Breast cancer is the most common cancer and the second leading cause of cancer-related death among women in the U.S. Known breast cancer risk factors include age, race/ethnicity, reproductive factors, and benign breast disease. Family history of breast cancer and hereditary cancer syndromes, such as BRCA1/BRCA2 mutations, confer the strongest risk for this disease. Although there have been a number of genome-wide association studies (GWAS) to identify genetic predictors of breast cancer, most have focused on high-risk cohorts of women with a strong family history rather than population-based cohorts and few have looked at genetic predictors based upon breast cancer subtypes. For example, BRCA1 mutation carriers tend to develop estrogen receptor (ER)-negative breast cancers, whereas the majority of breast tumors in BRCA2 mutation carriers are ER-positive. The purpose of this algorithm is to identify breast cancer subtypes based upon tumor hormone receptor (HR) status. There are currently FDA-approved drugs, such as tamoxifen, which have been shown to reduce the incidence of ER-positive breast cancer by up to 50-65% among high-risk women. Identifying cohorts of women who are more likely to benefit from anti-estrogen therapy may lead to a more precision medicine approach to breast cancer prevention strategies.
Comments
Breast densities whether Right or Left?
Hi,
Fyi, KP’s pathology database captures both left and right breast densities. However, the BreastCancerDd6_V1_breastDensity.csv does not include a variable to indicate the side of the breast measured -whether LEFT or RIGHT. Thoughts on how you would like us to address those cases where the densities of right and left breasts are coded differently in our path database?
Thanks,
Arvind
Reply
Hello Arvind,
Thanks for pointing out this problem. We have added "location" column to reflect this information. Please check the new data dictionary file BreastCancerDd6_V1b_breastDensity.csv. Agagin, really appreciate your finding.
Best regards,
Sunny
BMI data dictionary
In the BMI data dictionary (BreastCancerDd2_V1_BMI.csv) you have height listed in centimeters (cm) and weight in pounds (lbs). Did you want us to submit weight measures in kilograms (kg) or pounds (lbs)?
Reply to BMI weight unit
Thanks for finding this out. I think kg is good for use. I updated relevant file to BreastCancerDd2_V1b_BMI.csv.
Hormone receptor status
Regarding the hormone_receptor_status field In the demographics data dictionary (BreastCancerDd1_V1_Demo_phenotyping):
If a patient has an ER UNKNOWN and a PR NEG at age 21388 days, then subsequently has a ER NEG and a PR NEG at age 24042 days would their hormone_receptor_status be classified as NEG or UNKNOWN?
Reply NEG or UNKNOWN
I think the hormone_receptor_status can be calssified as UNKNOWN if ER UNKNOWN and PR NEG. Thank you.
BI-RADS Categories issues
I am seeing values 0-6 which doesn't match with your DD. Is it possible some of these might be from a different scale than the one you are using? These are sometimes identified by Roman Numerals, in case that is meaningful to you.
Provide description of 0-6
THanks you for finding this out. Can you provide textual description of 0-6? Then I will compare both scales and figure out how to combine this information in the data dictioanry.
BI-RADS 0-6 categories
"BI-RADS 0-6 categories" is for reporting of abnormal results on the mammogram, not describing breast density. Here our dictionary is to query breast density description.
OMOP SQL to pull data in tables 1-5? and family history
Hi, 2 questions:
1) I see in tables 1-5 that there are OMOP concept IDs, do you have OMOP queries you can share that pull these data?
2) for family history of breast cancer, do you just need yes/no any family history of breast cancer, or were you limiting it to 1st degree relatives &/or using NLP to pull this data from notes or just pulling from tumor registry data & other sources?
Thanks!
Jen