A pheontype defining patients with strong evidence of having been diagnosed with colorectal cancer (cases) and patients who clearly do not have such diagnoses (controls). This phenotype is being used for sequencing studies. The only NLP involved in this phenotype is a very simple string search applied to pathology reports.
Unstructured Data:
Data Source/clinical domain:
Files:
Suggested Citation
David Carrell and Jane Grafton. Group Health/UW. Colorectal Cancer (CRC). PheKB; 2016 Available from: https://phekb.org/phenotype/514
Kathryn Jackson
Table 6.1
Fri, 2016-08-26 12:00To identify screened controls, you require the patient to have at least one procedure code for a colonoscopy (6.1.2), and, for the not screened controls, to have no evidence of having received a colonoscopy (6.3.1). Both of these requirements reference table 6.1, but I do not see this in the document. Can you please advise? Thanks!
David Carrell
Table 6.1
Wed, 2016-08-31 23:40My apologies; Table 6.1 was inadvertently omitted from the pseudo code document. Will correct than and repost the pseudo code.
-David
David Carrell
Table 6.1
Thu, 2016-09-01 02:37An updated version of the pseudo code, dated 8/31/2016, has been posted. It includes the (previously omitted) Table 6.1. Please note that there will be some changes to Table 6.1 within a few days, but you may begin programming to it now. Some of the codes in the list will probably be excluded.
Cheers,
-David
David Carrell
Data dictionary updated 9/7/2016
Wed, 2016-09-07 13:36An updated version of the data dictionary (CRC_DataDict_PersonLevel_2016_09_07.csv) was posted today. This version provides descriptions for a few variables needed to operationalize them that were missing from the prior version. We believe the dictionary is now complete, but please reach out with any questions.
-David
David Carrell
Updated pseudo code with updated Table 6.1
Wed, 2016-09-07 14:22Updated today is the pseudo code document (GH_UW_CRC_Ptype_Pseudocode_2016_09_07.pdf) now including an updated list of codes used to identify relevant CRC screening procedures. Note that this revised set of codes is a subset of codes originally posted 9/1/2016 (dated 8/31/2016). Let us know if you have any questions or concerns.
-David
Kathryn Jackson
Crohn's Exclusion
Wed, 2016-11-30 14:36Hi!
During validation, we noticed that ICD-9 code 555.9 is not included on the exclusion list for Crohn's/UE. We are finding cases who do have this ICD-9 code and Crohn's present in their charts. Should we keep these patients in?
Thanks!
-Katie
David Carrell
Crohn's Exclusion
Wed, 2016-11-30 19:16Thank you for catching this issue, Katie. ICD-9 code 555.9 should be added to the list of codes listed in Table 4.1 of exclusions. We do not want subjects with this code to be eligble cases. We will update Table 4.1 in the pseudo code document for CRC and re-post it to PheKB asap.
-David
David Carrell
Ready for network implementation
Mon, 2017-01-30 20:01Hi All,
The CRC phenotype is ready for network-wide implementation. The pseudo code was updated as of today (adding the one overlooked ICD-9 code Kathryn found--thanks, Kathryn!). Also added was a list of ICD-10 codes, but these are unlikely to be of much use as they have not been validated and can't be validated at Group Health/UW (see note inside document "GH_UW_CRC_ICD-10_Codes_2017_01_30.docx").
Cheers,
David
Kathryn Jackson
Hi!
Thu, 2017-03-09 13:07Hi!
I noticed in the data dictionary for covariate "Cancer_Dx_Age", there is the requirement to exclude squamous or basal cell carcinoma of the skin, melanoma in situ, carcinoma in situ of the colon or rectum, or carcinoma in situ of the cervix. Are ICD-9 codes available for this?
Additionally, are there codes available for the Radiation to the pelvis covariates?
Thanks!
-Katie
David Carrell
Codes for radiation-involving imaging of the pelvis/abdomen
Wed, 2017-04-26 11:58We are assembling a list of procedure codes for radiation-involving imaging of the pelvis/abdomen.
-David
Jen Pacheco
Question REL the DD: for NSAIDS
Tue, 2017-03-21 13:20The DD (data dictionary) says “Age in years at earliest NSAID prescription fill (requires local site definition of all NSAID medications)” and “Count of days on NSAID medications (using days' supply from medication fills)”. We don’t have a local definition, and we don’t have access to medication fills, yet these are required fields. How do you suggest we proceed? Do you have a list of NSAID meds by generic name, and would a prescription for these or their existence in the (home or current) medication list, where they exist, be good enough (as that’s all we would have)?
David Carrell
DD qluestions
Tue, 2017-04-04 17:24Working on these questions abou the CRC data dictionary (DD). will update PheKB when we have updated the DD. Thanks for your patience.
David
David Carrell
Revised data dictionary April 5, 2017
Wed, 2017-04-05 08:53Please note that a revised data dictionary has been provided today (CRC_DataDict_PersonLevel_2017_04_05.csv). This dictionary provides additional details for covariates Cancer_Dx_Age, NSAID_Age, and NSAID_Days_Supply. Please reach out with any questions.
-David
Kathryn Jackson
Thanks, David!
Tue, 2017-04-18 11:55Thanks, David!
Are there any codes available for the radiation of the pelvis covariates (Rad_Pelvis_Age, Rad_Pelvis_Days)?
Thanks!
-Katie
David Carrell
Dropping covariate for radiation of the pelvis
Tue, 2017-05-02 11:06Hi All,
We have decided *not* to include among the covariates for the CRC phenotype a measure of exposure to radiation of the pelvis. We will update the data dicationary to reflect that this measure has been dropped.
-David
PS: We are also providing a separate data dictionary to capture BMI repeated measures.
Jim Linneman
Question regarding heights, weights, BMI in DD
Fri, 2017-04-07 11:55Your data dictionary indicates that you want height, weight, and BMI as repeated measures.
Do you want all heights, weights, and BMIs for each subject?
If so, do we send this as a seperate file and include the following fields: emergeid, bmi_age, height, weight, bmi.
If you only want one BMI measure, at what timepoint do you want the measure from?
David Carrell
Question regarding heights, weights, BMI in DD
Fri, 2017-04-07 12:19Thank you for this question. We do need repeated measurements. For simplicity, please provide all avilable measurements (going as far back as is reasonable given your data environment). And, yes, please do provide these measurements as a separate file, including EMERGEID, BMI_AGE, HEIGHT, WEIGHT, and BMI (as defined in the person-level DD). We will post a separate DD for BMI as soon as possible (and update the person level DD to reflect this).
-David
David Carrell
Data dictionaries revised, 4/7/2017
Sat, 2017-04-08 01:49Please note that the data dictionaries for the CRC phenotypes have been revised as of 4/7/2017. There are now two dictionaries. The first is the person level dictionary (named CRC_DataDict_PersonLevel_2017_04_07.csv). It is exactly like the prior version except that the BMI variables have been removed. The second is the BMI repeated measures dictionary (named CRC_DataDict_BMI_Repeated_2017_04_07.csv). It contains the height, weight, and BMI variables that were previously (and incorrectly) included in the person level dictionary, plus fields EMERGEID and BMI_AGE to document the age at which each subject's measurements were taken. As noted in the description for field BMI_AGE, please provide age with two digits of precision to the right of the decimal point.
Reach out with any questions.
Thanks,
David
David Carrell
CRC data dictionaries updated
Tue, 2017-05-02 11:15Hi All,
This message is to confirm that the current (and hopefully final) versions of the CRC phenotype data dictionaries are:
The 5/2/17 person-level dictionary omits two measures related to radiation exposure of the pelvis.
Reach out with any questions,
David
Adelaide M. Arr...
NSAIDs and data dictionary
Wed, 2017-05-03 14:23NSAID are included in the data dictionary however, we could not find a list of NSAIDs.
Could you please clarify if this data element (NSAID) should be included in this algorithm or not?
Thank you very much
Regards,
Adelaide Arruda-Olson
David Carrell
Please see the instruction on
Wed, 2017-05-03 18:05Please see the instruction on the row of the DD for this measure. It provides a method for identifying NSAIDs locally (which is an unavoidably local task, unfortunately).
David
Adelaide M. Arr...
NSAIDs
Fri, 2017-05-05 09:56We are assuming NSAID therapy is the abbreviation for non-steroidal anti-inflammatory drugs, correct?
If the assumption is correct, then in the category NSAIDs a group of different medications are classified under this category and some examples are listed below:
aspirin, celecoxib, diclofenac, ibuprofen, indomethacin.
Which medications should we use to generate the co-variate NSAID listed in the data dictionary?
Could you please clarify?
Thank you very much.
David Carrell
That is correct--NSAID is the
Fri, 2017-05-05 11:55That is correct--NSAID is the acronym for non-steroidal anti-inflammatory drugs. Apologies for not specifying that.
The descriptions of the NSAID measures in the data dictionary describe the method we suggest you apply locally to identify drugs in this category. From the data dictionary: "Age in years at earliest NSAID prescription fill (requires local site definition of all NSAID medications; recommend including as NSAIDs all medications where the therapeutic class contains the string “NSAID” ). As noted in a prior post on this page it is, unfortunately, not feasible for any one eMERGE site to know what medications are used at any other eMERGE site, and this is the case for all medications, not just NSAIDs. Nor is it possible for us to share an exhaustive list of national drug codes (NDCs) for NSAIDs (or any other medication class) from a commercially available list (such as First DataBank) because that would violate the terms of our contract with the data vendor.
Best,
David
Adelaide M. Arr...
RxNorm CUI
Fri, 2017-05-05 14:07Have you considered generic names of drugs and RxNorm CUI numbers?
PS: Our site will create our list of generic names. Thank you.
Regards.
Adelaide
David Carrell
Thanks, Adelaide. We did not
Mon, 2017-05-15 22:37Thanks, Adelaide. We did not use generic names of drugs or RxNORM CUI numbers to identify NSAID medications. However, if that is the method that works best at your site (and you can defend it), please feel free to do so. Happy to discuss in person if that would be helfpul.
David
Ken Borthwick
Rituxan used to treat Secondary Thrombocytopenia
Mon, 2017-05-15 13:44Our chart reviews uncovered an interesting senario that turns into a case by the 5.3 rule. If a patient were to have at least one CRC dx code ever, had chemotherapy treatment related codes, and had no other cancer dx codes, that patient becomes a case. However, if that one CRC dx code was an initial dx that was later found to be something else (like a polyp) and the patient was adminstered Rituxan as treatment for Secondary Thrombocytopenia, that patient would be falsely labled as a case. How should I proceed?
Thanks,
Ken
David Carrell
Hi, Ken. Very interesting
Mon, 2017-05-15 22:43Hi, Ken. Very interesting case. I will take this up with the PIs and post a response as soon as I have their input.
My inclination is to update the algorithm to "code around" this particular constellation of codes, but I'll get a definitive answer soon.
I'm guessing you did *not* review a large enough random sample to answer the following question, but if you did, please share what you know: Can you estimate the percentage of of algorithm-assigned CRC cases that would turn out to be false-positives because they only have Secondary Thrombocytopenia?
Thanks,
David
David Carrell
Hi All,
Tue, 2017-05-16 13:31Hi All,
Thank you, again, Ken for discovering the potential for rule 5.3 of our CRC algorithm to yield false positive CRC cases--in rare circumstances. We have updated the pseudo code today (5/16/2017) to address this particular clinical pattern (presence of a CRC diagnosis code that turns out to be a rule-out code that is followed by a diagnosis of thrombocytopenia treated with Rituxan. Rituxan is also used as a chemotherapy for CRC. The revised pseudo code dated May 16, 2017 has been posted to PheKB, and the revisions are visible as tracked changes.
To any site that has already implemented the CRC phenotype: We would appreciate your checking whether any of your patients qualifying as CRC cases by rule 5.3 also had diagnoses of thrombocytopenia (using the two diagnosis codes provided in the revised pseudo code), and if so removing them from your set of cases.
Thank you,
David
Ken Borthwick
Thanks! I added the change
Tue, 2017-05-16 13:57Thanks! I added the change and the issue is resolved.
Ken Borthwick
As we only had 5 cases in our
Tue, 2017-05-16 13:54As we only had 5 cases in our eMERGE cohort that followed the rules in 5.3, that one false positive case was all I could find. I have not run the the algorithm on a larger cohort yet.
David Carrell
Good to know. -David
Tue, 2017-05-16 13:56Good to know. -David
David Carrell
May 15, 2017 update to the CRC data dictionary
Mon, 2017-05-15 23:17Hi All,
Andy Cagan discovered a flaw in the CRC data dictionary (thanks, Andy!) which we have corrected in the just-released 5/15/2017 version. The problem was that the dictionary instructed you to use the period (".") to indicate missing data for some integer variables, but periods, alone, are not acceptable values for integer variables. We corrected this problem in the 5/15/2017 version of the data dictionary by asking you to instead indicate missing values with "999" or "99999" values (depending on the valid range of integer values). This was the only change in the data dictionary.
Thanks again, Andy, for catching this error.
David
Robert J Carroll
Table 5.1
Tue, 2017-05-16 16:58The table asks to exclude certain tumor histologies: is it possible to have those specified by codes? EG, 903 for malignant mesothelioma.
Robert J Carroll
* 905
Tue, 2017-05-16 16:59* 905
Hence my concerns! Thanks.
David Carrell
An updated 5/17/2017 version
Wed, 2017-05-17 10:30An updated 5/17/2017 version of the pseudo code has been provided that now includes ICD-O-3 histology codes for excluded tumor histologies listed in Table 5.1 (p. 10).
David
Robert J Carroll
Thank you!
Wed, 2017-05-17 11:55Thank you!
David Carrell
5/31/2017 update to pseudo code document
Thu, 2017-06-01 17:21Thanks to Robert Carroll's careful review of the code sets presented in the pseudo code tables (thank you, Robert!) we have identified a small number of codes that should not be included and have deleted them from the tables (see tracked changes and annotations added to the tables). These codes are for rare conditions that are not expected to impact the overall quality of the phenotype data. Nevertheless, if you have not already completed implementing the CRC pseudo code at your site, please remove the deleted codes as noted in the updated pseudo code document.
Thank you,
David
Robert J Carroll
Question re case/control types
Thu, 2017-06-15 16:46I noticed that there was no opportunity in the data dictionary to specify what type of case or type of control an individual was. This seemed especially important for controls- screened vs unscreened. Did I miss this somewhere?