Electronic Health Record-based Phenotyping Algorithm for Familial Hypercholesterolemia

Familial hypercholesterolemia (FH) is a relatively common Mendelian genetic disorder that is associated with elevated plasma low-density lipoprotein cholesterol (LDL-C) levels and dramatically increased lifetime risk for premature atherosclerotic cardiovascular disease (ASCVD). FH can be diagnosed based on clinical presentation and/or genetic testing results, with a positive genetic testing considered to be the “gold standard”. Clinical diagnosis is based on a set of clinical criteria including lipid panel testing, personal and family history of hypercholesterolemia or premature ASCVD, presence of xanthomas on extensor tendons or thickening of the Achilles tendon, and early corneal arcus. We provide a pseudocode to identify cases and controls for primary hypercholesterolemia followed by FH. Structured data are processed using preset codes and unstructured data are processed using natural language processing (NLP). Final output consists of (i) a case/control/unknown status for primary hypercholesterolemia, (ii) demographics of each individual (age at the time of qualifying LDL-C ascertainment, gender, race/ethnicity), (iii) lipid profile (total cholesterol, LDL-C, HDL-C, triglycerides), (iv) lipid-lowering treatment and difference in time between the index date and date of treatment ascertainment, (v) personal history of premature ASCVD and/or hypercholesterolemia, (vi) family history of premature ASCVD, (vii) xanthomas and/or early corneal arcus, (viii) Dutch Lipid Clinic Network score and case/control/unknown for FH status.

Date Created: 
Thursday, November 10, 2016
Age: 
Gender: 
Network Associations: 
Owner Phenotyping Groups: 
 
 

Suggested Citation

Safarova MS, Liu H, Arruda-Olson A, Rastegar M, Smith C, Cheng Y, Fan X, Balachandran P, Sohn S, Kullo IJ. Mayo Clinic. Electronic Health Record-based Phenotyping Algorithm for Familial Hypercholesterolemia. PheKB; 2016 Available from: https://phekb.org/phenotype/602

PubMed References

27678441

    Comments

    At Partners HealthCare, our clinical notes are not stuctured in a standard way, and it is challenging to use NLP to determine family history.  We would like to know more about your NLP methods. 

    1. Does the program require that notes be in a specific format? Do the notes have to include a family history section?

    2. Are the notes coming from a particular software program?

    3. Did the validation site have the same formatting in their notes?

    4. How does the Java program deal with unstructured notes in other formats? If so, has this been tested and does it work?

    Thanks, Beth Karlson

     

     

     

    1

    Dear Beth, Thank you for this feedback. Please see below our comments. Maya.

    1.       Does the program require that notes be in a specific format? Do the notes have to include a family history section?
    - NLP is run using MedTagger.  Per “FH_eAlgorithm_Pseudocode_FullText_2016”: A link to installation and user guides could be found here:
    http://ohnlp.org/index.php/MedTagger_Project_Page
    There is no specific requirement pertinent  to the patient notes (free text).
    In the primary site, at Mayo Clinic, we used solely “Family History” section of clinical notes.
    Please see “VALIDATION OF THE FH eALGORITHM IN THE GEISINGER HEALTH SYSTEM_2017” regarding the feedback from the validation site: “In selected cases based on the adopted strategy to record encounters in the index implementation center, search space for the family history of early-onset ASCVD could be expanded to the “Personal|Past Medical History”.”

    2.       Are the notes coming from a particular software program?
    Given the diversity of medical language, NLP system is advised to be modified based on the adopted strategy to record encounters in the implementation site.
    Regardless of the EHR vendor, free text within generated clinical notes is amenable to MedTagger.

    3.       Did the validation site have the same formatting in their notes?
    Since primary and validation sites used different EHR vendors, there were differences in the baseline formatting of the clinical notes.

    4. How does the Java program deal with unstructured notes in other formats? If so, has this been tested and does it work?
                    Given that the input is text per se from the sections relating individual FHx, there should not be any issue regarding formatting.                List of references that may be helpful could be found here: http://ohnlp.org/index.php/OHNLP_Publications

    for those sites like us that are not able to distinquish fasting vs not for glucose tests, can we just use regular glucose test as exclusion (table 2A), & if so, what would the cutoff be (still >220 mg/dl?)

    Here are some examples including genderless terms such as (child, kid, sibling, etc...)

    For family history of premature ASCVD: Patient has three siblings, one of whom had a stroke at age 56.

    For family history of premature CHD: The patient has two siblings total one died at age 49 of heart disease and another is living at age 70.

    Maya, Thanks for your response to my initial question. In our EHR, we do not have a separate family history section. Does your program work if there is no clear designation of a section?  Have any of the other sites implemented your program in unstructured notes? Thanks, Beth

    Hi Beth, In EHRs with no designated family history section (structured or unstructured), this system will  scan the whole text. In this case scenario, we anticipate increased probability of false positive results.
    However, there was no feedback to share yet from the sites with EHRs without note sections/types. We look forward to learning from your experience. P.S. One thought to consider could be demarcating a search space for, e.g. keywords, negation --> age brackets and relatedness, within 2-3 sentences. Thank you! Maya

    Does MedTagger only run against files? or can it be run against a database as well? In reading the User Guide, it looks as though it only runs against either a single file or multiple files.  Is that correct?

    Thanks, Barbara