Drug Induced Liver Injury

An algorithm to identify inpatients who have had an acute episode of drug induced liver injury (DILI).

Summary of drug-induced liver injury algorithm

Inclusion criteria

A. Suspect DILI? (NOTE: baseline population is institution specific.  See institution implementation details)

1.     Liver injury AND Exposure to drug (NOTE: medications are institution specific. See institution implementation details)

2.     Temporal relationship of exposure to drug and liver injury diagnosis.

o    Exposure to drug within 3 months prior to liver injury diagnosis

B. Acute liver injury?

C. Does it reach the threshold to qualify as DILI?

1.     ALP >=2x ULN OR ALT >= 5x ULN OR (ALT>= 3xULN AND Bilirubin >= 2x ULN)

2.     Temporal relationship of laboratory measurements and acute liver injury diagnosis.

o    Normal laboratory measurements prior to exposure to drug.

o    Laboratory measurements reach threshold for DILI following drug administration.

Exclusion criteria

D.  Patients with other diagnoses (cancer, gallbladder disease, pancreatic disease, heart failure, alcohol abuse/liver damage/toxic effects, HIV infection, rheumatoid arthritis, sarcoidosis, systemic lupus, viral hepatitis, sclerosing cholangitis, chronic liver disease, organ transplantation, liver operation, death, overdose). (NOTE: excluded diagnoses are institution specific. See institution implementation details).

Phenotype ID: 
Do Not List on the Collaboration Phenotypes List
Chunhua Weng and Casey Overby
Contact Author: 
Date Created: 
Friday, December 7, 2012
Network Associations: 
Owner Phenotyping Groups: 
View Phenotyping Groups: 

Suggested Citation

Chunhua Weng and Casey Overby. Columbia University. Drug Induced Liver Injury. PheKB; 2012 Available from: https://phekb.org/phenotype/135

PubMed References

24303321 23837993


    Marshfield is coming up with roughly 92% of our PMRP population that have one or more of the exclusion diagnoses listed in notes 16-29 on pages 8 to 25 of the DILI algorithm.

    Are you certain you want to exclude subjects that have had cancer screenings, non-malignant melanoma's, or benign neoplasms? What about vaccinations?

    Roughly 88% of our population have diagnoses in your group of cancer codes. Some of the heavy hitter cancer exclusion diagnosis codes (at Marshfield) are as follows:

    (%'s are the percent of PMRP population having code on 1 or more dates)

    1. V76.51 Spec Screen for Malignant Neoplasms, Colon (7.5%)
    2. 211.3 Benign Neoplasm Lg Bowel (6.8%)
    3. V76.2 Screen Mal Neop-Cervix (6.1%)
    4. V76.44 Screening for Malignant Neoplasm Prostate (5.7%)
    5. V76.12 Other Screening Mammogrm, Malig Neop Breast (5.5%)
    6. V76.10 Screening Malig Neoplasm, Breast Unspec (5.2%)
    7. 216.5 Benign Neo Skin Trunk (2.8%)
    8. 216.8 Benign Neoplasm Skin Nec (2.4%)
    9. 216.3 Benign Neo Skin Face Nec (2.2%)

    What about V05.03 Viral Hepatitis Vaccination (18.3%)?

    How about V42.5 Cornea transplant?

    Hi Jim, Thanks for sharing your experiences. The list of exclusions in the documentation were informed by the International Serious Events Consortium (iSEAC) and the Drug Induced Liver Injury Network (DILIN). In practice though, the excluded diagnoses should be institution specific. For example, through our experiences and discussions with Mayo (and their consultation with DILI experts) we decided on a list of high priority diagnoses to exclude including: (a) sclerosing cholangitis, (b) organ transplantation or liver operation, (c) alcohol abuse/liver damage/toxic effects, and (d) viral hepatitis. I hope this helps. Casey

    Hi Jane,

    Thank you for your question about the medications. The medications in the documentation were those of interest to the Drug Induced Liver Injury Network (DILIN). I have updated the documentation to include a link to other medications to choose from, since meds are institution specific (See LiverTox.nih.gov).


    I am working on this algorithm for Geisinger and have a few questions concerning the pesudocode:


    Question 1:

    Part 1C of the pseudocode (page 4) says that one of the thresholds to qualify as DILI is ALT>=5x ULN.  If I’m not mistaken, this corresponds to node C2 in the flowchart on page 2, but this node says Peak ALT>=3x ULN.   Furthermore, Note 4 also says for A1-Yes patients, we’re supposed to look at ALT>=3x ULN.  Should we look at patients with ALT >=5x ULN or >=3x ULN here?


    Question 2:

    Note 10 says to consider minocycline as a medication preparation, however the table for Note 10 doesn’t include minocycline.  Should it be included?

    Also, I’m assuming the medication listed as valporic acid is supposed to be valproic acid, correct?


    Question 3:

    Note 30 says that for E1-No patients, we exclude patients that were ever administered one of the Note 10 medications.  However, the flowchart on page 3 suggests that in part E2, we exclude patients that were never administered one of the Note 10 medications.  Which one is it supposed to be?


    Question 4:

    For part B (Note 8), what do you mean by “consider chronicity”?  Do we exclude patients with chronic liver injury?



    Jonathan Bock

    Q1: The protocol states that to qualify for DILI one of the following should hold: (1) ALP >=2x ULN; (2) ALT >= 5x; or (3) ALT>= 3xULN AND Bilirubin >= 2x ULN.  So that the algorithm is most efficient, C1-C3 in the algorithm first exclude patients with no labs above the lowest threshold to qualify for DILI (for ALT that's >=3xULN).

    Q2: Minocycline was missed previously, updated in the documentation

    Q3: In Note 30 "ever" should be "never", updated in the documentation.

    Q4: Chronic liver injury is not covered by this algorithm so they are excluded.

    Q1: For those of us capturing acute liver injury dxs (in list on p. 4), is the intent to capture these codes ONLY if they occur in an inpatient setting, to correspond with searching (inpatient) discharge summary notes when using NLP to look for mention of liver injury?

    Q2: I suspect that a large # of our cohort will have died at some point.  I'm not sure I understand the reason for including death in exclusion criteria on p. 4.  I'm assuming all of the exclusion dxs listed (other than death) should apply only if they occur at some point in time prior to the qualifying DILI dx date.




    Q1: In our experience, acute liver injury dxs should probably be used as inclusion criteria whether or not NLP is used & whether or not only inpatients are considered; Q2: We found that many of or false positive results were patients who had died so we included death as an exclusion in our local implementation.

    Note 8 lists ICD9 571.40    Chronic hepatitis

    However, in our database I am seeing:

    571.4    Chronic hepatitis
    571.40    Unspecified chronic hepatitis

    Should I be capturing both of these codes since they both seem relevant to chronic liver injury? Thanks.

    Thanks, Casey. One other question - all our bilirubin labs use mg/dL, but a tiny proportion of them go as high as 39.6, which seems more likely to be umol/L. With the bilirubin threshold at 2.34 mg/dL and 2xULN then at 4.68 mg/dL, is there an upper limit to the lab values that should be considered correct? For instance, should I throw out any values over 10 or 20 because they are probably using the wrong units?

    I don't know a threshold for classifying a lab value as having the wrong units, so I don't think you should throw them out. One approach is to use the manufacture specified ULN instead of the one we specify. That way you don't have to worry about units.

    Hi everyone, I made a minor update to the algorithm. To add some clarity to the algorithm for controls, we've made the following updates:

    • Note 30: "...exclude patients that were never administered (started on) a medication represented in algorithm selected cases..."
    • Note 31: "...excluded “other diagnoses” should be the same “other diagnoses” that are used in the DILI cases algorithm..." 

    Please note that the medications and exlusions we used at Columbia are described in our implementation details. For biobank populations (that are much smaller than our CDW population), we suggest considering any medication if possible and limiting exclusions to chronic liver injury diagnoses. If unable to consider any medication, common medications associated with liver injury (see http://www.livertox.nih.gov) should be considered.

    Validation of controls should be preformed by randomly selecting 50 population controls and manually reviewing them to confirm they don't have DILI (record #TNs and #FNs for the NPV calculation).

    Also, please send me an email offline if you plan to validate the cases algorithm by March 15th, 2013.


    re: post from Casey on 26feb2013.  our understanding is that only secondary site(s) perform full

    validation on cases and controls, and that all sites other than secondary do a simple "verification"

    of 2-3 ea. for cases and controls.


    For the demographic_data in your data dictionary, do you want us to add a "case_control" status indicator (i.e. C49152=Case, C28143=Control)?

    Would you also like "sex" added (i.e. C46109=male, C46110=Female)?



    We don't need to collect these data for controls. Since DILI is so rare, matching cases with controls doesn't add much power to the analysis.



    I'll add sex to the data dictionary.


    I'm cutting and pasting other questions I've had about the data dictionary below.




    For demographic data 


        1. If a case subject qualifies using diagnoses, medications, and labs, at what time point do you want “Age” determined? 

        2. For controls would “Age” be determined as Age at first liver injury drug date or “Age” at most recent liver injury drug date? 

    **In our implementation we queried notes and our CDW separately. Age was something that was calculated for our query of clinical notes only (age at the time the clinical note was created). This may be something that's local to how we implemented the algorithm at CU. If it isn't relevant, I think just including date of birth is appropriate.



    For lab_data

        1. For lab_code do you want the UMLS code (i.e. ALP would be C0201853)?

    **UMLS codes would be best for looking at labs across sites. If you're like us, we had several types of ALP labs, ALT labs and Bilirubin labs with unique local codes. So these would require mapping to UMLS codes. If this isn't feasible I'd include your own codes.

        2. For lab_char_value, do you want the full lab name (i.e. “Intestinal alkaline phosphatase measurement”) or will “ALP” suffice?

    **If you use local codes I'd use the full lab name. If you use UMLS codes, I think either way works.


    For med_ data

        1. Is med_order_code the UMLS source_id code (i.e. “Cipro” would be “C0282104”)? 

    **Again, UMLS codes would be best for looking at labs across sites. We had several local codes for each medication, so this would require mapping to a UMLS code. If this isn't feasible, I'd include you local codes


    For liver_injury_diagnoses/visit data 

        1. Do you want ICD 9 diagnoses in three separate files with patient_id, ICD 9 code, and time point as follows?: 



        1. one file with acute liver injury diagnoses (required for cases) 


        2. one file with chronic liver injury diagnoses (possible for controls, but not cases) 

    **yes, but excluded diagnoses for cases should also be excluded for controls (including chronic liver injury).

        3. one file with other diagnoses 

    **at CU we did one file for each of the other diagnosis.




    Q1. in tab Demographic Data, age is listed as a repeated measure - I had assumed that this tab was structured as one obs per person, rather than one obs per person/age. Would you not want Age to be "age at first DILI dx" in this file? Or do you want multiple records per person here, one for each DILI dx date?

    ** It's a repeated measure because one patient can have multiple clinical notes (at CU age was calculated at the time a note is created)


    Q2. in tab Visit Data, it looks like the intention is to include in this one file all dx code events , be they case definition (acute liver injury dx codes) or exclusion definition (chronic liver disease dx codes), or "flagged" dx exclusion (the long list of exclusion dx codes).

    Can you confirm?

    **This is correct.


    Q3. for the tabs: Visit data, Lab data, and Med data - is the intent to include "over all time", or ONLY the events that fall within the time frame examined in defining case/control status :


    - for lab, the 3 mos after the (first) DILI dx date and an unknown time period before RX date (to assess normal lab prior to exposure) 

    **Only within time frame. But the time frame we implemented at CU was 3 mo after med and 1 mo before med. MSSM anchored on acute liver injury dx. The later seems like it may be more feasible to implement, if you want to do that. Thoughts?




    - for meds, the 3 mo before the (first) acute liver injury dx date 

    **Only within time frame. Time frame is correct


    - for visits, a little more complex, if this tab is meant to include dili dx visits, exclusionary visits, and flagged visits.

    **Over all time. I'm not sure if this helps, but in our implementation we had a separate file for each dx


    Q4. You and I corresponded via PheKB alert about the setting in which the dx occurred, i.e. inpatient and outpatient (I'd asked if you wanted only acute liver injury dxs occurring in Inpatient encounters) 

    - you responded that we should consider ALL encounters. Just double checking that this still stand. 

    **It should be inpatient and outpatient if feasible at your institution. This might be a local decision. For example, at CU we used structured and unstructured data, but unstructured data was only available for discharge summaries at the time. Because of this we limited ourselves to inpatients with discharge summaries so that we were looking at the same set of patients.


    Q5. An FYI about the Meds tab - we don't have all of the variables you request, and our meds are from outpatient encounters only. What we can deliver are: Med_order_code (NDC code) and med_order_name (generic drug name), and med_order_time (date the med was prescribed) - we capture date but not time.



    Q6. I wonder about time frame for collecting labs and meds for controls. Meds and labs - "ever" ? since there is no acute liver injury dx to anchor to?

    **We don't need to collect these data for controls. DILI is so rare that matching controls doesn't add much power to the analysis. 



    Would you be able to post (from the primary site) implementation number for DILI that show the absolute numbers of true positives, false positives, etc?  A good example of how these numbers have been reported for other phenotypes is Marshfield's implementation of their cataract phenotype: http://www.phekb.org/node/9/implementations



    When I enter patient counts there's a ".00" on the end, so I assumed it should be a percentage. Does anyone know if there is a reason for the decimal? If it's for counts can the decimal be removed? We've reported the counts in the pdf currently.

    I've looked through the 2 posted implementations, but am unable to find a list of NLP terms used. Can you direct me to them or supply some terms? I understand they'll end up being site-specific, but would be great to know what you used.

    Thank You!

    ~~~~ sarah

    At Columbia we used our MedLEE NLP engine which alows us to run queries using UMLS codes, I don't have a list of NLP terms these codes map to.

    Thanks, Casey.

    Does anyone else have NLP terms they can share?

    I missed the last phenotype workgroup call but see in minutes that we are to deliver data to Chunhua.  can someone provide me with this person's email?  i will use that as the recipient in DataHippo.   thanks.

    Q1:  If a patient qualifies as a case at a certain point in time, yet at a later date, has elevated lab values within 1 month of starting another one of the medications listed, does the patient still qualify as a case?


    Q2:  Do we exclude patients with exclusion diagnoses (other than death) that occur before DILI, or are patients excluded if they've had an exclusion diagnosis at any point in time?





    Re Q1: Our implementation assessed when lab values were elevated and when a medication was prescribed in relation to when a patient was diagnosed with liver injury. A patient didn't qualify as a DILI case unless they were within the specified timeframes.

    Re Q2: We excluded diagnoses at any point in time.