Vanderbilt Implementation- Adult- BioVU- Emerge subset

Algorithm Type: 
Case, Control
Implementation Details: 

1- Limit the set to the emerge subset within BioVU
2-Identify all radiology reports with keyword ‘pneumonia’.  Run negation natural language processing (Chapman et al).  Lump radiology reports into a 6 month window from first non-negated mention (this is index time zero), 1 month prior and 5 months after (see Figure 1.)  Report in PNA_Data_Dictionary_1:
  - All ‘Radiology PNA event’ per subject.
  - Per ‘Radiology PNA event’, report number of negated and non-negated imaging reports
3-Around each ‘Radiology PNA event’, report a 31 day prior and 31 day following window to identify at least 2 mentions of ICD9/10 codes from Appendix A.  Report count of each codes per subject on unique days in window in PNA_Data_Dictionary_2.
4-Around each ‘Radiology PNA event’, look for antibiotic therapy with same 31 day prior and 31 day following window to identify at least 1 mention of antibiotic treatment listed in Appendix B.  Report count of antibiotic mentions on unique days in window in PNA_Data_Dictionary_3.
5-Remove cases with two instances of exclusion codes in Appendix C- two of same code or two from same bin, occurring in time frames A or B as below per ‘Radiology PNA event’.  Record exclusions by bin in PNA_Data_Dictionary_4.
Report covariates in PNA_Data_Dictionary_1:
- By Subject:
             *  Gender
             *  Race
             *  Ethnicity
- By ‘Radiology PNA event’, as close as available to index time zero:
             *  Non pregnant BMI closest to event
             *   Non pregnant BMI averaged in adult life
             *  Admitted? y/n
             *  Day of hospitalization (if known)
             *  Length of hospitalization (if known)

1- Include any subjects to those who meet the medical home definition 3 or more primary care visits in 2 years (from published eMERGE BPH algorithm).
2- Exclude any subjects with two occurrences of any code from a bin in Appendix C on unique dates. (ex. Two from heart failure on different dates exclude, one from heart failure and one from malignancy do not exclude (different categories)).  Complete PNA_Data_Dictionary_4.csv for control counts excluded by bin.
3- Identify:
    -Subjects without any single mention of any pneumonia code (Appendix A)
    -Subjects with no positive reports (can have negated only or none).
    -Report covariates in PNA_Data_Dictionary_1:
    -By Subject:
          * Year of birth
          * Gender
          *  Race
          *  Ethnicity
          * Non pregnant BMI averaged in adult life


Chapman, W. W., et al. (2001). "A Simple Algorithm for Identifying Negated Findings and Diseases in Discharge Summaries." Journal of Biomedical Informatics 34(5): 301-310.

Cases (0) Actual Class (Expectation)
Control (0) Actual Class (Expectation)