Natural Language Processing
Systemic lupus erythematosus (SLE)
We used Vanderbilt’s Synthetic Derivative (SD), a de-identified version of the EHR, with 2.5 million subjects. We selected all individuals with at least one SLE ICD-9 code (710.0) yielding 5959 individuals. To create a training set, 200 were randomly selected for chart review. A subject was defined as a case if diagnosed with SLE by a rheumatologist, nephrologist, or dermatologist.
Type 2 Diabetes - Demonstration Project
Type 2 Diabetes phenotype algorithm for the DNA Databank Demonstration Project.
Type1 or Type 2 Diabetes Mellitus
Phenotyping algorithm for the identification of patients with type 1 or type 2 diabetes mellitus (DM) preoperatively using routinely available clinical data from electronic
health records.
Urinary Incontinence
Description of a weakly supervised machine learning approach for extracting treatment-related side effects (Urinary Incontinence) following prostate cancer therapy from multiple types of free-text clinical narratives, including progress notes, discharge summaries, history and physical notes. Prostatectomy surgery and radiation therapy are our treatments of interest for prostate cancer.
Venous Thromboembolism (VTE)
Recently published GWAS of VTE done by Mayo: http://www.ncbi.nlm.nih.gov/pubmed/22672568
Warfarin dose/response
This algorithm identifies patients who have a stable within-range INR (assuming a target INR of 2-3) over at least a three week period and correlates with their warfarin weekly dose. It is used to identify pharmacogenetics behind warfarin stable dose.

