Systemic Lupus Erythematosus (SLE) is a chronic, systemic autoimmune disease that can affect many parts of the body including skin, lungs, brain, heart, kidneys, joints, and blood vessels. SLE presentation can vary significantly between patients. Because of this, it can be challenging to identify a patient as having SLE. Between 300,000 and 2,000,000 people in the US are estimated to have SLE. Determination of an exact number of people affected is challenging as the disease is difficult to identify given the diverse presentations and the length of time it may take for symptoms to appear. Electronic Health Records (EHR) are widely used in healthcare setting and are a rich source of information about the patients that can be mined for classification of SLE and earlier identification.
For the eMERGE network*, we are using the SLICC (Systemic Lupus International Collaborating Clinics) Criteria to determine if patients have SLE. A group of rheumatologists came up with SLICC criteria in 2012 to improve clinical relevance, and to incorporate the new knowledge of SLE immunology.
non-NLP: Version 4.6 of the pseudocode uses structured EHR data only - NOTE: the SQL pseudocode document is the definitive list of codes and cutoff values and logic to use, if that doc contradicts other files here, please let us know and use the SQL pseudocode doc (https://phekb.org/sites/phenotype/files/FINAL_SLE_SLICC_SQL_code_w_vocab...)
w/ NLP: Version 5.x and later of the pseudocode uses the output of version 4.x, adding NLP (natural language processing) for some of the SLICC criteria that are more accurate with NLP. 3 files have been added for the NLP, namely:
- pseudocode with NLP portions in red in a Word doc;
- the types of text documents needed highlighted in an Excel file;
- and a zip file w/ a README Word doc w/ NLP instructions, the Python & R code in Jupyter notebooks, & the format of the input to the NLP, which is 2 inputs:
- output from version 4.x of the algorithm, i.e. this input DD matches the output DD for version 4.x
- text from notes from in- & out-patient encounters with any primary care, rheumatology, or nephrology departments, & from kidney biopsy pathology reports
* for other studies/projects/networks we have also used the ACR/EULAR criteria, which has some overlapping classification attributes with SLICC. Thus we have also posted 2 additional files with information on all the classification criteria as follows:
- Supplementary tables with :
- Data Definitions which includes correctioions to a few typos in the ICD9/10 codes
- Sensitivity and Specificity of each of the Classification Attributes
- more information about the specific codes in an Excel file