Cardiorespiratory Fitness Algorithm (eMERGE Mayo Network Phenotype)

 

Suggested Citation

Ning Sunny Shang. Mayo Clinic. Cardiorespiratory Fitness Algorithm (eMERGE Mayo Network Phenotype). PheKB; 2012 Available from: https://phekb.org/phenotype/125

Comments

Hi,

Any information on when the NLP component for the CRF algorithm will be ready?  Of, if it is ready now, where we can download it?

Thanks,

David Carrell (Group Health)

Can you please upload a data dictionary that follows the format eMERGE uses, which includes:

  • min/max expected values, esp. for labs such as glucose
  • varnames that are computable
  • if variable required Yes/No
  • what values to use for missing and not assessed
  • etc.

This will help us to know where to cut off our values, and will help you to have standardized variable names.

:)

Thank you!

Northwestern

 

  1. For the following variables, are there ICD-9 codes and/or algorithms we should use (none were specified in the posted documents that we could find)?
    Hypertension
    Hyperlipidemia
    Diabetes
  2. For Comorbid Condition flags F1 & F2,  does "in different years" mean at least 1 yr. apart?
  3. For fasting glucose, we only have fasting specified for <100 pts, for the rest we have glucose which could either be random or fasting, so would you like us to use the fasting glucose where we have it, and where we don't, to use the nearest glucose?

  1. Just to clarify, for Other calcium channel blockers, in the Meds_list_CRF_1_6_2013.doc there is an "Others" section at the end which lists only 1 drug which I don't see in our list at NU of calcium channel blockers, so is this correct, i.e., is digoxin the drug you're looking for, or are you looking for other calcium channel blockers?  
  2. Can you please clarify what "As assessed closest to the time of the reference date" for medications means?   Do you want us to say Yes for a given medication if prescribed within a year before, and/or after, the reference date, or something else?  

Hi everyone, we have updated the CRF data dictionary consistent with the standardized data elements in eleMAP. That said, there are many data elements specific to CRF that are not available via caDSR or other resources. Please review, and let us know if there are any further questions. Thanks, Jyoti

Medication data are to be provied for the period +/- 6 months from the reference date (which is earliest CRF test).  Is the data for each individual drug dispensed to be provided as "ever/never" or do you want multiple records indicating each time each individual drug was dispensed in that 12-month period? 

Since these data are indicating "ever/never" exposure to the drug within the period 6 months +/- the exercise test date, it would seem that you do not really need the subject's age at medication date, since you already have the subject's age at exercise test date.   Do you agree?  If you still want subject age at medication date, which date should we use when multiple medication orders/fills are present within the period?  The earliest?  The latest?  Thanks,  -David

Hi David,

Since we already have the age given the test date and date of birth, there's no need to look at age at the time of the medication. Basically, we need to know whether a patient was on any of these medicaitons in the time period of +/- 6 months around the exercise test date.

Thanks,

Hayan

Jen, thanks for the questions. I have added the list of codes for the co-morbidities. Please see updated Data Dictionary (and the new tabs in the XLS file). Thanks, Jyoti

We had earlier used code list in the algorithm document (word file) , the updated data dictionary (XLS) code list are different for few co-morbidities. Do you want us to use the data dictionary code list? 

We have noticed that matching for the explicit codes (e.g., 411.81, 411.89, etc.) is a more precise method than matching for the code with wildcard characters (e.g., 441.xx).  I mention this because the wildcard version of the codes are in the pseudocode document, but it is probably best not to use them.

Hi,

There are eight comorbidities listed in the data dictionary (and pasted below).  The first five in this list use a "rule of two" (i.e., the dx codes must appear on at least two separate calendar dates during the specified 12-month period in order for the comorbidity flag to be set to true).  The last three comorbidities simply say the dx must be "present" in the period.  Should the last three comorbidity flags really be calculated differently than the first five? 

Coronary heart disease
Congestive heart failure
Cardiomyopathy
Peripheral arterial disease
Chronic Obstructive Pulmonary Disease
Hypertension
Dyslipidemia
Diabetes_Mellitus

Thanks,

David

Hi,

Most of the variable names in the CRF data dictionary seem reasonable.  However, some are quite long and have embedded spaces (as shown in the lists below).  Do you really want us to use these exact variable names in the datasets we provide?

VARNAMEs on Basic data tab:
=====================
SUBJID
Sex
Year_Birth
Observation_age
Race
Ethnicity
Coronary heart disease
Congestive heart failure
Cardiomyopathy
Peripheral arterial disease
Chronic Obstructive Pulmonary Disease
Hypertension
Dyslipidemia
Diabetes_Mellitus
Smoking_Status

VARNAMEs on Exercise test data tab:
==========================
SUBJID
Exercise_test_date
Test type
Exercise type
Protocol type
METs
Exercise time
Blood pressure at REST
Blood pressure at PEAK
Heart rate at REST
Heart rate at PEAK
Peak VO2
VE/VCO2 Nadir
Anaerobic threshold

Thanks,

David

Hi,

The pseudocode algorithm document says we should provide the flag variable for each subject defined as S1, S2, S3, S4 or S5 (per the pasted text below).  What do you want to name that field?  I don't see it listed in the data dictionary.

Thanks,

David

  1. In the remaining medical records, flag any record indicating cardiac stress test by using combinations of relevant codes detected simultaneously on the same date as listed below. Then set the first detect date as the test date for that patient.

S1: ECG (any of the CPT-4 codes: 93015; 93016; 93017; 93018) + Echo (CPT-4 code 93350); OR Echo code 93351 only;

S2: ECG (any of the CPT-4 codes: 93015; 93016; 93017; 93018)  + Cardiac nuclear test (CPT-4 codes: A9500);

S3: ECG (any of the CPT-4 codes: 93015; 93016; 93017; 93018) + Oxygen uptake (any of the CPT-4 codes: 94680; 94681);

S4: ECG (any of the CPT-4 codes: 93015; 93016; 93017; 93018) only.

S5: If the patient is flagged as S1 and S3 simultaneously, set that patient as S5.

Hi,

The data dictionary calls for some calendar dates (e.g., Exercise_test_date).  What is the format of these dates?  The dictionary simply says "numeric" but more details are needed.  Do you want to use the SAS numeric representation of a date?  Excel's numeric representation?  MS SQL Servers?  Or, perhaps, do you want us to provide dates in a standardized text format?  I believe we discussed this in the past and were leaning toward a standardized string representation of dates (e.g., YYYYMMDD or YYYY-MM-DD).  Please clarify.

Thanks,

David

Hi David,

Thanks for your help with this phenotype. Please find responses below in bold.

Please feel free letting us know if you have any other questions.

Thanks again!

Hayan

 

1. We have noticed that matching for the explicit codes (e.g., 411.81, 411.89, etc.) is a more precise method than matching for the code with wildcard characters (e.g., 441.xx).  I mention this because the wildcard version of the codes are in the pseudocode document, but it is probably best not to use them

Let's stick to the codes listed in the data dictionary since we already extracted the data based on them.

2. Most of the variable names in the CRF data dictionary seem reasonable.  However, some are quite long and have embedded spaces (as shown in the lists below).  Do you really want us to use these exact variable names in the datasets we provide

Agree, you can use abbreviated terms as below:

VARNAMEs on Basic data tab:
=====================
SUBJID  OK
Sex        OK
Year_Birth OK
Observation_age Obs_age
Race   OK
Ethnicity  OK
Coronary heart disease CAD or CHD
Congestive heart failure CHF or HF
Cardiomyopathy  CMP
Peripheral arterial disease PAD
Chronic Obstructive Pulmonary Disease  COPD
Hypertension  HTN
Dyslipidemia   HPL
Diabetes_Mellitus  DM
Smoking_Status Smoking

VARNAMEs on Exercise test data tab:
==========================
SUBJID
Exercise_test_date Test_dt
Test type  Test_type
Exercise type  Exer_type
Protocol type Protocol
METs OK
Exercise time Exer_time
Blood pressure at REST Please separate into: Rest_SBP (for systolic blood pressure) and Rest_DBP (diastolic BP)

Blood pressure at PEAK Again Peak_SBP and Peak_DBP
Heart rate at REST  Rest_HR
Heart rate at PEAK  Peak_HR
Peak VO2   Peak_vo2
VE/VCO2 Nadir VE_VCO2_nad
Anaerobic threshold Anaerob_thresh

3. There are eight comorbidities listed in the data dictionary (and pasted below).  The first five in this list use a "rule of two" (i.e., the dx codes must appear on at least two separate calendar dates during the specified 12-month period in order for the comorbidity flag to be set to true).  The last three comorbidities simply say the dx must be "present" in the period.  Should the last three comorbidity flags really be calculated differently than the first five? 

Coronary heart disease
Congestive heart failure
Cardiomyopathy
Peripheral arterial disease
Chronic Obstructive Pulmonary Disease
Hypertension
Dyslipidemia
Diabetes_Mellitus For the last three variables hypertension, dyslipidemia, and diabetes, only one code +6 mo/ -6 mo from the time of the test is sufficient to mark as present. The other variables F1 through F5 need 2 codes.

4. The pseudocode algorithm document says we should provide the flag variable for each subject defined as S1, S2, S3, S4 or S5 (per the pasted text below).  What do you want to name that field?  I don't see it listed in the data dictionary? Yes please. We will need to know the exact types of tests. Please create fields S1 through S5 and mark as Y for every patient based on the stress modality they had.

5. Regarding the test date, I agree with Jyoti.

 

Thank you for providing data dictionaries in the new format. I have some questions about eMERGE_DD_CRF_Exercise_data_5_24_2016_v6_0. First, peak blood pressure is listed as one field. Could you please split that into two fields? Second, the field Exercise_ECG doesn't have a description. What does that field represent? Could you add a description please?

Thank you,

Ken

We apologize for the late reply. Peak blood pressure would be peak systolic blood pressure. If getting the peak diastolic pressure is easily obtainable, please include that as well. Exercise ECG should be either 0=Negative; 1=positive; 2=abnormal & non-diagnostic for ischemia; .=missing/unknown/unassessed/non-applicable.

Thanks,

Hayan

We apologize for the late reply. Peak blood pressure would be peak systolic blood pressure. If getting the peak diastolic pressure is easily obtainable, please include that as well. Exercise ECG should be either 0=Negative; 1=positive; 2=abnormal & non-diagnostic for ischemia; .=missing/unknown/unassessed/non-applicable.

Thanks,

Hayan