Body Mass Index (BMI)

eMERGE-IV BMI Algorithm adapted from Geisinger Extreme Obesity Algorithm (2013). BMI is being implemented as a quantitative trait. PheKB maintains a catalog of the Geisinger Extreme Obesity algorithm, on which this is based (Phenotpe 121). This algorithm is for analysis. Sites only contributing covariates can simply compile the designated data.

Phenotype ID: 
List on the Collaboration Phenotypes List
Steve Gilhool, Frank Mentch, Shana McCormack, John Connolly, Heather Hain, Elizabeth Bhoj, Patrick Sleiman, Hakon Hakonarson
Contact Author: 
Date Created: 
Tuesday, March 9, 2021

Suggested Citation

Steve Gilhool, Frank Mentch, Shana McCormack, John Connolly, Heather Hain, Elizabeth Bhoj, Patrick Sleiman, Hakon Hakonarson . CHOP. Body Mass Index (BMI). PheKB; 2021 Available from:


Hi Cong, thanks for raising this for clarification. The PheKB website allows descriptions/labels for the links to specific files.
Currently, the file described in ReadMe as "EM_BMI_covariate_algorithm_<date>" has an actual file name of "EM_BMI_covariate_algorithm_20210307.txt" and the label that appears on the PheKB page is "Covariate Algorithm (.sql saved as .txt)".  Hovering over the link may show the hidden filename.  The file will be downloaded with the given file name by default.  The ReadMe file has been updated with this additional detail.

Submitted by Chad Dorn on

In the section of the script copied below, I think that I may be overlooking something. Can you please help me understand where data for #inclusion_events are coming from? I see the CREATE statement for #inclusion_events, but I am not sure where it gets populated with data. Thank you! Chad


create table #inclusion_events (inclusion_rule_id bigint,
person_id bigint,
event_id bigint

with cteIncludedEvents(event_id, person_id, start_date, end_date, op_start_date, op_end_date, ordinal) as
  SELECT event_id, person_id, start_date, end_date, op_start_date, op_end_date, row_number() over (partition by person_id order by start_date ASC) as ordinal
    select Q.event_id, Q.person_id, Q.start_date, Q.end_date, Q.op_start_date, Q.op_end_date, SUM(coalesce(POWER(cast(2 as bigint), I.inclusion_rule_id), 0)) as inclusion_rule_mask
    from Phen_BMI_qualified_events Q
    LEFT JOIN #inclusion_events I on I.person_id = Q.person_id and I.event_id = Q.event_id
    GROUP BY Q.event_id, Q.person_id, Q.start_date, Q.end_date, Q.op_start_date, Q.op_end_date
  ) MG -- matching groups

)  --end with cteIncludedEvents

select event_id, person_id, start_date, end_date, op_start_date, op_end_date
into Phen_BMI_included_events
FROM cteIncludedEvents Results
WHERE Results.ordinal = 1

Hi Chad, Great observation! The short answer is that there is no data to be loaded in this code snippet.  #inclusion_events is a placeholder table created so the next query that has an outer join to #inclusion_events will succeed.  The longer answer is that we're using mostly unmodified SQL as generated by Atlas and this table is part of the generic template.  As this portion of the algorithm doesn't have secondary conditions for including events, the table is not used to store data but must still be present.

The ReadMe file has been updated to note that if the OMOP person_id field is not the eMERGE subjid, then the identifier should be changed before uploading data to PheKB.

We are trying to implement this code in PostgresSQL on which our OMOP is based. So I need to convert some functions into PostgresSQL .

May I know why we are adding 0 to this which makes no difference ? :



Why are we adding this function to add "0"?

Can't we just use :


2. DATEADD(day,-1 * 0, event_date)  as end_date 

This is again adding "0"

Can we just use event_date as end_date? Why add 0?

Yes, by all means if you are rewriting the code please feel free to simplify it. 

However, we recommend that you also check out the SQLRender R package ( The BMI code has oddities like you found because we used mostly unmodified Atlas generated code. Atlas uses a generic template no matter how simple or complex the query. One reason we used Atlas was for situations like yours.  Atlas ensures that the generated code can be machine translated to other dialects, including PostgresSQL, using the SQLRender tool. Both Atlas and SQLRender are supported by OHDSI.

It is my understanding, from the eMERGE 3 OMOP testing project, that use of Atlas (or otherwise restricting development to the "OHDSI SQL" subset) is recommended and is key to each eMERGE site not having to recode an algorithm.