Hepatocellular carcinoma (HCC), the primary form of liver cancer, is one of the leading cancer-related causes of death worldwide. There are many complex treatment strategies; the populations are heterogeneous, with different genetic, lifestyle, and comorbity differences.
Here we describe the algorithm used to identify HCC liver cancer stages for AJCC, BCLC, and CLIP liver cancer staging systems.
Step 1) Patient files and laboratories
We used new patients and obtained all clinical reports from a single day. The purpose was to isolate related files for patients at onset. Radiology reports were specified to be CT or MRI of the abdomen or abdomen/pelvis or chest/abdomen/pelvis with contrast 3 months prior to or 1 month following the day of visit.
Required laboratory values: (1) Albumin, (2) Alpha Fetoprotein,Nonmaternal, (3) Bilirubin (Total), (4) Prothrombin INR
Cohort only included patients with at least 1 clinical report, 1 radiology, and all 4 laboratory values.
Patients that did not have HCC or already started treatment were excluded.
Step 2) File type exclusion. Only relevant file types were kept.
Step 3) Identification of text evidence for stage parameters
Step 4) Patient stage parameter determination
Step 5) Patient stage determination according to staging logic.
W. Yim, S. Kwan, M. Yetisgen. In-depth annotation for patient level liver cancer staging. In Proceedings of Sixth International Workshop on Health Text Mining and Information Analysis of EMNLP'2015, Lisbon, Portugal, September, 2015.