From: Natural language processing for populating lung cancer clinical research data
Data elements | Number of patients in existing Dataset (A) | Number of patients with true NLP results (B) | Number of patients with NLP results (C) | Precision1 (B/A) | Precision2 (B/C) | Recall | Time window |
---|---|---|---|---|---|---|---|
Stage | 2127 | 1330 | 1883 | 0.625 | 0.706 | 0.885 | 90 days |
2127 | 1328 | 1883 | 0.624 | 0.705 | 0.885 | 60 days | |
2127 | 1325 | 1883 | 0.623 | 0.704 | 0.885 | 30 days | |
Histology | 2208 | 1918 | 1989 | 0.869 | 0.885 | 0.982 | 90 days |
2208 | 1914 | 2164 | 0.867 | 0.884 | 0.980 | 60 days | |
2208 | 1889 | 2154 | 0.856 | 0.877 | 0.976 | 30 days | |
Tumor grade | 1635 | 1182 | 1203 | 0.723 | 0.902 | 0.801 | 90 days |
1635 | 1170 | 1300 | 0.716 | 0.900 | 0.795 | 60 days | |
1635 | 1143 | 1274 | 0.700 | 0.897 | 0.779 | 30 days | |
Chemotherapy | 1674 | 1674 | 1674 | 1 | 1 | 1 | 365 days |
Radiotherapy | 769 | 769 | 769 | 1 | 1 | 1 | 365 days |
Surgery | 312 | 312 | 312 | 1 | 1 | 1 | 365 days |