From: Natural language processing for populating lung cancer clinical research data
Histological types | Number (%) in training data set | Number (%) in testing data set |
---|---|---|
Adenocarcinoma | 897 (44.7%) | 37 (37%) |
Adenosquamous | 16 (0.8%) | 2 (2%) |
Carconoid | 1 (0.05%) | 0 |
Carconoid typical /atypical | 15 (0.75%) | 1 (1%) |
Large / larger neuroendocrine | 23 (1.1%) | 1 (1%) |
Non-small cell | 342 (17.0%) | 15 (15%) |
Other cell type /Unknown | 1 (0.05%) | 0 |
Other NSCLC | 14 (0.70%) | 1 (1%) |
Small cell | 339 (16.9%) | 21 (21%) |
Squamous | 358 (17.8%) | 22 (22%) |