Skip to main content

Table 6 Number of each histological cell type in training and testing data

From: Natural language processing for populating lung cancer clinical research data

Histological typesNumber (%) in training data setNumber (%) in testing data set
Adenocarcinoma897 (44.7%)37 (37%)
Adenosquamous16 (0.8%)2 (2%)
Carconoid1 (0.05%)0
Carconoid typical /atypical15 (0.75%)1 (1%)
Large / larger neuroendocrine23 (1.1%)1 (1%)
Non-small cell342 (17.0%)15 (15%)
Other cell type /Unknown1 (0.05%)0
Other NSCLC14 (0.70%)1 (1%)
Small cell339 (16.9%)21 (21%)
Squamous358 (17.8%)22 (22%)