Fig. 2From: Automatic detection of actionable radiology reports using bidirectional encoder representations from transformersDetection of actionable reports with BERT (a) without and (b) with order information. Each sequence was fed into the BERT classifier after adding special tokens (e.g., [CLS] and [SEP]) and padding with [PAD] tokens that are required by the standard specification of BERT. Generally, BERT models output a 512 × 768 hidden state matrix (indicated by **), part of which is a 768-dimensional feature vector for classification tasks (indicated by *). We used only the feature vector for classification tasks and discarded the rest of the hidden state matrix, as in the standard procedureBack to article page