- Research article
- Open Access
Screening pregnant women for suicidal behavior in electronic medical records: diagnostic codes vs. clinical notes processed by natural language processing
BMC Medical Informatics and Decision Making volume 18, Article number: 30 (2018)
We examined the comparative performance of structured, diagnostic codes vs. natural language processing (NLP) of unstructured text for screening suicidal behavior among pregnant women in electronic medical records (EMRs).
Women aged 10–64 years with at least one diagnostic code related to pregnancy or delivery (N = 275,843) from Partners HealthCare were included as our “datamart.” Diagnostic codes related to suicidal behavior were applied to the datamart to screen women for suicidal behavior. Among women without any diagnostic codes related to suicidal behavior (n = 273,410), 5880 women were randomly sampled, of whom 1120 had at least one mention of terms related to suicidal behavior in clinical notes. NLP was then used to process clinical notes for the 1120 women. Chart reviews were performed for subsamples of women.
Using diagnostic codes, 196 pregnant women were screened positive for suicidal behavior, among whom 149 (76%) had confirmed suicidal behavior by chart review. Using NLP among those without diagnostic codes, 486 pregnant women were screened positive for suicidal behavior, among whom 146 (30%) had confirmed suicidal behavior by chart review.
The use of NLP substantially improves the sensitivity of screening suicidal behavior in EMRs. However, the prevalence of confirmed suicidal behavior was lower among women who did not have diagnostic codes for suicidal behavior but screened positive by NLP. NLP should be used together with diagnostic codes for future EMR-based phenotyping studies for suicidal behavior.
Suicide, a devastating event, is one of the leading cause of maternal deaths during pregnancy and the peripartum period [1, 2]. Early detection of pregnant women with nonfatal suicidal thoughts and behavior (hereafter referred to as suicidal behavior) presents an important opportunity for directing suicide prevention efforts to those at high risk for suicide and, therefore, can help to prevent maternal mortality [3,4,5]. However, low-cost, highly scalable methods to identify suicidal behavior are lacking. To date, studies have primarily relied on the International Classification of Diseases (ICD) billing codes using administrative or claims data to identify instances of suicidal behavior [5,6,7,8,9]. Suicidal behavior is often “under-coded” with only a small proportion of suicidal cases being detected by the ICD codes among all suicidal cases (i.e., low sensitivity) [10,11,12,13]. For example, a systematic review  reported that the sensitivity of one widely used ICD-9 code category, suicide and self-inflicted injury (E950–E959), ranged from 13.8 to 65%. Using a large primary care database from the United Kingdom (UK), Thomas et al.  reported that the use of diagnostic codes to detect suicidal cases missed approximately three-quarters of the cases. The reported low sensitivity of billing codes for identifying suicidal behavior implies that a sizable portion of suicidal cases may be missed when case-finding relies on ICD codes alone. Therefore, expanded data collection methods for suicidal behavior are urgently needed to provide a foundation for prevention efforts [9, 14].
The increasing utilization of electronic medical records (EMRs) has provided unprecedented opportunities for identifying pregnant women with suicidal behavior. EMRs contain a ready repository of clinical and phenotypic information consisting of structured and unstructured data that can enable low-cost population-based studies [15, 16]. Structured data are entered by “clicking” on choices of lists, forms, or templates, including demographic data, laboratory test results, and diagnostic billing codes such as the aforementioned ICD codes [16,17,18]. Unstructured data—clinical data extracted from free-text such as physicians’ notes and radiology reports—offers a valuable resource for defining clinical phenotypes [19,20,21,22]. The automated examination of a large volume of clinical notes requires the use of natural language processing (NLP) , a field of computational linguistics that allows computers to extract relevant information from unstructured human language . NLP has been used successfully to identify patient cohorts for different phenotypes including treatment resistant depression, bipolar disorder, cerebral aneurysms, rheumatoid arthritis, Crohn’s disease, ulcerative colitis, and diabetes [15, 23,24,25,26,27,28,29,30,31,32]. However, very few studies have used NLP to identify suicidal behavior in EMRs [10, 33, 34], and no study has reported any classification algorithm that is highly predictive of suicidal behavior.
Because of the low prevalence of suicidal behavior [4, 35], developing a phenotyping algorithm using the full EMR population would likely result in low positive predictive values (PPV) . To address this, we first screened for patients with medical record information (structured or unstructured) suggestive of suicidal behavior and excluding those with no evidence of suicidal behavior . The patients who screened positive for suicidal behavior would serve as a highly sensitive datamart and then can be used to develop highly predictive classification algorithms for suicidal behavior. Here, using EMRs from a large healthcare system (Partners HealthCare), we demonstrate that using diagnostic codes together with NLP can more effectively screen for pregnant women with a higher potential of suicidal behavior. We also compare the characteristics of patients identified by these two methods.
Data source and study population
We extracted data from the Partners HealthCare System Research Patient Data Registry (RPDR). The RPDR is a centralized clinical data warehouse for 4.6 million patients from two large academic medical centers (Massachusetts General Hospital [MGH] and Brigham and Women’s Hospital [BWH]), as well as community and specialty hospitals in the Boston area. The RPDR includes structured and unstructured EMR information, including socio-demographic data, vital signs, laboratory and test results, problem list entries, prescribed medications, billing codes, and clinical notes for healthcare services provided within the system . The Institutional Review Board of Partners HealthCare (Protocol Number: 2016P000775/BWH) and Harvard T.H. Chan School of Public Health (Protocol Number: IRB16–0899) approved all aspects of this study.
We initially identified women aged 10–64 years with at least one diagnostic code related to pregnancy or delivery (International Classification of Diseases-10 [ICD-10]: Z3A.*, O0.*- O9.*; ICD-9: 640.*- 679.*, V22.*, V23.*, V24.*, V27.*, V28.*; Diagnosis-Related Group [DRG]: 370–384) in the EMRs from January 1, 1996 to March 31, 2016, totaling 275,843 women (hereafter referred to as “datamart”) included in the datamart (Fig. 1).
Suicidal behavior screened positive by diagnostic codes
We first screened for suicidal behavior based on diagnostic codes including the ICD codes and the Longitudinal Medical Record (LMR) codes. The LMR codes were assigned to problem list conditions in the ambulatory EMR system used across Partners HealthCare System. (Additional file 1: Table S1). In addition to the explicit diagnostic codes for suicidal ideation (e.g., ICD-9 V62.84) and suicide attempt (e.g., ICD-9 E95*), we also included additional sets of ICD code categories (poisoning by analgesics, antipyretics, and antirheumatics; poisoning by sedatives and hypnotics; and poisoning by psychotropic agents) with positive predictive value ≥0.8 for suicidal behavior, based on a previous study . Among the 275,843 women with at least one diagnostic code related to pregnancy or delivery, 2433 women had at least one diagnostic code related to suicidal behavior, of which 196 had a diagnostic code that occurred during pregnancy, or within 42 days after abortion or delivery . These 196 women, who screened positive for suicidal behavior based on diagnostic codes, hereafter will be referred to as the “diagnostic codes group” (Fig. 1).
Suicidal behavior screened positive by NLP-processed clinical notes
Among the 273,410 women without any diagnostic codes related to suicidal behavior, we randomly sampled a subset of women (N = 5880) who were matched for age (10-year intervals), race, and comparative health with the diagnostic codes group using a 1:30 matching ratio. The reason we chose the 1:30 ratio for subsequent NLP was twofold: (1) to provide a sample size that was large enough for a general view of distributions of CUIs, and (2) to minimize the NLP processing time. Comparative health, a proxy for healthcare utilization, was defined as the total number of observations in the medical records which included diagnostic codes for diseases, medications, and specific test results from hospital visits for each patient . To comply with the IRB, Partners HealthCare employees (N = 598) were excluded, leaving 5282 women in the matched set. We then searched women’s clinical notes and identified 1120 (21.2%) women with at least one mention of the terms related to suicidal behavior  (Additional file 1: Table S2) during pregnancy or within the 42 days after abortion or delivery .
We further processed the clinical notes of the 1120 women using the clinical Text Analysis and Knowledge Extraction System (cTAKES 3.2.3, http://ctakes.apache.org/) . Based on the Unstructured Information Management Architecture (UIMA), cTAKES is a comprehensive clinical NLP tool that processes clinical notes and identifies terms. cTAKES maps the terms to a subset of the Unified Medical Language System (UMLS) Metathesaurus , the Systemized Nomenclature of Medicine-Clinical Terms (SNOMED-CT) , and assigns each term a UMLS concept unique identifier (CUI). cTAKES also extracts qualifying attributes (including negation, temporality, and subject status) associated with each CUI. As determined by cTAKES negation module , each CUI can be either affirmed (e.g., “patient reports feeling suicidal”) or negated (e.g., “suicidal behavior: none”). Affirmed CUIs were considered as relevant for this analysis. cTAKES has a temporality module, DocTimeRel (Document Time Relation), to discover the temporal relation between a term and the document creation time . The values for DocTimeRel include “before” (e.g., “patient attempted suicide when she was 14”), “after” (e.g., “She would not consider suicide an option if symptoms were to arise”), “overlap” (e.g., “patient states that she wants to kill herself”), and “before/overlap” (terms that started before document creation time and continue to the present [e.g., “patient endorses passive suicidal ideation since the birth of her baby”]). Terms tagged as “overlap” or “before/overlap” were considered as temporally relevant for this analysis. The Subject module indicates whether the patient or someone else (e.g., “mom attempted suicide”) experiences the event. The values for the Subject module include “patient,” “family member,” “other,” and “null.”  The terms tagged as “patient” were considered as subject relevant for this analysis.
We created an expert-defined list of CUIs considered relevant to suicidal behavior (Additional file 1: Table S3). We included the distributions of attributes of the CUIs relevant to suicidal behavior in Additional file 1: Table S4.
To compensate for errors introduced by the NLP system, we calculated the proportion of affirmed, temporally relevant, and subject relevant CUIs related to suicidal behavior among all CUIs related to suicidal behavior for each woman and selected women with proportions that were greater than or equal to 0.25. This threshold was determined empirically with an aim to decrease false positives, while maintaining relatively low false negatives. From the NLP-processed clinical notes, we identified 486 pregnant women (hereafter referred to as the “NLP group”) with CUIs related to suicidal behavior. Of note, the NLP group was screened positive by both term mentions related to suicidal behavior and cTAKES. The remainder (N = 634) who had at least one mention of the terms related to suicidal behavior during pregnancy or within the 42 days after abortion or delivery, but were not screened positive by the NLP are referred to as the “NLP not relevant group.”
We randomly sampled a subset of women aged 10–64 years with at least one diagnostic code related to pregnancy or delivery as the reference group. The reference group was matched with comparative health  for the diagnostic codes group using a 1:100 matching ratio. Since we did not need to process the clinical notes for reference group, we included a relatively larger sample size. After excluding Partners HealthCare employees, 17,183 women were included in the reference group.
Chart review to obtain estimates for prevalence of confirmed suicidal behavior
After the screening process, one of the authors (QYZ) manually reviewed the clinical notes for random samples of (1) 50 women from the diagnostic codes group (N = 196); (2) 100 women from the NLP group (N = 486); (3) 100 women from the NLP not relevant group (N = 634); and (4) 100 women who had neither diagnostic codes nor term mentions related to suicidal behavior (N = 4162). Based on the Columbia Classification Algorithm of Suicide Assessment (C-CASA), the reviewer assigned each woman a classification of either “with” or “without suicidal behavior” . Women who had (1) completed suicide, (2) suicide attempt, (3) preparatory acts toward imminent suicidal behavior, or (4) suicidal ideation were considered as “with” suicidal behavior.
We compared the demographic and provider characteristics of pregnant women screened positive for suicidal behavior by the diagnostic codes versus NLP during encounters with suicidal behavior. We examined the distributions of demographic characteristics between pregnant women screened positive for suicidal behavior by the diagnostic codes versus NLP using the Chi-square test for categorical variables and Student’s t-test for continuous variables. We reported the proportions of women who received diagnoses of psychiatric comorbidities at least once during or before the most recent encounter with suicidal behavior. Psychiatric comorbidities were defined using the ICD codes in Additional file 1: Table S5. All analyses were done using R .
We identified 682 pregnant women who screened positive for suicidal behavior, of whom 196 (28.73%) were identified by diagnostic codes and 486 (71.26%) were identified by NLP. Based on manual chart review, the prevalence of confirmed suicidal behavior in women screened positive (PPV) by the diagnostic codes and by NLP in women without the diagnostic codes were 76.00 and 30.00%, respectively. The estimated number of confirmed suicidal behavior among the screen positive groups by the diagnostic codes and NLP would be 149 and 146, respectively. The prevalence of confirmed suicidal behavior was 1.00% among the NLP not relevant group. The prevalence of confirmed suicidal behavior was 0.00% among women who had neither diagnostic codes nor term mentions related to suicidal behavior. The approximate estimated prevalence of suicidal behavior in the reference group would be 2.76% (486 × 0.3/5282).
The demographic characteristics of women who screened positive for suicidal behavior by the diagnostic codes and NLP, respectively, are presented in Table 1. Compared with the NLP group, the diagnostic codes group was less likely to be Hispanic (33.33% vs. 28.57%), be married/common-law married/partnered (29.63% vs. 21.43%), report religious affiliation as Christian (45.47% vs. 38.27%), and have private insurance (44.65% vs. 32.14%); these women were more likely to be Black or African American (16.46% vs. 20.92%), be single (65.02% vs. 71.43%), and be insured by Medicaid (43.21% vs. 49.49%) and Medicare (6.17% vs. 9.18%).
Table 2 shows provider characteristics for participants’ encounters (inpatient or outpatient visits) with suicidal behavior. For encounters with suicidal behavior, more than two-thirds of women in the diagnostic codes group (69.39%) visited the Emergency Department, whereas only 17.49% of women in the NLP group visited the Emergency Department. The proportions of women screened positive for suicidal behavior treated in an inpatient setting was higher among those in the diagnostic codes group (39.29%), as compared with those in the NLP group (19.55%).
Psychiatric comorbidities were common among women with suicidal behavior (Table 3). Women screened positive for suicidal behavior by the diagnostic codes had higher psychiatric comorbidities including depression, schizophrenia, bipolar disorder, post-traumatic stress disorder (PTSD), and substance abuse. The distribution of care providers according to clinical specialties (Department of Psychiatry/Mental Health/Behavioral Health and Emergency Department) were similar across psychiatric comorbidities (Table 3).
We demonstrated that the use of NLP along with term search substantially improved the sensitivity of screening suicidal behavior among pregnant women from a large EMR system. More than two-thirds of potential suicidal behavior and nearly half of confirmed suicidal behavior would have been missed if screening had relied solely on ICD codes. However, we observed that the PPV of NLP, the probability that a suicidal case identified by NLP was truly suicidal, was lower (30.00%) as compared to the diagnostic codes (76.00%). We found that women in the diagnostic codes group had more risk factors for suicidal behavior , including low socioeconomic status, being single, and psychiatric comorbidities as compared with those women in the NLP diagnostic group.
Prior studies have attempted to identify patients with suicidal behavior in unstructured clinical notes. Using the UK Clinical Practice Research Datalink, Thomas et al. found that searching for terms related to suicide in general practice consultation records identified 10.7% of the suicidal cases that were missed by ICD diagnostic codes . Anderson et al.  processed the History of Present Illness notes of 15,761 patients with at least one diagnostic code of depression in primary care clinical organizations. A rule-based NLP system was developed to search for positive mention or negation of suicidal behavior using a list of terms related to suicidal behavior. The proportion of patients with corresponding ICD diagnostic codes indicating suicidal ideation and suicide attempt in the notes were 3% and 19%, respectively. Haerian et al.  used an NLP tool, the Medical Language Extraction and Encoding System (MedLEE), to identify suicidal behavior in the EMRs for pediatric and adult inpatients. Of note, they used a list of CUIs with a specific focus on suicidal behavior by drug overdose, which was different from the CUI list we used in our study. In their study, 469 potential cases were identified by the ICD diagnostic codes, and 4087 were identified by the NLP algorithm after filtering out CUIs that were negated or associated with family history. The intersection of both ICD diagnostic codes and the NLP algorithm identified 260 potential cases. The positive predictive values for the ICD diagnostic codes and the NLP algorithm were similar (55% for ICD and 60% for NLP). Despite the different NLP tools used across EMR systems, these results consistently suggested that suicidal behavior was often documented in clinical notes without being assigned any diagnostic codes that were designed for billing purposes. Suicidal behavior is a complex phenotype coupled with many psychosocial problems, where clinical notes are often used to capture the complexity and diagnostic uncertainty [47, 48]. Incorporating information from unstructured clinical notes through NLP in our study, we were able to screen a significant number of patients with potential suicidal behavior that would otherwise not be found using structured data alone. However, the PPV of NLP used in the current study was lower than that of the diagnostic codes. Nonetheless, we identified a comparable number of suicidal cases (149 for diagnostic codes vs. 146 for NLP) when using only a subsample of women (5880 out of 273,410) without any diagnostic codes related to suicidal behavior for NLP. Despite the low PPV of NLP, considering the large number of pregnant women without diagnostic codes related to suicidal behavior (N = 273,410) and the fact that suicidal behavior was often documented in clinical notes, we maintain that NLP procedures may be used to identify more suicidal cases. Therefore, for future studies using EMR-based phenotyping for suicidal behavior, an optimal approach to increase screening sensitivity may best involve combining the application of NLP procedures with the diagnostic codes.
Only 30% of the women who screened positive for suicidal behavior by NLP were confirmed to be suicidal by chart review (PPV = 0.30). A large proportion of women who were not suicidal were screened positive for suicidal behavior by NLP. Similar to one previous study , the majority of the false positives came from the incorrect qualifying attributes based on our error analysis by manual review of the clinical notes from 100 women in the NLP group, in particular, negation associated with CUIs. Negation is a well-known challenge for processing unstructured clinical notes . One study showed that approximately half of the conditions indexed in dictated reports were negated [50, 51]. For suicidal behavior, clinicians are likely to document both the presence and absence of suicidal behavior . In the Partners HealthCare EMRs, we observed a major negation structure for suicidal behavior: terms related to suicidal behavior were followed by a colon and a negation word without any sentence punctuation (e.g., “suicidal behavior: none,” “suicidal behavior: none reported,” and “suicidal behavior: denied”) (Additional file 1: Table S6). However, the standard cTAKES negation module NegEx [40, 52], a regular expression pattern matching algorithm that searches for predefined negation words around terms  was initially trained using the Intensive Care Unit discharge summaries , and is not able to recognize such negation structure . Consequently, a considerable number of suicidal behavior terms that were negated were incorrectly identified as “affirmed.” Further enhancement of the negation algorithm with training data pertaining specifically to suicidal behavior is required to decrease the false positives [49, 55]. Other common reasons leading to cTAKES miscoding women without suicidal behavior as suicidal (Additional file 1: Table S6) included (1) incorrect recognition of “before” as “overlap” by the DocTimeRel module (e.g., DocTimeRel module treated history of suicidal behavior as current suicidal behavior: “Suicide attempt/gesture: history of, hospitalized inpatient psych unit for suicide attempt in 1996”); (2) incorrect recognition of “family member” as “patient” by the Subject module (e.g., Subject module treated the suicidal behavior of patient’s father as patient’s: “Pt also identifies strongly with father, who was often aggressive toward others and threatened suicide”); (3) failure to identify section titles (e.g., “Suicidal Behavior Hx of Suicidal Behavior:”) that do not describe the behavior of patients; and (4) failure to handle hypothetical conditions that temporally are neither recent nor historical (e.g., “If she has significant side effects from it such as lethargy/depression/irritability/suicidal thought, we will change it to LTG.”).
We found that women in the diagnostic codes group had different characteristics as compared to women in the NLP group. On the one hand, these differences could be due to the lower prevalence of confirmed suicidal behavior in the NLP group. Therefore, developing highly predictive classification algorithms is needed for the NLP group. On the other hand, the differences between women screened positive for suicidal behavior by the diagnostic codes and NLP suggest that the two groups may differ with respect to the degree of suicide intent, methods used, and subsequent clinical management. Because a larger proportion of women screened positive by the diagnostic codes received inpatient care and were seen in the Emergency Department, they were likely to present as more severe cases of suicidal behavior with high suicide intent , requiring hospital admission and immediate care. In addition, the diagnostic codes for suicidal ideation (ICD-9: V62.82) were not used until October 2005 when the codes were introduced. Even after the codes became available, one study showed that suicidal ideation was less likely to be coded than suicide attempt . These two factors (i.e., source of inpatient care and timing of availability of diagnostic codes) might have contributed to a disproportionate representation of more severe cases of suicidal behavior in the diagnostic codes group. In this scenario, women screened positive by the diagnostic codes may be a more relevant cohort for assessing patients at high risk for completed suicide , whereas women screened positive by NLP may be more relevant for investigating early identification of high-risk groups and suicide prevention interventions. Another possibility for the observed differences in characteristics, especially for psychiatric comorbidities, between the diagnostic codes group and the NLP group could be due to differential bias in coding: women with more risk factors were more likely to be coded for suicidal behavior.
There are several limitations of this study. First, the prevalence of confirmed suicidal behavior among women screened positive by NLP was only 30%. However, given the purpose of our study, which was to screen pregnant women with a higher potential of suicidal behavior and to develop a highly sensitive datamart for suicidal behavior, this low PPV might be tolerated. Nevertheless, using this highly sensitive datamart for suicidal behavior, future development of accurate classification algorithms using different machine learning techniques [58, 59] is clearly needed to identify true cases of suicidal behavior. Second, given the small sample size of women screened positive for suicidal behavior by the diagnostic codes, we did not further classify patients according to subtypes of suicidal behavior such as suicidal ideation and suicide attempt. Third, given that a woman was considered as screened positive for suicidal behavior only if she was screened positive by both term mention and NLP by cTAKES, it is possible that we might miss some women who did not pass the screening by term mentions related to suicidal behavior but would have been considered as screened positive by cTAKES. Fourth, we used 20 years of data from a single urban-regional EMR system that did not include patient visits outside this geographical area, time period, or network of hospitals. The generalizability of our results to patients in other healthcare systems may vary depending on the informatics infrastructure and local documentation practices . Fifth, we focused on extracting facts expressed directly in the clinical notes (i.e., terms of suicidal behavior) using NLP. However, beyond extracting these basic facts, further research in studying other linguistic features, such as sentiment expressed in clinical notes (e.g., positive and negative emotions), and capturing the meaning of texts (e.g., word embedding [60,61,62]), may also be beneficial in identifying suicidal patients [63,64,65,66,67].
Our results illuminated the advantage of using NLP along with term search in EMRs to screen pregnant women for a complex, rare psychiatric phenotype. NLP substantially improved the sensitivity of screening for suicidal behavior in an obstetric population. We captured a group of pregnant women with potential suicidal behavior otherwise not reflected in the structured data. We also highlight the challenges of using NLP in screening pregnant women for suicidal behavior. Of note, NLP had lower PPV as compared with diagnostic codes. Improvement in the cTAKES modules, especially the negation module, may help to increase the PPV. For future studies using EMR-based phenotyping for suicidal behavior, an optimal approach may include combining NLP procedures with the diagnostic codes.
Our approach is the first to examine the large-scale use of NLP in suicidal behavior among pregnant women. The current study in our population of pregnant women was particularly challenging given the rarity of suicidal behavior, the stigma attached, the complexity of phenotypic assessment, and the historical misconception of the protective role of pregnancy in suicidal behavior [3, 68]. Because pregnancy is a time when women have frequent interactions with the healthcare system, EMR-based identification of pregnant women with suicidal behavior may be useful for future genetic, epidemiological, and clinical studies, presenting a valuable opportunity for healthcare providers to intervene promptly [5, 69].
Brigham and women’s hospital
Columbia classification algorithm of suicide assessment
clinical Text Analysis and Knowledge Extraction System
Concept unique identifier
Electronic medical records
International classification of diseases
Longitudinal medical record
Medical language extraction and encoding system
Massachusetts general hospital
Natural language processing
Positive predictive values
Research patient data registry
Systemized nomenclature of medicine-clinical terms
Unified medical language system
Oates M. Suicide: the leading cause of maternal death. Br J Psychiatry. 2003;183:279–81.
Oates M. Perinatal psychiatric disorders: a leading cause of maternal morbidity and mortality. Br Med Bull. 2003;67:219–29.
Gelaye B, Kajeepeta S, Williams MA. Suicidal ideation in pregnancy: an epidemiologic review. Arch Womens Ment Health. 2016;19:741–51.
Barak-Corren Y, Castro VM, Javitt S, Hoffnagle AG, Dai Y, Perlis RH, et al. Predicting Suicidal Behavior From Longitudinal Electronic Health Records. Am J Psychiatry. 2017;174:154–62.
Zhong Q-Y, Gelaye B, Miller M, Fricchione GL, Cai T, Johnson PA, et al. Suicidal behavior-related hospitalizations among pregnant women in the USA, 2006-2012. Arch Womens Ment Health. Springer Vienna; 2016;19:463–472.
Gandhi SG, Gilbert WM, McElvy SS, El Kady D, Danielson B, Xing G, et al. Maternal and neonatal outcomes after attempted suicide. Obstet Gynecol. 2006;107:984–90.
Patrick AR, Miller M, Barber CW, Wang PS, Canning CF, Schneeweiss S. Identification of hospitalizations for intentional self-harm when E-codes are incompletely recorded. Pharmacoepidemiol Drug Saf. 2010;19:1263–75.
Simon GE, Savarino J. Suicide attempts among patients starting depression treatment with medications or psychotherapy. Am J Psychiatry. 2007;164:1029–34.
Walkup JT, Townsend L, Crystal S, Olfson M. A systematic review of validated methods for identifying suicide or suicidal ideation using administrative or claims data. Pharmacoepidemiol Drug Saf. 2012;21(Suppl 1):174–82.
Haerian K, Salmasian H, Friedman C. Methods for identifying suicide or suicidal ideation in EHRs. AMIA Annu Symp Proc. 2012;2012:1244–53.
Colman I, Yiannakoulias N, Schopflocher D, Svenson LW, Rosychuk RJ, Rowe BH, et al. Population-based study of medically treated self-inflicted injuries. CJEM. 2004;6:313–20.
Thomas KH, Davies N, Metcalfe C, Windmeijer F, Martin RM, Gunnell D. Validation of suicide and self-harm records in the Clinical Practice Research Datalink. Br J Clin Pharmacol. 2013;76:145–57.
Lu CY, Stewart C, Ahmed AT, Ahmedani BK, Coleman K, Copeland LA, et al. How complete are E-codes in commercial plan claims databases? Pharmacoepidemiol Drug Saf. 2014;23:218–20.
US Public Health Service. The Surgeon General's Call to Action to Prevent Suicide. Washington, DC: US Public Health Service; 1999.
Castro VM, Minnier J, Murphy SN, Kohane I, Churchill SE, Gainer V, et al. Validation of electronic health record phenotyping of bipolar disorder cases and controls. Am J Psychiatry. 2015;172:363–72.
Jensen PB, Jensen LJ, Brunak S. Mining electronic health records: towards better research applications and clinical care. Nat Rev Genet. 2012;13:395–405.
Sinnott JA, Dai W, Liao KP, Shaw SY, Ananthakrishnan AN, Gainer VS, et al. Improving the power of genetic association tests with imperfect phenotype derived from electronic medical records. Hum Genet. 2014;133:1369–82.
de Lusignan S, van Weel C. The use of routinely collected computer data for research in primary care: opportunities and challenges. Fam Pract. 2006;23:253–63.
Bates DW, Evans RS, Murff H, Stetson PD, Pizziferri L, Hripcsak G. Detecting adverse events using information technology. J Am Med Inform Assoc. 2003;10:115–28.
Pakhomov SVS, Shah ND, Van Houten HK, Hanson PL, Smith SA. The role of the electronic medical record in the assessment of health related quality of life. AMIA Annu Symp Proc. 2011;2011:1080–8.
Fischer LR, Rush WA, Kluznik JC, O’Connor PJ, Hanson AM. Abstract C-C1-06: Identifying Depression Among Diabetes Patients Using Natural Language Processing of Office Notes. Clin Med Res. 2008;6:125–6.
Jha AK. The promise of electronic records: around the corner or down the road? JAMA. 2011;306:880–1.
Raymond Francis Sarmiento FD. Improving Patient Cohort Identification Using Natural Language Processing. In: MIT Critical Data, editor. Secondary Analysis of Electronic Health Records. Berlin, Germany: Springer International Publishing; 2016. p. 405–417.
Lin C, Karlson EW, Dligach D, Ramirez MP, Miller TA, Mo H, et al. Automatic identification of methotrexate-induced liver toxicity in patients with rheumatoid arthritis from the electronic medical record. J Am Med Inform Assoc. 2015;22:e151–61.
Castro VM, Dligach D, Finan S, Yu S, Can A, Abd-El-Barr M, et al. Large-scale identification of patients with cerebral aneurysms using natural language processing. Neurology. 2017;88:164–8.
Perlis RH, Iosifescu DV, Castro VM, Murphy SN, Gainer VS, Minnier J, et al. Using electronic medical records to enable large-scale studies in psychiatry: treatment resistant depression as a model. Psychol Med. Cambridge Univ Press; 2012;42:41–50.
Castro V, Shen Y, Yu S, Finan S, Pau CT, Gainer V, et al. Identification of subjects with polycystic ovary syndrome using electronic health records. Reprod Biol Endocrinol. 2015;13:116.
Castro VM, Apperson WK, Gainer VS, Ananthakrishnan AN, Goodson AP, Wang TD, et al. Evaluation of matched control algorithms in EHR-based phenotyping studies: a case study of inflammatory bowel disease comorbidities. J Biomed Inform. 2014;52:105–11.
Liao KP, Cai T, Gainer V, Goryachev S, Zeng-treitler Q, Raychaudhuri S, et al. Electronic medical records for discovery research in rheumatoid arthritis. Arthritis Care Res. 2010;62:1120–7.
Liao KP, Ananthakrishnan AN, Kumar V, Xia Z, Cagan A, Gainer VS, et al. Methods to Develop an Electronic Medical Record Phenotype Algorithm to Compare the Risk of Coronary Artery Disease across 3 Chronic Disease Cohorts. PLoS One. 2015;10:e0136651.
Ananthakrishnan AN, Cai T, Savova G, Cheng S-C, Chen P, Perez RG, et al. Improving case definition of Crohn’s disease and ulcerative colitis in electronic medical records using natural language processing: a novel informatics approach. Inflamm Bowel Dis. 2013;19:1411–20.
Carroll RJ, Thompson WK, Eyler AE, Mandelin AM, Cai T, Zink RM, et al. Portability of an algorithm to identify rheumatoid arthritis in electronic health records. J Am Med Inform Assoc. 2012;19:e162–9.
Anderson HD, Pace WD, Brandt E, Nielsen RD, Allen RR, Libby AM, et al. Monitoring suicidal patients in primary care using electronic health records. J Am Board Fam Med. 2015;28:65–71.
Downs JM, Velupillai S, Gkotsis G, Holden R, Kikoler M, Dean H, et al. Detection of Suicidality in Adolescents with Autism Spectrum Disorders: Developing a Natural Language Processing Approach for Use in Electronic Health Records. Proc AMIA Symp [Internet]. 2017 [cited 2018 Mar 19]; Available from: https://kclpure.kcl.ac.uk/portal/en/publications/detection-of-suicidality-in-adolescents-with-autism-spectrum-disorders(2e703fc1-2f87-448e-abfc-14e36036c471)/export.html
Smoller JW. The use of electronic health records for psychiatric phenotyping and genomics. Am J Med Genet B Neuropsychiatr Genet [Internet]. 2017; Available from: https://doi.org/10.1002/ajmg.b.32548
Liao KP, Cai T, Savova GK, Murphy SN, Karlson EW, Ananthakrishnan AN, et al. Development of phenotype algorithms using electronic medical records and incorporating natural language processing. BMJ. 2015;350:h1885.
Wang SV, Rogers JR, Jin Y, Bates DW, Fischer MA. Use of electronic healthcare records to identify complex patients with atrial fibrillation for targeted intervention. J Am Med Inform Assoc. 2017;24:339–44.
World Health Organization. International Statistical Classification of Diseases and Related Health Problems. In: World Health Organization; 2004.
Murphy SN, Weber G, Mendis M, Gainer V, Chueh HC, Churchill S, et al. Serving the enterprise and beyond with informatics for integrating biology and the bedside (i2b2). J Am Med Inform Assoc. 2010;17:124–30.
Savova GK, Masanz JJ, Ogren PV, Zheng J, Sohn S, Kipper-Schuler KC, et al. Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation and applications. J Am Med Inform Assoc. jamia.oxfordjournals.org. 2010;17:507–13.
Bodenreider O. The Unified Medical Language System (UMLS): integrating biomedical terminology. Nucleic Acids Res. 2004;32:D267–70.
Donnelly KSNOMED-CT. The advanced terminology and coding system for eHealth. Stud Health Technol Inform. 2006;121:279–90.
Chikka VR, Mariyasagayam N, Niwa Y, Karlapalem K. Information Extraction from Clinical Documents: Towards Disease/Disorder Template Filling. Experimental IR Meets Multilinguality, Multimodality, and Interaction. Cham: Springer; 2015. p. 389–401.
Posner K, Oquendo MA, Gould M, Stanley B, Davies M. Columbia Classification Algorithm of Suicide Assessment (C-CASA): classification of suicidal events in the FDA’s pediatric suicidal risk analysis of antidepressants. Am J Psychiatry. 2007;164:1035–43.
R Core Team. R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing; 2014.
Turecki G, Brent DA. Suicide and suicidal behaviour. Lancet. 2016;387:1227–39.
Ford E, Carroll JA, Smith HE, Scott D, Cassell JA. Extracting information from the text of electronic medical records to improve case detection: a systematic review. J Am Med Inform Assoc. 2016;23:1007–15.
Ford E, Campion A, Chamles DA, Habash-Bailey H. “You don’t immediately stick a label on them”: a qualitative study of influences on general practitioners’ recording of anxiety disorders. BMJ Open [Internet]. bmjopen.bmj.com; 2016; Available from: http://bmjopen.bmj.com/content/6/6/e010746.short
Wu S, Miller T, Masanz J, Coarr M, Halgrim S, Carrell D, et al. Negation’s not solved: generalizability versus optimizability in clinical natural language processing. PLoS One. 2014;9:e112774.
Harkema H, Dowling JN, Thornblade T, Chapman WW. ConText: an algorithm for determining negation, experiencer, and temporal status from clinical reports. J Biomed Inform. 2009;42:839–51.
Chapman WW, Bridewell W, Hanbury P, Cooper GF, Buchanan BG. Evaluation of negation phrases in narrative clinical reports. Proc AMIA Symp. 2001:105–9.
Chapman WW, Bridewell W, Hanbury P, Cooper GF, Buchanan BG. A simple algorithm for identifying negated findings and diseases in discharge summaries. J Biomed Inform. 2001;34:301–10.
Sohn S, Wu S, Chute CG. Dependency Parser-based Negation Detection in Clinical Narratives. AMIA Jt Summits Transl Sci Proc. 2012;2012:1–8.
Garla V, Lo Re V III, Dorey-Stein Z, Kidwai F, Scotch M, Womack J, et al. The Yale cTAKES extensions for document classification: architecture and application. J Am Med Inform Assoc. 2011;18:614–20.
Gkotsis G, Velupillai S, Oellrich A, Dean H, Liakata M, Dutta R. Don’t Let Notes Be Misunderstood: A Negation Detection Method for Assessing Risk of Suicide in Mental Health Records. In: Proceedings of the Third Workshop on Computational Lingusitics and Clinical Psychology. Association for Computational Linguistics; 2016. p. 95–105.
Crandall C, Fullerton-Gleason L, Aguero R, LaValley J. Subsequent suicide mortality among emergency department patients seen for suicidal behavior. Acad Emerg Med. 2006;13:435–42.
Chock MM, Bommersbach TJ, Geske JL, Bostwick JM. Patterns of Health Care Usage in the Year before Suicide: A Population-Based Case-Control Study. Mayo Clin Proc. 2015;90:1475–81.
Metzger M-H, Tvardik N, Gicquel Q, Bouvry C, Poulet E, Potinet-Pagliaroli V. Use of emergency department electronic medical records for automated epidemiological surveillance of suicide attempts: a French pilot study. Int J Methods Psychiatr Res [Internet]. Wiley Online. Library. 2017;26 Available from: https://doi.org/10.1002/mpr.1522
Walsh CG, Ribeiro JD, Franklin JC. Predicting Risk of Suicide Attempts Over Time Through Machine Learning. Clin Psychol Sci. SAGE Publications Inc. 2017;5:457–69.
Jagannatha AN, Yu H. Bidirectional RNN for Medical Event Detection in Electronic Health Records. Proc Conf. 2016;2016:473–82.
Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J. Distributed Representations of Words and Phrases and their Compositionality. In: Burges CJC, Bottou L, Welling M, Ghahramani Z, Weinberger KQ, editors. Advances in Neural Information Processing Systems 26. Curran Associates, Inc; 2013. p. 3111–9.
Mikolov T, Chen K, Corrado G, Dean J. Efficient Estimation of Word Representations in Vector Space [Internet]. arXiv [cs.CL]. 2013. Available from: http://arxiv.org/abs/1301.3781
McCoy TH Jr, Castro VM, Roberson AM, Snapper LA, Perlis RH. Improving Prediction of Suicide and Accidental Death After Discharge From General Hospitals With Natural Language Processing. JAMA Psychiatry. 2016;73:1064–71.
Leonard Westgate C, Shiner B, Thompson P, Watts BV. Evaluation of Veterans’ Suicide Risk With the Use of Linguistic Detection Methods. Psychiatr Serv. 2015;66:1051–6.
Roberts A. Language, Structure, and Reuse in the Electronic Health Record. AMA J Ethics. 2017;19:281–8.
Pestian J, Nasrallah H, Matykiewicz P, Bennett A, Leenaars A. Suicide Note Classification Using Natural Language Processing: A Content Analysis. Biomed Inform Insights. 2010;2010:19–28.
Pestian JP, Grupp-Phelan J, Bretonnel Cohen K, Meyers G, Richey LA, Matykiewicz P, et al. A Controlled Trial Using Natural Language Processing to Examine the Language of Suicidal Adolescents in the Emergency Department. Suicide Life Threat Behav. 2016;46:154–9.
Appleby L. Suicide during pregnancy and in the first postnatal year. BMJ. 1991;302:137–40.
Gold KJ, Singh V, Marcus SM, Palladino CL. Mental health, substance use and intimate partner problems among pregnant and postpartum suicide victims in the National Violent Death Reporting System. Gen Hosp Psychiatry. 2012;34:139–45.
The authors are very grateful for the help of Leslie Howes at Harvard T.H. Chan School of Public Health, and the Harvard Catalyst Leadership Team during the planning and development of this research. The authors thank the Enterprise Research Infrastructure & Services at Partners HealthCare for the provision of computing resources. The authors also thank Laurie Bogosian and Stacey Duey of the Research Patient Data Repository at Partners HealthCare for the in-depth support. The authors thank Kathy Brenner for the help with editing this manuscript. This research was done as partial fulfillment of the requirements of a Doctor of Science degree by one of the authors (QYZ) in the Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA. One of the authors (QYZ) expresses appreciation to Dr. Michael Napolitano for his non-expert comments, and constant support and encouragement in completing this manuscript.
This research was supported by awards from the National Institutes of Health (the National Institute on Minority Health and Health Disparities: T37-MD001449; and the National Center for Research Resources (NCRR), the National Center for Advancing Translational Sciences (NCATS): 8UL1TR 000170–09). The NIH had no further role in study design; in the collection, analysis and interpretation of data; in the writing of the manuscript; and in the decision to submit the paper for publication.
Availability of data and materials
The data that support the findings of this study are available from Partners HealthCare but restrictions apply to the availability of these data, which were used under license for the current study, and so are not publicly available. Data are however available from the authors upon reasonable request and with permission of Partners HealthCare.
Ethics approval and consent to participate
The Institutional Review Board (IRB) of Partners HealthCare (Protocol Number: 2016P000775/BWH) and Harvard T.H. Chan School of Public Health (Protocol Number: IRB16–0899) approved all aspects of this study. The IRB granted a waiver of consent/authorization.
The authors declare that they have no competing interest.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Table S1. International Classification of Disease (ICD) codes and other diagnostic codes used to screen suicidal behavior. Table S2. Terms used to screen suicidal behavior in clinical notes. Table S3. Concept Unique Identifiers (CUIs) related to suicidal behavior. Table S4. Distributions of attributes of the Concept Unique Identifiers (CUIs) related to suicidal behavior among 1120 women. Table S5. International Classification of Disease (ICD) codes used to define psychiatric comorbidities. Table S6. Error analysis of false positive results from cTAKES to screen for suicidal behavior. (DOCX 36 kb)
About this article
Cite this article
Zhong, QY., Karlson, E.W., Gelaye, B. et al. Screening pregnant women for suicidal behavior in electronic medical records: diagnostic codes vs. clinical notes processed by natural language processing. BMC Med Inform Decis Mak 18, 30 (2018). https://doi.org/10.1186/s12911-018-0617-7
- Natural language processing
- Electronic medical records
- Suicidal behavior
- Diagnostic codes
- Clinical notes