Volume 15 Supplement 4
Predictive modeling of structured electronic health records for adverse drug event detection
© Zhao et al. 2015
Published: 25 November 2015
The digitization of healthcare data, resulting from the increasingly widespread adoption of electronic health records, has greatly facilitated its analysis by computational methods and thereby enabled large-scale secondary use thereof. This can be exploited to support public health activities such as pharmacovigilance, wherein the safety of drugs is monitored to inform regulatory decisions about sustained use. To that end, electronic health records have emerged as a potentially valuable data source, providing access to longitudinal observations of patient treatment and drug use. A nascent line of research concerns predictive modeling of healthcare data for the automatic detection of adverse drug events, which presents its own set of challenges: it is not yet clear how to represent the heterogeneous data types in a manner conducive to learning high-performing machine learning models.
Datasets from an electronic health record database are used for learning predictive models with the purpose of detecting adverse drug events. The use and representation of two data types, as well as their combination, are studied: clinical codes, describing prescribed drugs and assigned diagnoses, and measurements. Feature selection is conducted on the various types of data to reduce dimensionality and sparsity, while allowing for an in-depth feature analysis of the usefulness of each data type and representation.
Within each data type, combining multiple representations yields better predictive performance compared to using any single representation. The use of clinical codes for adverse drug event detection significantly outperforms the use of measurements; however, there is no significant difference over datasets between using only clinical codes and their combination with measurements. For certain adverse drug events, the combination does, however, outperform using only clinical codes. Feature selection leads to increased predictive performance for both data types, in isolation and combined.
We have demonstrated how machine learning can be applied to electronic health records for the purpose of detecting adverse drug events and proposed solutions to some of the challenges this presents, including how to represent the various data types. Overall, clinical codes are more useful than measurements and, in specific cases, it is beneficial to combine the two.
Keywordspharmacovigilance adverse drug events electronic health records machine learning random forest feature selection
With the adoption of computerized medication ordering and administration systems, the veil on the incidence of adverse drug events (ADEs) is slowly being removed. Unfortunately, ADEs are still considered to be heavily under-reported . Among the ADEs that are reported, around half are preventable , causing unnecessary suffering for patients and increased healthcare costs. According to one meta-analysis, ADEs are, in fact, responsible for around 4.9% of hospital admissions worldwide, and, in some cases, this number can be as high as 41.3% . There is thus no doubt that drug safety is an important public health problem. Unfortunately, the high rate of ADEs may continue unabated unless systems that provide decision support for drug selection and dosing are developed and more widely implemented at the point of care .
Pharmacovigilance using electronic health records
Efforts have been made in pharmacovigilance to improve drug safety. The World Health Organization (WHO) defines pharmacovigilance as "the science and activities relating to the detection, assessment, understanding and prevention of adverse effects or any other drug-related problem" . The primary resources involved in pharmacovigilance are clinical trials, spontaneous reports and longitudinal healthcare databases . The use of these can be divided into pre-marketing and post-marketing pharmacovigilance activities. In the pre-marketing stage, prior to the launch of a drug, clinical trials are used to gather information on both the efficacy and safety of a drug. However, such a source of information comes with two inherent limitations, namely small samples of participants and short study duration. These limitations make it challenging to identify ADEs that are rare or occur with a long latency. In the post-marketing stage, after the drug has been launched, spontaneous reporting systems are used continuously to collect information on the safety of the drug. Examples of such systems are the US Food and Drug Administration's Adverse Event Reporting System  and WHO's Global Individual Case Safety Reports Database, Vigibase . Spontaneous reports are voluntarily made by patients and physicians of suspected ADEs, which allows for monitoring of all drugs on the market at a fairly low cost. Unfortunately, such systems suffer heavily from under-reporting: it has been estimated that more than 94% of ADEs are not reported through spontaneous reports . Other limitations of spontaneous reports include selective reporting, incomplete patient information and indeterminate population information; for more details see . Indeterminate population information is particularly problematic since it prevents the calculation of the incidence of reported ADEs. As a result of these limitations, the need for alternative, complementary data sources is duly being acknowledged.
Among the possible alternative data sources, which also includes social media and medical literature, are electronic health records (EHRs)  since they capture and integrate patient data from all aspects of clinical observations over time. Although the main function of EHRs is to archive and manage patient data efficiently - in comparison to paper-based health record systems - secondary use of EHR data is currently being widely explored for various medical research, such as disease discovery and patient stratification [12, 13], among which also pharmacovigilance has received a lot of attention. There are various ways of utilizing EHRs for pharmacovigilance in a data-driven fashion, such as calculating correlations between drugs and diseases, clustering patients into different disease groups, and employing machine learning based prediction , among which the latter is particularly nascent.
Predictive modeling of data from electronic health records
Machine learning based methods are data-driven approaches that can support discovery and exploitation of statistical patterns from large quantities of data. Given a large amount of observations that are described by multiple variables, such methods have proven to be robust to random errors . In areas where there is a need to analyze large amounts of data, such as bioinformatics, machine learning is a key technique, particularly when analyzing "big data" . This is also the case in post-marketing drug safety surveillance, where the discovery process typically relies on large samples; computational signal detection algorithms have in this context been developed to analyze data with the purpose of detecting signals of potential ADEs . Some of these algorithms detect signals according to a score function based on contingency tables, such as disproportionality analysis of spontaneous reports. However, a limitation of using contingency tables is that, by reducing the analysis to only two dimensions, the potential concomitant loss of clinically crucial information may result in arbitrary associations [17, 18]. This can be eschewed by instead employing multivariate algorithms for signal detection, where machine learning methods can provide efficient and effective means of modeling high-dimensional data.
Due to limited access to EHR data, research on exploiting it for pharmacovigilance is still relatively scarce compared to using other data sources, despite its acknowledged potential. Among the published research on using EHRs for ADE detection, some have focused on using clinical notes [19–21], while how best to exploit the structured data remains under-explored. In some studies, however, clinical measurements or lab tests from EHRs have been utilized for (adverse) event detection by representing them as time series , aggregating them into categorical variables , or representing them from multiple perspectives . Other studies have used diagnoses and drugs instead [25, 26], while these data types have also been used in conjunction for signaling ADEs, albeit only in a case study and on a very limited scale .
Diagnoses and drugs are normally encoded by standard coding systems such as International Statistical Classification of Diseases and Related Health Problems (ICD) and Anatomical Therapeutic Chemical Classification System (ATC), respectively. These coding systems have their own concept hierarchies representing terms from general levels to more specific ones according to organ system or etiology. In a previous study, we have studied the possibility of exploiting these concept hierarchies to obtain improved predictive performance on the task of distinguishing between patients who have experienced a specific ADE and randomly selected patients who have not experienced that same ADE . It was shown that for such tasks, using only the more general levels of the codes is sufficient to maintain the predictive performance on a high level. We have also evaluated various ways of representing clinical measurements from EHRs and discovered that using such measurements alone still leads to the effective detection of ADEs; moreover, using only the number of times each clinical measurement has been taken, without considering their actual values, is a representation that results in the highest predictive performance for the most common learning algorithms .
However, previous studies have either used a single data type from EHRs or a small number of pre-selected variables from different data types to signal a specific ADE. In this study, we explore if it is beneficial to combine various data types, on a large scale, by using all of the available variables for ADE detection, and also how best to represent them. In addition to detecting specific ADEs, this study aims to explore ways of using structured EHR data that can be exploited to detect a wide range of ADEs, which could be adopted in a general decision support system that alerts for potential ADEs.
In this study, we investigated the use of various data types in EHRs for drug safety surveillance. Here, we focused on using the structured data to build predictive models using machine learning based methods. Clinical measurements, diagnoses and drugs were extracted from a real EHR database. Besides the known problems of EHR data such as high dimensionality and sparsity, these data types have their own characteristics and hence lead to different challenges when fitting them into predictive models. For example, some clinical events here might be observed multiple times for one patient, while some might not be observed at all. Therefore, a series of experiments were conducted to explore the use of these heterogeneous data types separately and together when predicting ADEs with machine learning based methods: first, different representations of each data type were compared and the best representation of the corresponding data type was selected for merging with the other data types, i.e., to form a fused feature set, which was compared to using each data type separately; second, to reduce the high dimensionality and sparsity, feature selection was conducted on both the separate data types and the fused feature set, which also allowed for an in-depth feature analysis; and finally, various commonly used learning algorithms were applied and compared for the classification task.
Data was extracted from a Swedish EHR database, the Stockholm EPR Corpus (this research has been approved by the Regional Ethical Review Board in Stockholm with permission number 2012/834-31/5). This database contains health records of around 700,000 patients from 2009 to 2010, which were obtained from Karolinska University Hospital in Stockholm, Sweden . Here, large amounts of diagnosis information, drug administrations, clinical measurements, lab tests and clinical notes in free-text from anonymized health records are available for research. In this study, we only extracted the structured data, i.e., diagnoses, drugs and clinical measurements.
In the Stockholm EPR Corpus, diagnoses are encoded by the International Statistical Classification of Diseases and Related Health Problems, 10th Edition (ICD-10), some of which indicate ADEs, e.g., G44.4 (drug-induced headache). To create training data for building machine learning models, we used these ADE-related diagnosis codes as class labels. The population is hence divided into patients that have been assigned an ADE-related diagnosis code and those who have not. In a study on the use of ICD-10 codes for ADE reporting , the ADE-related diagnosis codes were divided into categories according to the strength of their indication for ADEs, where category A.1 (a drug-related causation was noted in the diagnosis code) and category A.2 (a drug-or other substance-related causation was noted in the diagnosis code) were used in this study, as they indicate the most certain causal drug-diagnosis relationship of ADEs compared to the other categories.
The 27 selected ADE related diagnosis codes.
Secondary sideroblastic anemia due to drugs and toxins
Drug-induced adrenocortical insufficiency
Mental and behavioural disorders (MBDs) due to use of opioids: acute intoxication
MBDs due to use of opioids: dependence syndrome
MBDs due to use of sedatives or hypnotics: acute intoxication
MBDs due to use of sedatives or hypnotics: dependence syndrome
MBDs due to use of other stimulants, including caffeine: acute intoxication
MBDs due to use of other stimulants, including caffeine: harmful use
MBDs due to use of other stimulants, including caffeine: dependence syndrome
MBDs due to multiple drug use: acute intoxication
MBDs due to multiple drug use: dependence syndrome
MBDs due to multiple drug use: unspecified mental and behavioural disorder
Drug-induced headache, not elsewhere classified
Cardiomyopathy due to drugs and other external agents
Hypotension due to drugs
Generalized skin eruption due to drugs and medicaments
Localized skin eruption due to drugs and medicaments
Maternal care for (suspected) damage to fetus by drugs
Adverse effects: anaphylactic shock, unspecified
Adverse effects: angioneurotic oedema
Adverse effects: allergy, unspecified
Other complications following infusion, transfusion and therapeutic injection
Anaphylactic shock due to correct drug or medicament properly administered
Unspecified adverse effect of drug or medicament
Statistical description of 27 datasets.
Number of features
The main underlying learning algorithm in this study is random forest , which is an ensemble learning method that generates a set of decision trees. Each tree in the forest is built with a bootstrapped sample from the original training examples and each node in the tree only considers a randomly selected subset of the original feature set. The trees carry out the learning task independently from each other and the forest eventually outputs the final result through voting, i.e., averaging the output of all constituent trees. The random forest learning algorithm has become one of the most popular machine learning methods, especially in bioinformatics where data is often high dimensional, as a result of its relatively low computational cost and robust predictive performance .
Evaluation was done through 10-fold cross validation with 10 iterations. The performance metrics used in this study are accuracy and area under ROC curve (AUC). Accuracy, the most common and perhaps also the most intuitive metric to evaluate the performance of a predictive model, measures the percentage of examples that are predicted correctly. Area under ROC curve can be used whenever the learning algorithm is able to rank the examples based on the decreasing probability of predicting them as positive. It measures the probability of ranking a true positive example ahead of a false positive example , i.e., the rate of detecting true signals versus the false alarm rate. Compared to accuracy, AUC is sometimes favored because it is not sensitive to changes in the class distribution between training and test data.
When more than two models were compared, a Friedman test  was employed to test the statistical significance, where the rank of each model is used. To look further at the pairwise significance between the inspected models, a post-hoc test using the Bergman-Hommel procedure was applied .
Using various data types
In the first experiment, different representations of clinical measurements, on the one hand, and diagnoses and drugs on the other (here we consider diagnoses and drugs as one data type, namely clinical codes, as they share the same characteristics), as well as their combination, were compared.
In a previous study , we proposed five representations (listed below) of clinical measurements to handle the problem that each measurement can be observed multiple times for a patient. Here, we re-evaluated the use of these representations, as well as their combination, on a slightly different task.
Mean - the average of the observed values
SD - the standard deviation of the observed values
Slope - the difference between the first and last observation over the time span
Existence - whether or not a measurement has been taken
Count - the number of times a measurement was taken
After investigating representations of clinical measurements and clinical codes separately, we combined them using their respective best observed representation. As it has previously been shown that, when an ensemble model is employed, building the model from a fused set of data types is favored compared to fusing ensemble models built from the individual data type , we combined the two data types by fusing them into one feature set before applying the random forest algorithm. The predictive performance of random forests using clinical measurements, clinical codes and a combination of the two were compared.
In this study, we explored the impact of feature selection on the predictive performance of the random forest algorithm with a set of thresholds starting from the top 10% of available features ranked according to their information gain scores and subsequently adding an extra 10% until the full feature set is included.
Using various learning algorithms
Learning algorithms and their default settings.
CART decision tree
minimum 1 instance per leaf
Support Vector Machine
polynomial kernel of degree 3
Support Vector Machine
RBF kernel, gamma = 0.0
k nearest neighbors
k = 5
Decision trees, 50 base estimators
Bagging using CART tree
10 base estimators
500 trees, inspected features =
In this section, we report on the predictive performance, in terms of accuracy and AUC, of models generated with the random forest algorithm that was provided with various representations of 27 clinical datasets, each one containing a different data type (clinical codes and measurements) and representation, as well as combinations of these - with and without feature selection. We present both results from individual datasets, as well as summary results, averaged over datasets. An in-depth feature analysis is moreover conducted and, finally, results from using various learning algorithms are summarized.
Using various data types
Comparing multiple representations of clinical measurements.
Comparing different levels of clinical codes.
Comparing random forests using clinical measurements (M), clinical codes (C) and their combination (M+C).
Using the most informative features
Using various learning algorithms
This study investigated the use of various types of structured EHR data - clinical measurements and clinical codes - both in isolation and in combination, to build machine learning models for ADE detection. The results show that using clinical codes alone, or together with clinical measurements, leads to significantly improved predictive performance compared to using only clinical measurements. In addition, feature selection based on information gain was conducted to remove relatively less informative variables, which also enables a deeper inspection of the informativeness of each data type and representation.
We evaluated different representations of clinical measurements and clinical codes using methods proposed in  and , and slightly different results are observed here. In the previous study that explored the possibility of exploiting the concept hierarchies of clinical codes , it was demonstrated that using only the more general levels of the codes was sufficient to maintain the predictive performance on a high level; in this study, however, we observed that using all levels of the codes, including both the general and the more specific levels, yields the best predictive performance. A possible explanation for this is that the tasks in the two studies are different: in , the task was to distinguish patients with a specific ADE from randomly selected patients without the ADE; in this study, the task was to distinguish patients with a specific ADE from patients with a similar disease to the ADE. The latter is a much more difficult task than the former, as the positive and negative examples are more similar in the latter. It is thus not surprising that, in this task, more specific levels of codes are needed to improve the predictive performance. In the study that investigates various representations of clinical measurements , the model with a combination of multiple representations outperformed the ones with any single representation, which is consistent with the observation in this study; however, the predictive performance of models using the single representations are inconsistent with the previous study: Mean is the best in the former, while Count is the best in the latter. This discrepancy might be due to slightly different settings of the tasks in the two studies. In , the task was also to distinguish patients with a specific ADE and patients with similar diseases to the ADE, but it is achieved by retrospectively analyzing the entire available patient history in the EHRs, i.e., clinical events that occurred after the target ADE were included in the predictive models; in this study, the task was instead designed for detecting ADEs at the point of care, which means that only the clinical events that occurred prior to the target ADE were allowed to be exploited in the predictive models.
By combining clinical measurements and clinical codes, the predictive performance does not outperform using only clinical codes. In order to understand the reasons for this observation, we looked at the number of features selected from each data type and their corresponding relative informativeness by ranking features based on their information gain. In general, most of the selected features are clinical codes, which is partly biased as there are in fact more codes than measurements in the feature set, but even when only the top 10% of features are selected, the majority of the top-ranked features are clinical codes. Since only looking at the quantity is not fair in this case, we instead inspected the relative informativeness, adjusted by the number of features, between codes and measurements. It turned out that clinical codes were consistently more informative than clinical measurements. Although by using only clinical measurements, the predictive performance is not worse than random guessing (average accuracy of 81.41 and AUC of 0.655), adding them to clinical codes does not seem to be helpful in improving the predictive performance compared to using codes alone. This can partly be explained by how each tree is built in the random forest: the algorithm selects the most informative feature from a random subset of features as the node to split on when building each tree. In this case, clinical measurements are less likely to be selected as they are inferior to clinical codes in terms of both quantity and quality. As a result, they can almost be considered useless when used in conjunction with clinical codes.
Besides the random forest algorithm, we also employed several other common learning algorithms. Similar results are observed with AdaBoost, Bagging and decision tree as were observed for the random forest algorithm, while for the other learning algorithms that are neither tree-based nor ensemble models, the results deviate from the previous pattern. For example, logistic regression favors the combination of clinical codes and measurements when no feature selection is conducted; a support vector machine with the RBF kernel using clinical measurements yields better predictive performance when only part of the features are selected; and the k nearest neighbor algorithm always achieves better performance by using clinical measurements alone. Moreover, feature selection has a different impact on these learning algorithms, which is basically consistent with what we know about their sensitiveness towards high dimensionality, e.g., adding feature selection clearly improves the predictive performance of the k nearest neighbor algorithm. Here, it is worth noting that among all of the investigated learning algorithms, the random forest classifier consistently outperforms the others for this task, which, again, proves its robustness on handling high dimensional data.
In addition to the averaged results over the 27 datasets, we also presented results for each individual dataset. For most datasets, using only clinical measurements results in the worst performance; however, if we look at the results for accuracy, for some datasets, such as G251, F132 and L270, opposite results are observed; for the AUC results, we can see that for datasets D642, E273, F199, L270, T783, T784, T886 and T887, using a combination of clinical measurements and codes outperforms the others. These diverse results can perhaps be explained by the different nature of each ADE. For example, to detect D642 (drug induced anemia), using clinical codes only is probably not sufficient since such a diagnosis is often made after observing results from blood tests; to detect ADEs starting with F (mental and behavioural disorders), it is less likely that using clinical measurements is helpful, whereas clinical notes, in this case, might contain much more valuable information than the structured data.
Challenges of using electronic health records for adverse drug event detection
Although EHRs are increasingly considered as a valuable resource for pharmocavgilance and machine learning based methods are often favored over other methods when analyzing large amounts of data from EHRs, it is, by using such purely data-driven methods, difficult to distinguish clinically relevant signals from systematic biases in the data. Therefore, the machine learning methods should serve primarily as tools for exploring the massive amounts of data and testing hypotheses; eventually, human knowledge and experience is still necessary to evaluate the validity of the findings.
In addition to the challenges that have already been discussed in the background section, EHR data is also very noisy. On the one hand, the quality of the diagnosis encoding varies according to the experience and expertise of coders , making it difficult for data analysts to adjust the validity and reliability of the reported events. According to a review by the Swedish National Board of Health and Welfare, around 20% of the assigned primary diagnosis codes were found to be erroneous . On the other hand, clinical codes can be influenced by various factors, such as the knowledge and experience of the clinicians, the amount of information available at admission and strategic billing, rendering the choice of codes to report biased. In such situations, when the codes are used to label the training data, we should proceed with caution as they cannot entirely be considered as a gold standard. One expensive alternative here is to involve experts for reviewing training data and correcting incorrect labels.
Limitations and future work
One limitation of this study is that the labels in the training data are directly extracted from the EHR database without being scrutinized by clinical experts. This could lead to findings that do not entirely reflect reality. Moreover, both clinical codes and measurements are represented in certain ways in this study, and hence the results and findings are limited only to these representations. It is, for instance, conceivable that, with better representations, clinical measurements would be as informative as clinical codes for detecting ADEs. Therefore, in future work, representations that can further improve the informativeness of clinical measurements should be explored. This study only included two types of data, codes and measurements, from EHRs. A natural extension would thus be to include more data types, such as lab tests and notes.
We have here demonstrated how machine learning can be employed to analyze structured data in electronic health records for the purpose of supporting pharmacovigilance activities such as detecting adverse drug events. Predictive models learned from electronic health records could be incorporated into adverse drug event alerting systems at the point of care, primarily facilitating the correct encoding of adverse drug events, which, in turn, would address the problem of under-reporting of adverse drug events and lead to more reliable statistics. To create high-performing predictive models, it is essential to pay careful attention to which data to use and how to best represent it, especially so when faced with high-dimensional and extremely sparse data. We have here presented a detailed study and proposed solutions to the said challenges, focusing on two groups of data: measurements and clinical codes that encode drugs and diagnoses.
Within each data type, it is advantageous to combine multiple representations, effectively providing a more holistic view of the data. Across data types, providing all representations of each data type leads to improved predictive performance for some learning algorithms, while for the best-performing learning algorithm - random forest - this is beneficial in certain cases only, i.e., for specific adverse drug events. Generally speaking, clinical codes are more informative than measurements for the purpose of detecting adverse drug events, and it is not necessary in general to add measurements to clinical codes. Selecting a subset of the most informative features can, to some extent, lead to improved predictive performance, even with learning algorithms that are considered to effectively handle high-dimensional data.
This work was partly supported by the project High-Performance Data Mining for Drug Effect Detection at Stockholm University, funded by Swedish Foundation for Strategic Research under grant IIS11-0053.
Publication costs for this article were funded by the project High-Performance Data Mining for Drug Effect Detection at Stockholm University.
This article has been published as part of BMC Medical Informatics and Decision Making Volume 15 Supplement 4, 2015: Selected articles from the IEEE International Conference on Bioinformatics and Biomedicine (BIBM 2014): Medical Informatics and Decision Making. The full contents of the supplement are available online at http://www.biomedcentral.com/bmcmedinformdecismak/supplements/15/S4.
- Classen DC, Resar R, Griffin F, Federico F, Frankel T, Kimmel N, Whittington JC, Frankel A, Seger A, James BC: 'Global trigger tool' shows that adverse events in hospitals may be ten times greater than previously measured. Health Affairs. 2011, 30 (4): 581-589.View ArticlePubMedGoogle Scholar
- Hakkarainen KM, Hedna K, Petzold M, Hägg S: Percentage of patients with preventable adverse drug reactions and preventability of adverse drug reactions-a meta-analysis. PloS One. 2012, 7 (3): 33236-View ArticleGoogle Scholar
- Beijer H, De Blaey C: Hospitalisations caused by adverse drug reactions (adr): a meta-analysis of observational studies. Pharmacy World and Science. 2002, 24 (2): 46-54.View ArticlePubMedGoogle Scholar
- Nebeker JR, Hoffman JM, Weir CR, Bennett CL, Hurdle JF: High rates of adverse drug events in a highly computerized hospital. Archives of internal medicine. 2005, 165 (10): 1111-1116.View ArticlePubMedGoogle Scholar
- Organization WH, et al: The importance of pharmacovigilance. 2002Google Scholar
- Härmark L, Van Grootheest A: Pharmacovigilance: methods, recent developments and future perspectives. European Journal of Clinical Pharmacology. 2008, 64 (8): 743-752.View ArticlePubMedGoogle Scholar
- Ahmad SR: Adverse drug event monitoring at the food and drug administration. Journal of general internal medicine. 2003, 18 (1): 57-60.View ArticlePubMedPubMed CentralGoogle Scholar
- Lindquist M: Vigibase, the who global icsr database system: basic facts. Drug Information Journal. 2008, 42 (5): 409-419.Google Scholar
- Hazell L, Shakir SA: Under-reporting of adverse drug reactions. Drug Safety. 2006, 29 (5): 385-396.View ArticlePubMedGoogle Scholar
- Goldman SA: Limitations and strengths of spontaneous reports data. Clinical Therapeutics. 1998, 20: 40-44.View ArticleGoogle Scholar
- Trifirò G, Patadia V, Schuemie MJ, Coloma PM, Gini R, Herings R, Hippisley-Cox J, Mazzaglia G, Giaquinto C, Scotti L, et al: EU-ADR healthcare database network vs. spontaneous reporting system database: preliminary comparison of signal detection. Studies in Health Technology and Informatics. 2011, 166: 25-30.PubMedGoogle Scholar
- Kohane IS: Using electronic health records to drive discovery in disease genomics. Nature Reviews Genetics. 2011, 12 (6): 417-428.View ArticlePubMedGoogle Scholar
- Roque FS, Jensen PB, Schmock H, Dalgaard M, Andreatta M, Hansen T, Søeby K, Bredkjær S, Juul A, Werge T, et al: Using electronic patient records to discover disease correlations and stratify patient cohorts. PLoS Computational Biology. 2011, 7 (8): 1002141-View ArticleGoogle Scholar
- Jensen PB, Jensen LJ, Brunak S: Mining electronic health records: towards better research applications and clinical care. Nature Reviews Genetics. 2012, 13 (6): 395-405.View ArticlePubMedGoogle Scholar
- Bishop CM, et al: Pattern Recognition and Machine Learning. 2006, Springer, New York, 4:Google Scholar
- Larrañaga P, Calvo B, Santana R, Bielza C, Galdiano J, Inza I, Lozano JA, Armañanzas R, Pérez A, et al: Machine learning in bioinformatics. Briefings in Bioinformatics. 2006, 7 (1): 86-112.View ArticlePubMedGoogle Scholar
- Hauben M, Madigan D, Gerrits CM, Walsh L, Van Puijenbroek EP: The role of data mining in pharmacovigilance. Expert Opinion on Drug Safety. 2005Google Scholar
- Harpaz R, DuMouchel W, Shah NH, Madigan D, Ryan P, Friedman C: Novel data-mining methodologies for adverse drug event discovery and analysis. Clinical Pharmacology & Therapeutics. 2012, 91 (6): 1010-1021.View ArticleGoogle Scholar
- LePendu P, Iyer SV, Bauer-Mehren A, Harpaz R, Mortensen JM, Podchiyska T, Ferris TA, Shah NH: Pharmacovigilance using clinical notes. Clinical Pharmacology & Therapeutics. 2013, 93 (6): 547-555.View ArticleGoogle Scholar
- Eriksson R, Jensen PB, Frankild S, Jensen LJ, Brunak S: Dictionary construction and identification of possible adverse drug events in danish clinical narrative text. JAMIA. 2013, 20 (5): 947-953.PubMedPubMed CentralGoogle Scholar
- Henriksson A, Kvist M, Hassel M, Dalianis H: Exploration of adverse drug reactions in semantic vector space models of clinical text. Proceedings of ICML Workshop on Machine Learning for Clinical Data Analysis. 2012Google Scholar
- Batal I, Fradkin D, Harrison J, Moerchen F, Hauskrecht M: Mining recent temporal patterns for event detection in multivariate time series data. Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2012, ACM, 280-288.View ArticleGoogle Scholar
- Chazard E, Ficheur G, Bernonville S, Luyckx M, Beuscart R: Data mining to generate adverse drug events detection rules. Information Technology in Biomedicine, IEEE Transactions. 2011, 15 (6): 823-830.View ArticleGoogle Scholar
- Zhao J, Henriksson A, Asker L, Boström H: Detecting adverse drug events with multiple representations of clinical measurements. Proceedings of International Conference on Bioinformatics and Biomedicine: 2-5 November 2014. 2014, Belfast, UK, IEEE Computer Society, 536-543.View ArticleGoogle Scholar
- Karlsson I, Zhao J, Asker L, Boström H: Predicting adverse drug events by analyzing electronic patient records. Proceedings of Conference on Artificial Intelligence in Medicine. 2013, Springer, 125-129.View ArticleGoogle Scholar
- Zhao J, Henriksson A, Boströom H: Detecting adverse drug events using concept hierarchies of clinical codes. Proceedings of International Conference on Healthcare Informatics. 2014, IEEE Computer Society, 285-293.Google Scholar
- Ficheur G, Chazard E, Beuscart J-B, Merlin B, Luyckx M, Beuscart R: Adverse drug events with hyperkalaemia during inpatient stays: evaluation of an automated method for retrospective detection in hospital databases. BMC Medical Informatics and Decision Making. 2014, 14 (1): 83-View ArticlePubMedPubMed CentralGoogle Scholar
- Dalianis H, Hassel M, Henriksson A, Skeppstedt M: Stockholm epr corpus: a clinical database used to improve health care. Swedish Language Technology Conference. 2012, 17-18.Google Scholar
- Stausberg J, Hasford J: Drug-related admissions and hospital-acquired adverse drug events in germany: a longitudinal analysis from 2003 to 2007 of icd-10-coded routine data. BMC Health Services Research. 2011, 11 (1): 134-View ArticlePubMedPubMed CentralGoogle Scholar
- Breiman L: Random forests. Machine Learning. 2011, 45 (1): 5-32.View ArticleGoogle Scholar
- Caruana R, Karampatziakis N, Yessenalina A: An empirical evaluation of supervised learning in high dimensions. Proceedings of the 25th International Conference on Machine Learning. 2008, ACM, 96-103.Google Scholar
- Bradley AP: The use of the area under the roc curve in the evaluation of machine learning algorithms. Pattern Recognition. 1997, 30 (7): 1145-1159.View ArticleGoogle Scholar
- Demšar J: Statistical comparisons of classifiers over multiple data sets. The Journal of Machine Learning Research. 2006, 7: 1-30.Google Scholar
- Garcia S, Herrera F: An extension on "statistical comparisons of classifiers over multiple data sets" for all pairwise comparisons. Journal of Machine Learning Research. 2008, 9 (12):Google Scholar
- Boström H: Feature vs. classifier fusion for predictive data mining a case study in pesticide classification. Proceedings of the 10th International Conference on Information Fusion. 2007, IEEE, 1-7.Google Scholar
- Kohavi R, John GH: Wrappers for feature subset selection. Artificial Intelligence. 1997, 97 (1): 273-324.View ArticleGoogle Scholar
- Lazar C, Taminau J, Meganck S, Steenhoff D, Coletta A, Molter C, de Schaetzen V, Duque R, Bersini H, Nowe A: A survey on filter techniques for feature selection in gene expression microarray analysis. IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB). 2012, 9 (4): 1106-1119.View ArticleGoogle Scholar
- Puentes J, Montagner J, Lecornu L, Cauvin J-M: Information quality measurement of medical encoding support based on usability. Computer methods and programs in biomedicine. 2013, 112 (3): 329-342.View ArticlePubMedGoogle Scholar
- Socialstyrelsen: The National Board of Health and Welfare, Diagnosgranskningar utförda i Sverige 1997-2005 samt råd inför granskning, (In Swedish). 2006, [http://www.socialstyrelsen.se/publikationer2006/2006-131-30]
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.