Skip to main content

Evaluation of the informatician perspective: determining types of research papers preferred by clinicians



To deliver evidence-based medicine, clinicians often reference resources that are useful to their respective medical practices. Owing to their busy schedules, however, clinicians typically find it challenging to locate these relevant resources out of the rapidly growing number of journals and articles currently being published. The literature-recommender system may provide a possible solution to this issue if the individual needs of clinicians can be identified and applied.


We thus collected from the CiteULike website a sample of 96 clinicians and 6,221 scientific articles that they read. We examined the journal distributions, publication types, reading times, and geographic locations. We then compared the distributions of MeSH terms associated with these articles with those of randomly sampled MEDLINE articles using two-sample Z-test and multiple comparison correction, in order to identify the important topics relevant to clinicians.


We determined that the sampled clinicians followed the latest literature in a timely manner and read papers that are considered landmarks in medical research history. They preferred to read scientific discoveries from human experiments instead of molecular-, cellular- or animal-model-based experiments. Furthermore, the country of publication may impact reading preferences, particularly for clinicians from Egypt, India, Norway, Senegal, and South Africa.


These findings provide useful guidance for developing personalized literature-recommender systems for clinicians.


Medicine is a field that is continually changing as knowledge of disease and health continues to advance. In the age of translational medicine, clinicians constantly face challenges in transforming scientific evidence into ordinary clinical practice. Peer-reviewed scientific literature is a major, minimally biased resource for scientists and other researchers to communicate their discoveries based on experiments and their findings from rigorously implemented trials and thoughtfully balanced clinical guidelines. Thus, remaining current on medical literature can help clinicians provide the best evidence-based care for their patients [1].

Nevertheless, a widely held belief is that most clinicians rarely read scientific literature on account of their hectic schedules evaluating patients and preparing the required paperwork. According to a survey by Saint et al. US internists reported that they read medical journals for an average of 4.4 h per week in 2000 [2]. Analysis of the web log of 55,000 Australian clinicians by Westbrook et al. revealed an average of 2.32 online literature accesses per clinician per month in a period from October 2000 to February 2001 [3]. Moreover, a 2007 survey report by Tenopir et al. reported that 666 pediatrician participants spent 49 to 61 h per year (equivalent to 0.94 to 1.2 h per week) reading journal articles [4]. In the same year, McKibbon and colleagues showed that primary care physicians not affiliated with an academic medical center in the US accessed an average of one online journal article per month, while specialists not affiliated with an academic medical center accessed an average of 1.9 online journal articles each month [5].

The time constraint of clinicians is the most commonly suggested reason for the small number of articles read. Moreover, access to journals and publication databases is sometimes limited to clinicians with academic affiliations, which can subsequently limit the number of articles accessed and read. Another consideration is whether scientific journal reading comprised everyday practice during training. The majority of clinicians presently in practice were trained in the pre-digital era and likely did not learn the literature-searching skills necessary to keep updated [6,7,8,9]. Furthermore, they may not have the advanced technical knowledge (e.g., statistical modeling) often required to understand and apply the findings in scientific articles to clinical practice [7, 10, 11].

Moreover, the aims of clinicians by reading are often discordant with the goals of researchers in their publishing endeavors. A clinician usually seeks information that is relevant to his/her respective medical practice, whereas many researchers do not emphasize the clinical relevance of their work to an extent that is sufficient to address the clinician’s needs [8, 10, 12]. Researchers often use specialized and nuanced language to describe complex medical discoveries in scientific literature. Such language can be opaque for practicing clinicians. In addition, clinicians may be frustrated by many discrepancies and knowledge gaps in the literature [13,14,15,16] because they require conclusive and actionable information in practice.

To address this issue, some researchers have strived to improve the user experience with literature search engines. They have added the functions of sorting and clustering search results, as well as extracting and displaying semantic relations among concepts [17]. In addition, considerable research efforts have been made in building intelligent recommender systems that automatically recommend literature to users by employing content-, collaborative-, or graph-based filtering methods [18,19,20,21,22]. Although these studies differed in the methods used, they all targeted scientists and other researchers for whom literature review is an integral part of their daily work.

Clinicians, on the other hand, who search for and read scientific literature, are transcending their daily routine to satisfy their intellectual curiosity, increase their awareness of the latest scientific advancements, and obtain knowledge for treating their patients. The distinction in this regard between the needs of clinicians versus professional researchers has inspired us to investigate the specific needs of clinicians and their preferences for reading literature. Our goal is to learn from clinicians what types of medical research papers they prefer to read and their modes of accessing and assimilating the knowledge.

In this study, we identified a group of clinicians and the scientific papers they were likely to read from the website. We investigated what type of job they perform, in what specialty they practice, in which country they live and work, the length of time they have practiced, and what scientific research was interesting to them. This type of systematic examination of clinician reading libraries is an important pre-market evaluation for developing a personalized literature-recommender system to improve clinician reading experiences and overall promote the reading of scientific literature.


The workflow of the entire study is illustrated in Fig. 1 with each step further described below.

Fig. 1

A workflow to determining types of research papers preferred by clinicians

Data collection

We employed [23] to identify a sample of clinicians who read scientific literature. Since 2004, has provided free online reference management services with the goal of promoting the sharing of scientific references and fostering communication among researchers. It enables registered users to add publications they like to their own libraries and identify their research fields in profiles. The system then groups users of the same research field.

Moreover, the libraries and basic information of users are openly shared on the website, making it an ideal data source choice for our study. We selected clinical medicine in the primary research field and retrieved 2,472 users on May 1, 2014. We manually verified whether each of these users is a clinician based on the combination of name, location, job title and affiliation information provided in the profile. In this process, we eliminated 91.7% of the users because significant amounts of information were missing from their profiles and we could not confirm that they were clinicians.

We further excluded 109 users who had fewer than five articles in their libraries because these users are less likely to be active users on CiteULike. Including them could complicate the analysis and results interpretation. After filtering according to these stringent selection criteria, our final sample of reading clinicians was comprised of 96 individuals, who claimed to be clinicians and had relatively complete information in their profiles. These 96 users cited 8,511 articles in their CiteULike libraries. For those articles with PubMed IDs (PMIDs), the unique identifier used by the PubMed search engine to access the MEDLINE bibliographic database of life sciences, we retrieved the complete abstracts in MEDLINE format. For the remainder, we searched PubMed using the article title and retrieved the publication if there was only one returned hit, which retrieved the exact match of the article most of the time, but not always. In this process, 2,290 articles (26.9%) could not be retrieved on account of missing bibliographic information, article exclusion from MEDLINE, or difficulty with the title search.

For example, some users did not list the complete article titles and other valid bibliographic information. Instead, they listed keywords identifiable only to themselves, such as sharing and managing data, aging and the brain, and public health and Web 2.0. These keywords were too general for PubMed to precisely locate corresponding articles. Other articles, such as “The Dos and Don’ts of PowerPoint Presentations” and “Inside Microsoft SQL Server 7.0 (Mps),” were not indexed by MEDLINE. Meanwhile, some articles were indexed in MEDLINE but could not be retrieved using the title search field in PubMed. We ultimately identified 6,221 publications that were cited in the user libraries, and we employed them in further analysis.

Content analysis by MeSH

Medical Subject Headings (MeSH) is used by the National Library of Medicine (NLM) to annotate biomedical concepts and supporting facts addressed in each article indexed in the MEDLINE bibliographic database for information-retrieval purposes [24]. Each MEDLINE article is assigned two types of MeSH terms: major terms, which represent the main topics of the article, and minor terms, which represent the concepts and facts that are related to the experimental subjects and design attributes, such as humans or animals, adults or children, gender, and countries. MeSH annotations have been used in text mining and data mining tasks [25,26,27] and provided unique and key information on the topic and content preferences of the sampled clinicians in our study. All MeSH terms were included in the MEDLINE format abstracts that we downloaded.

We were eager to understand what content and topics, if different, that a dummy tool, without prior knowledge of its clinician users, might recommend. Therefore, we randomly selected 6,000 publications by sampling from PMIDs. We compared the MeSH terms from the real clinician reading libraries to the randomly sampled MEDLINE articles. The MeSH terms over-represented in the real clinician reading libraries were expected to suggest contents and topics that were more relevant to the clinicians. This information can provide useful clues for designing a personalized literature-recommender system that targets clinicians. In this experiment, we wrote Python scripts to extract and count the MeSH terms indexed in each article.

Two-sample Z-test

The two-sample Z-test is often used for validating whether there is a significant difference between two groups based on a single categorical attribute. For example, it can validate whether there are more female vegetarians than male [28]. We chose this statistical test for our study because we intended to learn what contents and topics (in MeSH terms) are more interesting to clinicians. Our null hypothesis was that the frequency of MeSH term t in the clinician reading libraries was identical to that in the randomly sampled articles recommended by the dummy tool (H 0 : f t,clinician = f t,random). The frequency was therein defined as the number of articles that were assigned the term t divided by the total number of articles in that group. We calculated the z-scores and two-side p-values in Excel to validate the null hypothesis.

Multiple comparison correction

We considered the multiple comparison problem and avoided p-values that became ‘significant’ because of random effects [29] when performing Z-test for thousands of MeSH terms. We conducted the Benjamini-Hochberg false discovery rate (FDR) controlling procedure [30] in Excel to set more stringent p-value thresholds instead of 0.05.


We first examined the professional backgrounds of the 96 sampled clinicians. Of these clinicians, 58 were practicing doctors (60.4%), 19 were medical school faculty members (19.8%), 12 were medical doctors in atypical career paths, such as managerial/consulting positions in various healthcare organizations (12.5%), five were students advancing in post-graduate medical studies (5.2%), and two were practicing nurses (2.0%). In addition, we evaluated the time lengths of the clinicians’ medical practices. For each clinician, we calculated the number of years since he/she graduated from the professional school. The number of years range from 1 to 45, with an average of 16.0 (Fig. 2a).

Fig. 2

Demographic information for the sampled clinicians. a histogram of clinician practicing years after medical school graduation; b distribution of specialties; c distribution of countries of residence

In terms of specialty, 30 clinicians specialized in the internal medicine (31.3%), 12 in the surgery (12.5%), 6 in the pediatrics (6.3%), 5 in the psychiatry (5.2%), 25 in other medical specialty areas (26.0%, individual specialties with the number of clinicians in each were listed in Fig. 2b), 11 not actively seeing patients (11.5%) and 7 in an undisclosed specialty (7.3%). Geographically (Fig. 2c), 33 clinicians resided in the United States (34.4%), 16 in the United Kingdom (16.7%), five in the Germany (5.2%), five in the India (5.2%), four in the France (4.2%) and the remaining 23 clinicians were in other countries (24.0%, see Additional file 1: Table S1).

We then examined the clinician reading libraries. First, we plotted a histogram of the publication years for all 6,221 publications (Fig. 3a). Of the articles read by clinicians, 89.9% are published after 2000, with the peak centering between 2008 and 2010. In both 2013 and 2014, a significant decrease occurs. The 2013 decrease in the number of articles read by clinicians may be due to the fact that many users are no longer active on this website, whereas for 2014, we collected the data from CiteULike in May.

Fig. 3

Temporal analysis of articles read by clinicians. a histogram of articles published each year; b histogram of age of articles when being read by clinicians (age = year read - publication year)

To examine how soon after an article is published that a clinician reads it, we plotted a histogram for the age of the article at the reading time, which is defined by the year when the article was read by a clinician minus its publication year. Figure 3b shows that articles published and read by clinicians in the same year are the highest, and a steady decreasing trend is evident when the article age increases. This result is strong evidence that clinicians in fact read the latest publications.

Interestingly, nine clinicians additionally read 51 papers published more than 30 years ago. These papers were published on journals with an average impact factor of 10.84 and have been cited for an average of 573.88 times according to Google Scholar, and thus can be considered as landmark articles in medical research. For example, “Studies of Illness in the Aged. Index of ADL: A Standardized Measure of Biological and Psychological Function” and “Functional Evaluation: The Barthel Index” were published in 1963 in the Journal of the American Medical Association (JAMA) and the Maryland State Medical Journal, with a Google Scholar citation of more than 7,000 and 9,400 times. These are the original publications of the most appropriate and extensively adopted measurement for evaluating functional status in the elderly population.

We summarized the types of publications for 6,221 articles and found that the majority are journal articles, including original research articles (3,698 or 59.4%), reviews (1,093 or 17.6%), reports of clinical trials (508 or 8.2%), case reports (259 or 4.2%), evaluation and validation studies (185 or 3.0%), comments (147 or 2.4%), and clinical guidelines (29 or 0.5%). The remaining 4.9% of articles belong to opinion and announcement categories, such as letters, editorials, and breaking news (see Additional file 1: Table S2).

In addition, we investigated what journals are read most often by clinicians and found that 6,221 articles are widely distributed among 1,664 journals. Nearly 50% (823) of the journals are cited only once in the clinician reading libraries, and 53 journals are cited 20 or more times (Table 1). Such a sparse distribution of journals suggests a need to further evaluate the impact of journals in a reliable recommender system model.

Table 1 Top journals read by the sampled clinicians and article count

We later evaluated what journals that the clinicians of different specialty groups read (see Additional file 1: Table S3). The results indicate that prestige was not the most important factor when different specialty groups choose what scientific journals to read. Specialists tend to read journals closely related to their practice fields, rather than medical journals with high impact factors that target a broader readership. For instance, Arthroscopy: The Journal of Arthroscopic & Related Surgery is the most widely read journal among surgeons, while The Lancet ranks only in 108th place in their reading.

To determine whether the country of residence and language or culture of practice can affect the clinician reading preferences, we analyzed the association between the clinician and author countries of residence for the articles in their libraries. If an article has authors from different countries, we used the first author’s country of residence. In the heat map of Fig. 4, each row represents a country of residence for the clinicians; each column represents the country of residence for the authors. The cell color changes horizontally from green (minimum number of articles) to red (maximum number of articles). According to the findings, articles written by American and British authors are extensively read by many clinicians in our sample, given the fact that these two countries publish a considerable amount of medical research. However, clinicians residing in Egypt, India, Norway, Senegal, and South Africa prefer works by authors of their own countries.

Fig. 4

Clinician country of residence versus author country of residence in the reading libraries. Each row represents a country of residence of the sampled clinicians; each column represents the country of residence of the authors of the cited articles. The cell color changes from green (minimal count) to red (maximal count) for each row

We performed a comprehensive statistical analysis to examine whether the topics of articles read by the sampled clinicians, in MeSH terms, differed from those recommended by the dummy tool without prior knowledge of the users. We found that 119 major MeSH terms and 288 minor MeSH terms have significantly different occurrence frequencies in the two groups (see Additional file 1: Table S4 and S5). Among the MeSH terms with the highest frequency variations in the two groups (Table 2), clear distinctions exist. Clinicians read more topics relating to patient issues and needs, such as pain, hip joints, drug therapy, surgery, arthroscopy, and therapeutic uses and adverse effects of analgesics. They prefer meta-analyses, reviews of literature, and quality of life research. Moreover, they are interested in research on human subjects, instead of molecule-, cell-, or animal-based studies, likely because human-based research is more relevant to treating their patients.

Table 2 MeSH term comparison between clinician reading libraries and a random sample


The proliferation of scientific publications and reduction of clinician reading times warrants the need for a method of enabling clinicians to quickly identify the latest results from scientific publications to more effectively practice evidence-based medicine. In the past decade, many clinicians have employed digital resources for the most recent and relevant findings and guidelines. Previous studies show that 60 to 70% of US clinicians access the Internet for professional purposes, and searching for literature in journals and databases is one of their most frequent online activities [31]. However, an overwhelming amount of information, coupled with the inadequate search skills of readers, remain major obstacles for clinicians to access research literature.

In recent years, several literature reading applications have been developed for clinicians to access research articles on smartphones and tablets [32], so that fragmented time between patient care could be better utilized. Some of them provided paper recommendation. For example, the Read by QxMD [33] suggests the latest research papers based on the users’ specialties and their choice of key words and journals. Docphin [34] tracks new and landmark papers related to the medical topics and authors specified by users. The mobile application offered by [35] populates users’ reading list with articles picked by a board of medical experts. None of these implementations go beyond keyword-based recommendations and are not fully adopted by clinicians. A truly personalized literature-recommender system that alleviates the obstacles for clinician to access and read research literature requires more cognitive study to understand their reading preference and habits, which motivates us to carry out this work.

Our study advanced previous research [2,3,4,5, 7,8,9, 31, 36] by determining clinician reading preferences based on bibliographic and content aspects. We employed CiteULike, a contemporary data source, to identify the reading materials that are favored by the site’s clinician users. The publically available user profiles and reading library information on the website makes it more desirable for this study than other websites such as Mendeley. Moreover, compared to the widely used methods in previous studies [9] such as interviews, surveys, and tracking library access of limited samples within an institution, CiteULike offers two advantages. First, it is a non-invasive collection; the unnecessary response bias that is common in interview- or survey-based studies is avoided. Secondly, the sampled users on CiteULike represent clinicians from far more diverse geographic locations and medical specialties than in a specific hospital or institution.

In this study, we determined that research articles published in peer-reviewed journals are the most highly valued type; moreover, articles are usually read within the first 1 or 2 years after the publication date. Landmark papers in medical research history are also a significant category. Reviews, reports of clinical trials, meta-analysis studies, and case reports are likewise well represented across specialties and countries of practice.

In selection of the reading material, whether a paper is published in a prestigious journal with a high impact factor carries less weight than the topic of the paper and experimental design. The country of publication apparently also plays a role in reading preference. For example, some readers from Egypt, India, Norway, Senegal, and South Africa seem to prefer works by authors from their own countries (Fig. 4).

In content analysis, we determined that patient-oriented topics, meta-analyses, literature reviews, studies involving human subjects, and quality of life research are significantly more prevalent in clinician reading choices than in the overall publications.

These results provide important insights for designing a personalized literature recommender system that would be more welcome by clinicians, who are eager to learn the latest scientific discoveries relevant to patient care. For example, the system would possibly recommend not only latest research articles published in peer-reviewed journals, but also some landmark research works. The language and culture background, together with significantly over-represented MeSH terms, could be used in conjunction with the specialty information, so that recommendation could be optimized for different user groups.

However, the findings of this study must be considered in the context of the following limitations. First, the sample size of clinicians was small. The distribution of their demographic and professional attributes may not represent the entire clinician population. Secondly, we learned the paper reading preference of the sampled clinicians from articles cited in their CiteULike libraries. We assumed that the clinicians have read those articles in their libraries, which might not be true all the time. Thirdly, we randomly selected 6,000 papers from the MEDLINE bibliographic database based on PMID, and we used these papers to represent the possible recommendations by a dummy tool. This collection is not an ideal one for a comparison because it can include papers that clinicians may like to read. Consequently, we may have missed meaningful MeSH terms because the difference between groups was not statistically significant. In other words, we traded recall for precision when identifying the relevant MeSH terms. Finally, the content analysis was limited to user demographics and the bibliographic features documented by MEDLINE and CiteULike, such as practice specialty, publication year, and MeSH terms. On the other hand, analysis based on full-text articles is expected to provide a more comprehensive understanding of clinician preferences. Nevertheless, such a study would demand advanced text mining and natural language processing technologies.


Despite the limitations of the present study, our findings on clinician reading preferences can serve as useful information for developing a personalized literature-recommender system for clinicians who work at the front-line of patient care. In the future, further research and development should be performed in this area so that clinicians can more effectively and conveniently access the most relevant scientific results. In addition, connecting clinicians and researchers for collaborations through a publication-based social network is another interesting aspect to be explored. Existing social network sites such as ResearchGate and have attracted a great number of scientists, but an online research community connecting researchers and clinicians is not available yet. Such a social network can improve the communication and collaboration between clinicians and medical scientists so that scientific breakthroughs can be applied to clinical settings faster, while medical scientists can more effectively learn and focus on patient-relevant problems.



Medical subject headings


National Library of Medicine


The unique identifier number for each article entered into PubMed System


  1. 1.

    English RA, Lebovitz Y, Giffin RB. Transforming clinical research in the United States: challenges and opportunities: workshop summary. Washington DC: National Academies Press (US); 2010.

    Google Scholar 

  2. 2.

    Saint S, Christakis DA, Saha S, Elmore JG, Welsh DE, Baker P, et al. Journal reading habits of internists. J Gen Intern Med. 2000;15:881–4.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  3. 3.

    Westbrook JI, Gosling AS, Coiera E. Do Clinicians Use Online Evidence to Support Patient Care? A Study of 55,000 Clinicians. J Am Med Inform Assoc. 2004;11(2):113–20. doi:10.1197/jamia.M1385.

    Article  PubMed  PubMed Central  Google Scholar 

  4. 4.

    Tenopir C, King DW, Clarke MT, Na K, Zhou X. Journal reading patterns and preferences of pediatricians. J Med Libr Assoc. 2007;95(1):56–63.

    PubMed  PubMed Central  Google Scholar 

  5. 5.

    McKibbon KA, Haynes RB, McKinlay RJ, Lokker C. Which journals do primary care physicians and specialists access from an online service? J Med Libr Assoc. 2007;95(3):246–54. doi:10.3163/1536-5050.95.3.246.

    Article  PubMed  PubMed Central  Google Scholar 

  6. 6.

    Neale T. How doctors can stay up to date with current medical information. MedPage Today. 2009. Accessed 1 Aug 2014.

  7. 7.

    Majid S, Foo S, Luyt B, Zhang X, Theng Y-L, Chang Y-K, et al. Adopting evidence-based practice in clinical decision making: nurses’ perceptions, knowledge, and barriers. J Med Libr Assoc. 2011;99(3):229–36.

    Article  PubMed  PubMed Central  Google Scholar 

  8. 8.

    Basow D. Use of evidence-based resources by clinicians improves patient outcomes. Minneapolis: Wolters Kluwer Health,; 2010. Accessed 6 Apr 2014.

    Google Scholar 

  9. 9.

    Clarke MA, Belden JL, Koopman RJ, Steege LM, Moore JL, Canfield SM, et al. Information needs and information-seeking behaviour analysis of primary care physicians and nurses: a literature review. Health Info Libr J. 2013;30(3):178–90. doi:10.1111/hir.12036.

    Article  PubMed  Google Scholar 

  10. 10.

    Barraclough K. Why doctors don’t read research papers. BMJ. 2004;329(7479):1411. doi:10.1136/bmj.329.7479.1411-a.

    Article  PubMed Central  Google Scholar 

  11. 11.

    Homer-Vanniasinkam S, Tsui J. The continuing challenges of translational research: clinician-scientists’ perspective. Cardiol Res Pract. 2012;2012:246710. doi:10.1155/2012/246710.

    Article  PubMed  PubMed Central  Google Scholar 

  12. 12.

    O’Donnell M. Why doctors don’t read research papers. BMJ. 2005;330(7485):256. doi:10.1136/bmj.330.7485.256-a.

    Article  PubMed  PubMed Central  Google Scholar 

  13. 13.

    Becker JE, Krumholz HM, Ben-Josef G, Ross JS. Reporting of results in and high-impact journals. J Am Med Assoc. 2014;311(10):1063–5. doi:10.1001/jama.2013.285634.

    CAS  Article  Google Scholar 

  14. 14.

    Hoenselaar R. Saturated fat and cardiovascular disease: the discrepancy between the scientific literature and dietary advice. Nutrition. 2012;28(2):118–23.

    Article  PubMed  Google Scholar 

  15. 15.

    Sherwin BB. The critical period hypothesis: can it explain discrepancies in the oestrogen-cognition literature? J Neuroendocrinol. 2007;19(2):77–81. doi:10.1111/j.1365-2826.2006.01508.x.

    CAS  Article  PubMed  Google Scholar 

  16. 16.

    Glasziou P, Altman DG, Bossuyt P, Boutron I, Clarke M, Julious S, et al. Reducing waste from incomplete or unusable reports of biomedical research. Lancet. 2014;383(9913):267–76. doi:10.1016/S0140-6736(13)62228-X.

    Article  PubMed  Google Scholar 

  17. 17.

    Lu Z. PubMed and beyond: a survey of web tools for searching biomedical literature. J Biol Databases Curation. 2011;2011:baq036. doi:10.1093/database/baq036.

    Google Scholar 

  18. 18.

    Wang C, Blei DM. Collaborative topic modeling for recommending scientific articles, Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. San Diego: ACM; 2011. p. 448-56.

  19. 19.

    Huang Z, Chung W, Ong T-H, Chen H. A graph-based recommender system for digital library, Proceedings of the 2nd ACM/IEEE-CS Joint Conference on Digital Libraries. Portland: ACM; 2002. p. 65-73.

  20. 20.

    Hess C, Schlieder C. Trust-based recommendations for documents. AI Commun. 2008;21(2):145–53.

    Google Scholar 

  21. 21.

    Gipp B, Beel J, Hentschel C. Scienstein: a research paper recommender system. International Conference on Emerging Trends in Computing. 2009. p. 309–15.

    Google Scholar 

  22. 22.

    Beel J, Gipp B, Langer S, Breitinger C. Research paper recommender systems: a literature survey. Int J Digit Libr. 2015;17:305–38. doi:10.1007/s00799-015-0156-0.

    Article  Google Scholar 

  23. 23.

    CiteULike. Frequently asked questions. 2014. Accessed 4 Apr 2014.

    Google Scholar 

  24. 24.

    Lowe HJ, Barnett GO. Understanding and using the medical subject headings (MeSH) vocabulary to perform literature searches. J Am Med Assoc. 1994;271(14):1103–8.

    CAS  Article  Google Scholar 

  25. 25.

    Li S, Shin HJ, Ding EL, Van Dam RM. Adiponectin levels and risk of type 2 diabetes: a systematic review and meta-analysis. J Am Med Assoc. 2009;302(2):179–88.

    CAS  Article  Google Scholar 

  26. 26.

    Hristovski D, Peterlin B, Mitchell JA, Humphrey SM. Using literature-based discovery to identify disease candidate genes. Int J Med Inform. 2005;74(2):289–98.

    Article  PubMed  Google Scholar 

  27. 27.

    Cooper GF, Miller RA. An experiment comparing lexical and statistical methods for extracting MeSH terms from clinical free text. J Am Med Inform Assoc. 1998;5(1):62–75.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  28. 28.

    Daniel WW, Cross CL. Biostatistics: a foundation for analysis in the health sciences. 10 ed. Hoboken (NJ): Wiley; 2013.

  29. 29.

    Miller RG. Simultaneous statistical inference. New York: Springer; 1966.

    Google Scholar 

  30. 30.

    Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc Series B Stat Methodol. 1995;57(1):289–300.

  31. 31.

    Masters K. For what purpose and reasons do doctors use the Internet: a systematic review. Int J Med Inform. 2008;77(1):4–16.

    Article  PubMed  Google Scholar 

  32. 32.

    Capdarest-Arest N, Glassman NR. Keeping up to date: apps to stay on top of the medical literature. J Electron Res Med Libraries. 2015;12(3):171–81.

    Article  Google Scholar 

  33. 33.

    QxMD Software Inc. Read by QxMD. 2016. Accessed 12 Dec 2016.

  34. 34.

    Johnson T. Docphin. J Med Libr Assoc. 2014;102(2):137. doi:10.3163/1536-5050.102.2.022.

    Article  PubMed Central  Google Scholar 

  35. 35. About us. 2016. Accessed 12 Dec 2016.

  36. 36.

    Bennett NL, Casebeer LL, Kristofco R, Collins BC. Family physicians’ information seeking behaviors: a survey comparison with other specialties. BMC Med Inform Decis Mak. 2005;5(1):9.

    Article  PubMed  PubMed Central  Google Scholar 

Download references


The authors thank Lejla Hadzikadic Gusic for sharing her valuable perspectives on this topic as a practicing oncologist, Melanie Sorrell, the librarian in liaison to biological sciences and informatics subjects at the University of North Carolina at Charlotte, and Jennifer W. Weller, biological scientist, for her feedback on the manuscript.


Publication of this article was funded by LY’s startup funding at UNC Charlotte and Mayo Clinic, MN.

Availability of data and materials

The de-identified clinicians’ reading data from this study will be made available to academic users upon request.

Authors’ contributions

BR performed the data collection and data analysis. LY conceived the idea and supervised the entire project. BR and LY drafted the manuscript. XW offered advice on the study design and provided feedback on the manuscript. All authors read and approved the final manuscript.

Competing interests

The authors declare that they have no competing interests.

Consent for publication

Not applicable.

Ethics approval and consent to participate

In this study, we mined only publicly available information from CiteULike website, without interacting with, intervening, or manipulating/changing the website’s environment. The study does not include “human subject” data and is approved by the Office of Research and Compliance without IRB requirement at UNC Charlotte.

About this supplement

This article has been published as part of BMC Medical Informatics and Decision Making Volume 17 Supplement 2, 2017: Selected articles from the International Conference on Intelligent Biology and Medicine (ICIBM) 2016: medical informatics and decision making. The full contents of the supplement are available online at

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Author information



Corresponding author

Correspondence to Lixia Yao.

Additional file

Additional file 1: Table S1.

Country of residence distributions of the clinicians. Table S2. Publication type distributions of the articles read by the clinicians. Table S3. Times of the journals read by the clinicians in each medical specialty group. Table S4. List of major MeSH terms having significant different frequencies between clinicians’ reading libraries and a random sample Table S5. List of minor MeSH terms having significant different frequencies between clinicians’ reading libraries and a random sample. (XLS 313 kb)

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Ru, B., Wang, X. & Yao, L. Evaluation of the informatician perspective: determining types of research papers preferred by clinicians. BMC Med Inform Decis Mak 17, 74 (2017).

Download citation


  • Clinicians’ reading preference
  • Medical subject headings
  • Literature recommender systems