An information model for computable cancer phenotypes

Background Standards, methods, and tools supporting the integration of clinical data and genomic information are an area of significant need and rapid growth in biomedical informatics. Integration of cancer clinical data and cancer genomic information poses unique challenges, because of the high volume and complexity of clinical data, as well as the heterogeneity and instability of cancer genome data when compared with germline data. Current information models of clinical and genomic data are not sufficiently expressive to represent individual observations and to aggregate those observations into longitudinal summaries over the course of cancer care. These models are acutely needed to support the development of systems and tools for generating the so called clinical “deep phenotype” of individual cancer patients, a process which remains almost entirely manual in cancer research and precision medicine. Methods Reviews of existing ontologies and interviews with cancer researchers were used to inform iterative development of a cancer phenotype information model. We translated a subset of the Fast Healthcare Interoperability Resources (FHIR) models into the OWL 2 Description Logic (DL) representation, and added extensions as needed for modeling cancer phenotypes with terms derived from the NCI Thesaurus. Models were validated with domain experts and evaluated against competency questions. Results The DeepPhe Information model represents cancer phenotype data at increasing levels of abstraction from mention level in clinical documents to summaries of key events and findings. We describe the model using breast cancer as an example, depicting methods to represent phenotypic features of cancers, tumors, treatment regimens, and specific biologic behaviors that span the entire course of a patient’s disease. Conclusions We present a multi-scale information model for representing individual document mentions, document level classifications, episodes along a disease course, and phenotype summarization, linking individual observations to high-level summaries in support of subsequent integration and analysis.


Background
Our ability to deeply investigate the cancer genome is outpacing our ability to correlate genetic changes with the phenotypes that they produce. Advances in tumor genomic profiling allow for the possibility of detailed molecular classification of cancers, potentially including whole exome or whole genome sequences derived from multiple tumor locations and peripheral blood, collected at multiple time points during tumor progression.
However, methods and tools for linking these rich genomic data to relevant clinical information remain quite limited. Many key phenotypic variables in cancer including tumor morphology (e.g. histopathologic features), laboratory findings (e.g. gene amplification status), specific tumor behaviors (e.g. metastasis), and response to treatment (e.g. effect of a chemotherapeutic agent on tumor volume) are available only in clinical notes, or are fragmented across multiple data sources.
To better serve translational researchers, new techniques are needed to extract and represent these phenotypes from electronic health record (EHR) data. The set of features representing the clinical expression of the disease over time can be defined as the deep phenotype -"the precise and comprehensive analysis of phenotypic abnormalities in which the individual components of the phenotype are observed and described for the purposes of scientific examination of human disease" [1]. Cancer deep phenotypes will integrate data from both structured and unstructured clinical records, as well as patient reported measures, to form longitudinal models of each patient's course.
The long-term goal of our research is to develop a generalizable computational infrastructure that will facilitate the extraction, manipulation, and use of these deep phenotypes, combining them with genomic data to drive discovery and precision medicine. As a first step towards this goal, we present an information model for cancer phenotypes, derived from translational cancer research programs and validated by cancer researchers working in three domains: breast cancer, ovarian cancer, and melanoma. We extend and complement evolving FHIR models [2], defining cancer-specific extensions for describing tumors, treatments, metastases, recurrences, and other key factors, at levels of abstraction varying from specific mentions in clinical notes to distinct episodes of care (e.g. staging, treatment, and follow-up) to summative descriptions of patients.
The information model will play a key role in a National Cancer Institute (NCI)-funded collaboration to develop new methods for extracting, representing, and visualizing cancer deep phenotypes. In the future, we expect to use these models to provide the foundation for expressive and interactive phenotype exploration tools [3] supporting cohort identification and analysis for cancer research.

Cancer deep phenotype extraction
Extraction and representation of cancer phenotypes is typically a manual curation process. For specific cancer diagnoses, hospital and state cancer registries provide retrospective manual abstraction of clinical observations including outcomes and some phenotypic attributes. However, cancer registries often lack treatment and recurrence information critical for addressing retrospective research questions [4]. Consequently, a major effort of many NCI designated Cancer Centers, NCI Specialized Programs of Research Excellence (SPOREs), and Cancer Cooperative Groups has been to obtain detailed, structured phenotypic data [5,6]. The collection of TCGA clinical data is a well-known example of a cancer data requiring manual abstraction for phenotype representation [7]. The TCGA dataset includes data from more than 100 institutions contributing structured phenotype data along with biomaterials for high throughput molecular classification on over ten thousand cancer cases.
Previous work in a number of NIH-funded translational science initiatives, such as eMERGE, has demonstrated the benefits of natural language processing (NLP) methods for cohort identification in both genome-wide [8][9][10] and phenome-wide [11] association studies. However, these initiatives have focused almost exclusively on non-cancer phenotypes, and have had the goal of dichotomizing patients for a particular phenotype of interest (for example, Type II Diabetes). Less focus has been given to identifying specific key variables such as response to treatment and extent of disease, or to extracting and representing the temporal aspects of disease progression and treatment.
Our ongoing work on natural language processing (NLP) systems provides important experience relevant to computable cancer phenotypes. The TIES project applies NLP techniques to the extraction of cancer phenotype data from clinical notes [12,13], but the resulting models lack necessary granular phenotype detail and summarization over time. The cTAKES system has also been used for annotation of a variety of cancer specific variables and has the advantage of annotating temporal expressions and relations [14][15][16][17][18][19][20], but similarly focuses on the extraction of mentions within documents, lacking a phenotype level representation.

DeepPhe cancer information model
Our goal was to build a cancer information model to provide a series of progressively more abstract representations suitable for aggregating individual observations from clinical text or structured data into summarizations of Documents, Episodes, and eventually individual patient Phenotypes [21]. For example, multiple mentions of a chemo-therapeutic agent in a single clinical note (and corresponding medication administration record) might be combined to form a Document summary indicating the specific drugs and dosages. Several documents with similar records occurring over several weeks might be further summarized as a treatment Episode, with still further summarization listing the set of agents as a Treatment Regimen, associated with adjuvant therapy for the primary tumor, and producing radiographic evidence of response as part of the Tumor Phenotype. Mention, Document and Episode represent fundamentally different levels of abstraction, all of which must be considered to accurately assess the Phenotype, when inferring from clinical data. The DeepPhe information model includes these multiple levels of representation, along with provenance information linking the higher-level abstractions to their underlying individual statements [22], as necessary for verifying the accuracy of summarized information. Our initial implementation operates on entities extracted via the cTAKES NLP system, providing both a functional implementation and a demonstration of how this approach might be adapted to work with other NLP tools.
Because the sequence of events can influence the resulting phenotype, information models must provide informative representations of the temporal relationships between events. Previous efforts have proposed temporal models for clinical events [18,[23][24][25][26]. Recent projects, including SemEval clinical TempEval [27] and THYME provide insights into the automated annotation of temporal events, expression and relations [14,28], which can support more sophisticated temporal reasoning. Ideally, temporal cancer phenotype models will facilitate the aggregation of such detail from individual healthcare encounters into abstractions corresponding to key epochs in cancer care such as diagnosis, surgery, treatment, and progression [29].
In the spirit of community efforts such as the OBO-Foundry [30], information models should build upon existing community standards and models wherever possible. Relevant efforts specific to cancer include the NCI Metathesaurus and NCI Thesaurus [31] as well as the Cancer Data Standards Repository (caDSR) [32]. Ontological efforts such as the Human-Phenotype ontology [33] and the Disease Ontology [34] provide well-organized terms and relationships for individual phenotypes, however, they do not provide the structure necessary for creating detailed descriptions of individual patients. Clinical element models (CEMS) [35][36][37][38] and the emerging Fast Healthcare Interoperability Resources (FHIR) [39][40][41] provide the necessary structure for representations but have thus far been focused on low-level elements and have not been used to develop summarizing abstractions for phenotypes. In this manuscript, we present a cancer deep phenotype information model that builds on underlying standards and terminologies to meet these and related requirements (Table 1).

Methods
The development of the cancer phenotype models involved sequential steps consistent with recently published process models for clinical information model development [42], including (1) review of prior schemata, (2) development of guiding requirements (Table 1), (3) interviews with domain experts, (4) selection of an appropriate standard and/or formal method framework, and (5) iterative model development, validation, and review ( Fig. 1). We also used descriptions of user personae to inform model development. Additional methodological details can be found at https://github.com/ deepphe/models/wiki.

Selection of modeling framework
We reviewed existing models of clinical and biomedical data to identify formalisms for representing cancer deep phenotypes, modeling languages with appropriately expressive semantics, and vocabularies sufficient for communicating details of cancer diagnosis and treatment. We used this review to develop a list of requirements for our information model, including use of appropriate terminology providing required coverage of cancer concepts (Requirement R1); flexibility and extensibility (R2); availability of tooling including validators and application programming interfaces (APIs) (R3); the possibility of using community input to drive the development and evolution of our models (R4); easy integration with existing NLP tools (R5); the need to support both structured and unstructured data (R6); modeling at multiple levels of granularity (ranging from text spans in documents to patient-level summaries) (R7); and inclusion of provenance linkages between individual data items and higher-level summaries (R8). We evaluated four possible formalisms against these requirements, including clinical element models [35][36][37][38], caDSR information models [32,43], OBO-Foundry biomedical ontology models [30] including the entity + quality framework [44], and FHIR [39][40][41].
FHIR offers significant strengths which include detailed schemata suitable for validation, reference implementations (R3), an extensive collection of software designs and tools, including proposed extensions to handle the inclusion of genomic information in EMRs [39], and an active community of developers (R4). The FHIR XML definitions are also easily convertible to a format compatible with the type system used by the cTAKES NLP Suite [45] (R5), which we plan to use to extract deep phenotypes from clinical notes. FHIR's models of observations, diagnostic reports and medications are well suited for representing available structured information (R6). For these reasons, we selected FHIR as the underlying modeling formalism for our cancer phenotype models. Our model development included the extension of FHIR resources to model cancer concepts such as tumors (R1, R2), as necessary for the required multi-level representation of cancer phenotypes (R7). Provenance relations between levels in the model enumerate the linkages between lower-level details and more abstract summaries (R8).
We considered the NCI-Thesaurus [31] and OBO-Foundry [30] ontologies as candidate cancer vocabularies. Although there is some coverage of cancer-related phenotypes, both in broader ontologies such as the disease ontology [34] and the human-phenotype ontology [33] and in some domain-specific cancer ontologies [46,47], we were not able to identify OBO ontologies that provided the detailed phenotype entities and attributes needed to represent the subtleties inherent in cancer progression and treatment. The NCI Thesaurus [31] was therefore chosen as the richest available set of curated cancer terms and concepts.
Our models are based on a translation of FHIR structure to an OWL 2 Description Logic (DL) representation [48]. OWL offers several advantages aligned with the goals of the DeepPhe project, including a semantic infrastructure suitable for representing both structured and unstructured data (R6); constraints appropriate for many of the domain-specific requirements of cancer modeling (R1, R2); the availability of reasoners and rule systems needed for managing summarization (R3); and the potential for compatibility with community ontology processes (R4, R5), especially those linking phenotypic and genomic information [25,37,49]. OWL also provides for the possibility of incorporating data provenance references (R6) [22].
Our next steps include expanding our model to include additional details necessary for the representation of cancer phenotypes for ovarian cancer, and malignant melanoma using data from interviews already collected, and then to add additional models for other solid tumors using the same basic methods. We also plan to align our OWL representations with ongoing community efforts to develop a FHIR representation. Although these efforts began in the fall of 2014 [50], community proposals were not complete at the time of this writing. We will align with HL7/W3C models for FHIR in RDF as they progress toward community consensus.

Construction of draft models
Development of initial draft models was based on an exploration of existing models from prior efforts, and discussion with collaborators. Cancer specific attributes and corresponding terminologies and value sets were developed based on existing data models provided to us by multiple groups of collaborating cancer researchers. In a parallel process, we reviewed the emerging FHIR model definitions [2] to identify resources appropriate for modeling basic clinical content (medications, procedures, observations, etc.). As compatibility with existing NLP systems was a key goal, we also examined FHIR models in the context of existing elements in the cTAKES model [45,51], developing prelimimary mappings sufficient for using cTAKES to populate FHIR models. The cTAKES model is based on the SHARP secondary use Clinical Element Models [38]. Published models of care trajectories [29] informed the development of models for episodes. Abstract classes summarizing phenotypes, tumors, and cancers were developed through graphical concept maps and refined through a series of design discussions.
Input from domain experts informed the selection of candidate models and model attributes including information related to diagnosis, staging, biomarker status, adequacy of surgical resection, therapy, outcome, and other cancer-specific factors. Initial drafts were produced by manual merging and mapping of multiple information models, data dictionaries, and spreadsheets obtained from each domain group. Outcomes of this process were collected in a spreadsheet grouped by content area (e.g.

Domain expert interviews
We conducted two different types of interviews to separately capture the process and content constraints for the models. For process, we conducted open-ended interviews using a modification of the Beyer and Holtzblatt Contextual Inquiry method [52]. For content, we conducted information modeling interviews that included card sorts of potential data elements. Information modeling interviews were conducted with funded collaborators; contextual inquiry interviews were classified as exempt by the University of Pittsburgh Human Research Protection Office (PRO13120154).
Contextual Inquiry Interviews with cancer researchers facilitated understanding of information needs, workflows, and practices to identify cohorts and related phenotypes to molecular characteristics. In interviews conducted by author HH, all participants were asked to describe their research goals and questions, and to either directly illustrate (when possible) or describe their use of informatics tools to meet those goals. Interviews were audio-coded and reviewed to extract descriptions of information needs, processes, and challenges [53]. Information needs identified through these discussions were used as input to the model development processes and to the development of competency questions; discussions of processes and challenges contributed to the development of work models that will inform the design of planned analytic tools (to be reported in a future publication).
Information Modeling Interviews with project collaborators involved in cancer research provided insights into the necessary content, relative importance and need for information extraction methods. For each of three cancer types (breast, ovarian, and melanoma), we separately interviewed one or more translational researchers actively engaged in using clinical data. For each domain, we also interviewed one or more data managers or abstractors, who were primarily responsible for obtaining clinical data from various sources including EMRs. In interviews conducted by author RJ, each participant was provided with the complete set of index cards representing all data elements in the draft model and asked to prioritize them on two axes.
For the first axis, they were asked to sort the cards based on whether they considered any given data element to be important information for (a) their own research, (b) for the research of colleagues, or (c) not important. They subsequently prioritized group (a) into those that were very important and somewhat important. For the second axis, they were asked to resort cards in group (a) and (b) based on whether they typically obtained such data from (a) structured electronic sources, (b) unstructured electronically available sources, or (c) unstructured sources not amenable to automated processing (e.g. paper charts, PDF documents).
Individuals completed either both card sorts or only one card sort based on their roles on the research team. We also asked each participant to add data elements that were important to the research team, but were not represented in the card set, and to include these additional data elements in both sorts. Throughout the interview process, the interviewer and participant engaged in an ongoing refinement of the meaning and importance of various data elements. Cards were marked during the interviews to capture prioritization on both axes. Interviews were captured on audio recorder, and transcribed verbatim. Transcriptions and prioritizations were analyzed to further identify, refine and categorize data elements, value domains, valid values (for enumerated elements), priority, and availability. The results were used to guide revision of the models.

Model revision
Initial models were refined through an iterative process involving both domain expert feedback and review against sample clinical notes. Data elements, relationships, and value sets were revised based on feedback from the card sorting activities conducted during the information modeling interviews. Values and value domains provided by researchers were included. We manually populated instances of candidate models using a sample of de-identified clinical notes from cancer patients and compared those models to the original notes to verify sufficient expressivity. This review identified both new data elements and new linguistic modifiers describing negation, hedging, temporality, and other qualifiers for inclusion in the value sets. These items were added to the model and the process was repeated with an additional set of documents.

Model validation
Finally, candidate models generated from this sequence of activities were presented to domain experts to validate that the information of interest in a set of reports could be represented accurately. The process was conducted as a presentation by the original information modeling interviewer. For each major part of the domain model, we presented the expert with example text, highlighted to show entities and their relationships in tandem with the associated representation. For example, we depicted a cancer Phenotype as the sum of information deriving from the Primary Tumor as well as the subsequent Metastatic Tumor. For each major modeling decision, we also asked experts to comment on the appropriateness of this method for the specific example. For example, we defined an initial set of Episode types corresponding to important intervals in a patient's Disease Course and had the experts review them to confirm that they were correct. Results informed the final candidate models which were then prepared for release.

Informant interviews
A total of 13 interviews were performed with domain experts, including 6 contextual interviews and 7 information modeling interviews. Interviews were performed with translational researchers and their staff working in the areas of breast cancer, ovarian cancer, and melanoma between October 2014 and August 2015. Participants included principal investigators, research fellows, and clinical data managers. Interview lengths ranged from approximately 1 to 2 h.
For information modeling interviews, the total number of data elements considered in the card sort was 137 (breast cancer), 86 (ovarian cancer) and 97 (melanoma). Of the total, 16 (breast cancer), 15 (ovarian cancer) and 25 (melanoma) new data elements were added by the participants.

Prioritization
For the breast cancer model, informants prioritized 101/ 137 data elements as specifically important to them, 35/ 137 data elements as potentially important only to other researchers but not to themselves, and 1/137 data elements as not important to themselves or other researchers. For the ovarian cancer model, informants prioritized 81/86 data elements as specifically important to them, 4/86 data elements as potentially important to other researchers but not to themselves, and 1/86 data elements as not important to themselves or to other researchers. For the melanoma model, informants prioritized 86/97 data elements as specifically important to them, 9/97 data elements as potentially important to other researchers but not to themselves, and 2/97 data elements as not important to themselves or to other researchers.

Availability
Of the total data elements, research staff currently tasked with collecting this data indicated that the large majority of these data elements could only be manually abstracted at present. This included 112/137 data elements for breast cancer, 79/86 data elements for ovarian cancer and 90/97 data elements for melanoma which are currently and routinely abstracted from free text electronic medical records. Structured data is available for only a small number of data elements in each model.

Overlap among individual models
Models for breast cancer, ovarian cancer and melanoma contained significant overlap with 52 data elements shared by all three models. These included key variables such as tumor stage, treatment, and outcome. In contrast, 129 data elements were unique to a specific domain. These included specific types of somatic mutations (e.g. BRAF), germline mutations (e.g. BRCA1), biomarkers (e.g. CA125), risk factors (e.g. UV exposure) prognostic features of the tumor (e.g. tumor infiltrating lymphocytes), and specific clinical features (e.g. associated ascites). Figure 2 provides an overview of the information model including the four constituent levels: Mention, Document, Episode, and Phenotype.

Mentions (Level 1)
Mentions are represented using the cTAKES type system, which provides an interoperability standard based on the SHARP secondary use Clinical Element Models. This data provides essential building blocks for higher-level summarization of Documents, Episodes and Phenotypes. For temporal representation, we reuse entities articulated in the cTAKES temporal module, including events, document time relations (DocTimeRel) that build on the classic Allen temporal relations [54], and the notion of temporal containers.

Composition (Level 2)
Individual mentions and their relations are combined into a composition model representing all details from an individual clinical note. As an example, Document 1 in Fig. 3 includes multiple mentions of a Mass that is summarized into one event detail in the corresponding composition model. Data captured in Level 1 cTAKES types can be transformed to Level 2 FHIR resources which are aggregated to create FHIR compositions (R5), and stored as event details (R7). Resources selected as an initial subset of the FHIR Data Standard for Trial Use 2 include Condition, Patient, Observation, BodySite, Procedure, and MedicationStatement, which were sufficient for modeling a large number of concepts extracted from clinical text via NLP. OWL classes from an existing NLP information extraction schema [55] were modified to represent this subset of FHIR resources.

Episodes (Level 3)
At the Episode level, we model specific disease-relevant intervals with expected key events, as Episodes within a Disease Course, extending previous work on cancer trajectories [29]. Events are contained and ordered within these disease-relevant episodes. Episodes are hierarchical (containing other episodes), may overlap, and include start and end dates as well as start and end events. For example, a Primary Tumor episode is composed of constituent episodes (phases) including a Diagnostic episode. The diagnostic episode begins with the presentation of a complaint, symptom, or sign that initiates a diagnostic workup, and ends with a pathologic, laboratory, or radiologic diagnosis of a new Cancer. Episodes are defined as extensions of the FHIR Bundle class and can be ordered to form an abstracted timeline of a patient's Disease Course (Fig. 4). Thus, events from Documents 1, 2 and 3 in Fig. 3 (e.g. mammogram, mass, needle biopsy, invasive ductal carcinoma, T1, N0, M0, BRCA status) are classified as belonging to a Primary Tumor episode in the Diagnostic phase whereas the events from Document 4 (e.g. MRI, enhancing lesion, metastatic carcinoma) are classified as belonging to a Metastatic Tumor episode in the Diagnostic phase. Extraction of subsequent entities, events, and relations can thus be conditioned on the unique context of the disease-specific episode. Visualization and search methods can also leverage the context to return more relevant results.

Phenotypes (Level 4)
At the phenotype level, we model abstractions of key variables over time. Disease indicators are grouped  Table 2. Phenotypes also include comorbidities, including those that are relevant to cancer. Additional classes describe Treatments, Outcomes, and Germline Sequence Variations and Tumor Sequence Variations. Wherever possible, phenotype level entities are defined within existing biomedical ontologies, favoring those developed using OBO principles. For example, Germline Sequence Variation links to the Sequence Ontology class, sequence_variant, by referencing the class id (SO:0001060) in the rdfs: seeAlso annotation property.
Linkages between mentions, documents, and episodes are accomplished through provenance extensions to the FHIR resources (R8). Each higher-level Fig. 3 Example patient records and their representation as compositions resource refers to one or more lower-level resources, using the "prov:wasDerivedFrom" relationship from the PROV provenance ontology [22]. The transitive closure of these relationships, along with direct relationships between concepts such as Cancer and Tumor, will form a complete derivation path for the abstracted models.
The DeepPhe model is defined in publicly available OWL files [56] distributed under a creative commons Attribution International 4.0 license. Readers are encouraged to use GitHub code control and issue-tracking tools for the model repository to provide comments, suggest enhancements, and explore potential extensions and adaptations to the models (R4).

Use of the model for phenotyping
Construction of a patient phenotype is envisioned as a multi-step process. Currently we leverage the models described above (1) to produce dictionaries for a new concept recognition component of cTAKES [45] using the NobleCoder concept recognition tool [57], and (2) as the knowledge representation for developing phenotyping rules capable of combining individual observations from the mention level into appropriate instances of data elements at the phenotype level. Initial rules validating the approach were developed using the Semantic Web Rule Language [58], with subsequent rules implemented in the Drools system [59] (Fig. 5). In the future, we will extend our DeepPhe NLP pipeline to (3) leverage the  (4) interdigitate EMR data, cancer registry data, and data derived from NLP pipelines as input to the phenotyping rules.

Model validation
Both contextual inquiry and information modeling interviews (including card sorts) were used to develop a set of competency questions [60] suitable for validating the resulting cancer phenotype models. Competency questions reflect prototypical questions that might be asked by cancer investigators. Resulting questions encompass the identification of patients with specific clinical profiles, potentially including temporal relationships between sentinel events; comparison of patients by cohort; integration of information across multiple sources; and identification of available information. Sample competency questions are given in Table 3; full details are available at the project website [56].

Discussion
Proponents of "deep phenotyping" argue for the importance of detailed phenotypic descriptions -generally in a computable form-as prerequisites for finely-grained analyses that stratify patients into previously unknown classifications, thus enabling more precise investigation and characterization of human disease [1,[61][62][63][64]. This is also a key goal of precision medicine, which will require much more sophisticated analysis of patient data to derive meaningful features for classification and prediction.
Achieving both of these goals will require advances in extraction of key details from patient records and also in assembling those details into sufficiently expressive and flexible representations. Although previous efforts such as eMERGE have shown the potential of large-scale extraction of phenotypic information from both structured and unstructured data sources [9], the resulting classifications have typically been dichotomous, describing patients in terms of the presence or absence of one or more specified diseases. More detailed models are needed to build phenotype descriptions that capture the inherent complexity and diversity in the manifestations of human disease. Specifically, these models must convert individual facts and observations into computable phenotypes, describing patients at a level of granularity appropriate for interpretation of individual cases, comparison between cases, cohort selection, and hypothesis generation through exploration of large datasets.
The DeepPhe information model builds on entities that can be extracted from structured or unstructured data in medical records and aggregated into individual documents, episodes, and eventually into high-level phenotypic descriptions. Provenance linkages tying higher-level phenotypic representations to constituent observational details provide audit trails suitable for verifying the abstractions, while also enabling analyses to move between levels of abstraction as necessary for specific tasks.
The use of the HL7 FHIR data model bridges the gap between two key applications of patient data: data resulting from direct clinical care and secondary use of clinical data for research purposes. Although FHIR was clearly developed to meet clinical interoperability needs, the simple, well-documented designs of FHIR resources and data types, particularly including extension mechanisms, simplified the process of developing phenotype abstractions needed for translational research. This approach provides a model for adapting FHIR to support secondary use of clinical data, similar to efforts that use FHIR to integrate genomics into clinical records [39].
Our development of cancer-specific attributes and value sets raised familiar design issues such as preand post-coordination of descriptors [36] and differences in domain perspectives. As definitive answers to these questions are often not possible, our modeling efforts relied on a combination of pragmatism and reference to existing best practices. For example, biomarker test results (e.g. Estrogen Receptor, Progesterone Receptor and Her2Neu Receptor status) presented a challenge, as initial attempts to pre-compose testing methods, marker, and interpretation led to an unwieldy number of potential combinations. A postcomposed model was chosen instead, with the understanding that equivalence classes would be added as needed.  Genetic and molecular descriptors present additional challenges for cancer modeling. We have included classes for both sequence variations and molecular manifestation resulting from those variants, with relationships between associated classes as necessary. Userfacing tools based on these models might choose to combine these factors if necessary to align with the preferences of users in specific domains. We have chosen to adopt an initial model of structural variation that is significantly less detailed than the GeneticObservation suggested by the SMART on FHIR Genomics effort [39]. Subsequent evolution of our model will align more closely with this effort.
Our multi-level modeling approach identified-but did not fully resolve-modeling challenges associate with multiple uses of terms such as "tumor" or "cancer". At the mention level, these terms refer to specific statements from clinical notes, while at the phenotype level they refer to abstractions of complex pathophysiologic events. Thus, we have included entities in our representation at both Level 1 and Level 4 that share the same names, although they refer to different conceptualizations. This duplication was deemed preferable to the creation of alternative terms. Our cancer information model is presented here as a computable representation of longitudinal phenotype and treatment data at multiple levels of abstraction. Realizing the complexity and diversity of cancer treatment and disease progression, we do not expect that this approach is in any way final or definitive. We plan to engage an even broader range of informatics from the cancer research community to identify use cases suitable for extending the expressivity of the model and for guiding any necessary revisions. Extension of the models to handle non-solid cancers (e.g. leukemias) would be particularly useful for validation of the overall approach. Feedback can be provided at the project GitHub page, https://github.com/deepphe/models. We also plan to work with systems developers, cancer researchers engaged in complementary computational approaches such as tumor growth simulations [65,66], and with related efforts such GA4GH [67], PhenoPackets [68], and others developing phenotype models [8] to identify additional use cases and opportunities for encouraging broader adoption and use of common methods and standards.

Limitations
Although our modeling process combined both extensive interviews aimed at eliciting information needs from cancer researchers, review of relevant guidelines and standards for cancer care, and consideration of competency questions, our models are not expected to be generalizable to all use cases, both in terms of specific modeling decisions and scope of relevant concerns. We anticipate potential revisions to the model to accommodate the practicalities of working with functional NLP tools. Finally, FHIR RDF representations from the HL7/ W3C community are evolving and (as of this writing) not fully complete. It is possible that completion of these efforts may lead to the adoption of some RDF models that are not directly compatible with our proposed information models. If any such inconsistencies arise, we will endeavor to ensure that subsequent revisions to our model are compatible with community-developed FHIR/ RDF models to the greatest extent possible.  (5) and transforming them into a summary representation (6). This rule indicates that the value of a FISH test will take precedence over results of an IHC test. This rule is given in English (7), SWRL (8), and Drools (9)

Conclusion
Improved phenotypic descriptions of cancer and patient phenotypes are needed to advance translational research [69], quality care measures [4], and precision medicine [70]. We illustrate the potential benefits of using FHIRcompatible models, and offer a foundation suitable for extension to other domains. We present a multi-level information model designed to support capture of cancer clinical data at multiple levels, from specific mentions in clinical texts, to summarization at increasingly higher levels of abstraction including documents, episodes, and phenotypes. The model is designed to be used by computational systems that extract these representations. Our model also provides an early example of rich representational models for deep phenotypes [69], suitable for adaptation, generalization, and community comment.

Funding
This work was supported by NCI grant 1U24CA184407.
Availability of data and materials OWL versions of the information models; descriptions of interview protocols; models of stakeholders and users; competency questions; and other supporting documents can be found at https://github.com/deepphe/ models/wiki. Raw data and other materials will be made available upon request to the corresponding author.
Authors' contributions HH conducted the contextual inquiry interviews and analyzed resulting data; researched modeling frameworks; participated in developing the information models; and co-led the writing of the manuscript. MC created the OWL models and coordinated the finer details of model building. DH contributed to discussions of model development. GS provided feedback on the data models and their integration with cTAKES. RJ conducted the information modeling interviews and analyzed the resulting data; participated in developing the information models, and co-led the writing of the manuscript. All authors read and approved the final manuscript.