Bmc Medical Informatics and Decision Making Interactive Decision Support in Hepatic Surgery

Background: Hepatic surgery is characterized by complicated operations with a significant peri-and postoperative risk for the patient. We developed a web-based, high-granular research database for comprehensive documentation of all relevant variables to evaluate new surgical techniques.


Background
For many years hepatic surgery has been a research focus at the department of surgery of the university of Munich [1][2][3][4][5].
In the case of liver neoplasms surgery is highly complicated and difficult; it takes sometimes 10 hours to operate malignant liver neoplasms with a perioperative fatality rate as high as three to four percent, which is much higher than the usual operative risk.
Since six years there has been a research database covering in detail the conducted surgical procedures, relevant med-ical parameters of the patient and the postoperative course.
The primary goals of this documentation in the past and present is to collect detailed data about patient characteristics and outcomes to determine what kind of patients do benefit from specific procedures based on survival and complication rates. There is a demand from the surgeons to extend the capabilities from a pure research database towards a clinically integrated decision support system.
The database is currently being used by five surgeons, one medical student, one person of the administrative staff, one statistician, one computer scientist and finally by one anesthesiologist. In the future two radiologists will also be involved. All these users are scattered among the hospital. In this context of a highly interdisciplinary complex research topic our main focus is set on the following issues: • Is a knowledge base for this medical domain with an appropriate data structure and high-quality content feasible from a technical point of view?
• Does a decision support system for this medical domain provide clinically relevant information?

Concept of risk assessment tool
The basic idea of the risk assessment tool is to find similar cases to a given patient. The prognosis of the matching subjects is aggregated and taken as an estimate for the risk of the individual patient.
From a methodical point of view similarity search in complex, multidimensional records which are partially incomplete is a difficult problem [6]. To make it even worse, the problem is characterized by a high number of variables in relation to the number of records (we do have 451 items and altogether 766 patients).
For this reason the risk assessment tool is based on selected parameters which are known predictors for outcome of hepatic liver tumor resection [5]. A case is defined to be similar to the actual patient, if all predictive parameters correspond within a given level of tolerance.
The final decision on surgery is taken by the patient and the surgeon; thus both methodology and result of the analysis should be transparent and easy to understand. For this reason the risk is visualized as Kaplan-Meier plot [7], which is the established standard for visualizing survival data in medicine.
In addition, summarized data on all cases classified by the computer as similar to the actual patient is displayed to enable the physician to verify the analysis.
Cases which are inappropriate according to the physicians expertise can be excluded from the analysis in an interactive manner.

Database design
The software engineering approach was iterative. By means of regular user meetings and rapid prototyping after approximately 10 iteration cycles a suitable database structure was defined.
The following design objectives were identified: • high-granular documentation to enable detailed statistical analysis (e.g. association between laboratory parameters and outcomes) • reports on data quality (e.g. lost followup) • access from many locations within the clinic • different access rights for each user group (surgeons, anaesthetists, administrative staff, statistician)

Automatic generation of web programs
Data entry is performed with a standard web browser. A dedicated software tool (see [8,9]) has been developed for rapid prototyping of ergonomic, highly adaptive web forms and management of data transformations. It enables to define a data structure (e.g. database table) interactively. A preview of the forms can be generated and presented to the clinical user. When the data structure is defined, all PERL [10] programs and database tables are generated from templates, i.e. no line of code is programmed manually. The function of the tool is similar to the UltraDev™ extension of Macromedia Dreamweaver™ [http://www.macromedia.com/] , but is adapted to the needs of medical databases (e.g. specific templates).
The general documentation workflow is as follows: After login the patient is selected from a current list provided by the legacy system. The most recent document for this patient is displayed. The user can navigate within the documents, create new ones or edit the current page. The structure of each document is described in XML format [11].
The data structures themselves are created and edited with an Intranet based modelling tool. For each item a set of attributes is defined: Type of item (text, pulldown menu, checkbox, radio button, textarea, date, time), default values, constraints and layout. Each item has a unique object ID to enable data transformations when the data structure is updated.
The Intranet tool provides the following functions: • generation of custom data entry forms • combination of free text and structured data entry Security and confidentiality of patient information is protected by means of individual logins and a firewall.

Hard-and software
Both the database program and the risk assessment tool are written in PERL (version 5.005_03) [10] running on a Linux machine [12] providing Apache web server (version 1.3) [13] and a PostgreSQL database (version 6.5.3) [14].

Frontend and structure of the database
The database itself consists of eight tables (demographics, medical history, volumetrics, surgical documentation, histology, laboratory values, complications, outcome) with an overall number of 451 items indicating a high level of granularity which is required for this complex research topic. Information on 766 patients are recorded. Caused by the high number of items missing values cannot be avoided; this fact must be taken into account by the decision support component.

The interactive decision support component
The decision support component is a server side web application and is accessed using a standard web browser. The application itself is written in PERL invoked by an Apache web server. The knowledge base is stored in a Post-greSQL database which is located on the same server as the application and webserver. Standard techniques such as SQL are used to query the database via the appropriate interfaces.
After invoking the program by the web client a form is presented ( Fig. 2) which requires the user to provide demo-graphic data of the patient for whom a suggestion is needed. Clinically relevant parameters must be specified, which have shown to be predictive for patient outcomes (see [5]); for this reason we selected diagnosis, type of planned resection, PHRR (partial hepatic resection rate [15]), prothrombin activity (Quick, a blood clotting parameter) and gamma-GT (gamma-Glutamyltranspeptidase, a liver enzyme).
Sometimes not all parameters for an individual patient are available, e.g. estimation of the PHRR is very time consuming. The similarity search also includes datasets which have missing values. It is possible to specify a range for each of the numerical parameters.

Figure 1
Small section from the documentation of the surgical procedure. In addition to numerical and categorical items multimedia items (e.g. CT / MR scans) can be stored.
Because the number of documented cases is small in comparison to the quantity of possible parameter combinations, the algorithm was designed to avoid overlooking of similar cases (high sensitivity, low specificity). For example there are 35 different diagnoses resulting in approximately 22 cases under the assumption of equipartition of diagnoses within the 766 patients. Therefore all cases that are not contradictory to the query are considered similar.
After submitting the form to the system it connects to the database to retrieve the appropriate results based on the specified parameters and their ranges. The system then computes the data necessary for the Kaplan-Meier plot and generates a web page containing the plot and the underlying data (Fig. 3, Fig. 4).
The Kaplan-Meier plot presents the course of survival over time. It takes into account that the information per patient is limited with respect to time as long as the patient isn't dead. For instance, if a patient had surgery three years ago and is alive, this information can be used only until the third year of the plot. After then, this information is censored, because it is unknown, for how long this patient will be alive.
By simple clicking on a similar case the surgeon can go directly to the database and verify the source information. Then the physician decides, whether this case should be excluded from the analysis by selecting the 'exclude' checkbox. After exclusion of inappropriate cases the analysis can be recalculated. By means of this interactive technique the physician gets involved into the analysis and has the ability to verify and adjust it for an individual patient according to his expert knowledge.

Evaluation
To evaluate the decision support component we analyzed data provided by the clinical information system (CIS) concerning the period from January 1996 to September 2000.

Figure 2
Risk assessment form: Demographic data and five parameters can be entered. Diagnosis and type of resection can be selected on a pulldown menu. For the numerical data a range can be specified, which is applied to the similarity search. PHRR=partial hepatic resection rate; Quick=prothrombin activity (parameter of blood clotting); gamma-GT = gamma-Glutamyltranspeptidase (liver enzyme)

Figure 3
Kaplan-Meier-Analysis for HCC patients (HCC= hepatocellular carcinoma). The survival rate is based on the patient data presented in the next figure.
According to the CIS, within this time frame 3269 surgical procedures from 744 patients were performed covering diagnoses of malign and benign liver tumors. 165 of these patients were assigned HCC (hepatocellular carcinoma) as main diagnosis. These 165 patients are partitioned into 93 with no liver resection (conservative therapy, e.g. chemoembolisation) and 72 with liver resection. 21 of the 72 patients were subject to liver transplantation. The remaining 51 patients with HCC and liver resection without transplantation were grouped into 23 covered by the research database and 28 not covered by the research database. Therefore it is evident that a substantial number of patients treated in our hospital is not covered by the research database.
We used a batch version of our decision support system to calculate the Kaplan-Meier estimation for all 165 patients from the CIS diagnosed with HCC. Fig. 5 shows the distribution of the number of cases rated to be similar to a given patient. It has two peaks indicating there are groups of similar patients and singular cases which are quite different from the average. This may be influenced by the small number of reference cases in the database (range: 1 to 25 matched cases), but can also be interpreted that there are patients with exclusive features which are very different from "the average patient".
In the group of 51 patients who underwent liver resection we analyzed perioperative fatality rate (within 30 days) to assess whether there is a shift over time (e.g. due to medical progress) which would bias the survival estimation, but we did not find a significant change. In this group the risk assessment tool estimated a mean 1 year survival rate of 78+/-5% (mean +/-S.D.), which is consistent with an observed fatality rate of 20% (10 of 51 patients died within the first year after operation). This must be interpreted very carefully, because of the limited number of reference cases. We could not evaluate the survival estimations in the group of 93 patients without liver resection, because we do not have reliable information about the followup status of these patients and for this reason we do not know the correct fatality rate in this group.

Figure 4
Example of a similarity search (section): Selected clinically important characteristics for patients matching Fig. 2. By clicking on the case number all information concering a specific patient can be displayed. Individual patients can be excluded from the analysis and the Kaplan-Meier-Plot (Fig. 3) can be recalculated.

The problem of similarity search in medicine
The situation with the hepatic surgery dataset concerning number of items, number of records and data quality issues, i.e. many variables, limited number of patients and missing values, is typical for medical, patient-oriented research databases [16,17]. For this reason data mining and visualization techniques for large databases [18] cannot be applied easily.
Care must be taken in defining "similar patients" to avoid selection bias, either by too imprecise or too strict criteria. We decided to apply an interactive approach: the computer presents a list of similar patients, but the medical expert can exclude certain cases from the analysis. This provides a synthesis of formalized computer rules and empirical medical expertise.
One might argue why we applied quite few parameter to the similarity search. From a technical point of view, very few variables can split our dataset in manageable units, e.g. there are as many as 35 different diagnoses resulting in approximately 22 cases per diagnosis under the theoretical assumption of equipartition within the 766 patients.
From a clinical viewpoint, the risk assessment tool must be fast and easy to use -for this reason we applied 5 parameters, which have shown to be predictive for patient outcomes (see [5]).

Data monitoring
The research database provides a set of specific reports, e.g. the number of patients per diagnostic category or a list of patients with lost-followup (i.e. followup status is unknown for more than 6 months).
The new decision support component has also an impact on data quality, because when the list of similar cases is displayed it becomes evident, which parameters are implausible or missing. Given the direct link to the individual patient record missing data can be entered directly if it is available.
However, we did not verify whether all eligible patients were entered into the database, therefore a selection bias may occur.
By searching similar cases for very common or clinically relevant situations, targeted data monitoring is feasible, which can improve data quality in an efficient way and might lead to better medical decisions in the long run -it is hard to prove this hypothesis, but because this research database is the foundation for many clinical studies in this field better data quality will enable faster and more reliable scientific results.

Risk assessment
The decision about whether or not an individual patient is eligible for a surgical procedure is essential for the routine work of any surgeon. A standalone research database is not suitable for supporting this difficult task. For this reason we developed a tool to extract and visualize information from the research database relevant for an individual patient. This aggregation enables specific insights, tailored to a specific patient, which are not available by simple database queries.
A stepwise decision support is provided: First, the surgeon and the patient get assistance whether or not a surgical procedure is appropriate. The risk can be calculated specifically for each patient by established predictors of outcomes for this operation.
Second, the surgeon can select different procedure types (extent of liver resection) and analyze the risk.

Figure 5
Histogram describing the distribution of similar cases per patient. For 165 patients with HCC (hepatocellular carcinoma) we calculated the number of cases which are rated as similar to the individual case by the risk assessment tool. The similarity search provides a two-peak distribution indicating there are groups of similar patients and singular cases which are quite different from the average. This may be caused by the small number of reference cases in the database, another explanation might be that there are patients with exclusive features which are very different from "the average patient".
Third, the clinical decision is stored into the database to enable statistical analysis for future improvements of the system.
The evaluation based on retrospective data of the clinical information system provides evidence, that the risk assessment tools delivers clinically relevant information. However, we did not quantify the potential clinical impact of our system; in this case, a controlled trial would be required.
So far surgeons have made decisions based on their personal expertise and intuition -we want to provide additional, objective information to facilitate this difficult task. The patient also benefits from this comprehensive information.

Visualization and aggregation of medical information
Our approach is visualization by means of the Kaplan-Meier plot which is -in the field of medicine -the established method for displaying survival data. The surgeon and furthermore the patient can get an idea of the risk of the surgery supporting both of them in their decision whether or not to perform the surgery. The benefit of the Kaplan-Meier plot compared to complex decisions trees [19] is obvious. A decision tree is often hard to understand especially for persons not familiar with this kind of representation. However, due to limitations in sample size and length of follow-up, the Kaplan-Meier plot must be interpreted carefully.
There is no special training needed to introduce the risk assessment tool. The surgeon isn't influenced in his opinion by a unary suggestion presented by the system. But his decision process is supported by data of documented surgeries. In addition Kaplan-Meier plots for certain patient groups can be generated for scientific purposes, i.e. without regard to an individual patient.

High-granular database design
One might argue why we store so many different variables (451 items per patient) in our database. For answering a specific research question (e.g. which of two surgical procedures is more efficient?) much less items were sufficient. But due to medical progress, for example the invention of the jet-cutter [3], new scientific questions arose in the past and very probably will arise in the future. Retrospective data collection is not only expensive, but also prone to errors and in many cases the data cannot be reconstructed precisely. For this reason we decided to implement a comprehensive medical record for liver resection patients, to prospectively document all potentially relevant items. An alternative approach would be a smaller data model, which would be limited to a specific research question, but could provide more complete data.
With the risk assessment tool we want to bridge the gap between a pure research and a clinical database and by this means improve the data quality.

Success factors for clinical decision support
There are not many, but some relevant examples for effective clinical decision support, e.g. the classical "Computer reminders" of McDonald [20] and recent applications concerning efficient use of antibiotics [21,22]. We tried to implement the following common characteristics in our system:

• Clinical integration
Clinical decisions are and should be taken in the forseeable future by doctors and their patients, not by machines. For this reason the computer has the role of enabling the physician to find the correct conclusions. In this setting, any system must provide a comprehensible benefit for the clinical user. Without workflow integration, the data in the computer will not correspond to the medical reality around it and any system will not be able to provide clinically useful information.
• Transparent, up-to-date knowledge base A "black box" approach is inappropriate in the medical setting, because many decisions are ambiguous. The medical evidence behind a specific statement must be available on demand, to enable the doctor to verify the rationale behind it. In our case, we provide the user the opportunity to view the full record of a case rated as similar to the actual patient. As a side effect of this approach errors in the knowledge base are detected spontaneously.

• User feedback
Neither physicians nor computer scientists know an appropriate data model for a specific medical domain in the beginning. From our experience, in a theoretical setting clinically adequate data structures cannot be defined, thus an iterative approach in software engineering is required. User feedback is important not only for fine tuning of data structures, but also for general acceptance, because the physicians get involved. By collecting the real decisions taken in a particular case we want to gather information on the clinical use of the system.

Future directions
As stated earlier, a major limitation of medical decision support in general and specifically with regards to the hepatic surgery database is the limited number of fully documented patients. Even at a major university hospital only about 100-150 operations are performed per year. Given the many possible parameters which might influ-ence patient outcomes and the progress in surgery, interinstitutional cooperation is required to build a comprehensive knowledge base. Internet technology can provide the technical platform for this collaboration. Once the knowledge base and the update mechanisms for the content are established, a public accessible decision support tool is feasible.