This article has Open Peer Review reports available.
PKreport: report generation for checking population pharmacokinetic model assumptions
© Sun and Li; licensee BioMed Central Ltd. 2011
Received: 29 October 2010
Accepted: 16 May 2011
Published: 16 May 2011
Graphics play an important and unique role in population pharmacokinetic (PopPK) model building by exploring hidden structure among data before modeling, evaluating model fit, and validating results after modeling.
The work described in this paper is about a new R package called PKreport, which is able to generate a collection of plots and statistics for testing model assumptions, visualizing data and diagnosing models. The metric system is utilized as the currency for communicating between data sets and the package to generate special-purpose plots. It provides ways to match output from diverse software such as NONMEM, Monolix, R nlme package, etc. The package is implemented with S4 class hierarchy, and offers an efficient way to access the output from NONMEM 7. The final reports take advantage of the web browser as user interface to manage and visualize plots.
PKreport provides 1) a flexible and efficient R class to store and retrieve NONMEM 7 output, 2) automate plots for users to visualize data and models, 3) automatically generated R scripts that are used to create the plots; 4) an archive-oriented management tool for users to store, retrieve and modify figures, 5) high-quality graphs based on the R packages, lattice and ggplot2. The general architecture, running environment and statistical methods can be readily extended with R class hierarchy. PKreport is free to download at http://cran.r-project.org/web/packages/PKreport/index.html.
The application of population pharmacokinetic (PopPK) modeling in the drug development has grown in this decade. It has numerous advantages over non-compartmental analysis: incorporating unbalanced designs, modeling sparse data [1–3] and quantifying individual variability. However, these advantages increase the complexity of model bringing additional consideration to the results, and more difficulties in checking how well the model fits the data. This paper describes an R package for generating reports for PopPK models, that contain comprehensive summary statistics and graphics. Graphics play an important and unique role in PopPK model building through exploring hidden structure among data before modeling, evaluating model fit, and validating results after modeling [4–13].
The output of PKreport follows many of the recommendations in Ette's comprehensive tutorial on the application of graphics in PopPK modeling . By exploring distribution plots, scatter plots, residual plots, partial residual plots, pairs plots, conditional plot, contour plots and start plots, he extensively demonstrated the graphic ability in the field of PopPK. At the same time, from a model perspective Karlsson investigated assumption testing comprehensively for PopPK model based on graphics . In that paper, the authors described 22 assumptions for various situations during the model development. By going through each stage of model building process with graphics, Bonate gave a detailed demonstration on how to facilitate model building with graphics, especially with real PopPK examples .
In 1999, as a continuation of the work in 1998, Jonsson developed a software tool: Xpose to help model building with graphics . Equipped with data set checkout plots, goodness of fit plots and tools for covariate model selection, this software has gained great popularity. Later, Wilkins further created a graphical user interface and management tool: Census, to help Xpose diagnose models . In 2003, Monolix was developed as a Matlab program. Compared with NONMEM, it employed an alternative approach to calculate maximum likelihood estimators based on SAEM algorithms . Monolix provides user-friendly graphical interface, powerful and convenient PK/PD model library, goodness of fit plots, and a stand-alone non-matlab program. PKreport further advances this work by providing automatically generated routine graphics, as required for example by the Federal Drug Administration (FDA).
PKreport provides 1) a flexible and efficient R class to store and retrieve NONMEM 7 output, 2) automate plots for users to visualize data and models, 3) automatically generated R scripts that are used to create the plots, that can be used later for reproducing the same or specific results, 4) an archive-oriented management tool for users to store, retrieve and modify figures, 5) high-quality graphs based on the R packages, lattice and ggplot2. The general architecture, running environment and statistical methods can be readily extended by the user.
The paper is organized as follows. The following section explains the methods implemented in the report. The third section focuses on the software implementation. The fourth section demonstrates how to use this package. The fourth section discusses the unique features of this package. The conclusions and future work are discussed in the final section.
Many authors [7, 8] have done extensively research in model assumption testing, and we follow these guidelines to automatically perform the following assumption testing: 1) exploratory data analysis; 2) goodness of fit plots; 3) parameter and random effects evaluation; 4) structural model diagnostics; 5) residual model diagnostics; 6) covariate model diagnostics. PKreport can be run on these subsets of methods, or on everything.
Exploratory data analysis
Dose history, covariate information, and diverse clinical trials taken in different arms or different periods should be checked for correctness and accuracy before models construction. Data structure should be investigated to screen hidden patterns, outliers and extreme observations linked to individuals for further analysis. Currently, histogram and scatter plot combined with conditional plot are implemented to help achieve these goals. Karlsson emphasized the plots for each patient ID versus each variable in the data file , and Ette described exploratory examination of concentration, distribution and correlations between covariates . All of these guidelines have been implemented in the PKreport package.
Goodness of fit plot
Goodness of fit plot plays a key role in checking model fitting. These kinds of plots give an overall perspective of model performance, including scatter plots for concentration versus PRED, concentration versus IPRED, PRED versus IDV and IPRED versus IDV . Most reports submitted to FDA are required to explain response from each patient. Individual plots for concentration/PRED/IPRED versus IDV can be explored for this purpose.
Evaluate parameters and random effects
Generally, there are assumptions for distribution of parameters during modeling process. The histogram is utilized to check this distribution. In addition, the correlation of parameters (clearance, volume distribution, etc) has significant effect on modeling performance, and it is checked by scatter plots or a scatterplot matrix. The assumptions for random effects are also tested for distribution and correlation by histogram, scatter plots or a scatterplot matrix.
Diagnose structural models
Structural model describes the model without the covariates. In practice, there are three popular structural models for use, including 1-, 2-, and 3-compartment models with different absorption models. After determining structural models, we can further build covariate models by incorporating relevant covariates. Structural model is diagnosed by PRED versus concentration conditioned on time, IPRED versus concentration conditioned on time, WRES versus time, WRES versus PRED, PRED versus concentration conditioned on covariates, IPRED versus concentration conditioned on covariates.
Diagnose residual error models
Two assumptions are related to this submodel: 1) homoscedastic variability; 2) symmetrically distributed residuals. To test these assumptions, we apply the following techniques: 1) histogram for distributions of WRES; 2) histogram for individual distribution of WRES; 3) scatterplot of |WRES| versus PRED to check the shape of residual; 4) scatterplot of |WRES| versus PRED conditioned on covariates to screen the covariate effects; 5) autocorrelation of WRES.
Diagnose covariate models
In general, covariate models study how to incorporate covariates into the model such that the associated variability can be reduced and the model explanation power enhanced. By linking subject-specific characteristics with model parameters, we can identify relevant covariates for model. Parameters, ETA and WRES are of great use to help screen proper covariates. We utilize the following methods to check covariate models: 1) scatter plot for parameters versus covariates, ETAs versus covariates, WRES versus covariates; 2) scatterplot matrix of covariates.
PKreport is an R package aiming to create an automatic pipeline for model assumption testing. Based on a hidden metric system matching default modeling variables to data variables, this package turns the assumption testing discussed in the previous sections to a fast, convenient and comprehensive routine. With the support of two powerful R graphical packages (lattice and ggplot2 ), this software can generate high-quality figures for diagnosis, archive all figures with specific folders for report and review, and utilize web browser as the interface for viewing, archiving and analyzing.
Package metric system
Time after dose
The concentration of drug in the body
Prediction generated from model fitting
Individual weighted residual
Parameters, such as clearance, volume of distribution, etc.
Independent variable (usually time)
The whole system is configured by three lists: 1) graph list. This list helps the user to choose proper figure format (jpg, pdf, png, etc.) as well as the graphical packages. Currently there are two popular graphical packages implemented for high-quality figures (lattice and ggplot2 ). 2) histogram list. This list specifies the configuration for the histogram generated by this package. It includes type of histogram and layout setup. 3) scatterplot list. This list determines type of scatter plot, bandwidth of smooth and layout setup.
Architecture description and features
Design of nonmem class in R
nonmem class slots
nonmem class methods
Standard output in lst file.
Estimation method extracted from #METH tag in lst file.
Analysis information extracted between #TERM and #TERE tag in lst file.
Objective function extracted from #OBJT tag in lst file.
Objective function value extracted from #OBJV tag in lst file.
Objective function standard deviation extracted from #OBJS tag in lst file.
The title of tab file (the first line of the file) in tab file.
Output data starting from the second line in tab file. This is the main data for PKfigure.
Title (character) and data (data.frame) in cov file.
Title (character) and data (data.frame) in cor file.
Title (character) and data (data.frame) in coi file.
Title (character) and data (data.frame) in phi file.
The package will automatically store figures generated from graphic reports in the file system. The figures are categorized by the model diagnostics methods. If all methods are utilized for report, nine folders will be created with the proper figures. "univar" and "bivar" folders are for exploratory data analysis; "gof" folder is for goodness of fit; "struct" folder is for structural model diagnostics; "resid" folder is for residual model diagnostics; "para" folder is for parameter diagnostics; "cov" folder is for covariate model diagnostics; "eta" folder is for random effects diagnostics and "ind" folder is for individual plots.
The format of figures is specified in save.format option in PKconfig function, and currently it supports png, bmp, jpeg, and tiff. png files are automatically generated for html report. After analysis, the figures will be stored in the proper folders with the specified file formats.
The figure archives can be deleted with PKclean function. During analysis, if users work through the diagnostic method step by step, the archives will be cleaned automatically unless clean option in PKfigure function is set as FALSE.
All R codes for figures are automatically generated in the report. Each R code command includes two comments and one script. The first comment explains the folder name for this figure and figure ID matching the graphical report. The second comment describes the title of the figure. The R script can be run to regenerate figures for further usage. In addition, all the R codes are stored as a text file (PKcode.txt) in the current R working directory.
This package supports a flexible pipeline for reporting and analyzing outputs from NONMEM 7. It includes data input, data configuration, model diagnostics, report generation and data cleaning.
The raw data can be output from NONMEM, Monolix or SAS. For NONMEM 7, this package requires standard input (lst file) and fitting results (tab file). It also works with some new files generated only for this NONMEM version, such as cor, cov, coi and phi files. For Monolix, SAS, and other version of NONMEM, this package requires only fitting results. The main function is as follows,
> myNonmemObj <- new("nonmem'',
The objectives of this step are twofold. First, users can setup global parameters for this package. It includes graphic package choice, figure configuration and saving format. Second, users are required to link package metric system to the variables in the data for further model diagnostics.
# First: setup global configuration
> PKconfig(general.list, hist.list, scatter.list)
#Second: match metric system
> PKdata(data=pdata, match.term=var.name)
The main goal of this step is to generate figures for model diagnostics. It performs the following model assumption testing: exploratory data analysis, goodness of fit plots, parameter and random effects evaluation, structural model diagnostics, residual model diagnostics, and covariate model diagnostics.
# residual model diagnostics
> PKfigure(pdata, 5)
This step is to generate two types of reports: numeric report and graphical report. Depending on the data available, the package can generate only graphical report or both reports.
# generate both numeric report
# and graphical report
# generate only graphical report
This step helps to clean R environment and delete figure achieves.
One data set from NONMEM was fitted with one-compartment model and utilized for demonstration of PKreport. To illustrate how to use this package, three examples are used. The first example describes how to generate simple graphical report. It works for NONMEM, Monolix, and SAS. The second example demonstrates how to generate a complex report including graphical report and numeric report. It only works for NONMEM 7. The last example focuses on the nonmem class and explains how to conveniently retrieve NONMEM 7 output.
This example demonstrates how to generate a simple report (only graphical report and no numerical report). By inputting a simple fitted result, users can generate model diagnostics with graphical report.
> var.name <- list(ID="ID", DV="CONC",
> PKdata(pdata, match.term=var.name)
> PKfigure(pdata, c(3,6,8))
The NONMEM 7 output directory is in c:\nnonmem7, and it includes lst, tab, cov, cor, coi and phi files. We would like to generate a complete report, including both graphical report and numeric report. To create this report, we need to create an instance from the nonmem class.
> myclass <- new("nonmem",
> var.name <- list(ID="ID", DV="DV",
ETA=c("ETA5", "ETA1"), COV="TIME",
> pdata <- myclass@tabdata
> PKdata(data=pdata, match.term=var.name)
> PKfigure(pdata, 3)
> PKfigure(pdata, 7, FALSE)
In this example, we would like to demonstrate how to utilize nonmem class to access the NONMEM output.
 "First Order Conditional Estimation
# select lines from 50 to 56 in lst file
> exp.data <- non.select(myclass, c(50:56))
> options(scipen = 100)
[,1] [,2] [,3]
[1,] 1 44.80 1000000.00
[2,] 1 410.00 1000000.00
[3,] 0 0.25 1000000.00
[4,] 6 17.00 1000000.00
[5,] 0 0.28 1000000.00
[6,] 0 0.50 0.95
[7,] 0 0.50 1000000.00
> options(scipen = -100)
[,1] [,2] [,3]
[1,] 1e+00 4.48e+01 1.0e+06
[2,] 1e+00 4.10e+02 1.0e+06
[3,] 0e+00 2.50e-01 1.0e+06
[4,] 6e+00 1.70e+01 1.0e+06
[5,] 0e+00 2.80e-01 1.0e+06
[6,] 0e+00 5.00e-01 9.5e-01
[7,] 0e+00 5.00e-01 1.0e+06
In this study, we developed an R package: PKreport as a comprehensive exploratory tool for diagnosing population pharmacokinetic models. It targets audiences working in population pharmacokinetics models, and particularly those professionals who have only basic knowledge of R and lack statistical expertise. PKreport is available in an open-source environment. Based on the questions and suggestions from users, we will continue to update and make it more useful to the community.
As a similar R package to Xpose, PKreport has the following unique features: PKreport is an exploratory report tool rather than a fine-tuned graphical tool. The main objective of this software is to provide a comprehensive view of data, model, and the relationship between them by the automatic pipeline for generating reports. The pharmacologists always hope to use some fancy and specific graphic user interface, which in fact limits and even contradicts the spirit of discovery research. The thought of discovery is the motivation behind this package. Instead of some assumed direction, a systematically full model report helps users to gain deep understanding of the project. On the other hand, Xpose is more like fine-tuned graphical tool to address specific research questions in mind.
In addition, this package automatically generates the R scripts for the plots. This feature allows the experienced users for further amelioration, and thus largely alleviates their repetitive work. For users who do not have expertise in statistics or R, this package can generate all required diagnostic plots with several commands and a few arguments. Anyone has to admit that we can produce any plot and calculate any parameters in R or Matlab, however, the big problem is the time and energy cost. No one wants to repeat it each time for a new model project. In addition, the software design, such as report interface based on web browser, separate stand-alone diagnostic modules, and flexible archive structure for plot management, make it convenient to users.
Furthermore, we proposed and developed a S4 class: nonmem to specifically match the new release of NONMEM 7. In the new release, the standard result files are modified and formatted with particular tags to identify various sections. Also, some additional files are generated automatically, including variance-covariance matrix (cov file), correlation matrix (cor file), inverse covariance matrix (coi file) and individual phi parameters and variances (phi file). The new nonmem class provides an efficient way to access these output files from NONMEM 7. It includes twelve slots and thirteen methods to access estimation method, analysis information, objective function, title and data for tab, cov, cor, coi, phi files. This package can accept model fit from diverse software, including NONMEM, Monolix, R nlme, etc. By importing the model fit file (for example, tab file in NONMEM) and matching software-specific variables to default modeling variables in metric system, PKreport can explore, visualize and diagnose models from all these software platforms. NONMEM and Monolix both provides some basic diagnostic plots for their fitting results, however, as emphasized before, PKreport servers as a comprehensive exploratory tool and provides a comprehensive way for the data and model. It will be a beneficial and complimentary tool to these software.
PKreport is an R package that generates a collection of plots and statistics for testing model assumptions, visualizing data and diagnosing models. It provides a flexible and efficient R class to store and retrieve NONMEM output. In addition, it can generate numeric report and graphical report for users to diagnose PopPK models. The general architecture, running environment and statistical methods can be easily extended to include more automatic diagnostics in the development of PopPK models.
Availability and requirements
Project name: PKreport
Project home page: http://cran.r-project.org/web/packages/PKreport/index.html
Operating system(s): Platform independent
Programming language: R
Other requirements: R packages (lattice, ggplot2)
License: GNU GPL
Any restrictions to use by non-academics: Licence needed
This work was fully funded by an unrestricted fellowship from Novartis. We thank Dr. Di Cook for helpful discussions and advices on this work.
- Samara E, Granneman R: Role of population pharmacokinetics in drug development - A pharmaceutical industry perspective. Clinical Pharmacokinetics. 1997, 32 (4): 294-312. 10.2165/00003088-199732040-00003.View ArticlePubMedGoogle Scholar
- Sun H, Fadiran E, Jones C, Lesko L, Huang S, Higgins K, Hu C, Machado S, Maldonado S, Williams R, Hossain M, Ette E: Population pharmacokinetics - A regulatory perspective. Clinical Pharmacokinetics. 1999, 37 (1): 41-58. 10.2165/00003088-199937010-00003.View ArticlePubMedGoogle Scholar
- Ette E, Williams P, Lane J: Population pharmacokinetics - III: Design, analysis, and application of population pharmacokinetic studies. Annals of Pharmacotherapy. 2004, 38 (12): 2136-2144. 10.1345/aph.1E260.View ArticlePubMedGoogle Scholar
- Cleveland WS: Visualizing Data. 1993, Hobart Press, 1Google Scholar
- Karlsson M, Beal S, Sheiner L: Three new residual error models for population PK/PD analyses. Journal of Pharmacokinetics and Biopharmaceutics. 1995, 23 (6): 651-672. 10.1007/BF02353466.View ArticlePubMedGoogle Scholar
- Ette E, Ludden T: Population pharmacokinetic modeling: The importance of informative graphics. Pharm Res. 1995, 12 (12): 1845-1855. 10.1023/A:1016215116835.View ArticlePubMedGoogle Scholar
- Karlsson M, Jonsson E, Wiltse C, Wade J: Assumption testing in population pharmacokinetic models: Illustrated with an analysis of moxonidine data from congestive heart failure patients. Journal of Pharmacokinetics and Biopharmaceutics. 1998, 26 (2): 207-246. 10.1023/A:1020561807903.View ArticlePubMedGoogle Scholar
- Ette E: Statistical graphics in pharmacokinetics and pharmacodynamics: A tutorial. Annals of Pharmacotherapy. 1998, 32 (7-8): 818-828.View ArticlePubMedGoogle Scholar
- Ette E, Williams P, Sun H, Fadiran E, Ajayi F, Onyiah L: The process of knowledge discovery from large pharmacokinetic data sets. Journal of Clinical Pharmacology. 2001, 41 (1): 25-34. 10.1177/00912700122009809.View ArticlePubMedGoogle Scholar
- Petricoul O, Claret L, Barbolosi D, Iliadis A, Puozzo C: Information tools for exploratory data analysis in population pharmacokinetics. Journal of Pharmacokinetics and Pharmacodynamics. 2001, 28 (6): 577-599. 10.1023/A:1014464505261.View ArticlePubMedGoogle Scholar
- Brendel K, Comets E, Laffont C, Laveille C, Mentre F: Metrics for external model evaluation with an application to the population pharmacokinetics of gliclazide. Pharmaceutical Research. 2006, 23 (9): 2036-2049. 10.1007/s11095-006-9067-5.View ArticlePubMedPubMed CentralGoogle Scholar
- Karlsson MO, Savic RM: Diagnosing model diagnostics. Clinical Pharmacology and Therapeutics. 2007, 82 (1): 17-20. 10.1038/sj.clpt.6100241.View ArticlePubMedGoogle Scholar
- Ene I, Ette PJW: Pharmacometrics: the science of quantitative pharmacology. 2007, Wiley-Interscience, 1Google Scholar
- Bonate P: Pharmacokinetic-Pharmacodynamic modeling and simulation. 2005, Springer, 1Google Scholar
- Jonsson E, Karlsson M: Xpose - an S-PLUS based population pharmacokinetic/pharmacodynamic model building aid for NONMEM. Computer Methods and Programs in Biomedicine. 1999, 58 (1): 51-64.View ArticlePubMedGoogle Scholar
- Wilkins J: Census. 2005, [Http://census.sourceforge.net/]Google Scholar
- The Monolix group: Monolix. 2010, [Http://www.monolix.org/]Google Scholar
- Sarkar D: Lattice: multivariate data visualization with R. 2008, Springer, [http://lmdvr.r-forge.r-project.org/]View ArticleGoogle Scholar
- Wickham H: ggplot2: elegant graphics for data analysis. 2009, Springer New York, [http://had.co.nz/ggplot2/book]View ArticleGoogle Scholar
- European Medicines Agency: Guideline for reporting the results of population pharmacokinetic analyses. 2007, [Http://www.deepdyve.com/lp/sage/european-medicines-agency-emea-committee-for-medicinal-products-for-M2HGjYqXWr]Google Scholar
- Boeckmann AJ, Sheiner LB, Beal SL: NONMEM users guide. NONMEM Project Group, University of California, San Francisco. 1994Google Scholar
- The pre-publication history for this paper can be accessed here:http://www.biomedcentral.com/1472-6947/11/31/prepub
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.