S-01 Gregory Ferl

Literate Programming Methods for Clinical M&S

Gregory Z. Ferl

Genentech, Inc.

Objectives: Develop an effective Literate Programming/Reproducible Research environment for Modeling and Simulation of clinical data that serves as a “back-end” to NONMEM analysis.

Methods: Literate Programming[1] (LP) is a computer programming methodology developed by Don Knuth, a computer scientist who invented the TeX typesetting system. His first iteration of the LP methodology combined the programming language PASCAL with the typesetting system TeX and was called WEB, “…partly because it was one of the few three-letter words that hadn’t already been applied to computers”[1]. LP can be described as a system that makes the process of writing, running and evaluating computer programs more easily understood by people. This requires explanation. A data analysis workflow can have many steps, looking something like 1) write a computer program, 2) run the program on data, 3) generate summary figures and tables, 4) paste the figures and tables into a final report written using a word processor. Essentially, what we are doing is writing a program that instructs a computer what to do with our data; then, in a series of separate steps, we translate relevant output into a form that is readable by a person (a final document that contains text, figures, tables). The LP approach automates these translation steps by formally combining, in a single file, the code that performs the analysis with a document preparation system. Using LP, the final report is re-created on the fly each time a data set is analyzed and automatically reflects any changes that may have been made to methods of analysis or the data set. Key pieces of code can (and should) be automatically included within prose sections of the final report, where they are printed in the final document exactly as they are written in the code. Thus, a reader can be 100% certain of how results presented in a report were generated and can easily update the report if any changes are made to the analysis method or data set.

Results: Here, we describe an implementation of LP for population modeling analysis of clinical imaging data, using Sweave 2, a tool that allows one to use NONMEM [3], R [4] and LaTeX [5] within the literate programming environment. Using simulated data, we illustrate how a detailed population PK/PD report and companion slides decks may be generated and updated on the fly as data and analysis methods are updated. The LP workflow is driven by a single Sweave master file containing code for population modeling (NONMEM), data post processing/generation of graphs (R), and markup used to generate a comprehensive PDF report (LaTeX). If at any time we decide to alter the analysis methodology, such as adding or removing a patient(s) from the data set or changing an equation, all that needs to be done is modify the data set and/or the master LP file accordingly and run it to generate a completely updated report.

Literate Programming workflow. The Sweave master file contains the NONMEM control file, R code and LaTeX markup for document text/layout:

Sweave Master File & NONMEM data file → Run Sweave → PDF report

Conclusions: Our Literate Programming approach facilitates construction of generic templates that can be used to analyze any clinical data set with the appropriate format/structure, creating a report that can be easily reproduced, effectively archived and/or passed on to collaborators for further analysis.

References:
[1] Knuth D. Literate Programming. Computer Journal 1984;27(2):97–111.
[2] Leisch F. Sweave: Dynamic generation of statistical reports using literate data analysis. In: Hardle, W and Ronz, B, editor. COMPSTAT 2002: Proceedings in Computational Statistics. Humboldt Univ Berlin, Ctr Appl Stat & Econ; Frie Univ Berlin; Univ Potsdam. Heidelberg, Germany: Physica-Verlag 2002. p. 575–580.
[3] Beal S, Sheiner LB, Boeckmann A, Bauer RJ. NONMEM User’s Guides (1989-2009). Ellicott City, MD, USA 2009.
[4] R Development Core Team. R: A Language and Environment for Statistical Computing. Vienna, Austria 2011. ISBN 3-900051-07-0.
[5] Lamport L. LaTeX – A Document Preparation System: User’s Guide and Reference Manual, Second Edition. Addison-Wesley Professional 1994.

Reference: PAGE 22 (2013) Abstr 2951 [www.page-meeting.org/?abstract=2951]

Poster: Software Demonstration