Henning Schmidt, Heinz Hegi
IntiQuan GmbH, Basel, Switzerland
Introduction: Pharmacometric reports typically contain a large number of figures, tables, text files, in-line text, and numerical values that have been generated during the conduct of analyses. While these results are often generated in a script-based manner, reporting of these results is complicated by corporate requirements to use Microsoft Word and a specific in house-style guide, requiring scientists to copy, paste, and format results manually in a time-consuming and error-prone process.
Wilkins and Jonsson [1] presented an approach to Reproducible Pharmacometrics that allowed a full end-to-end scripting of the analysis and the creation of publication-ready reports. Their solution was based on LaTeX [2], knitr [3], and R [4]. While this approach is very powerful, LaTeX can be perceived as a cumbersome language. More importantly, the output is a PDF rather than a Word document, which often does not comply with corporate publication policy. In recent years, R Markdown [5] has been developed, allowing to use the more lightweight language Markdown, and RStudio [6] supporting the conversion to Word. However, no R Markdown support is available for critical Word features, such as cross-references, figure and table captions, and choice of more complex Word styles, impeding the generation of submission-ready Word documents.
In this poster, a Markdown based framework is presented that leverages open source tools to allow the creation of submission-ready Word reports, using any desired Word style, and to automatically integrate analysis results without the need for manual copy and paste.
Methods: The development of the reporting framework requires the combination of several elements. One of the most important elements is the definition of a language to be used to write the report. Markdown [7] was selected as a basis and its syntax was extended with additional elements, such as tags for page breaks, landscape and portrait mode, definition of figures and tables and related captions and legends, cross references. The conversion of Markdown to Word DOCX is performed by Pandoc [8]. Before the extended Markdown syntax can be processed by Pandoc, the extended Markdown and the figures and tables need to be pre-processed and the extended Markdown converted to normal Markdown. Figure post-processing is done using Ghostscript [9] and ImageMagick [10] and involves scaling, cropping, and conversion from PDF files to a format suitable for import into Word. After Pandoc conversion of the Markdown document, the resulting Word document needs to be post-processed to ensure the use of the correct style across the final Word report. The mapping of styles to report elements is defined in a template settings file. Simple interface functions were developed for R and other scripting environments, allowing to export matrices, tables, and data frames into a text-based format that serves as container for tables to be imported to Word and to execute the framework on a user provided extended Markdown document. In addition, these interfaces generate information that allows to link the report elements to the scripts and outputs that were generated during the analysis, leading to full traceability.
Results: The resulting Markdown based framework has been applied in the creation of several Word reports for submissions to Health Authorities. Its user-friendliness has been greatly improved by employing the text editor Notepad++ [11], allowing syntax highlighting for the extended Markdown language and a customizable context menu that appears on the right click of the mouse. This context menu allows to include often used report text and (extended) Markdown elements into the report document and thus speeds up report writing and learning of the syntax.
Conclusions: The approach uses existing technology that is not particularly difficult but does require some custom scripting to link different tools. This scripting, however, does not need to be performed by the end-user. The work demonstrates that increased accuracy, efficiency, credibility, elimination of transcription errors, traceability, reproducibility, including the resulting Word report, is possible.
References:
[1] Wilkins J, Jonsson E.N. (2013) Reproducible pharmacometrics, PAGE Meeting, Glasgow, Scotland
[2] Lamport L (1986) LaTeX: A Document Preparation System, Addison-Wesley, Reading, Mass.
[3] knitr, https://yihui.name/knitr/
[4] R Development Core Team (2008) R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0
[5] R Markdown, https://rmarkdown.rstudio.com/
[6] RStudio, https://www.rstudio.com/
[7] Markdown, https://daringfireball.net/projects/markdown/
[8] Pandoc: a universal document converter, https://pandoc.org/index.html
[9] Ghostscript: and interpreter for the PostScript language and for PDF, https://www.ghostscript.com/
[10] ImageMagick, https://www.imagemagick.org
[11] Notepad++, https://notepad-plus-plus.org/
Reference: PAGE 27 (2018) Abstr 8498 [www.page-meeting.org/?abstract=8498]
Poster: Methodology - Other topics