Developing a Reliable R Package for Regulatory Purposes: The RPACT Case Study

Research Colloquium, IMBS, Lübeck

Dr. Friedrich Pahlke and Dr. Gernot Wassmer

RPACT

2024-06-19

My Career Started Here

  • IMBS - Institut für Medizinische Biometrie und Statistik
  • 2004 - 2008: Scientific assistant
  • My office at the IMBS, 2005:

  • Gernot Wassmer
  • Friedrich Pahlke
  • Carrie Li
  • Daniel Sabanés Bové

Motivation: Why validate an R package?

Why validate an R package?

Software validation is essential for the approval of new drugs and treatments.

  • Regulatory authorities require organizations to ensure that their computerized systems are reliable and accurate
  • Software used in clinical trials must be validated to meet these regulatory requirements

Why is GxP-compliant validation important?

  • Non-validated software is more likely to be faulty
  • What are the technical risks of faulty software?
    • Incorrect planning and/or analysis of trials or trial data
    • Non-reproducible results
    • Unreliable results

  • What are the non-technical risks of faulty software?
    • Low scientific acceptance of the results
    • Incorrect decisions about the safety and efficacy of drugs or treatments
    • The clinical trial may be delayed and may result in additional costs
    • Existing effectiveness may not be proven \(\rightarrow\) Waste of money
    • Existing non-effectiveness may not be recognized (futility stop)

What are the targets of GxP-Compliant Validation?

  • Risk management and minimization: Identify and mitigate potential risks associated with the use of the software
  • Increase the trust of authorities and stakeholders:
    • By adhering to GxP standards, the organization demonstrates its commitment to quality and accuracy
    • Strengthens the trust of regulatory authorities, sponsors, and other stakeholders in the reliability of the trial results

R Package Validation

What does GxP-compliant validation mean for R packages?

  • GxP-compliant validation of R packages used for the planning and analysis of clinical trials is a critical step
  • The FDA has stated that the R packages used must be “reliable”
  • This can be demonstrated through traditional formal validation, as shown with “rpact”
  • Or through a less formal “Risk Assessment” approach, as promoted by the R Validation Hub (pharmar.org/risk)

The “rpact” Case Study: Developing a Reliable R Package for Regulatory Purposes

  • 2017: Idea

  • 2024: “rpact” and RPACT are trusted and widely accepted

Status 2024: RPACT in Numbers

  • 1,500,000 € sponsorship money
  • 61,000 CRAN downloads total; 1,4001 downloads per month (June 2024)
  • 36,593 lines of code
  • 34,427 unit tests covering 29,701 lines of code (see codecov.io)
  • 20,000 hours of work
  • 98 separate source code files: Modular software design
  • 29 vignettes published on rpact.org/vignettes
  • 28 releases on CRAN since 2018

  • 20 SLA sponsors: “The RPACT User Group”
  • 8 websites: CRAN, rpact.com, rpact.org, …
  • 4 CI/CD pipelines on GitHub to automate checks, tests, code coverage calculation, and GitHub pages creation
  • 2 main developers; 3 contributors; feedback and feature requests from many different users and companies
  • 2 required dependencies: Rcpp and R6
  • 2 rpact Shiny apps: rpact Shiny app and RPACT Cloud
  • 1 developer platform: GitHub is used for issues, comments, feature requests

Status 2024 - (#29) RPACT Vignettes

Status 2024 - (#20) The RPACT User Group

  • Boehringer Ingelheim
  • Metronomia Clinical Research
  • F. Hoffmann-La Roche
  • Dr. Willmar Schwabe
  • Bayer
  • Merck
  • AbbVie
  • Dr. Falk Pharma
  • Klifo
  • FGK Clinical Research
  • UCB
  • GKM
  • Parexel
  • Nestlé
  • Janssen (Johnson & Johnson)
  • Novartis
  • PPD (Thermo Fisher Scientific)
  • Sanofi
  • Pfizer
  • Gilead

Status 2024 - (#8) Websites

Further information, installation, and usage:

Project Challenges 2017

1. Funding of an open source project

  • How to earn money with open source software?
  • Is there a sustainable business model?
  • Contract drafting?

2. Realization with a small team

  • Collaboration:
    • Planning, coordination and development over long distances at different locations
    • How to manage the project efficiently?
    • How to track bugs and feature requests?
  • What is the best integrated development environment (IDE)
    \(\rightarrow\) RStudio, Eclipse, VS Code, or ?
  • What clean code rules to use for R packages?
    \(\rightarrow\) Generally accepted guidelines were missing

3. Development of a sustainable user concept

  • Usability:
    • The software must be easy to learn and use
    • Goal: A very high user acceptance
  • Consistency:
    • How to name functions and arguments?
    • What is the best output format?
      A list() is not enough…

2024: What Our Users Say About RPACT

  • “One of the best software and team in the field of adaptive design!”
    (Senior Director of Statistics)
  • “rpact is by far the easiest to use.”
    (Professor, Human-Technology Interaction Group)
  • “RPACT is just amazing.” (Biostatistician)
  • “We are impressed by the high quality of the package and the excellent support by rpact.” (Biostatistics director of a pharmaceutical company)
  • “[We] exclusively uses rpact, complemented with a huge internal webportal of supporting code, documentation, internal case studies, repository of health authority questions, etc. for all clinical trial design purposes” (see DOI)
  • “Excellent package! Many thansk.” (Biostatistician)

4. Development of a reliable and sustainable validation concept

  • High download rates are a common quality criterion for open source software
  • Our expectation: rpact remains a niche software with low download rates
  • Application of good software engineering rules was not clear for R packages

Challenge 1: Funding

  • Crowdfunding
    \(\rightarrow\) 10 pharma companies and CROs agreed to sponsor the project
  • Service Level Agreement
    \(\rightarrow\) Together we developed a simple contract: software support and training; no software development!
  • RPACT was founded as a GbR \(\rightarrow\) Easiest and fastest solution for freelancers

Challenge 2: Realization with a small team

  • Project management
    \(\rightarrow\) Atlassian JIRA, GitHub
  • Bug tracking
    \(\rightarrow\) Online form, GitHub Issues
  • Validation
    \(\rightarrow\) Automation wherever possible

Challenge 3: Sustainable user concept

  • Usability:
    • Many default arguments \(\rightarrow\) Getting started is much easier
    • Support of R generics \(\rightarrow\) print(), plot(), summary(), names(), …
    • Inline help and documentation \(\rightarrow\) roxygen2
    • Vignettes \(\rightarrow\) Practical examples and tutorials
  • Consistency:
    • Lower camel case names
    • Clear output format: structured and meaningful

Challenge 4: Validation Concept

  • Formal validation inspired by GAMP1 5 principles
  • As few dependencies as possible because we cannot validate other R packages
  • We assume that base/core R is validated/reliable (see R-FDA.pdf)
  • What does formal validation mean?
    • Experience from other projects: extensive documentation, huge paperwork, lots of manual work, lots of signatures
  • High levels of automation \(\rightarrow\) In parallel, a robust validation framework had to be developed

Our R Package Validation Concept

RPACT Validation Concept

  • CRAN release as part of the validation concept
    • Strict CRAN package checks
  • High test coverage1:
    • Generally accepted threshold is \(\geq\) 80%
    • Usage of covr and codecov.io
  • Comprehensive validation documentation
  • Automatic SOP checks
  • CI/CD pipeline on GitHub to automate checks, tests, code coverage calculation, and GitHub pages creation

  • Template-based unit tests:
    • Automatic generation of testthat test cases
    • Automatic creation of test plans and references to function specifications
    • Automatic creation of test protocols directly linking to individual test cases

  • Risk Assessment
  • Levels:
    • High Risk
    • Medium Risk
    • Low Risk

Risk Assessment Level “High Risk”

High Risk: Functions and calculations that directly or indirectly affect the planning and analysis of clinical trials. These involve decisions regarding the safety and effectiveness of drugs and treatments. Errors or inaccuracies in these functions could lead to incorrect conclusions about the efficacy or safety of a clinical intervention, potentially causing harm to patients or misguiding regulatory decisions.

Risk Assessment Level “Medium Risk”

Medium Risk: Functions and operations that support the main analytical procedures but do not directly influence the critical outcomes of clinical trial planning and analysis. These could include data preparation, intermediate statistical methods that inform but do not determine the final analysis, or other support functions that facilitate the primary objectives of the package without being directly tied to decision-making about treatments.

Risk Assessment Level “Low Risk”

Low Risk: Utility functions for specific output formats (e.g., R Markdown) or user-specific customizations that do not fall into the other risk categories. These functions assist in the presentation and documentation of results but do not impact the core analytical procedures or outcomes of clinical trial analysis.

Risk Assessment - Risk of Dependencies

  • Check the nature of the imported and suggested packages
  • rpact: ggplot2, Rcpp, R6, rmarkdown, knitr, testthat
    • Given the roles of these packages, it is assumed that any malfunctioning behavior would be detected during the testing of rpact functions, as these packages support technical rather than methodological aspects of the package

Formal validation

Documentation structure inspired by GAMP 5

  1. User requirements specification (URS)
  2. Functional specification (FS)
  3. Software design specification (SDS)
  4. Verification
    • Test plan (TP)
    • Test protocol (TL)
  5. Appendix

Validation documentation of rpact 4.0.0:
7,470 pages

Automation of recurring validation processes/activities

Validation Utility Package rpact.validator

  1. User requirements specification (URS)
    \(\rightarrow\) Manual work
  2. Functional specification (FS)
    \(\rightarrow\) createOutputSpecification(), createPlotTypeSpecification()
  3. Software design specification (SDS)
    \(\rightarrow\) createBaseRPackagesTable(), createBusinessObjectStructure(), createSourceDirectoryStructure()

  1. Verification
    • Test plan (TP) \(\rightarrow\) createTestPlan()
    • Test protocol (TL)
      • Installation qualification (IQ) \(\rightarrow\) CRAN Package Check Results
      • Operational qualification (OQ)
        \(\rightarrow\) createUnitTestResultDocumentation()
      • Performance qualification (PQ)
        \(\rightarrow\) createPerformanceQualification()
  2. Appendix
    • Release notes \(\rightarrow\) createNewsTexFile()

Automation of recurring validation processes/activities

library(rpact.validator)

createNewsTexFile()
createBaseRPackagesTable()
createOutputSpecification()
createPlotTypeSpecification()
createSourceDirectoryStructure()
createBusinessObjectStructure()
createUnitTests()
createTestPlan()
createUnitTestResultDocumentation()
createPerformanceQualification()
runSopCheck()

\(\rightarrow\) Documentation work

Automation of recurring validation processes/activities

library(rpact.validator)

createNewsTexFile()
createBaseRPackagesTable()
createOutputSpecification()
createPlotTypeSpecification()
createSourceDirectoryStructure()
createBusinessObjectStructure()
createUnitTests()
createTestPlan()
createUnitTestResultDocumentation()
createPerformanceQualification()
runSopCheck()

\(\rightarrow\) Quality control of the process steps up to the final validation documentation

\(\rightarrow\) The SOPs help not to forget or overlook any step

Automation of recurring validation processes/activities

library(rpact.validator)

createNewsTexFile()
createBaseRPackagesTable()
createOutputSpecification()
createPlotTypeSpecification()
createSourceDirectoryStructure()
createBusinessObjectStructure()
createUnitTests()
createTestPlan()
createUnitTestResultDocumentation()
createPerformanceQualification()
runSopCheck()

\(\rightarrow\) Efficient unit test case generation

Basic Idea of Template-Based Unit Testing

  • Step 1: Compare software results manually, e.g., with simulation results and results from the literature and/or other programs
    \(\rightarrow\) Reference point is correct and trustworthy
  • Step 2: Fix the validated state, i.e., generate unit tests which test the software systematically and reproducibly
    \(\rightarrow\) Further development and refactoring do not cause undetected side effects

Advantages of the Template-Based Approach

  1. Automation and Consistency:
    • Uniform Test Structure: Test templates ensure a uniform structure for the tests, improving maintainability and readability.
    • Automated Test Generation: Automatically generating tests from templates reduces manual effort and minimizes errors that can occur with manual test script creation.

  1. Flexibility and Extensibility:
    • Easy Switch of Test Packages: Since the tests are defined in templates, these templates can be easily adapted to work with other test packages. This facilitates the transition to new or improved test frameworks without significant effort.
    • Modularity: Using templates allows tests to be modular, making it easier to add, remove, or modify specific tests.

  1. Traceability and Documentation:
    • Granular and Traceable Tests: Each test case is clearly defined and traceable, making debugging easier. This avoids creating “black-box tests” and improves understanding of the functionality being tested.
    • Documentation: References to functional specifications and software design specifications can be defined in the templates. This promotes better documentation and traceability of the tests, which is particularly advantageous for audits and reviews.

  1. Efficiency Improvement:
    • Time Savings: The one-time creation of test templates saves time as they can be reused to generate tests automatically.
    • Scalability: With test templates, tests can easily be scaled to new functions and modules, increasing test coverage and improving software quality.

Example: Test Template

test_template_f_design_group_sequential.R:

#' @exit Do not create the unit tests again
#' @context Testing the Group Sequential and Inverse Normal Design Functionality

#' @test_that 'getDesignInverseNormal' with default parameters: 
#'             parameters and results are as expected

#' @refFS[Tab.]{fs:tab:output:getDesignInverseNormal}
#' @refFS[Formula]{fs:criticalValuesOBrienFleming}
x0 <- getDesignInverseNormal()
getUnitTestObject(x0, "x0")

Template

#' @exit Do not create the unit tests again
#' @context Testing the Group Sequential and Inverse Normal Design Functionality

#' @test_that 'getDesignInverseNormal' with default parameters: 
#'             parameters and results are as expected

#' @refFS[Tab.]{fs:tab:output:getDesignInverseNormal}
#' @refFS[Formula]{fs:criticalValuesOBrienFleming}
x0 <- getDesignInverseNormal()
getUnitTestObject(x0, "x0")

Regeneration is disabled

#' @exit Do not create the unit tests again
#' @context Testing the Group Sequential and Inverse Normal Design Functionality

#' @test_that 'getDesignInverseNormal' with default parameters: 
#'             parameters and results are as expected

#' @refFS[Tab.]{fs:tab:output:getDesignInverseNormal}
#' @refFS[Formula]{fs:criticalValuesOBrienFleming}
x0 <- getDesignInverseNormal()
getUnitTestObject(x0, "x0")

Section title in the document (context for testthat version <3)

#' @exit Do not create the unit tests again
#' @context Testing the Group Sequential and Inverse Normal Design Functionality

#' @test_that 'getDesignInverseNormal' with default parameters: 
#'             parameters and results are as expected

#' @refFS[Tab.]{fs:tab:output:getDesignInverseNormal}
#' @refFS[Formula]{fs:criticalValuesOBrienFleming}
x0 <- getDesignInverseNormal()
getUnitTestObject(x0, "x0")

Unit test title/description

#' @exit Do not create the unit tests again
#' @context Testing the Group Sequential and Inverse Normal Design Functionality

#' @test_that 'getDesignInverseNormal' with default parameters: 
#'             parameters and results are as expected

#' @refFS[Tab.]{fs:tab:output:getDesignInverseNormal}
#' @refFS[Formula]{fs:criticalValuesOBrienFleming}
x0 <- getDesignInverseNormal()
getUnitTestObject(x0, "x0")

Reference to a table in the Functional Specification

#' @exit Do not create the unit tests again
#' @context Testing the Group Sequential and Inverse Normal Design Functionality

#' @test_that 'getDesignInverseNormal' with default parameters: 
#'             parameters and results are as expected

#' @refFS[Tab.]{fs:tab:output:getDesignInverseNormal}
#' @refFS[Formula]{fs:criticalValuesOBrienFleming}
x0 <- getDesignInverseNormal()
getUnitTestObject(x0, "x0")

Reference to a formula in the Functional Specification

#' @exit Do not create the unit tests again
#' @context Testing the Group Sequential and Inverse Normal Design Functionality

#' @test_that 'getDesignInverseNormal' with default parameters: 
#'             parameters and results are as expected

#' @refFS[Tab.]{fs:tab:output:getDesignInverseNormal}
#' @refFS[Formula]{fs:criticalValuesOBrienFleming}
x0 <- getDesignInverseNormal()
getUnitTestObject(x0, "x0")

The function call to be tested

#' @exit Do not create the unit tests again
#' @context Testing the Group Sequential and Inverse Normal Design Functionality

#' @test_that 'getDesignInverseNormal' with default parameters: 
#'             parameters and results are as expected

#' @refFS[Tab.]{fs:tab:output:getDesignInverseNormal}
#' @refFS[Formula]{fs:criticalValuesOBrienFleming}
x0 <- getDesignInverseNormal()
getUnitTestObject(x0, "x0")

Create a unit test for each field of the object x0.

Example: Test Template

library(rpact)
library(rpact.validator)
x0 <- getDesignInverseNormal()
getUnitTestObject(x0, "x0")

results in:

## Comparison of the results of TrialDesignInverseNormal 
## object 'x0' with expected results
expect_equal(x0$alphaSpent, c(0.00025917372, 0.0071600594, 0.02499999), 
    tolerance = 1e-07)
expect_equal(x0$criticalValues, c(3.4710914, 2.4544323, 2.0040356), 
    tolerance = 1e-07)
expect_equal(x0$stageLevels, c(0.00025917372, 0.0070553616, 0.022533125), 
    tolerance = 1e-07)

invisible(capture.output(expect_error(print(x0), NA)))
expect_output(print(x0)$show())
invisible(capture.output(expect_error(summary(x0), NA)))
expect_output(summary(x0)$show())
x0CodeBased <- eval(parse(text = 
    getObjectRCode(x0, stringWrapParagraphWidth = NULL)))
expect_equal(x0CodeBased$alphaSpent, x0$alphaSpent, tolerance = 1e-07)
expect_equal(x0CodeBased$criticalValues, x0$criticalValues, tolerance = 1e-07)
expect_equal(x0CodeBased$stageLevels, x0$stageLevels, tolerance = 1e-07)
expect_type(names(x0), "character")
df <- as.data.frame(x0)
expect_s3_class(df, "data.frame")
expect_true(nrow(df) > 0 && ncol(df) > 0)
mtx <- as.matrix(x0)
expect_true(is.matrix(mtx))
expect_true(nrow(mtx) > 0 && ncol(mtx) > 0)

Conclusion

  • Validation of an R package is challenging, time consuming, and expensive
  • The template-based unit testing approach offers a structured, flexible, and efficient method for software quality assurance.
  • By automating test generation, using traceable test cases, and enabling comprehensive documentation, software development is not only accelerated but also the quality of the final products is significantly enhanced.
  • Especially the combination of manual validation and automated verification ensures that the software remains stable and reliable, even as it is further developed or refactored.

Thank you!