Open-source software like R and Python can be used in clinical research and have huge potential as they are cheap and can be customized to meet a diversity of visualization needs. Additionally, open-source software have a large and active user community that can support enthusiasts who want to ride the learning curve. For example, GitHub has about 2.5 million R projects and Stack Overflow has about 1.3 million R-related questions and discussions which users can tap into to answer their questions and get their own R projects rolling.
Although SAS has always been the standard for submission work, the pharmaceutical industry has been investing in R for more than ten years and the platform's steady adoption rate is bound to lead to a paradigm shift in the near future. Namely, throughout 2021-22, the R Consortium completed pilots to submit R-based packages through the FDA eCTD gateway and collaborated with the industry and regulators to pilot R Shiny for the assessment of new medicinal products. Check it out here.
Regulators moving away from SAS and towards open source is good for three reasons:
- Lower cost. Although the professional version of most open source software is not free, it is far cheaper than SAS. Also, because open source is more accessible than SAS, the industry could draw from a larger talent pool to prepare submissions.
- Alleviation of data standards pains that the SAS Transport File Format has historically imposed. For example, the fact that data files sent to regulators cannot contain variable values exceeding 200 characters is an artifact of the SAS Transport File Format that the FDA requires. To comply with this rule, sponsors need to split any value exceeding 200 character into supplemental domains and come up with a mapping schema to put the values back together. This is an error-prone task and a wastes of the programmers' creative energy that could be put to better use.
- More efficient review process. Open source solutions like R Shiny allows sponsors and reviewers to share reproducible interactive visualization. R Shiny is a web application framework that allows for the creation of interactive web applications and web-based data visualization tools using R programming language. Data can thus be analyzed and visualized and results can be displayed in an interactive and customizable way. For example, the youlldie web app leverages R Shiny to statistically predict cause and age of death based on inherited risk factors and lifestyle choices using data pulled from peer-reviewed literature and public data libraries.
The shift to open source solutions in the clinical research industry has been anticipated for years and hopefully, we will live to see it. Learning new R or Python skills is a clever use of waiting time.