By Sam Haynes and Edward Wallace
Quantitative Polymerase Chain Reaction, or qPCR, is a highly adaptable experimental technique used across biology and medicine. We have developed the R package tidyqpcr to encourage best practices in qPCR experimental design and analysis.
tidyqpcr began as a series of in-house R scripts to analyse our 384-well RT-qPCR data. The steady demand within our lab for a consistent qPCR analysis pipeline led us to organise our functions into a portable and upgradeable R package. As part of the eLife Open Innovation Leaders program 2020, we decided to contribute to a more general need for a best-practice, open source qPCR analysis pipeline by adding documentation and tests to our package, creating tidyqpcr.
Conceptually, the basics of qPCR are intuitive to understand. It consists of a cycle of melting double-stranded DNA into single strands, annealing specifically designed primers to the target sequences and duplicating every single strand by elongation between the primers. The amount of duplicated DNA is measured at each cycle by fluorescent probes.
The process of duplicating target sequences and measuring fluorescence can be used to deduce the amount of DNA in the original sample. Starting with an excess of primers, enzymes, and nucleotides, the increase in DNA copies per cycle is bound by an exponential defined only by the original number of copies of the target sequence in the original sample. Real-time quantification of fluorescence can then deduce the absolute number of copies in the original sample by comparing it to a standard curve, or the relative number of copies by normalising to DNA targets with stable expression. As qPCR only requires well-designed primers, it can measure DNA abundance in a variety of samples and RNA abundance too, if reverse transcribed. 123
Unfortunately, the versatility of qPCR assays is undermined by poor experimental design, inconsistent analysis and inadequate reporting. For example, controls need to be included for contamination and, if applicable, errors in reverse transcription. PCR efficiency needs to be calculated for all primers to quantify the amount of product that is duplicated for all target amplicons in each cycle. The normalisation technique used in the analysis also needs to be accurately explained.
Infamously, Science retracted one of 2005’s runners up for “Breakthrough of the Year” due to incorrect conclusions drawn from qPCR results45. The report used cherry-picked summary statistics to argue for an incorrect mechanism for how plants initiate flowering. The raw data and the analysis script were not included in the publication, so reviewers did not detect the error.
To promote best practices, a team of qPCR experts created advice for publishing qPCR results (and implicitly the design of experiments) called the Minimum Information for the Publication of Quantitative Real-Time PCR Experiments (MIQE) guidelines7. They provide a thorough checklist of every detail that needs to be reported to enable other researchers to accurately repeat your work. However, the guidelines have been around since 2009 but are still not widely implemented or known about, cited by as little as 4% of qPCR articles8.
We want to empower colleagues to conduct reproducible, flexible, and MIQE best-practice compliant quantitative PCR experiments and analysis, starting with our research group. Enter tidyqpcr, an R package that combines powerful generic data science tools from the tidyverse with documentation for designing experiments using best practices. Analysis scripts written in tidyqpcr can be released alongside a qPCR publication, or as part of interactive articles, so anyone can recreate published qPCR figures straight from raw data. You can download tidyqpcr from our GitHub page.
We wrote tidyqpcr to take advantage of improvements to the usability and intuitiveness of open-source data analysis tools in the tidyverse. These centre on tidy data (spreadsheet-like rectangular data frames) and generic functions that build up complex analyses in a series of simple steps. tidyqpcr is built around the tidy paradigm to enable data to be easily imported from and exported to spreadsheets.
Currently, tidyqpcr contains functions for:
- designing the layout of microwell plates
- loading data from Roche LightCyclerⓇ
- calculating ΔCq and ΔΔCq given reference target amplicons
- visualising amplification and melt curves for quality control.
We are currently focusing on adding functionality for including primer efficiencies in quantification, calculating quantification cycle (Cq) from amplification curves, and integrating experimental meta-data using the Real-Time PCR Data Markup Language or RDML9. As more users start to analyse their data with tidyqpcr we also hope to include functions for easy input and design for other available qPCR machines. We continue to conduct user interviews to ensure comprehensive function documentation, intuitive analysis structure and accessible walkthroughs.
- If you regularly teach students qPCR we would like to hear from you! tidyqpcr is an educational tool as well as a functional one. Hopefully, it will aid you in teaching best practices.
- We want to conduct more user interviews and help other research groups to use tidyqpcr. Give it a try and tell us what you think.
- We want to develop a community of qPCR users. To improve tidyqpcr’s functionality and exchange best practices we want to use tidyqpcr to facilitate a dialogue. Post questions as issue tickets on our GitHub page and we will use them to improve our software.
We welcome comments, questions and feedback. Please annotate publicly on the article or contact us at innovation [at] elifesciences [dot] org.
Do you have an idea or innovation to share? Send a short outline for a Labs blogpost to innovation [at] elifesciences [dot] org.
For the latest in innovation, eLife Labs and new open-source tools, sign up for our technology and innovation newsletter. You can also follow @eLifeInnovation on Twitter.