Data Analysis and BioInformatics in real-time qPCR
(main page)

Big Data in Gene Expression Profiling   ... NEW

qPCR data-analysis talks on -- -- Amplify your knowledge!

Bioinformatics is a multidisciplinary approach to discribe, model and understand biological processes on basis of information on genes, transcripts (mRNA and microRNA), proteins and metabolism. It uses computers, data bases and algorithms to link all the information and translate it back into biology, physiology and pathophysiology.

BioInformatics  =>  Database Management Systems, Data Mining, Sample Tracking, Information Management, Data Acquisition, Data Analysis, Statistics, Pattern Recognition & Classification, Simulation & Modeling, Biomarker Discovery and Validation

Bioinformatics initially centered on sequence, genome, and transcript analysis but now the extensive use of microarrays, mass spectrometry, qPCR and RT-qPCR, RNA-Seq, has stimulated bioinformatic work in data acquisition, signal processing, and data mining. Also, simulation and modeling are becoming increasingly important areas of focus in bioinformatics which finally will lead to a new level of understanding the networks in the metabolism: Genomics, Epigenomics, Transcriptomics, Splicomics, Proteomics, Metabolomics, Integrated analysis of microRNA and mRNA expression  etc.

Nice Review about qPCR data analysis tools --
A survey of tools for the analysis of quantitative PCR data  BDQ 2014(1): 23-33

BioInformatics content - further pages:

Big biological datasets map life's networks -- Multi-omics offers a new way of doing biology.
Michael Snyder’s genes were telling him that he might be at increased risk for type 2 diabetes. The Stanford University geneticist wasn’t worried: He felt healthy and didn’t have a family history of the disease. But as he monitored other aspects of his own biological data over months and years, he saw that diabetes was indeed emerging, even though he showed no symptoms.
Snyder’s story illustrates the power of looking beyond the genome, the complete catalog of an organism’s genetic information. His tale turns the genome’s one-dimensional view into a multidimensional one. In many ways, a genome is like a paper map of the world. That map shows where the cities are. But it doesn’t say anything about which nations trade with each other, which towns have fierce football rivalries or which states will swing for a particular political candidate.
Open one of today’s digital maps, though, and numerous superimposed data sources give a whole lot of detailed, real-time information. With a few taps, Google Maps can show how to get across Boston at rush hour, offer alternate routes around traffic snarls and tell you where to pick up a pizza on the way.
Now, scientists like Snyder are developing these same sorts of tools for biology, with far-reaching consequences. To figure out what’s really happening within an organism — or within a particular organ or cell — researchers are linking the genome with large-scale data about the output of those genes at specific times, in specific places, in response to specific environmental pressures.
While the genome remains mostly stable over time, other “omes” change based on what genes are turned on and off at particular moments in particular places in the body. The proteome (all an organism’s proteins) and the metabolome (all the metabolites, or small molecules that are the outputs of biological processes) are two of several powerful datasets that become more informative when used together in a multi-omic approach. They show how that genomic instruction manual is actually being applied.
“The genome tells you what can happen,” says Oliver Fiehn, a biochemist at the University of California, Davis. The proteome and the metabolome can show what’s actually going on. And just as city planners use data about traffic patterns to figure out where to widen roads and how to time stoplights, biologists can use those entwined networks to predict at a molecular level how individual organisms will respond under specific conditions.
By linking these layers and others to expand from genomics to multi-omics, scientists might be able to meet the goals of personalized medicine: to figure out, for example, what treatment a particular cancer patient will best respond to, based on the network dynamics responsible for a tumor. Or predict whether an experimental vaccine will work before moving into expensive clinical tests. Or help crops grow better during a drought. And while many of those applications are still in the future, researchers are laying the groundwork right now. “Biology is being done in a way that’s never been done before,” says Nitin Baliga, director of the Institute for Systems Biology in Seattle.

Katsuyuki Yugi, Hiroyuki Kubota, Atsushi Hatano, Shinya Kuroda
Trends in Biotechnology 2016 34(4): 276-290

We propose 'trans-omic' analysis for reconstructing global biochemical networks across multiple omic layers by use of both multi-omic measurements and computational data integration. We introduce technologies for connecting multi-omic data based on prior knowledge of biochemical interactions and characterize a biochemical trans-omic network by concepts of a static and dynamic nature. We introduce case studies of metabolism-centric trans-omic studies to show how to reconstruct a biochemical trans-omic network by connecting multi-omic data and how to analyze it in terms of the static and dynamic nature. We propose a trans-ome-wide association study (trans-OWAS) connecting phenotypes with trans-omic networks that reflect both genetic and environmental factors, which can characterize several complex lifestyle diseases as breakdowns in the trans-omic system.

Bioinformatics Made Easy
Search bioinformatics tools and run genomic analysis in the cloud

We are excited to invite you to beta test of InsideDNA platform which provide:
  • over 600 most used bioinformatics tools including TopHat, Bowtie2, OrthoMCL, samtools, bamtools, BEAST, phyml, abyss, SOAPdeNovog
  • powerful compute nodes up to 208 Gb RAM and 32 core
  • unlimited number of compute nodes for each user
  • effortless way to launch any bioinformatics tool
How it works?
Currently, our service is free and we are thrilled to provide 10 Gb of storage space and 10 compute credits to each new user. These 10 credits roughly equal to 260 hours of computational work on different compute nodes*. We hope that you will be pleasantly surprised by how much analysis you can do during these hours. In addition, if you fill our entry survey, we will give you an extra 10 compute credits. The survey aims to make InsideDNA application better and more user friendly.

How will it work in the future?
While we are trying to make this service as affordable as possible for researchers, compute nodes are provided to us by a third party and we can only keep current service free of charge for several months and for limited number of users. After that we will have to charge for computing with a price of $10 USD per 10 compute credits (~260 hours of work). We only deduct credits when you actually do the analysis - not when you are idle.

Bug, errors and problems
Despite we have been testing InsideDNA for several months internally, it is still likely to have bugs. Thus, we kindly ask you to report any issues or problems you may experience with InsideDNA. Please provide any feedback to this email:

Next releases and forthcoming features
Currently we are working on more exciting features including provisioning of a much bigger storage space for each user. Vote for different features in our application to get them done quicker or talk to us and suggest other features which you think may be useful!

Enjoy happy sequence crunching with InsideDNA!

GenEx offers advanced methods to analyze real-time qPCR data with simple clicks of the mouse

GenEx is a popular software for qPCR data processing and analysis. Built in a modular fashion GenEx provides a multitude of functionalities for the qPCR community, ranging from basic data editing and management to advanced cutting-edge data analysis.

Basic data editing and management
Arguably the most important part of qPCR experiments is to pre-process the raw data into shape for subsequent statistical analyses. The pre-processing steps need to be performed consistently in correct order and with confidence. GenEx standard’s streamlined and user-friendly interface ensures mistake-free data handling. Intuitive and powerful presentation tools allow professional illustrations of even the most complex experimental designs.

Advanced cutting-edge data analysis
When you need more advanced analyses GenEx 6 is the product for you. Powerful enough to demonstrate feasibility it often proves sufficient for most users demands. Current features include parametric and non-parametric statistical tests, Principal Component Analysis, and Artificial Neural Networks. New features are continuously added to GenEx with close attention to customers’ needs.

New features
Sample handling and samples individual biology often contribute to confounding experimental variability. By using the new nested ANOVA feature in GenEx a user will be able to evaluate variance contributions from each step in the experimental procedure. With a good knowledge of the variance contributions, an appropriate distribution of experimental replicates can be selected to minimize confounding variance and maximize the power of the experimental design! For experiments with complex features, such as for example multifactorial diseases, analytical relationships and classifications may not readily be available. The support vector machine feature in the new version of GenEx is so easy to use that it will make this advanced supervised classification method easily available to novice users, while providing access to advanced parameters for experts.

The methods are suitable to select and validate reference genes, classify samples, group genes, monitor time dependent processes and much more.

Please see the GenEx web page or Online Tutorials                Bookmark and Share

Learn more - For further information of the analyses in GenEx, see the GenEx online help manual or

A survey of tools for the analysis of quantitative PCR (qPCR) data
Stephan Pabinger, Stefan Rödiger, Albert Kriegner, Klemens Vierlinger, Andreas Weinhäusel
Biomolecular Detection and Quantification 1 (2014) 23–33

Real-time quantitative polymerase-chain-reaction (qPCR) is a standard technique in most laboratories used for various applications in basic research. Analysis of qPCR data is a crucial part of the entire experiment, which has led to the development of a plethora of methods. The released tools either cover specific parts of the workflow or provide complete analysis solutions. Here, we surveyed 27 open-access software packages and tools for the analysis of qPCR data. The survey includes 8 Microsoft Windows, 5 web-based, 9 R-based and 5 tools from other platforms. Reviewed packages and tools support the analysis of different qPCR applications, such as RNA quantification, DNA methylation, genotyping, identification of copy number variations, and digital PCR. We report an overview of the functionality, features and specific requirements of the individual software tools, such as data exchange formats, availability of a graphical user interface, included procedures for graphical data presentation, and offered statistical methods. In addition, we provide an overview about quantification strategies, and report various applications of qPCR. Our comprehensive survey showed that most tools use their own file format and only a fraction of the currently existing tools support the standardized data exchange format RDML. To allow a more streamlined and comparable analysis of qPCR data, more vendors and tools need to adapt the standardized format to encourage the exchange of data between instrument software, analysis tools, and researchers.

For each tool its corresponding application area is specified, divided into: Cq calculation, normalization, quantification, CNV, and dPCR. The input type can either be precalculated Cq values (Cq) or raw fluorescence values (Raw). For each tool the supported operating system or the underlying framework is specified. Frameworks are often available on different operating systems allowing the package to run on several platforms. GUI specifies the existence of a graphical user interface for data input and output. ABI, Applied Biosystems format; ABT, Lightcycler export format; CSV, comma separates values, FLO, Lightcycler export format; REX, Rotor Gene export format; R format, encompasses all import and export formats provided by the default R installation and auxiliary R packages (e.g., PDF, SVG, HTML, and XLS).

Web Feature Cq/Raw Input Output OS/Framework GUI Last update Ref
CAmpER [76] Cq calculation, Normalization, Quantification Raw FLO, ABT, CSV, REX, TXT CSV, TXT Web based Yes 2009-06-01 [77]
chipPCR [34] Cq calculation Raw Native R format Native R format R based Yes 2014-06-25 [34]
CopyCaller [78] CNV Cq ABI CSV, TXT, XLS Windows Yes 2009-02-01 [79]
Cy0 Method [80] Cq calculation Raw XLS, TXT, DOC XLS Web based Yes 2010-01-01 [81] and [82]
DART-PCR [83] Cq calculation, Normalization, Quantification Raw XLS XLS Windows, Excel based Yes 2002-12-16 [84]
ddCT [85] Normalization, Quantification Cq TXT, native R format TXT, PDF, native R format R based No 2013-10-14 [86]
Deconvolution [87] Quantification Raw TXT TXT Perl based No 2010-04-29 [88]
dpcR [89] dPCR, Quantification, CNV, Genotyping Cq, Raw TXT, CSV, native R format TXT, native R format R based No 2013-09-08 [90]
EasyqpcR [91] Normalization, Quantification Cq TXT, CSV TXT R based Yes 2013-11-24 [92]
FPK-PCR [93] Cq Calculation Raw CSV, TXT TXT R based No 2012-01-20 [94]
HTqPCR [95] Normalization, Quantification, Statistics Cq TXT, native R format TXT, PDF, native R format R based No 2013-10-14 [96]
LinRegPCR [97] Cq calculation, Quantification Raw XLS, RDML XLS, RDML Windows Yes 2014-02-19 [98]
LRE Analysis [99] Quantification Raw XLS XLS MATLAB based Yes 2012-02-21 [100]
LRE Analyzer [101] Quantification Raw XLS XLS Java based Yes 2014-01-07 [102]
MAKERGAUL [103] Cq calculation, Quantification Raw CSV HTML Web based Yes 2013-08-27 [104]
NormqPCR [105] Normalization, Quantification Cq TXT TXT R based No 2013-03-23 [73]
PCR-Miner [106] Cq calculation Raw TXT TXT Web based Yes 2011-10-21 [107]
pyQPCR [108] Normalization, Quantification Cq TXT, CSV TXT, PDF Python based Yes 2012-01-03 [109]
qBase [110] Normalization, Quantification Cq XLS, RDML XLS, RDML Windows, Excel based Yes 2007 [26]
qCalculator [111] Normalization, Quantification Cq XLS XLS Windows, Excel based Yes 2004-01-26 [112]
QPCR [113] Cq calculation, Normalization, Quantification, Statistics Raw CSV, RDML CSV, RDML, XLS, SVG, PNG Web based Yes 2013-06-10 [114]
qpcR [115] Cq calculation, Normalization, Quantification, Melting curve analysis Cq, Raw CSV, native R format TXT, PDF, native R format R based No 2014-06-02 [116]
qPCR-DAMS [117] Normalization, Quantification Cq XLS XLS Windows Yes 2006-02-18 [118]
qpcrNorm [119] Normalization, Statistics Cq CSV TXT R based No 2013-10-14 [120]
REST [121] Normalization, Quantification, Statistics Cq TXT TXT Windows 32 Bit Yes 2009 [122]
SARS [123] Normalization, Statistics Cq XLS, TXT TXT Windows Yes 2011-05-01 [124]
SASqPCR [125] Normalization, Quantification, Statistics Cq XLS, CSV TXT SAS based No 2011-06-01 [126]

On non-detects in qPCR data.
McCall MN, McMurray HR, Land H, Almudevar A
Bioinformatics. 2014 Aug 15;30(16): 2310-2316

MOTIVATION: Quantitative real-time PCR (qPCR) is one of the most widely used methods to measure gene expression. Despite extensive research in qPCR laboratory protocols, normalization and statistical analysis, little attention has been given to qPCR non-detects-those reactions failing to produce a minimum amount of signal.
RESULTS: We show that the common methods of handling qPCR non-detects lead to biased inference. Furthermore, we show that non-detects do not represent data missing completely at random and likely represent missing data occurring not at random. We propose a model of the missing data mechanism and develop a method to directly model non-detects as missing data. Finally, we show that our approach results in a sizeable reduction in bias when estimating both absolute and differential gene expression.
AVAILABILITY AND IMPLEMENTATION: The proposed algorithm is implemented in the R package, nondetects. This package also contains the raw data for the three example datasets used in this manuscript. The package is freely available at and as part of the Bioconductor project.

Reverse transcription quantitative real-time PCR (RT-qPCR) is a key method for measurement of relative gene expression. Analysis of RT-qPCR data requires many iterative computations for data normalization and analytical optimization. Currently no computer program for RT-qPCR data analysis is suitable for analytical optimization and user-controllable customization based on data quality, experimental design as well as specific research aims. Here I introduce an all-in-one computer program, SASqPCR, for robust and rapid analysis of RT-qPCR data in SAS. This program has multiple macros for assessment of PCR efficiencies, validation of reference genes, optimization of data normalizers, normalization of confounding variations across samples, and statistical comparison of target gene expression in parallel samples. Users can simply change the macro variables to test various analytical strategies, optimize results and customize the analytical processes. In addition, it is highly automatic and functionally extendable. Thus users are the actual decision-makers controlling RT-qPCR data analyses. SASqPCR and its tutorial are freely available at

Determinants of expression variability
Alemu EY, Carl JW Jr, Corrada Bravo H, Hannenhalli S
Nucleic Acids Res. 2014 Apr;42(6): 3503-3514

The amount of tissue-specific expression variability (EV) across individuals is an essential characteristic of a gene and believed to have evolved, in part, under functional constraints. However, the determinants and functional implications of EV are only beginning to be investigated. Our analyses based on multiple expression profiles in 41 primary human tissues show that a gene's EV is significantly correlated with a number of features pertaining to the genomic, epigenomic, regulatory, polymorphic, functional, structural and network characteristics of the gene. We found that (i) EV of a gene is encoded, in part, by its genomic context and is further influenced by the epigenome; (ii) strong promoters induce less variable expression; (iii) less variable gene loci evolve under purifying selection against copy number polymorphisms; (iv) genes that encode inherently disordered or highly interacting proteins exhibit lower variability; and (v) genes with less variable expression are enriched for house-keeping functions, while genes with highly variable expression tend to function in development and extra-cellular response and are associated with human diseases. Thus, our analysis reveals a number of potential mediators as well as functional and evolutionary correlates of EV, and provides new insights into the inherent variability in eukaryotic gene expression.

Select the right Reference gene with Genevestigator

Genevestigator is a high quality and manually curated expression database and meta-analysis system. It allows biologists to study the expression and regulation of genes in a broad variety of contexts by summarizing information from hundreds of microarray experiments into easily interpretable results. A user-friendly interface allows you to visualize gene expression in many different tissues, at multiple developmental stages, or in response to large sets of stimuli, diseases, drug treatments, or genetic modifications. This type of meta-analysis is core to understanding the spatio-temporal-response regulation of genes, to identify or validate biomarkers, and to find out which subnetworks are commonly affected in different diseases and conditions.

Screenshots       Video Tutorials

Graphical user interface. The different tools are presented as icons and grouped by tool sets. The Genevestigator tools help you to find relevant conditions for your genes of interest, to find genes having special properties (e.g. biomarkers), or to identify gene expression modules that are co-regulated over selected conditions. The tools let you analyze individual experiments or thousands of experiments simultaneously.

RefGenes tool.
Identification of genes having the smallest expression variance across 26,075 human samples (Affymetrix 133 Plus 2 arrays). The two boxplots in the upper section represent, as a comparison, the expression distribution of PPIA and B2M (two commonly used reference genes for RT-qPCR) across the same set of samples.
=> RefGenes tutorial

ExpressionData - A public resource of high quality curated datasets representing gene expression across anatomy, development and experimental conditions.
Zimmermann P, Bleuler S, Laule O, Martin F, Ivanov NV, Campanoni P, Oishi K, Lugon-Moulin N, Wyss M, Hruz T, Gruissem W.
BioData Min. 2014 7: 18 -- eCollection 2014.

Reference datasets are often used to compare, interpret or validate experimental data and analytical methods. In the field of gene expression, several reference datasets have been published. Typically, they consist of individual baseline or spike-in experiments carried out in a single laboratory and representing a particular set of conditions. Here, we describe a new type of standardized datasets representative for the spatial and temporal dimensions of gene expression. They result from integrating expression data from a large number of globally normalized and quality controlled public experiments. Expression data is aggregated by anatomical part or stage of development to yield a representative transcriptome for each category. For example, we created a genome-wide expression dataset representing the FDA tissue panel across 35 tissue types. The proposed datasets were created for human and several model organisms and are publicly available at

A multilevel gamma-clustering layout algorithm for visualization of biological networks.
Hruz T, Wyss M, Lucas C, Laule O, von Rohr P, Zimmermann P, Bleuler S.
Adv Bioinformatics. 2013: 920325

Visualization of large complex networks has become an indispensable part of systems biology, where organisms need to be considered as one complex system. The visualization of the corresponding network is challenging due to the size and density of edges. In many cases, the use of standard visualization algorithms can lead to high running times and poorly readable visualizations due to many edge crossings. We suggest an approach that analyzes the structure of the graph first and then generates a new graph which contains specific semantic symbols for regular substructures like dense clusters. We propose a multilevel gamma-clustering layout visualization algorithm (MLGA) which proceeds in three subsequent steps: (i) a multilevel γ -clustering is used to identify the structure of the underlying network, (ii) the network is transformed to a tree, and (iii) finally, the resulting tree which shows the network structure is drawn using a variation of a force-directed algorithm. The algorithm has a potential to visualize very large networks because it uses modern clustering heuristics which are optimized for large graphs. Moreover, most of the edges are removed from the visual representation which allows keeping the overview over complex graphs with dense subgraphs.

Global regulatory architecture of human, mouse and rat tissue transcriptomes.
Prasad A, Kumar SS, Dessimoz C, Bleuler S, Laule O, Hruz T, Gruissem W, Zimmermann P.
BMC Genomics. 2013 14: 716

BACKGROUND: Predicting molecular responses in human by extrapolating results from model organisms requires a precise understanding of the architecture and regulation of biological mechanisms across species.
RESULTS: Here, we present a large-scale comparative analysis of organ and tissue transcriptomes involving the three mammalian species human, mouse and rat. To this end, we created a unique, highly standardized compendium of tissue expression. Representative tissue specific datasets were aggregated from more than 33,900 Affymetrix expression microarrays. For each organism, we created two expression datasets covering over 55 distinct tissue types with curated data from two independent microarray platforms. Principal component analysis (PCA) revealed that the tissue-specific architecture of transcriptomes is highly conserved between human, mouse and rat. Moreover, tissues with related biological function clustered tightly together, even if the underlying data originated from different labs and experimental settings. Overall, the expression variance caused by tissue type was approximately 10 times higher than the variance caused by perturbations or diseases, except for a subset of cancers and chemicals. Pairs of gene orthologs exhibited higher expression correlation between mouse and rat than with human. Finally, we show evidence that tissue expression profiles, if combined with sequence similarity, can improve the correct assignment of functionally related homologs across species.
CONCLUSION: The results demonstrate that tissue-specific regulation is the main determinant of transcriptome composition and is highly conserved across mammalian species.

Investigation of variation in gene expression profiling of human blood by extended principle component analysis.
Xu Q, Ni S, Wu F, Liu F, Ye X, Mougin B, Meng X, Du X.
Fudan University Shanghai Cancer Center - Institut Mérieux Laboratory, Fudan University Shanghai Cancer Center, Shanghai, People's Republic of China.
PLoS One. 2011;6(10): e26905

BACKGROUND: Human peripheral blood is a promising material for biomedical research. However, various kinds of biological and technological factors result in a large degree of variation in blood gene expression profiles.
METHODOLOGY/PRINCIPAL FINDINGS: Human peripheral blood samples were drawn from healthy volunteers and analysed using the Human Genome U133Plus2 Microarray. We applied a novel approach using the Principle Component Analysis and Eigen-R(2) methods to dissect the overall variation of blood gene expression profiles with respect to the interested biological and technological factors. The results indicated that the predominating sources of the variation could be traced to the individual heterogeneity of the relative proportions of different blood cell types (leukocyte subsets and erythrocytes). The physiological factors like age, gender and BMI were demonstrated to be associated with 5.3% to 9.2% of the total variation in the blood gene expression profiles. We investigated the gene expression profiles of samples from the same donors but with different levels of RNA quality. Although the proportion of variation associated to the RNA Integrity Number was mild (2.1%), the significant impact of RNA quality on the expression of individual genes was observed.
CONCLUSIONS: By characterizing the major sources of variation in blood gene expression profiles, such variability can be minimized by modifications to study designs. Increasing sample size, balancing confounding factors between study groups, using rigorous selection criteria for sample quality, and well controlled experimental processes will significantly improve the accuracy and reproducibility of blood transcriptome study.

Download free version of GenEx software !        
Multi dimensional qPCR data analysis via GenEx analysis software  (MultiD)

Real-time PCR gene expression profiling

Mikael Kubista, Björn Sjögreen, Amin Forootan, Radek Sindelka and Jiri Jonák, and José Manuel Andrade

Real-time PCR has rapidly become the preferred technique for quantitative analysis of nucleic acids. Its superior sensitivity, reproducibility and dynamic range make it the preferred choice for expression profiling in scientific, as well as routine, applications.    => Link to GenEx software

    Real-Time PCR: Current Technology and Applications
Publisher: Caister Academic Press
Editor: Julie Logan, Kirstin Edwards and Nick Saunders Applied and Functional Genomics, Health Protection Agency, London (2009)
ISBN: 978-1-904455-39-4

Chapter 4 - Reference Gene Validation Software for Improved Normalization
J. Vandesompele, M. Kubista and M. W. Pfaffl  (2009)

Real-time PCR is the method of choice for expression analysis of a limited number of genes. The measured gene expression variation between subjects is the sum of the true biological variation and several confounding factors resulting in non-specific variation. The purpose of normalization is to remove the non-biological variation as much as possible. Several normalization strategies have been proposed, but the use of one or more reference genes is currently the preferred way of normalization. While these reference genes constitute the best possible normalizers, a major problem is that these genes have no constant expression under all experimental conditions. The experimenter therefore needs to carefully assess whether a certain reference gene is stably expressed in the experimental system under study. This is not trivial and represents a circular problem. Fortunately, several algorithms and freely available software have been developed to address this problem. This chapter aims to provide an overview of the different concepts.

Chapter 5 - Data Analysis Software
M. W. Pfaffl, J. Vandesompele and M. Kubista  (2009)

Quantitative real-time RT-PCR (qRT-PCR) is widely and increasingly used in any kind of mRNA quantification, because of its high sensitivity, good reproducibility and wide dynamic quantification range. While qRT-PCR has a tremendous potential for analytical and quantitative applications, a comprehensive understanding of its underlying principles is important. Beside the classical RT-PCR parameters, e.g. primer design, RNA quality, RT and polymerase performances, the fidelity of the quantification process is highly dependent on a valid data analysis. This review will cover all aspects of data acquisition (trueness, reproducibility, and robustness), potentials in data modification and will focus particularly on relative quantification methods. Furthermore useful bioinformatical, biostatical as well as multi-dimensional expression software tools will be presented.

Real-Time PCR: Current Technology and Applications - Book reviews:

"... a comprehensive overview of the RT-PCR technology, which is as up-to-date as a book can be ..." Mareike Viebahn in Current Issues in Molecular Biology (2009)

"... a useful book for students ..." from J. Microbiological Methods

"provides a dual focus by aiming, in the early chapters, to provide both the theory and practicalities of this diverse and superficially simple technology, counter-balancing this in the later chapters with real-world applications, covering infectious diseases, biodefence, molecular haplotyping and food standards." from Microbiology Today

"a reference work that should be found both in university libraries and on the shelves of experienced applications specialists."   from Microbiology Today

"a comprehensive guide to real-time PCR technology and its applications" from Food Science and Technology Abstracts (2009) Volume 41 Number 6

"This volume should be of utmost interest to all investigators interested and involved in using RT-PCR ... the RT-PCR protocols covered in this book will be of interest to most, if not all, investigators engaged in research that uses this important technique ... a well balanced book covering the many potential uses of real-time PCR ... valuable for all those interested in RT-PCR." from Doodys reviews (2009)

"provide the novice and the experienced user with guidance on the technology, its instrumentation, and its applications" f  rom SciTech Book News 2009 p. 64

"... written by international authors expert in specific technical principles and applications ... a useful compendium of basic and advanced applications for laboratory scientists. It is an ideal introductory textbook and will serve as a practical handbook in laboratories where the technology is employed." from Christopher J. McIver, Microbiology Department, Prince of Wales Hospital, New South Wales, Australia writing in Australian J. Med. Sci. 2009. 30(2): 59-60


Biogazelle is the real-time PCR data-analysis company, founded in 2007 as a Ghent University spin-off company. Its founders have more than 10 years of experience in real-time PCR experiment design, assay development and data-analysis. They wrote one of the most influential papers on normalization of gene expression and on data-analysis (together cited more than one thousand times in internal peer-reviewed articles).

Biogazelle's flagship product qBase+ is the most powerful, flexible, and user-friendly real-time PCR data-analysis software based on the proven geNorm and qBase technology, enhanced with proprietary algorithms and innovative features. qBase+ is truly accelerating your research.

Based on years of experience, Biogazelle is also offering hands-on courses on experiment design and data-analysis, starting June 2008.

qBase has now been phased out and the professional successor qBase+ is now available from the real-time PCR data-analysis company Biogazelle.

Statistical analysis of real-time PCR data.
Yuan JS, Reed A, Chen F, Stewart CN Jr.   BMC Bioinformatics. 2006 (7): 85.
Department of Plant Sciences, University of Tennessee, Knoxville, TN 37996, USA.

BACKGROUND: Even though real-time PCR has been broadly applied in biomedical sciences, data processing procedures for the analysis of quantitative real-time PCR are still lacking; specifically in the realm of appropriate statistical treatment. Confidence interval and statistical significance considerations are not explicit in many of the current data analysis approaches. Based on the standard curve method and other useful data analysis methods, we present and compare four statistical approaches and models for the analysis of real-time PCR data. 
RESULTS: In the first approach, a multiple regression analysis model was developed to derive DeltaDeltaCt from estimation of interaction of gene and treatment effects. In the second approach, an ANCOVA (analysis of covariance) model was proposed, and the DeltaDeltaCt can be derived from analysis of effects of variables. The other two models involve calculation DeltaCt followed by a two group t-test and non-parametric analogous Wilcoxon test. SAS programs were developed for all four models and data output for analysis of a sample set are presented. In addition, a data quality control model was developed and implemented using SAS. 
CONCLUSION: Practical statistical solutions with SAS programs were developed for real-time PCR data and a sample dataset was analyzed with the SAS programs. The analysis using the various models and programs yielded similar results. Data quality control and analysis procedures presented here provide statistical elements for the estimation of the relative expression of genes using real-time PCR.

Data Analysis Methods

There are two methods, both equally valid, for analyzing data obtained from real time PCR: Relative Standard Curve Method and Comparative CT Method. The first, relative standard curve method, is useful for investigators that have a limited number of cDNA samples and a large number of genes of interest. The comparative CT method is useful for investigators who have a lage number of cDNA samples and a limited number of genes of interest (RRC Core Genomics Facility, University of Illinois at Chicago)

qPCR Bioinformatik:  Neue Entwicklungen in der post-qPCR Datenanalyse  (in German)
Michael W. Pfaffl (2006), Laborwelt (1): 10-13, ISSN 1611–0854 (Editor:  T. Gabrielczyk)
Die Entwicklung der Polymerase Ketten Reaktion (PCR) in den 80er Jahren gehört zweifelsohne zu den größten Errungenschaften in der Molekularbiologie. Mittels der klassischen PCR lassen sich hochsensitiv Genabschnitte oder DNA Fragmente qualitativ sowie semi-quantitativ nachweisen. Um spezifische mRNA zu quantifizieren, stellt man der PCR die Reverse Transkription (RT) vor. Die Anwendung der RT-PCR zur Quantifizierung spezifischen mRNA ist heute zum Routinewerkzeug in der Expressionsanalytik geworden. Die gewonnenen Ergebnisse sind von überproportionalen Nutzen in der molekularbiologischen Forschung und molekularen Diagnostik, in der vergleichenden Expressionsanalytik sowie zur Aufklärung der „Functional Genomics“.
Der Nachweis kann qualitativ in klassischen Thermocyclern oder in „real-time“ quantitativ mittels Echtzeit PCR (qPCR) durchgeführt werden. Die Ergebnisse sind direkt verfügbar, so dass der Einsatz der qPCR eine deutliche Zeitersparnis mit sich bringt. Da die Zunahme der Fluoreszenz und die Menge an neusynthetisierten PCR-Produkten über einen weiten Bereich proportional zueinander sind, kann aus den gewonnenen Fluoreszenzdaten die eingesetzte Ausgangsmenge der DNA respektive RNA bestimmt werden. Vorraussetzung für einen zuverlässigen quantitativen Nachweis ist eine funktionierende Analytik und Datenauswertung, die exakte Quantifizierungsergebnisse bei ausreichender Genauigkeit und hoher Wiederholbarkeit liefert.

QPCR DEMO - real-time PCR data management and analysis
Developed by - Stephan Pabinger   or
QPCR is a versatile web-based Java application that allows to store, manage, analyze, and display data from quantitative real-time polymerase chain reaction (qPCR) experiments. You can try out the application by using the demo account at  QPCR Demo

It is strongly recommended to use a private account which guarantees confidentiality and security of your data.
To request an account please contact
To get started:
Read  the tutorial which leads you through all important steps of the application.
For more information download the user guide which covers all aspects of the application.

BACKGROUND: Since its introduction quantitative real-time polymerase chain reaction (qPCR) has become the standard method for quantification of gene expression. Its high sensitivity, large dynamic range, and accuracy led to the development of numerous applications with an increasing number of samples to be analyzed. Data analysis consists of a number of steps, which have to be carried out in several different applications. Currently, no single tool is available which incorporates storage, management, and multiple methods covering the complete analysis pipeline. RESULTS: QPCR is a versatile web-based Java application that allows to store, manage, and analyze data from relative quantification qPCR experiments. It comprises a parser to import generated data from qPCR instruments and includes a variety of analysis methods to calculate cycle-threshold and amplification efficiency values. The analysis pipeline includes technical and biological replicate handling, incorporation of sample or gene specific efficiency, normalization using single or multiple reference genes, inter-run calibration, and fold change calculation. Moreover, the application supports assessment of error propagation throughout all analysis steps and allows conducting statistical tests on biological replicates. Results can be visualized in customizable charts and exported for further investigation. CONCLUSION: We have developed a web-based system designed to enhance and facilitate the analysis of qPCR experiments. It covers the complete analysis workflow combining parsing, analysis, and generation of charts into one single application. The system is freely available at


pyQPCR is an open-source software. It can be used to perform qPCR analysis. It may be used, copied and modified with no restriction according to the GPLv3 (or higher) licence.

pyQPCR is a GUI application written in python that deals with quantitative PCR (QPCR) raw data. Using quantification cycle values extracted from QPCR instruments, it uses a proven and universally applicable model to give finalized quantification results.

Import QPCR raw data / open existing file

During this first step, you can:
  • Create a new project: you give a project name, choose the PCR device (for now only Eppendorf ones are supported, but others can be easily added) and import your raw data (TXT or CSV files) of one or several plates. Some examples of these files are given with the source of pyQPCR.
  • Open an existing one: pyQPCR has its own file format which is XML based. You can directly open these files (examples are in the source code of pyQPCR).
At any time, you can add or remove a plate from your project thanks to the corresponding icons.

Plate settings

You can edit the data of each well separately or select and modify a group of wells. You also can change the targets and samples properties (name, efficiency of the primers), and remove or add new ones. You can disable wells in order to not take them into account for calculations.

Standard curve calculation

You can define as "standards" the wells that contains dilutions of DNA in order to calculate PCR efficiency. Then, you precise the amount of DNA (arbitrary unit) in the different wells and the program will plot the standard curve and calculate PCR efficiency for this set of primers. This efficiency will be taken into account for subsequent relative quantifications.

Reference target and sample

For relative quantification calculations, you must define a reference gene and target. They can be either shared for all plates or specific of each plate.

Relative quantification

The wells defined as "unknown" are used to calculate relative quantifications. An improved ΔΔCt method allows you to obtain reliable quantifications and error. The confidence level is modifiable and can be either gaussian or calculated using a T-test. The program plots results as histograms that are easy to customize.

Results, export and save

Results can be printed or exported in a pdf file containing a table with all the data and plots for standard curves and/or relatives quantifications. You can also save your project in the pyQPCR XML file format that allows you to keep the entire project with the different plates and settings easely recoverable.

qBase relative quantification framework and software for management and automated analysis of real-time quantitative PCR data. Hellemans J, Mortier G, De Paepe A, Speleman F, Vandesompele J. Genome Biol. 2007;8(2): R19.

qBASE Talk at the qPCR 2007 symposium

The qpcR library - Analysis of real-time PCR data using R

The qpcR library is an extension to the R environment that assists in the modelling and analysis of quantitative real-time PCR data =>

With the qpcR library you can:
  • Fit sigmoidal (three-, four-, five- and six-parameter) models to the raw fluorescence data and display the curves with various options.
  • Calculate essential PCR parameters (efficiency, threshold cycles, initial template fluorescence F0) from the sigmoidal fits and display comprehensive graphics.
  • Conduct a model selection process in which the best sigmoidal model is chosen by nested F-tests on the residual variance or other criteria such as Akaike weights.
  • Derive values from more classical quantitation methods, such as the ‘window-of-linearity’ method, exponential fitting of the identified exponential region or a calibration curve from diluted samples.
  • In calibration curve analysis, find the threshold fluorescence value which maximizes the linearity of the dilution curve 'threshold cycles'.
  • Further optimize the fitting process by eliminating cycles in the ground and plateau phase, using all possible combinations.
  • Calculate many measures for the goodness-of-fit, such as the residual variance, R-squared, adjusted R-squared, Akaike Information Criterion (AIC), corrected AIC (AICc), Bayesian Information Criterion (BIC), root-mean-squared-error (RMSE) and Allen's PRESS statistic.
  • Make goodness-of-fit tests such as 'lack-of-fit' or Neill's test for nonreplicates.
  • Do a batch analysis of many runs with all methods (this often reveals dramatic differences in the estimated parameters!).
  • Predict either fluorecence or cycle values from data.
  • Calculate the goodness-of-fit (by means of RMSE) of all different sigmoidal models within the exponential region of the qPCR curve.
  • Conduct gaussian error propagation with Monte Carlo simulation using multivariate normal distributions if a covariance matrix is given.
  • Calculate ratios and their propagated errors for qPCR runs, using single or replicated data. If reference PCRs are supplied, the ratios are normalized against these.
  • Calculate ratios with a permutation approach such as in the popular REST software.
  • Build an averaged model from several housekeeping PCRs.
  • Calculate model selection measures such as Likelihood Ratios (nested) or Akaike weights (non-nested).
  • Calculate the Cy0 value as described in Guescini et al and do a maxRatio analysis as in Shain et al.
  • Bootstrap qPCR data and obtain confidence intervals for all estimated parameters, including those from efficiency and threshold cycle analysis.
  • Simulate qPCR curves starting from a fitted curve and including defined homo/heteroscedastic noise.
  • Do automatic plotting of large-scale batch PCRs by using 3D-plots or plot matrices.
  • Identify deviating qPCR runs within a group of replicates by Kinetic Outlier Detection and non-replicated runs by Sigmoidal Outlier Detection .
  • Conduct batch ratio analysis from 96- or 384-well plates that contain different numbers of control/treatment samples or gene-of-interests/reference genes with automatic sample recognition from the column headers.
  • Do a complete melting curve analysis of qPCR runs, including graphical display of melt curves and automatic Tm identification of the products.
  • Screenshots =>

PowerNest -  illuminating error in qPCR experiment design

PowerNest is a software tool enabling experimenters to explore the effect of sampling on noise propagation throughout qPCR assays.  The sampling process is assumed to be comprised of a number of levels; the acquisition of a sample and the preparation of extracted material, reverse-transcription of the mRNA, and the qPCR itself.  Given a small set of data, representative of a larger assay, the error at each stage of the experiment is profiled using a nested-ANOVA.
Armed with this information, PowerNest allows the experimenter to explore the effects of modifications to the experimental design on the expected total error of the assay.  When given the financial cost of replicates at each level, PowerNest will calculate a cost-optimal sampling-plan, delivering an experiment design that will minimise processing error and maximise the statistical resolution of the assay.

The software is temporarily undergoing final testing, during which time it has been made available as a free download

PowerNest Poster

The registration of the accumulation of polymerase chain reaction (PCR) products in the course of amplification (real-time PCR) requires specific equipment, i.e., detecting amplifiers capable of recording the level of fluorescence in the reaction tube during amplicon formation. When the time of the reaction is complete, researchers are able to obtain DNA accumulation graphs. This review discusses the most promising algorithms of the analysis of real-time PCR curves and possible errors, caused by the software used or by operators' mistakes. The data included will assist researchers in understanding the features of a method to obtain more reliable results.

Evaluation of real-time PCR data.

Vaerman JL, Saussoy P, Ingargiola I.   J Biol Regul Homeost Agents. 2004 18(2): 212-214.
UCL, Cliniques Saint Luc, Bruxelles, Belgium.

If real-time PCR is to be of much worth to its user, some idea regarding the reliability of its data is essential. We discuss here some of the problems associated with interpreting numerical real-time PCR data that lend themselves to analytical evaluation. We translate into the language of molecular biology some of the criteria which are used to evaluate the performance of any new method (linearity, precision, specificity, limit of detection and quantification).

Statistical practice in high-throughput screening data analysis.

Malo N, Hanley JA, Cerquozzi S, Pelletier J, Nadon R.
Nat Biotechnol. 2006 24(2): 167-75.
McGill University and Genome Quebec Innovation Centre, 740 avenue du Docteur Penfield, Montreal, Quebec, Canada

High-throughput screening is an early critical step in drug discovery. Its aim is to screen a large number of diverse chemical compounds to identify candidate 'hits' rapidly and accurately. Few statistical tools are currently available, however, to detect quality hits with a high degree of confidence. We examine statistical aspects of data preprocessing and hit identification for primary screens. We focus on concerns related to positional effects of wells within plates, choice of hit threshold and the importance of minimizing false-positive and false-negative rates. We argue that replicate measurements are needed to verify assumptions of current methods and to suggest data analysis strategies when assumptions are not met. The integration of replicates with robust statistical methods in primary screens will facilitate the discovery of reliable hits, ultimately improving the sensitivity and specificity of the screening process.