subpage 4 (integrative data analysis)
subpage 5 (latest paper updates)
BioInformatics content - page 2:
Bioinformatics Made Easy
Search bioinformatics tools and run genomic analysis in the cloud
We are excited to invite you to beta test of InsideDNA platform which provide:
How it works?
Currently, our service is free and we are thrilled to provide 10 Gb of storage space and 10 compute credits to each new user. These 10 credits roughly equal to 260 hours of computational work on different compute nodes*. We hope that you will be pleasantly surprised by how much analysis you can do during these hours. In addition, if you fill our entry survey, we will give you an extra 10 compute credits. The survey aims to make InsideDNA application better and more user friendly.
How will it work in the future?
While we are trying to make this service as affordable as possible for researchers, compute nodes are provided to us by a third party and we can only keep current service free of charge for several months and for limited number of users. After that we will have to charge for computing with a price of $10 USD per 10 compute credits (~260 hours of work). We only deduct credits when you actually do the analysis - not when you are idle.
Bug, errors and problems
Despite we have been testing InsideDNA for several months internally, it is still likely to have bugs. Thus, we kindly ask you to report any issues or problems you may experience with InsideDNA. Please provide any feedback to this email: InsideDNA@gmail.com
Next releases and forthcoming features
Currently we are working on more exciting features including provisioning of a much bigger storage space for each user. Vote for different features in our application to get them done quicker or talk to us and suggest other features which you think may be useful!
Enjoy happy sequence crunching with InsideDNA!
GenEx offers advanced methods to analyze real-time qPCR data with simple clicks of the mouse
GenEx is a popular software for qPCR data processing and analysis. Built in a modular fashion GenEx provides a multitude of functionalities for the qPCR community, ranging from basic data editing and management to advanced cutting-edge data analysis.
Basic data editing and management
Arguably the most important part of qPCR experiments is to pre-process the raw data into shape for subsequent statistical analyses. The pre-processing steps need to be performed consistently in correct order and with confidence. GenEx standard’s streamlined and user-friendly interface ensures mistake-free data handling. Intuitive and powerful presentation tools allow professional illustrations of even the most complex experimental designs.
Advanced cutting-edge data analysis
When you need more advanced analyses GenEx 6 is the product for you. Powerful enough to demonstrate feasibility it often proves sufficient for most users demands. Current features include parametric and non-parametric statistical tests, Principal Component Analysis, and Artificial Neural Networks. New features are continuously added to GenEx with close attention to customers’ needs.
Sample handling and samples individual biology often contribute to confounding experimental variability. By using the new nested ANOVA feature in GenEx a user will be able to evaluate variance contributions from each step in the experimental procedure. With a good knowledge of the variance contributions, an appropriate distribution of experimental replicates can be selected to minimize confounding variance and maximize the power of the experimental design! For experiments with complex features, such as for example multifactorial diseases, analytical relationships and classifications may not readily be available. The support vector machine feature in the new version of GenEx is so easy to use that it will make this advanced supervised classification method easily available to novice users, while providing access to advanced parameters for experts.
The methods are suitable to select and validate reference genes, classify samples, group genes, monitor time dependent processes and much more.
Please see the GenEx web page or Online Tutorials
Learn more - For further information of the analyses in GenEx, see the GenEx online help manual or www.qPCRforum.com
A survey of tools for the analysis of quantitative PCR (qPCR) data
Stephan Pabinger, Stefan Rödiger, Albert Kriegner, Klemens Vierlinger, Andreas Weinhäusel
Biomolecular Detection and Quantification 1 (2014) 23–33
Real-time quantitative polymerase-chain-reaction (qPCR) is a standard technique in most laboratories used for various applications in basic research. Analysis of qPCR data is a crucial part of the entire experiment, which has led to the development of a plethora of methods. The released tools either cover specific parts of the workflow or provide complete analysis solutions. Here, we surveyed 27 open-access software packages and tools for the analysis of qPCR data. The survey includes 8 Microsoft Windows, 5 web-based, 9 R-based and 5 tools from other platforms. Reviewed packages and tools support the analysis of different qPCR applications, such as RNA quantification, DNA methylation, genotyping, identification of copy number variations, and digital PCR. We report an overview of the functionality, features and specific requirements of the individual software tools, such as data exchange formats, availability of a graphical user interface, included procedures for graphical data presentation, and offered statistical methods. In addition, we provide an overview about quantification strategies, and report various applications of qPCR. Our comprehensive survey showed that most tools use their own file format and only a fraction of the currently existing tools support the standardized data exchange format RDML. To allow a more streamlined and comparable analysis of qPCR data, more vendors and tools need to adapt the standardized format to encourage the exchange of data between instrument software, analysis tools, and researchers.
On non-detects in qPCR data.
McCall MN, McMurray HR, Land H, Almudevar A
Bioinformatics. 2014 Aug 15;30(16): 2310-2316
MOTIVATION: Quantitative real-time PCR (qPCR) is one of the most widely used methods to measure gene expression. Despite extensive research in qPCR laboratory protocols, normalization and statistical analysis, little attention has been given to qPCR non-detects-those reactions failing to produce a minimum amount of signal.
RESULTS: We show that the common methods of handling qPCR non-detects lead to biased inference. Furthermore, we show that non-detects do not represent data missing completely at random and likely represent missing data occurring not at random. We propose a model of the missing data mechanism and develop a method to directly model non-detects as missing data. Finally, we show that our approach results in a sizeable reduction in bias when estimating both absolute and differential gene expression.
AVAILABILITY AND IMPLEMENTATION: The proposed algorithm is implemented in the R package, nondetects. This package also contains the raw data for the three example datasets used in this manuscript. The package is freely available at http://mnmccall.com/software and as part of the Bioconductor project.
SASqPCR: robust and rapid analysis of RT-qPCR data in SAS
PLoS One. 2012; 7(1): e29788
Reverse transcription quantitative real-time PCR (RT-qPCR) is a key method for measurement of relative gene expression. Analysis of RT-qPCR data requires many iterative computations for data normalization and analytical optimization. Currently no computer program for RT-qPCR data analysis is suitable for analytical optimization and user-controllable customization based on data quality, experimental design as well as specific research aims. Here I introduce an all-in-one computer program, SASqPCR, for robust and rapid analysis of RT-qPCR data in SAS. This program has multiple macros for assessment of PCR efficiencies, validation of reference genes, optimization of data normalizers, normalization of confounding variations across samples, and statistical comparison of target gene expression in parallel samples. Users can simply change the macro variables to test various analytical strategies, optimize results and customize the analytical processes. In addition, it is highly automatic and functionally extendable. Thus users are the actual decision-makers controlling RT-qPCR data analyses. SASqPCR and its tutorial are freely available at http://code.google.com/p/sasqpcr/downloads/list
Determinants of expression variability
Alemu EY, Carl JW Jr, Corrada Bravo H, Hannenhalli S
Nucleic Acids Res. 2014 Apr;42(6): 3503-3514
The amount of tissue-specific expression variability (EV) across individuals is an essential characteristic of a gene and believed to have evolved, in part, under functional constraints. However, the determinants and functional implications of EV are only beginning to be investigated. Our analyses based on multiple expression profiles in 41 primary human tissues show that a gene's EV is significantly correlated with a number of features pertaining to the genomic, epigenomic, regulatory, polymorphic, functional, structural and network characteristics of the gene. We found that (i) EV of a gene is encoded, in part, by its genomic context and is further influenced by the epigenome; (ii) strong promoters induce less variable expression; (iii) less variable gene loci evolve under purifying selection against copy number polymorphisms; (iv) genes that encode inherently disordered or highly interacting proteins exhibit lower variability; and (v) genes with less variable expression are enriched for house-keeping functions, while genes with highly variable expression tend to function in development and extra-cellular response and are associated with human diseases. Thus, our analysis reveals a number of potential mediators as well as functional and evolutionary correlates of EV, and provides new insights into the inherent variability in eukaryotic gene expression.
Select the right Reference gene with Genevestigator
Genevestigator is a high quality and manually curated expression database and meta-analysis system. It allows biologists to study the expression and regulation of genes in a broad variety of contexts by summarizing information from hundreds of microarray experiments into easily interpretable results. A user-friendly interface allows you to visualize gene expression in many different tissues, at multiple developmental stages, or in response to large sets of stimuli, diseases, drug treatments, or genetic modifications. This type of meta-analysis is core to understanding the spatio-temporal-response regulation of genes, to identify or validate biomarkers, and to find out which subnetworks are commonly affected in different diseases and conditions.
Screenshots Video Tutorials
Graphical user interface. The different tools are presented as icons and grouped by tool sets. The Genevestigator tools help you to find relevant conditions for your genes of interest, to find genes having special properties (e.g. biomarkers), or to identify gene expression modules that are co-regulated over selected conditions. The tools let you analyze individual experiments or thousands of experiments simultaneously.
RefGenes tool. Identification of genes having the smallest expression variance across 26,075 human samples (Affymetrix 133 Plus 2 arrays). The two boxplots in the upper section represent, as a comparison, the expression distribution of PPIA and B2M (two commonly used reference genes for RT-qPCR) across the same set of samples.
=> RefGenes tutorial
ExpressionData - A public resource of high quality curated datasets representing gene expression across anatomy, development and experimental conditions.
Zimmermann P, Bleuler S, Laule O, Martin F, Ivanov NV, Campanoni P, Oishi K, Lugon-Moulin N, Wyss M, Hruz T, Gruissem W.
BioData Min. 2014 7: 18 -- eCollection 2014.
Reference datasets are often used to compare, interpret or validate experimental data and analytical methods. In the field of gene expression, several reference datasets have been published. Typically, they consist of individual baseline or spike-in experiments carried out in a single laboratory and representing a particular set of conditions. Here, we describe a new type of standardized datasets representative for the spatial and temporal dimensions of gene expression. They result from integrating expression data from a large number of globally normalized and quality controlled public experiments. Expression data is aggregated by anatomical part or stage of development to yield a representative transcriptome for each category. For example, we created a genome-wide expression dataset representing the FDA tissue panel across 35 tissue types. The proposed datasets were created for human and several model organisms and are publicly available at http://www.expressiondata.org
A multilevel gamma-clustering layout algorithm for visualization of biological networks.
Hruz T, Wyss M, Lucas C, Laule O, von Rohr P, Zimmermann P, Bleuler S.
Adv Bioinformatics. 2013: 920325
Visualization of large complex networks has become an indispensable part of systems biology, where organisms need to be considered as one complex system. The visualization of the corresponding network is challenging due to the size and density of edges. In many cases, the use of standard visualization algorithms can lead to high running times and poorly readable visualizations due to many edge crossings. We suggest an approach that analyzes the structure of the graph first and then generates a new graph which contains specific semantic symbols for regular substructures like dense clusters. We propose a multilevel gamma-clustering layout visualization algorithm (MLGA) which proceeds in three subsequent steps: (i) a multilevel γ -clustering is used to identify the structure of the underlying network, (ii) the network is transformed to a tree, and (iii) finally, the resulting tree which shows the network structure is drawn using a variation of a force-directed algorithm. The algorithm has a potential to visualize very large networks because it uses modern clustering heuristics which are optimized for large graphs. Moreover, most of the edges are removed from the visual representation which allows keeping the overview over complex graphs with dense subgraphs.
Global regulatory architecture of human, mouse and rat tissue transcriptomes.
Prasad A, Kumar SS, Dessimoz C, Bleuler S, Laule O, Hruz T, Gruissem W, Zimmermann P.
BMC Genomics. 2013 14: 716
BACKGROUND: Predicting molecular responses in human by extrapolating results from model organisms requires a precise understanding of the architecture and regulation of biological mechanisms across species.
RESULTS: Here, we present a large-scale comparative analysis of organ and tissue transcriptomes involving the three mammalian species human, mouse and rat. To this end, we created a unique, highly standardized compendium of tissue expression. Representative tissue specific datasets were aggregated from more than 33,900 Affymetrix expression microarrays. For each organism, we created two expression datasets covering over 55 distinct tissue types with curated data from two independent microarray platforms. Principal component analysis (PCA) revealed that the tissue-specific architecture of transcriptomes is highly conserved between human, mouse and rat. Moreover, tissues with related biological function clustered tightly together, even if the underlying data originated from different labs and experimental settings. Overall, the expression variance caused by tissue type was approximately 10 times higher than the variance caused by perturbations or diseases, except for a subset of cancers and chemicals. Pairs of gene orthologs exhibited higher expression correlation between mouse and rat than with human. Finally, we show evidence that tissue expression profiles, if combined with sequence similarity, can improve the correct assignment of functionally related homologs across species.
CONCLUSION: The results demonstrate that tissue-specific regulation is the main determinant of transcriptome composition and is highly conserved across mammalian species.
Investigation of variation in gene expression profiling of human blood by extended principle component analysis.
Xu Q, Ni S, Wu F, Liu F, Ye X, Mougin B, Meng X, Du X.
Fudan University Shanghai Cancer Center - Institut Mérieux Laboratory, Fudan University Shanghai Cancer Center, Shanghai, People's Republic of China.
PLoS One. 2011;6(10): e26905
BACKGROUND: Human peripheral blood is a promising material for biomedical research. However, various kinds of biological and technological factors result in a large degree of variation in blood gene expression profiles.
METHODOLOGY/PRINCIPAL FINDINGS: Human peripheral blood samples were drawn from healthy volunteers and analysed using the Human Genome U133Plus2 Microarray. We applied a novel approach using the Principle Component Analysis and Eigen-R(2) methods to dissect the overall variation of blood gene expression profiles with respect to the interested biological and technological factors. The results indicated that the predominating sources of the variation could be traced to the individual heterogeneity of the relative proportions of different blood cell types (leukocyte subsets and erythrocytes). The physiological factors like age, gender and BMI were demonstrated to be associated with 5.3% to 9.2% of the total variation in the blood gene expression profiles. We investigated the gene expression profiles of samples from the same donors but with different levels of RNA quality. Although the proportion of variation associated to the RNA Integrity Number was mild (2.1%), the significant impact of RNA quality on the expression of individual genes was observed.
CONCLUSIONS: By characterizing the major sources of variation in blood gene expression profiles, such variability can be minimized by modifications to study designs. Increasing sample size, balancing confounding factors between study groups, using rigorous selection criteria for sample quality, and well controlled experimental processes will significantly improve the accuracy and reproducibility of blood transcriptome study.
Download free version of GenEx software ! Multi dimensional qPCR data analysis via GenEx analysis software (MultiD)
Real-time PCR gene expression profiling
Mikael Kubista, Björn Sjögreen, Amin Forootan, Radek Sindelka and Jiri Jonák, and José Manuel Andrade
Real-time PCR has rapidly become the preferred technique for quantitative analysis of nucleic acids. Its superior sensitivity, reproducibility and dynamic range make it the preferred choice for expression profiling in scientific, as well as routine, applications. => Link to GenEx software
Real-Time PCR: Current Technology and Applications
Publisher: Caister Academic Press
Editor: Julie Logan, Kirstin Edwards and Nick Saunders Applied and Functional Genomics, Health Protection Agency, London (2009)
Chapter 4 - Reference Gene Validation Software for Improved Normalization
J. Vandesompele, M. Kubista and M. W. Pfaffl (2009)
Real-time PCR is the method of choice for expression analysis of a limited number of genes. The measured gene expression variation between subjects is the sum of the true biological variation and several confounding factors resulting in non-specific variation. The purpose of normalization is to remove the non-biological variation as much as possible. Several normalization strategies have been proposed, but the use of one or more reference genes is currently the preferred way of normalization. While these reference genes constitute the best possible normalizers, a major problem is that these genes have no constant expression under all experimental conditions. The experimenter therefore needs to carefully assess whether a certain reference gene is stably expressed in the experimental system under study. This is not trivial and represents a circular problem. Fortunately, several algorithms and freely available software have been developed to address this problem. This chapter aims to provide an overview of the different concepts.
Chapter 5 - Data Analysis Software
M. W. Pfaffl, J. Vandesompele and M. Kubista (2009)
Quantitative real-time RT-PCR (qRT-PCR) is widely and increasingly used in any kind of mRNA quantification, because of its high sensitivity, good reproducibility and wide dynamic quantification range. While qRT-PCR has a tremendous potential for analytical and quantitative applications, a comprehensive understanding of its underlying principles is important. Beside the classical RT-PCR parameters, e.g. primer design, RNA quality, RT and polymerase performances, the fidelity of the quantification process is highly dependent on a valid data analysis. This review will cover all aspects of data acquisition (trueness, reproducibility, and robustness), potentials in data modification and will focus particularly on relative quantification methods. Furthermore useful bioinformatical, biostatical as well as multi-dimensional expression software tools will be presented.
Current Technology and Applications - Book reviews:
"... a useful book for students ..." from J. Microbiological Methods
"provides a dual focus by aiming, in the early chapters, to provide both the theory and practicalities of this diverse and superficially simple technology, counter-balancing this in the later chapters with real-world applications, covering infectious diseases, biodefence, molecular haplotyping and food standards." from Microbiology Today
"a reference work that should be found both in university libraries and on the shelves of experienced applications specialists." from Microbiology Today
"a comprehensive guide to real-time PCR technology and its applications" from Food Science and Technology Abstracts (2009) Volume 41 Number 6
"This volume should be of utmost interest to all investigators interested and involved in using RT-PCR ... the RT-PCR protocols covered in this book will be of interest to most, if not all, investigators engaged in research that uses this important technique ... a well balanced book covering the many potential uses of real-time PCR ... valuable for all those interested in RT-PCR." from Doodys reviews (2009)
"provide the novice and the experienced user with guidance on the technology, its instrumentation, and its applications" f rom SciTech Book News 2009 p. 64
"... written by international authors expert in specific technical principles and applications ... a useful compendium of basic and advanced applications for laboratory scientists. It is an ideal introductory textbook and will serve as a practical handbook in laboratories where the technology is employed." from Christopher J. McIver, Microbiology Department, Prince of Wales Hospital, New South Wales, Australia writing in Australian J. Med. Sci. 2009. 30(2): 59-60
Statistical analysis of real-time PCR data.
Yuan JS, Reed A, Chen F, Stewart CN Jr. BMC Bioinformatics. 2006 (7): 85.
Department of Plant Sciences, University of Tennessee, Knoxville, TN 37996, USA.
BACKGROUND: Even though real-time PCR has been broadly applied in biomedical sciences, data processing procedures for the analysis of quantitative real-time PCR are still lacking; specifically in the realm of appropriate statistical treatment. Confidence interval and statistical significance considerations are not explicit in many of the current data analysis approaches. Based on the standard curve method and other useful data analysis methods, we present and compare four statistical approaches and models for the analysis of real-time PCR data.
RESULTS: In the first approach, a multiple regression analysis model was developed to derive DeltaDeltaCt from estimation of interaction of gene and treatment effects. In the second approach, an ANCOVA (analysis of covariance) model was proposed, and the DeltaDeltaCt can be derived from analysis of effects of variables. The other two models involve calculation DeltaCt followed by a two group t-test and non-parametric analogous Wilcoxon test. SAS programs were developed for all four models and data output for analysis of a sample set are presented. In addition, a data quality control model was developed and implemented using SAS.
CONCLUSION: Practical statistical solutions with SAS programs were developed for real-time PCR data and a sample dataset was analyzed with the SAS programs. The analysis using the various models and programs yielded similar results. Data quality control and analysis procedures presented here provide statistical elements for the estimation of the relative expression of genes using real-time PCR.
Data Analysis Methods
There are two methods, both equally valid, for analyzing data obtained from real time PCR: Relative Standard Curve Method and Comparative CT Method. The first, relative standard curve method, is useful for investigators that have a limited number of cDNA samples and a large number of genes of interest. The comparative CT method is useful for investigators who have a lage number of cDNA samples and a limited number of genes of interest (RRC Core Genomics Facility, University of Illinois at Chicago)
qPCR Bioinformatik: Neue Entwicklungen in der post-qPCR Datenanalyse (in German)
Michael W. Pfaffl (2006), Laborwelt (1): 10-13, ISSN 1611–0854 (Editor: T. Gabrielczyk)
Die Entwicklung der Polymerase Ketten Reaktion (PCR) in den 80er Jahren gehört zweifelsohne zu den größten Errungenschaften in der Molekularbiologie. Mittels der klassischen PCR lassen sich hochsensitiv Genabschnitte oder DNA Fragmente qualitativ sowie semi-quantitativ nachweisen. Um spezifische mRNA zu quantifizieren, stellt man der PCR die Reverse Transkription (RT) vor. Die Anwendung der RT-PCR zur Quantifizierung spezifischen mRNA ist heute zum Routinewerkzeug in der Expressionsanalytik geworden. Die gewonnenen Ergebnisse sind von überproportionalen Nutzen in der molekularbiologischen Forschung und molekularen Diagnostik, in der vergleichenden Expressionsanalytik sowie zur Aufklärung der „Functional Genomics“.
Der Nachweis kann qualitativ in klassischen Thermocyclern oder in „real-time“ quantitativ mittels Echtzeit PCR (qPCR) durchgeführt werden. Die Ergebnisse sind direkt verfügbar, so dass der Einsatz der qPCR eine deutliche Zeitersparnis mit sich bringt. Da die Zunahme der Fluoreszenz und die Menge an neusynthetisierten PCR-Produkten über einen weiten Bereich proportional zueinander sind, kann aus den gewonnenen Fluoreszenzdaten die eingesetzte Ausgangsmenge der DNA respektive RNA bestimmt werden. Vorraussetzung für einen zuverlässigen quantitativen Nachweis ist eine funktionierende Analytik und Datenauswertung, die exakte Quantifizierungsergebnisse bei ausreichender Genauigkeit und hoher Wiederholbarkeit liefert.
QPCR DEMO - real-time PCR data management and analysis
Developed by - Stephan Pabinger http://genome.tugraz.at/QPCR or https://esus.genome.tugraz.at/rtpcr
QPCR is a versatile web-based Java application that allows to store, manage, analyze, and display data from quantitative real-time polymerase chain reaction (qPCR) experiments. You can try out the application by using the demo account at QPCR Demo
It is strongly recommended to use a private account which guarantees confidentiality and security of your data.
To request an account please contact email@example.com
To get started:
Read the tutorial which leads you through all important steps of the application.
For more information download the user guide which covers all aspects of the application.
QPCR: Application for real-time PCR data management and analysis.
Pabinger S, Thallinger GG, Snajder R, Eichhorn H, Rader R, Trajanoski Z.
BMC Bioinformatics 2009, 10:268
BACKGROUND: Since its introduction quantitative real-time polymerase chain reaction (qPCR) has become the standard method for quantification of gene expression. Its high sensitivity, large dynamic range, and accuracy led to the development of numerous applications with an increasing number of samples to be analyzed. Data analysis consists of a number of steps, which have to be carried out in several different applications. Currently, no single tool is available which incorporates storage, management, and multiple methods covering the complete analysis pipeline. RESULTS: QPCR is a versatile web-based Java application that allows to store, manage, and analyze data from relative quantification qPCR experiments. It comprises a parser to import generated data from qPCR instruments and includes a variety of analysis methods to calculate cycle-threshold and amplification efficiency values. The analysis pipeline includes technical and biological replicate handling, incorporation of sample or gene specific efficiency, normalization using single or multiple reference genes, inter-run calibration, and fold change calculation. Moreover, the application supports assessment of error propagation throughout all analysis steps and allows conducting statistical tests on biological replicates. Results can be visualized in customizable charts and exported for further investigation. CONCLUSION: We have developed a web-based system designed to enhance and facilitate the analysis of qPCR experiments. It covers the complete analysis workflow combining parsing, analysis, and generation of charts into one single application. The system is freely available at http://genome.tugraz.at/QPCR
The qpcR library is an extension to the R environment that assists in the modelling and analysis of quantitative real-time PCR data => http://www.dr-spiess.de/qpcR.html
With the qpcR library you can:
PowerNest - illuminating error in qPCR experiment design