Supervised analysis of gene expression data

Size: px
Start display at page:

Download "Supervised analysis of gene expression data"

Transcription

1 Supervised analysis of gene expression data Bing Zhang Department of Biomedical Informatics Vanderbilt University

2 Gene expression Gene expression is the process by which information from a gene is used in the synthesis of a functional gene product. For a specific cell at a specific time, only a subset of the genes coded in the genome are expressed. Transcriptional control is critical in gene expression regulation. Measure of mrna expression level can Provide a good indicator of corresponding protein expression level Provide insight on the mechanisms of transcriptional regulation graph courtesy of Wikipedia

3 Candidate gene approach vs high-throughput approach Chalcone synthase Protein kinase Actin Northern 0 10m 30m 1h 3h 6h 24h Microarray 10m 30m 1h 3h 6h 24h Advantages of high-throughput technologies High-throughput Exploratory analysis Relationship between genes or between samples Challenges in high-throughput technologies Cost Data analysis

4 High-throughput transcriptome profiling approaches Transcriptome: the set of all messenger RNA (mrna) molecules, or "transcripts, produced in one or a population of cells. Hybridization based approaches: incubating fluorescently labeled cdna with microarrays. Hybridization signal is measured. cdna microarray (printed arrays) High density olio arrays (synthesized arrays) Sequencing based approaches: directly determine the cdna sequence. Count is measured. Sanger sequencing of cdna or EST libraries Serial Analysis of Gene Expression (SAGE) Massively Parallel Signature Sequencing (MPSS) RNA-Seq

5 Array preparation Array preparation PerfectPerfect match match Probe Probe set set Mismatch Mismatch Insert amplification Insert amplification by PCR by PCR Vector-specific Vector-specific primersprimers Gene-specific Gene-specific primersprimers In situ In synthesis situ synthesis by photolithography by photolithography Microarray: two-color vs single-color PrintingPrinting Coupling Coupling Denaturing Denaturing Array 2 Array 2 Array 1 Array 1 Ratio array Ratio1/array array 1/array 2 2 Target preparation Target preparation Ratio Cy5/Cy3 Ratio Cy5/Cy3 Staining Staining hybridization hybridization Hybridization Hybridization mixingmixing Cy3 Cy3 orcy3 Cy5or Cy5 labelled labelled cdna cdna Cy5 Cy3 TTTTTTTT TTTTTTTT TTTTTTTT TTTTTTTT TTTTTTTT TTTTTTTT TTTTTTTT TTTTTTTT TTTTTTTT TTTTTTTT TTTTTTTT TTTTTTTT TTTTTTTT TTTTTTTT TTTTTTTT TTTTTTTT First-strand First-strand cdna cdna synthesis synthesis Total RNA Total RNA Cells/tissue Cells/tissue two-color arrays Biotin-labelled Biotin-labelled crna crna Cy5 In vitrointranscription vitro transcription TTTTTTTTTTTTTTTT T7 T7 Double-stranded Double-stranded cdna cdna TTTTTTTTTTTTTTTT T7 T7 cdna synthesis cdna synthesis PolyA+PolyA+ RNA RNA Cells/tissue Cells/tissue single-color arrays FigureFigure 1 Schematic 1 Schematic overview overview of probe of array probeand array target and preparation target preparation for spotted for spotted intensities intensities and ratios and of ratios mrna of abundance mrna abundance for the for genes the represented genes represented on the on array. the array. Schulze andb,downward, Nature Cell Biol, 3:E190, 2001 cdna microarrays cdna microarrays and high-density and high-density oligonucleotide oligonucleotide microarrays. microarrays. a, cdna a,microarcdna microarhigh-density b, High-density oligonucleotide oligonucleotide microarrays. microarrays. Array preparation: Array preparation: sequences sequences of of rays. Array rays.preparation: Array preparation: inserts inserts from cdna fromcollections cdna collections or libraries or libraries (such as (such IMAGE as IMAGE short oligonucleotides short oligonucleotides (typically (typically 25mers) 25mers) are chosen are chosen from the from mrna the reference mrna reference Applied Bioinformatics, Spring 2011 libraries) libraries) are amplified are amplified using either using vector-specific either vector-specific or gene-specific or gene-specific primers. primers. PCR PCR sequence sequence of eachofgene, each often gene,representing often representing the most theunique most unique part of part the transcript of the transcript in in products products are printed are printed at specified at specified sites onsites glass onslides glass using slideshigh-precision using high-precision arrayingarrayingthe 5 -untranslated the 5 -untranslated region. region. Light-directed, Light-directed, in situ oligonucleotide in situ oligonucleotide synthesis synthesis is usedistoused to robots.robots. ThroughThrough the usethe of chemical use of chemical linkers,linkers, selective selective covalent covalent attachment attachment of the of the generate generate high-density high-density probe arrays probe arrays containing containing over 300,000 over 300,000 individual individual elements. elements. + + RNA from RNAdifferent from different tissuestissues or cell populations or cell populations is usedistoused to coding coding strand strand to the glass to thesurface glass surface can be can achieved. be achieved. Target preparation: Target preparation: RNA from RNA from Target preparation: Target preparation: polya polya

6 Overall workflow of a microarray study Biological question Experiment design Microarray experiment Image analysis Pre-processing Data Analysis Experimental verification Hypothesis

7 Data matrix Samples Genes!"#$%&'%(&)* +,-.&/ +,-.&0 +,-.&1 +,-2.&/ +,-2.&0 +,-2.&1 /..3&'&4(!"#!!!!"$%&$!"$'()!"$')&!"$#&'!"*%(* /.51&4( +")$$! +")!*$ +"'&+' +"&))) +")&%' +"&'+' //3&4( ("%(%% ("%%*' #"+%'( +"%')'!"#*!& +"&##* /0/&4( +"()(' +"(''% +"#)&% +"($!) +"('&& +"(*'$ /055&6&4( '"&!%) '"'##+ '"&*#% '"*(%% '"'$(* '"&+(+ /078&4( #"*$$# #"&*!) #"&%$* #"'&+% #"$%(' #"&(() /1/2&4( #"$($+ #"$**% #"'(%+ #"##*# #"#'*! #"'#!! /10.&4( #"$'+( #"$*!! #"$')% #"##%$ #"$+!( #"(&*# /8.5&)&4( '"*&#% '"'#'% '")'*! '"*'#& '"*!(# '"#!'+ /81/&4( $"&)+) $"&%(% $"&#$( $"&!&* $"&$&& $")!%! /819&4( ("%)$$ #"+*$+ #"+&') ("%&'! ("%)'& ("%+() /893&4(!"#*#)!"'!(+!"''+!!"''(%!"$*))!"'&&$ /878&:&4( ("*&+# ("*+%) ("%!!# ("&#'! ("#%$! ("&+'+ /550052&4&4( )%"#&'$ )%"$&*$ )%"#$&& )%"'&%$ )%"&*'' )%"*)'' /550053&4&4( )%"*&&' )%")('+ )%")++& )%"&'#' )%"&)+) )%"&'%$

8 Three major goals of gene expression studies Class comparison (supervised analysis) e.g. disease biomarker discovery Differential expression analysis Input: gene expression data, class label of the samples Output: differentially expressed genes Class detection (unsupervised analysis) e.g. patient subgroup detection Clustering analysis Input: gene expression data Output: groups of similar samples or genes Class prediction (supervised learning) e.g. disease diagnosis and prognosis Machine learning techniques Input: gene expression data, class label of the samples (training data) Output: prediction model!"#$%&'%(&)* +,-.&/ +,-.&0 +,-.&1 +,-2.&/ +,-2.&0 +,-2.&1 /..3&'&4(!"#!!!!"$%&$!"$'()!"$')&!"$#&'!"*%(* /.51&4( +")$$! +")!*$ +"'&+' +"&))) +")&%' +"&'+' //3&4( ("%(%% ("%%*' #"+%'( +"%')'!"#*!& +"&##* /0/&4( +"()(' +"(''% +"#)&% +"($!) +"('&& +"(*'$ /055&6&4( '"&!%) '"'##+ '"&*#% '"*(%% '"'$(* '"&+(+ /078&4( #"*$$# #"&*!) #"&%$* #"'&+% #"$%(' #"&(() /1/2&4( #"$($+ #"$**% #"'(%+ #"##*# #"#'*! #"'#!! /10.&4( #"$'+( #"$*!! #"$')% #"##%$ #"$+!( #"(&*# /8.5&)&4( '"*&#% '"'#'% '")'*! '"*'#& '"*!(# '"#!'+ /81/&4( $"&)+) $"&%(% $"&#$( $"&!&* $"&$&& $")!%! /819&4( ("%)$$ #"+*$+ #"+&') ("%&'! ("%)'& ("%+() /893&4(!"#*#)!"'!(+!"''+!!"''(%!"$*))!"'&&$ /878&:&4( ("*&+# ("*+%) ("%!!# ("&#'! ("#%$! ("&+'+ /550052&4&4( )%"#&'$ )%"$&*$ )%"#$&& )%"'&%$ )%"&*'' )%"*)'' /550053&4&4( )%"*&&' )%")('+ )%")++& )%"&'#' )%"&)+) )%"&'%$

9 Data preprocessing I: missing value imputation Replace with zeros Replace all missing values with 0 Replace with row averages Replace missing values with mean of available values in each row (gene) KNN imputation Estimate missing values via the K-nearest neighbors analysis

10 Data preprocessing II: normalization To make arrays comparable Adjust the arrays using some control or housekeeping genes that you would expect to have the same intensity level across all of the samples Adjust using spike control Multiply each array by a constant to make the mean (median) intensity the same for each individual array (Global normalization) Match the percentiles of each array (Quantile normalization) No normalization Global normalization Quantile normalization

11 Data preprocessing III: transformation To make the data more closely meet the assumptions of a statistical inference procedure log transformation to improve normality Histogram of a Histogram of log(a) Frequency Frequency a log(a)

12 Differential expression Samples Genes!"#$%&'%(&)* +,-.&/ +,-.&0 +,-.&1 +,-2.&/ +,-2.&0 +,-2.&1 /..3&'&4(!"#!!!!"$%&$!"$'()!"$')&!"$#&'!"*%(* /.51&4( +")$$! +")!*$ +"'&+' +"&))) +")&%' +"&'+' //3&4( ("%(%% ("%%*' #"+%'( +"%')'!"#*!& +"&##* /0/&4( +"()(' +"(''% +"#)&% +"($!) +"('&& +"(*'$ /055&6&4( '"&!%) '"'##+ '"&*#% '"*(%% '"'$(* '"&+(+ /078&4( #"*$$# #"&*!) #"&%$* #"'&+% #"$%(' #"&(() /1/2&4( #"$($+ #"$**% #"'(%+ #"##*# #"#'*! #"'#!! /10.&4( #"$'+( #"$*!! #"$')% #"##%$ #"$+!( #"(&*# /8.5&)&4( '"*&#% '"'#'% '")'*! '"*'#& '"*!(# '"#!'+ /81/&4( $"&)+) $"&%(% $"&#$( $"&!&* $"&$&& $")!%! /819&4( ("%)$$ #"+*$+ #"+&') ("%&'! ("%)'& ("%+() /893&4(!"#*#)!"'!(+!"''+!!"''(%!"$*))!"'&&$ /878&:&4( ("*&+# ("*+%) ("%!!# ("&#'! ("#%$! ("&+'+ /550052&4&4( )%"#&'$ )%"$&*$ )%"#$&& )%"'&%$ )%"&*'' )%"*)'' /550053&4&4( )%"*&&' )%")('+ )%")++& )%"&'#' )%"&)+) )%"&'%$ Case Control

13 Fold change n-fold change Arbitrarily selected fold change cut-offs Usually 2 fold Pros Intuitive Simple and rapid Cons Statistically inefficient Magnitude does not necessarily indicate importance

14 Statistical analysis: hypothesis testing Samples Genes!"#$%&'%(&)* +,-.&/ +,-.&0 +,-.&1 +,-2.&/ +,-2.&0 +,-2.&1 /..3&'&4(!"#!!!!"$%&$!"$'()!"$')&!"$#&'!"*%(* /.51&4( +")$$! +")!*$ +"'&+' +"&))) +")&%' +"&'+' //3&4( ("%(%% ("%%*' #"+%'( +"%')'!"#*!& +"&##* /0/&4( +"()(' +"(''% +"#)&% +"($!) +"('&& +"(*'$ /055&6&4( '"&!%) '"'##+ '"&*#% '"*(%% '"'$(* '"&+(+ /078&4( #"*$$# #"&*!) #"&%$* #"'&+% #"$%(' #"&(() /1/2&4( #"$($+ #"$**% #"'(%+ #"##*# #"#'*! #"'#!! /10.&4( #"$'+( #"$*!! #"$')% #"##%$ #"$+!( #"(&*# /8.5&)&4( '"*&#% '"'#'% '")'*! '"*'#& '"*!(# '"#!'+ /81/&4( $"&)+) $"&%(% $"&#$( $"&!&* $"&$&& $")!%! /819&4( ("%)$$ #"+*$+ #"+&') ("%&'! ("%)'& ("%+() /893&4(!"#*#)!"'!(+!"''+!!"''(%!"$*))!"'&&$ /878&:&4( ("*&+# ("*+%) ("%!!# ("&#'! ("#%$! ("&+'+ /550052&4&4( )%"#&'$ )%"$&*$ )%"#$&& )%"'&%$ )%"&*'' )%"*)'' /550053&4&4( )%"*&&' )%")('+ )%")++& )%"&'#' )%"&)+) )%"&'%$ Case A statistical hypothesis is an assumption about a population parameter, e.g. group mean. Control Null hypothesis Alternative hypothesis H 0 : µ 1 = µ 2 H 1 : µ 1 " µ 2

15 Statistical analysis: comparing means of two groups Parametric method Student s t-test Assumes normal distribution of the data Non-parametric method Mann-Whitney U test GeneX t-test: p=0.06; U test: p=0.1 GeneX t-test: p=0.32; U test: p=0.1 Does not rely on data belonging to any particular distribution Based on ranks of observations Student s t-test vs Mann-Whitney U test Robustness: U-test is more robust to outliers Efficiency: When normality holds, the efficiency of the U-test is about 0.95 when compared to the t-test. For distributions sufficiently far from normal and for sufficiently large sample sizes, the U-test can be considerably more efficient than the t-test.

16 Statistical tests for different types of comparisons DATA Continuous/ normal Rank Nominal Compare two unpaired groups Unpaired t- test Mann- Whitney test Fisher s exact test or chi-square test G O A L Compare two paired groups Compare three or more groups Association to quantitative phenotypes Paired t-test One-way ANOVA Pearson s correlation Wilcoxon test Kruskal- Wallis test Spearman s correlation McNemar s test Chi-square test Contingency coefficients

17 Correction for multiple testing: why? In an experiment with a 10,000-gene array in which the significance level p is set at 0.05, 10,000 x 0.05 = 500 genes would be inferred as significant even though none is differentially expressed The probability of drawing the wrong conclusion in at least one of the n different test is P(wrong) =1" (1 " # s ) n = # g " s Where is the significance level at single gene level, and is the global significance level. " g Each row is a test!"#$%&'%(&)* +,-.&/ +,-.&0 +,-.&1 +,-2.&/ +,-2.&0 +,-2.&1 /..3&'&4(!"#!!!!"$%&$!"$'()!"$')&!"$#&'!"*%(* /.51&4( +")$$! +")!*$ +"'&+' +"&))) +")&%' +"&'+' //3&4( ("%(%% ("%%*' #"+%'( +"%')'!"#*!& +"&##* /0/&4( +"()(' +"(''% +"#)&% +"($!) +"('&& +"(*'$ /055&6&4( '"&!%) '"'##+ '"&*#% '"*(%% '"'$(* '"&+(+ /078&4( #"*$$# #"&*!) #"&%$* #"'&+% #"$%(' #"&(() /1/2&4( #"$($+ #"$**% #"'(%+ #"##*# #"#'*! #"'#!! /10.&4( #"$'+( #"$*!! #"$')% #"##%$ #"$+!( #"(&*# /8.5&)&4( '"*&#% '"'#'% '")'*! '"*'#& '"*!(# '"#!'+ /81/&4( $"&)+) $"&%(% $"&#$( $"&!&* $"&$&& $")!%! /819&4( ("%)$$ #"+*$+ #"+&') ("%&'! ("%)'& ("%+() /893&4(!"#*#)!"'!(+!"''+!!"''(%!"$*))!"'&&$ /878&:&4( ("*&+# ("*+%) ("%!!# ("&#'! ("#%$! ("&+'+ /550052&4&4( )%"#&'$ )%"$&*$ )%"#$&& )%"'&%$ )%"&*'' )%"*)'' /550053&4&4( )%"*&&' )%")('+ )%")++& )%"&'#' )%"&)+) )%"&'%$ n " s " g

18 Correction for multiple testing: how? Control the family-wise error rate (FWER), the probability that there is a single type I error in the entire set (family) of hypotheses tested. e.g. Standard Bonferroni Correction: uncorrected p value x no. of genes tested Control the false discovery rate (FDR), the expected proportion of false positives among the number of rejected hypotheses. e.g. Benjamini and Hochberg correction. Ranking all genes according to their p value Picking a desired FDR level, q (e.g. 5%) p " i m q Starting from the top of the list, accept all genes with, where i is the number of genes accepted so far, and m is the total number of genes tested. p Bonferroni Rank (i) q (i/m)*q significant?

19 Resources Data source Gene Expression Omnibus (GEO): ArrayExpress: Microarray data analysis tools Bioconductor: Expression profiler:

20 Summary Three major goals of gene expression studies Class comparison Class detection Class prediction Gene expression data pre-processing steps Missing data imputation Normalization Transformation Statistical tests for two group comparative studies Student s t-test Mann-Whitney U test Multiple-test adjustment Control the family-wise error rate (FWER) Control the false discovery rate (FDR)

21 Exercise Data set: james_west_2005_hne_6h_60vs0.txt (or james_west_2005_hne_6h_60vs0_head100.txt) probe sets (or the top 100 probe sets) Two groups (HNE0 and HNE60, three replicates in each group) No missing value; Already normalized; Already log transformed Use t-test in expression profiler ( or excel to identify genes that are differentially expressed between the two groups. Apply multiple test adjustment on the raw p-values!"#$%&'%(&)* +,-.&/ +,-.&0 +,-.&1 +,-2.&/ +,-2.&0 +,-2.&1 /..3&'&4(!"#!!!!"$%&$!"$'()!"$')&!"$#&'!"*%(* /.51&4( +")$$! +")!*$ +"'&+' +"&))) +")&%' +"&'+' //3&4( ("%(%% ("%%*' #"+%'( +"%')'!"#*!& +"&##* /0/&4( +"()(' +"(''% +"#)&% +"($!) +"('&& +"(*'$ /055&6&4( '"&!%) '"'##+ '"&*#% '"*(%% '"'$(* '"&+(+ /078&4( #"*$$# #"&*!) #"&%$* #"'&+% #"$%(' #"&(() /1/2&4( #"$($+ #"$**% #"'(%+ #"##*# #"#'*! #"'#!! /10.&4( #"$'+( #"$*!! #"$')% #"##%$ #"$+!( #"(&*# /8.5&)&4( '"*&#% '"'#'% '")'*! '"*'#& '"*!(# '"#!'+ /81/&4( $"&)+) $"&%(% $"&#$( $"&!&* $"&$&& $")!%! /819&4( ("%)$$ #"+*$+ #"+&') ("%&'! ("%)'& ("%+() /893&4(!"#*#)!"'!(+!"''+!!"''(%!"$*))!"'&&$ /878&:&4( ("*&+# ("*+%) ("%!!# ("&#'! ("#%$! ("&+'+ /550052&4&4( )%"#&'$ )%"$&*$ )%"#$&& )%"'&%$ )%"&*'' )%"*)'' /550053&4&4( )%"*&&' )%")('+ )%")++& )%"&'#' )%"&)+) )%"&'%$

Gene expression analysis. Ulf Leser and Karin Zimmermann

Gene expression analysis. Ulf Leser and Karin Zimmermann Gene expression analysis Ulf Leser and Karin Zimmermann Ulf Leser: Bioinformatics, Wintersemester 2010/2011 1 Last lecture What are microarrays? - Biomolecular devices measuring the transcriptome of a

More information

Statistical tests for SPSS

Statistical tests for SPSS Statistical tests for SPSS Paolo Coletti A.Y. 2010/11 Free University of Bolzano Bozen Premise This book is a very quick, rough and fast description of statistical tests and their usage. It is explicitly

More information

Gene Expression Analysis

Gene Expression Analysis Gene Expression Analysis Jie Peng Department of Statistics University of California, Davis May 2012 RNA expression technologies High-throughput technologies to measure the expression levels of thousands

More information

Analysis of gene expression data. Ulf Leser and Philippe Thomas

Analysis of gene expression data. Ulf Leser and Philippe Thomas Analysis of gene expression data Ulf Leser and Philippe Thomas This Lecture Protein synthesis Microarray Idea Technologies Applications Problems Quality control Normalization Analysis next week! Ulf Leser:

More information

Microarray Technology

Microarray Technology Microarrays And Functional Genomics CPSC265 Matt Hudson Microarray Technology Relatively young technology Usually used like a Northern blot can determine the amount of mrna for a particular gene Except

More information

Tutorial for proteome data analysis using the Perseus software platform

Tutorial for proteome data analysis using the Perseus software platform Tutorial for proteome data analysis using the Perseus software platform Laboratory of Mass Spectrometry, LNBio, CNPEM Tutorial version 1.0, January 2014. Note: This tutorial was written based on the information

More information

Measuring gene expression (Microarrays) Ulf Leser

Measuring gene expression (Microarrays) Ulf Leser Measuring gene expression (Microarrays) Ulf Leser This Lecture Gene expression Microarrays Idea Technologies Problems Quality control Normalization Analysis next week! 2 http://learn.genetics.utah.edu/content/molecules/transcribe/

More information

How many of you have checked out the web site on protein-dna interactions?

How many of you have checked out the web site on protein-dna interactions? How many of you have checked out the web site on protein-dna interactions? Example of an approximately 40,000 probe spotted oligo microarray with enlarged inset to show detail. Find and be ready to discuss

More information

Molecular Genetics: Challenges for Statistical Practice. J.K. Lindsey

Molecular Genetics: Challenges for Statistical Practice. J.K. Lindsey Molecular Genetics: Challenges for Statistical Practice J.K. Lindsey 1. What is a Microarray? 2. Design Questions 3. Modelling Questions 4. Longitudinal Data 5. Conclusions 1. What is a microarray? A microarray

More information

Protein Protein Interaction Networks

Protein Protein Interaction Networks Functional Pattern Mining from Genome Scale Protein Protein Interaction Networks Young-Rae Cho, Ph.D. Assistant Professor Department of Computer Science Baylor University it My Definition of Bioinformatics

More information

Data, Measurements, Features

Data, Measurements, Features Data, Measurements, Features Middle East Technical University Dep. of Computer Engineering 2009 compiled by V. Atalay What do you think of when someone says Data? We might abstract the idea that data are

More information

Introduction to transcriptome analysis using High Throughput Sequencing technologies (HTS)

Introduction to transcriptome analysis using High Throughput Sequencing technologies (HTS) Introduction to transcriptome analysis using High Throughput Sequencing technologies (HTS) A typical RNA Seq experiment Library construction Protocol variations Fragmentation methods RNA: nebulization,

More information

Statistical issues in the analysis of microarray data

Statistical issues in the analysis of microarray data Statistical issues in the analysis of microarray data Daniel Gerhard Institute of Biostatistics Leibniz University of Hannover ESNATS Summerschool, Zermatt D. Gerhard (LUH) Analysis of microarray data

More information

Introduction to SAGEnhaft

Introduction to SAGEnhaft Introduction to SAGEnhaft Tim Beissbarth October 13, 2015 1 Overview Serial Analysis of Gene Expression (SAGE) is a gene expression profiling technique that estimates the abundance of thousands of gene

More information

Data Acquisition. DNA microarrays. The functional genomics pipeline. Experimental design affects outcome data analysis

Data Acquisition. DNA microarrays. The functional genomics pipeline. Experimental design affects outcome data analysis Data Acquisition DNA microarrays The functional genomics pipeline Experimental design affects outcome data analysis Data acquisition microarray processing Data preprocessing scaling/normalization/filtering

More information

From Reads to Differentially Expressed Genes. The statistics of differential gene expression analysis using RNA-seq data

From Reads to Differentially Expressed Genes. The statistics of differential gene expression analysis using RNA-seq data From Reads to Differentially Expressed Genes The statistics of differential gene expression analysis using RNA-seq data experimental design data collection modeling statistical testing biological heterogeneity

More information

Essentials of Real Time PCR. About Sequence Detection Chemistries

Essentials of Real Time PCR. About Sequence Detection Chemistries Essentials of Real Time PCR About Real-Time PCR Assays Real-time Polymerase Chain Reaction (PCR) is the ability to monitor the progress of the PCR as it occurs (i.e., in real time). Data is therefore collected

More information

Introduction To Real Time Quantitative PCR (qpcr)

Introduction To Real Time Quantitative PCR (qpcr) Introduction To Real Time Quantitative PCR (qpcr) SABiosciences, A QIAGEN Company www.sabiosciences.com The Seminar Topics The advantages of qpcr versus conventional PCR Work flow & applications Factors

More information

REAL TIME PCR USING SYBR GREEN

REAL TIME PCR USING SYBR GREEN REAL TIME PCR USING SYBR GREEN 1 THE PROBLEM NEED TO QUANTITATE DIFFERENCES IN mrna EXPRESSION SMALL AMOUNTS OF mrna LASER CAPTURE SMALL AMOUNTS OF TISSUE PRIMARY CELLS PRECIOUS REAGENTS 2 THE PROBLEM

More information

PreciseTM Whitepaper

PreciseTM Whitepaper Precise TM Whitepaper Introduction LIMITATIONS OF EXISTING RNA-SEQ METHODS Correctly designed gene expression studies require large numbers of samples, accurate results and low analysis costs. Analysis

More information

Analyzing Research Data Using Excel

Analyzing Research Data Using Excel Analyzing Research Data Using Excel Fraser Health Authority, 2012 The Fraser Health Authority ( FH ) authorizes the use, reproduction and/or modification of this publication for purposes other than commercial

More information

Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm

Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm Mgt 540 Research Methods Data Analysis 1 Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm http://web.utk.edu/~dap/random/order/start.htm

More information

Gene Expression Assays

Gene Expression Assays APPLICATION NOTE TaqMan Gene Expression Assays A mpl i fic ationef ficienc yof TaqMan Gene Expression Assays Assays tested extensively for qpcr efficiency Key factors that affect efficiency Efficiency

More information

Recombinant DNA and Biotechnology

Recombinant DNA and Biotechnology Recombinant DNA and Biotechnology Chapter 18 Lecture Objectives What Is Recombinant DNA? How Are New Genes Inserted into Cells? What Sources of DNA Are Used in Cloning? What Other Tools Are Used to Study

More information

Real-Time PCR Vs. Traditional PCR

Real-Time PCR Vs. Traditional PCR Real-Time PCR Vs. Traditional PCR Description This tutorial will discuss the evolution of traditional PCR methods towards the use of Real-Time chemistry and instrumentation for accurate quantitation. Objectives

More information

Correlation of microarray and quantitative real-time PCR results. Elisa Wurmbach Mount Sinai School of Medicine New York

Correlation of microarray and quantitative real-time PCR results. Elisa Wurmbach Mount Sinai School of Medicine New York Correlation of microarray and quantitative real-time PCR results Elisa Wurmbach Mount Sinai School of Medicine New York Microarray techniques Oligo-array: Affymetrix, Codelink, spotted oligo-arrays (60-70mers)

More information

SPSS Explore procedure

SPSS Explore procedure SPSS Explore procedure One useful function in SPSS is the Explore procedure, which will produce histograms, boxplots, stem-and-leaf plots and extensive descriptive statistics. To run the Explore procedure,

More information

Quantitative proteomics background

Quantitative proteomics background Proteomics data analysis seminar Quantitative proteomics and transcriptomics of anaerobic and aerobic yeast cultures reveals post transcriptional regulation of key cellular processes de Groot, M., Daran

More information

Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012

Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012 Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization GENOME 560, Spring 2012 Data are interesting because they help us understand the world Genomics: Massive Amounts

More information

Descriptive Statistics

Descriptive Statistics Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize

More information

Frequently Asked Questions Next Generation Sequencing

Frequently Asked Questions Next Generation Sequencing Frequently Asked Questions Next Generation Sequencing Import These Frequently Asked Questions for Next Generation Sequencing are some of the more common questions our customers ask. Questions are divided

More information

Statistical Analysis. NBAF-B Metabolomics Masterclass. Mark Viant

Statistical Analysis. NBAF-B Metabolomics Masterclass. Mark Viant Statistical Analysis NBAF-B Metabolomics Masterclass Mark Viant 1. Introduction 2. Univariate analysis Overview of lecture 3. Unsupervised multivariate analysis Principal components analysis (PCA) Interpreting

More information

Quality Assessment of Exon and Gene Arrays

Quality Assessment of Exon and Gene Arrays Quality Assessment of Exon and Gene Arrays I. Introduction In this white paper we describe some quality assessment procedures that are computed from CEL files from Whole Transcript (WT) based arrays such

More information

Basic Analysis of Microarray Data

Basic Analysis of Microarray Data Basic Analysis of Microarray Data A User Guide and Tutorial Scott A. Ness, Ph.D. Co-Director, Keck-UNM Genomics Resource and Dept. of Molecular Genetics and Microbiology University of New Mexico HSC Tel.

More information

Statistics in Medicine Research Lecture Series CSMC Fall 2014

Statistics in Medicine Research Lecture Series CSMC Fall 2014 Catherine Bresee, MS Senior Biostatistician Biostatistics & Bioinformatics Research Institute Statistics in Medicine Research Lecture Series CSMC Fall 2014 Overview Review concept of statistical power

More information

New Technologies for Sensitive, Low-Input RNA-Seq. Clontech Laboratories, Inc.

New Technologies for Sensitive, Low-Input RNA-Seq. Clontech Laboratories, Inc. New Technologies for Sensitive, Low-Input RNA-Seq Clontech Laboratories, Inc. Outline Introduction Single-Cell-Capable mrna-seq Using SMART Technology SMARTer Ultra Low RNA Kit for the Fluidigm C 1 System

More information

ALLEN Mouse Brain Atlas

ALLEN Mouse Brain Atlas TECHNICAL WHITE PAPER: QUALITY CONTROL STANDARDS FOR HIGH-THROUGHPUT RNA IN SITU HYBRIDIZATION DATA GENERATION Consistent data quality and internal reproducibility are critical concerns for high-throughput

More information

Using Illumina BaseSpace Apps to Analyze RNA Sequencing Data

Using Illumina BaseSpace Apps to Analyze RNA Sequencing Data Using Illumina BaseSpace Apps to Analyze RNA Sequencing Data The Illumina TopHat Alignment and Cufflinks Assembly and Differential Expression apps make RNA data analysis accessible to any user, regardless

More information

RT 2 Profiler PCR Array: Web-Based Data Analysis Tutorial

RT 2 Profiler PCR Array: Web-Based Data Analysis Tutorial RT 2 Profiler PCR Array: Web-Based Data Analysis Tutorial Samuel J. Rulli, Jr., Ph.D. qpcr-applications Scientist Samuel.Rulli@QIAGEN.com Pathway Focused Research from Sample Prep to Data Analysis! -2-

More information

FlipFlop: Fast Lasso-based Isoform Prediction as a Flow Problem

FlipFlop: Fast Lasso-based Isoform Prediction as a Flow Problem FlipFlop: Fast Lasso-based Isoform Prediction as a Flow Problem Elsa Bernard Laurent Jacob Julien Mairal Jean-Philippe Vert September 24, 2013 Abstract FlipFlop implements a fast method for de novo transcript

More information

Using Excel for inferential statistics

Using Excel for inferential statistics FACT SHEET Using Excel for inferential statistics Introduction When you collect data, you expect a certain amount of variation, just caused by chance. A wide variety of statistical tests can be applied

More information

Analyzing microrna Data and Integrating mirna with Gene Expression Data in Partek Genomics Suite 6.6

Analyzing microrna Data and Integrating mirna with Gene Expression Data in Partek Genomics Suite 6.6 Analyzing microrna Data and Integrating mirna with Gene Expression Data in Partek Genomics Suite 6.6 Overview This tutorial outlines how microrna data can be analyzed within Partek Genomics Suite. Additionally,

More information

CHAPTER 14 ORDINAL MEASURES OF CORRELATION: SPEARMAN'S RHO AND GAMMA

CHAPTER 14 ORDINAL MEASURES OF CORRELATION: SPEARMAN'S RHO AND GAMMA CHAPTER 14 ORDINAL MEASURES OF CORRELATION: SPEARMAN'S RHO AND GAMMA Chapter 13 introduced the concept of correlation statistics and explained the use of Pearson's Correlation Coefficient when working

More information

REAL TIME PCR SYBR GREEN

REAL TIME PCR SYBR GREEN REAL TIME PCR SYBR GREEN 1 THE PROBLEM NEED TO QUANTITATE DIFFERENCES IN mrna EXPRESSION SMALL AMOUNTS OF mrna LASER CAPTURE SMALL AMOUNTS OF TISSUE PRIMARY CELLS PRECIOUS REAGENTS 2 THE PROBLEM QUANTITATION

More information

A Primer of Genome Science THIRD

A Primer of Genome Science THIRD A Primer of Genome Science THIRD EDITION GREG GIBSON-SPENCER V. MUSE North Carolina State University Sinauer Associates, Inc. Publishers Sunderland, Massachusetts USA Contents Preface xi 1 Genome Projects:

More information

Biostatistics: Types of Data Analysis

Biostatistics: Types of Data Analysis Biostatistics: Types of Data Analysis Theresa A Scott, MS Vanderbilt University Department of Biostatistics theresa.scott@vanderbilt.edu http://biostat.mc.vanderbilt.edu/theresascott Theresa A Scott, MS

More information

Thermo Scientific DyNAmo cdna Synthesis Kit for qrt-pcr Technical Manual

Thermo Scientific DyNAmo cdna Synthesis Kit for qrt-pcr Technical Manual Thermo Scientific DyNAmo cdna Synthesis Kit for qrt-pcr Technical Manual F- 470S 20 cdna synthesis reactions (20 µl each) F- 470L 100 cdna synthesis reactions (20 µl each) Table of contents 1. Description...

More information

Next Generation Sequencing

Next Generation Sequencing Next Generation Sequencing Technology and applications 10/1/2015 Jeroen Van Houdt - Genomics Core - KU Leuven - UZ Leuven 1 Landmarks in DNA sequencing 1953 Discovery of DNA double helix structure 1977

More information

Comparative genomic hybridization Because arrays are more than just a tool for expression analysis

Comparative genomic hybridization Because arrays are more than just a tool for expression analysis Microarray Data Analysis Workshop MedVetNet Workshop, DTU 2008 Comparative genomic hybridization Because arrays are more than just a tool for expression analysis Carsten Friis ( with several slides from

More information

Sample Size and Power in Clinical Trials

Sample Size and Power in Clinical Trials Sample Size and Power in Clinical Trials Version 1.0 May 011 1. Power of a Test. Factors affecting Power 3. Required Sample Size RELATED ISSUES 1. Effect Size. Test Statistics 3. Variation 4. Significance

More information

Nonparametric Statistics

Nonparametric Statistics Nonparametric Statistics J. Lozano University of Goettingen Department of Genetic Epidemiology Interdisciplinary PhD Program in Applied Statistics & Empirical Methods Graduate Seminar in Applied Statistics

More information

Difference tests (2): nonparametric

Difference tests (2): nonparametric NST 1B Experimental Psychology Statistics practical 3 Difference tests (): nonparametric Rudolf Cardinal & Mike Aitken 10 / 11 February 005; Department of Experimental Psychology University of Cambridge

More information

Next generation DNA sequencing technologies. theory & prac-ce

Next generation DNA sequencing technologies. theory & prac-ce Next generation DNA sequencing technologies theory & prac-ce Outline Next- Genera-on sequencing (NGS) technologies overview NGS applica-ons NGS workflow: data collec-on and processing the exome sequencing

More information

Standards, Guidelines and Best Practices for RNA-Seq V1.0 (June 2011) The ENCODE Consortium

Standards, Guidelines and Best Practices for RNA-Seq V1.0 (June 2011) The ENCODE Consortium Standards, Guidelines and Best Practices for RNA-Seq V1.0 (June 2011) The ENCODE Consortium I. Introduction: Sequence based assays of transcriptomes (RNA-seq) are in wide use because of their favorable

More information

Appendix 2 Molecular Biology Core Curriculum. Websites and Other Resources

Appendix 2 Molecular Biology Core Curriculum. Websites and Other Resources Appendix 2 Molecular Biology Core Curriculum Websites and Other Resources Chapter 1 - The Molecular Basis of Cancer 1. Inside Cancer http://www.insidecancer.org/ From the Dolan DNA Learning Center Cold

More information

Social Media Mining. Data Mining Essentials

Social Media Mining. Data Mining Essentials Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers

More information

BASIC STATISTICAL METHODS FOR GENOMIC DATA ANALYSIS

BASIC STATISTICAL METHODS FOR GENOMIC DATA ANALYSIS BASIC STATISTICAL METHODS FOR GENOMIC DATA ANALYSIS SEEMA JAGGI Indian Agricultural Statistics Research Institute Library Avenue, New Delhi-110 012 seema@iasri.res.in Genomics A genome is an organism s

More information

Scottish Qualifications Authority

Scottish Qualifications Authority National Unit specification: general information Unit code: FH2G 12 Superclass: RH Publication date: March 2011 Source: Scottish Qualifications Authority Version: 01 Summary This Unit is a mandatory Unit

More information

ncounter Leukemia Fusion Gene Expression Assay Molecules That Count Product Highlights ncounter Leukemia Fusion Gene Expression Assay Overview

ncounter Leukemia Fusion Gene Expression Assay Molecules That Count Product Highlights ncounter Leukemia Fusion Gene Expression Assay Overview ncounter Leukemia Fusion Gene Expression Assay Product Highlights Simultaneous detection and quantification of 25 fusion gene isoforms and 23 additional mrnas related to leukemia Compatible with a variety

More information

Statistics for Sports Medicine

Statistics for Sports Medicine Statistics for Sports Medicine Suzanne Hecht, MD University of Minnesota (suzanne.hecht@gmail.com) Fellow s Research Conference July 2012: Philadelphia GOALS Try not to bore you to death!! Try to teach

More information

1. Molecular computation uses molecules to represent information and molecular processes to implement information processing.

1. Molecular computation uses molecules to represent information and molecular processes to implement information processing. Chapter IV Molecular Computation These lecture notes are exclusively for the use of students in Prof. MacLennan s Unconventional Computation course. c 2013, B. J. MacLennan, EECS, University of Tennessee,

More information

An Introduction to Microarray Data Analysis

An Introduction to Microarray Data Analysis Chapter An Introduction to Microarray Data Analysis M. Madan Babu Abstract This chapter aims to provide an introduction to the analysis of gene expression data obtained using microarray experiments. It

More information

II. DISTRIBUTIONS distribution normal distribution. standard scores

II. DISTRIBUTIONS distribution normal distribution. standard scores Appendix D Basic Measurement And Statistics The following information was developed by Steven Rothke, PhD, Department of Psychology, Rehabilitation Institute of Chicago (RIC) and expanded by Mary F. Schmidt,

More information

Service courses for graduate students in degree programs other than the MS or PhD programs in Biostatistics.

Service courses for graduate students in degree programs other than the MS or PhD programs in Biostatistics. Course Catalog In order to be assured that all prerequisites are met, students must acquire a permission number from the education coordinator prior to enrolling in any Biostatistics course. Courses are

More information

Illumina Sequencing Technology

Illumina Sequencing Technology Illumina Sequencing Technology Highest data accuracy, simple workflow, and a broad range of applications. Introduction Figure 1: Illumina Flow Cell Illumina sequencing technology leverages clonal array

More information

Lecture 2: Descriptive Statistics and Exploratory Data Analysis

Lecture 2: Descriptive Statistics and Exploratory Data Analysis Lecture 2: Descriptive Statistics and Exploratory Data Analysis Further Thoughts on Experimental Design 16 Individuals (8 each from two populations) with replicates Pop 1 Pop 2 Randomly sample 4 individuals

More information

Recombinant DNA & Genetic Engineering. Tools for Genetic Manipulation

Recombinant DNA & Genetic Engineering. Tools for Genetic Manipulation Recombinant DNA & Genetic Engineering g Genetic Manipulation: Tools Kathleen Hill Associate Professor Department of Biology The University of Western Ontario Tools for Genetic Manipulation DNA, RNA, cdna

More information

BBSRC TECHNOLOGY STRATEGY: TECHNOLOGIES NEEDED BY RESEARCH KNOWLEDGE PROVIDERS

BBSRC TECHNOLOGY STRATEGY: TECHNOLOGIES NEEDED BY RESEARCH KNOWLEDGE PROVIDERS BBSRC TECHNOLOGY STRATEGY: TECHNOLOGIES NEEDED BY RESEARCH KNOWLEDGE PROVIDERS 1. The Technology Strategy sets out six areas where technological developments are required to push the frontiers of knowledge

More information

HiPer RT-PCR Teaching Kit

HiPer RT-PCR Teaching Kit HiPer RT-PCR Teaching Kit Product Code: HTBM024 Number of experiments that can be performed: 5 Duration of Experiment: Protocol: 4 hours Agarose Gel Electrophoresis: 45 minutes Storage Instructions: The

More information

Step-by-Step Guide to Basic Expression Analysis and Normalization

Step-by-Step Guide to Basic Expression Analysis and Normalization Step-by-Step Guide to Basic Expression Analysis and Normalization Page 1 Introduction This document shows you how to perform a basic analysis and normalization of your data. A full review of this document

More information

UNIVERSITY OF NAIROBI

UNIVERSITY OF NAIROBI UNIVERSITY OF NAIROBI MASTERS IN PROJECT PLANNING AND MANAGEMENT NAME: SARU CAROLYNN ELIZABETH REGISTRATION NO: L50/61646/2013 COURSE CODE: LDP 603 COURSE TITLE: RESEARCH METHODS LECTURER: GAKUU CHRISTOPHER

More information

Data Analysis on the ABI PRISM 7700 Sequence Detection System: Setting Baselines and Thresholds. Overview. Data Analysis Tutorial

Data Analysis on the ABI PRISM 7700 Sequence Detection System: Setting Baselines and Thresholds. Overview. Data Analysis Tutorial Data Analysis on the ABI PRISM 7700 Sequence Detection System: Setting Baselines and Thresholds Overview In order for accuracy and precision to be optimal, the assay must be properly evaluated and a few

More information

2. True or False? The sequence of nucleotides in the human genome is 90.9% identical from one person to the next. False (it s 99.

2. True or False? The sequence of nucleotides in the human genome is 90.9% identical from one person to the next. False (it s 99. 1. True or False? A typical chromosome can contain several hundred to several thousand genes, arranged in linear order along the DNA molecule present in the chromosome. True 2. True or False? The sequence

More information

QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS

QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS This booklet contains lecture notes for the nonparametric work in the QM course. This booklet may be online at http://users.ox.ac.uk/~grafen/qmnotes/index.html.

More information

Forensic DNA Testing Terminology

Forensic DNA Testing Terminology Forensic DNA Testing Terminology ABI 310 Genetic Analyzer a capillary electrophoresis instrument used by forensic DNA laboratories to separate short tandem repeat (STR) loci on the basis of their size.

More information

Core Facility Genomics

Core Facility Genomics Core Facility Genomics versatile genome or transcriptome analyses based on quantifiable highthroughput data ascertainment 1 Topics Collaboration with Harald Binder and Clemens Kreutz Project: Microarray

More information

Just the Facts: A Basic Introduction to the Science Underlying NCBI Resources

Just the Facts: A Basic Introduction to the Science Underlying NCBI Resources 1 of 8 11/7/2004 11:00 AM National Center for Biotechnology Information About NCBI NCBI at a Glance A Science Primer Human Genome Resources Model Organisms Guide Outreach and Education Databases and Tools

More information

Software and Methods for the Analysis of Affymetrix GeneChip Data. Rafael A Irizarry Department of Biostatistics Johns Hopkins University

Software and Methods for the Analysis of Affymetrix GeneChip Data. Rafael A Irizarry Department of Biostatistics Johns Hopkins University Software and Methods for the Analysis of Affymetrix GeneChip Data Rafael A Irizarry Department of Biostatistics Johns Hopkins University Outline Overview Bioconductor Project Examples 1: Gene Annotation

More information

Real-time quantitative RT -PCR (Taqman)

Real-time quantitative RT -PCR (Taqman) Real-time quantitative RT -PCR (Taqman) Author: SC, Patti Lab, 3/03 This is performed as a 2-step reaction: 1. cdna synthesis from DNase 1-treated total RNA 2. PCR 1. cdna synthesis (Advantage RT-for-PCR

More information

AGILENT S BIOINFORMATICS ANALYSIS SOFTWARE

AGILENT S BIOINFORMATICS ANALYSIS SOFTWARE ACCELERATING PROGRESS IS IN OUR GENES AGILENT S BIOINFORMATICS ANALYSIS SOFTWARE GENESPRING GENE EXPRESSION (GX) MASS PROFILER PROFESSIONAL (MPP) PATHWAY ARCHITECT (PA) See Deeper. Reach Further. BIOINFORMATICS

More information

Dr Alexander Henzing

Dr Alexander Henzing Horizon 2020 Health, Demographic Change & Wellbeing EU funding, research and collaboration opportunities for 2016/17 Innovate UK funding opportunities in omics, bridging health and life sciences Dr Alexander

More information

Cancer Biostatistics Workshop Science of Doing Science - Biostatistics

Cancer Biostatistics Workshop Science of Doing Science - Biostatistics Cancer Biostatistics Workshop Science of Doing Science - Biostatistics Yu Shyr, PhD Jan. 18, 2008 Cancer Biostatistics Center Vanderbilt-Ingram Cancer Center Yu.Shyr@vanderbilt.edu Aims Cancer Biostatistics

More information

Analysis of Illumina Gene Expression Microarray Data

Analysis of Illumina Gene Expression Microarray Data Analysis of Illumina Gene Expression Microarray Data Asta Laiho, Msc. Tech. Bioinformatics research engineer The Finnish DNA Microarray Centre Turku Centre for Biotechnology, Finland The Finnish DNA Microarray

More information

Next Generation Sequencing

Next Generation Sequencing Next Generation Sequencing DNA sequence represents a single format onto which a broad range of biological phenomena can be projected for high-throughput data collection Over the past three years, massively

More information

Two-Way ANOVA tests. I. Definition and Applications...2. II. Two-Way ANOVA prerequisites...2. III. How to use the Two-Way ANOVA tool?...

Two-Way ANOVA tests. I. Definition and Applications...2. II. Two-Way ANOVA prerequisites...2. III. How to use the Two-Way ANOVA tool?... Two-Way ANOVA tests Contents at a glance I. Definition and Applications...2 II. Two-Way ANOVA prerequisites...2 III. How to use the Two-Way ANOVA tool?...3 A. Parametric test, assume variances equal....4

More information

Lecture 11 Data storage and LIMS solutions. Stéphane LE CROM lecrom@biologie.ens.fr

Lecture 11 Data storage and LIMS solutions. Stéphane LE CROM lecrom@biologie.ens.fr Lecture 11 Data storage and LIMS solutions Stéphane LE CROM lecrom@biologie.ens.fr Various steps of a DNA microarray experiment Experimental steps Data analysis Experimental design set up Chips on catalog

More information

Materials and Methods. Blocking of Globin Reverse Transcription to Enhance Human Whole Blood Gene Expression Profiling

Materials and Methods. Blocking of Globin Reverse Transcription to Enhance Human Whole Blood Gene Expression Profiling Application Note Blocking of Globin Reverse Transcription to Enhance Human Whole Blood Gene Expression Profi ling Yasmin Beazer-Barclay, Doug Sinon, Christopher Morehouse, Mark Porter, and Mike Kuziora

More information

Package dunn.test. January 6, 2016

Package dunn.test. January 6, 2016 Version 1.3.2 Date 2016-01-06 Package dunn.test January 6, 2016 Title Dunn's Test of Multiple Comparisons Using Rank Sums Author Alexis Dinno Maintainer Alexis Dinno

More information

business statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar

business statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar business statistics using Excel Glyn Davis & Branko Pecar OXFORD UNIVERSITY PRESS Detailed contents Introduction to Microsoft Excel 2003 Overview Learning Objectives 1.1 Introduction to Microsoft Excel

More information

Introduction to next-generation sequencing data

Introduction to next-generation sequencing data Introduction to next-generation sequencing data David Simpson Centre for Experimental Medicine Queens University Belfast http://www.qub.ac.uk/research-centres/cem/ Outline History of DNA sequencing NGS

More information

Real-time PCR: Understanding C t

Real-time PCR: Understanding C t APPLICATION NOTE Real-Time PCR Real-time PCR: Understanding C t Real-time PCR, also called quantitative PCR or qpcr, can provide a simple and elegant method for determining the amount of a target sequence

More information

Improving the Performance of Data Mining Models with Data Preparation Using SAS Enterprise Miner Ricardo Galante, SAS Institute Brasil, São Paulo, SP

Improving the Performance of Data Mining Models with Data Preparation Using SAS Enterprise Miner Ricardo Galante, SAS Institute Brasil, São Paulo, SP Improving the Performance of Data Mining Models with Data Preparation Using SAS Enterprise Miner Ricardo Galante, SAS Institute Brasil, São Paulo, SP ABSTRACT In data mining modelling, data preparation

More information

Gene Models & Bed format: What they represent.

Gene Models & Bed format: What they represent. GeneModels&Bedformat:Whattheyrepresent. Gene models are hypotheses about the structure of transcripts produced by a gene. Like all models, they may be correct, partly correct, or entirely wrong. Typically,

More information

Biotechnology: DNA Technology & Genomics

Biotechnology: DNA Technology & Genomics Chapter 20. Biotechnology: DNA Technology & Genomics 2003-2004 The BIG Questions How can we use our knowledge of DNA to: diagnose disease or defect? cure disease or defect? change/improve organisms? What

More information

Discovery and Quantification of RNA with RNASeq Roderic Guigó Serra Centre de Regulació Genòmica (CRG) roderic.guigo@crg.cat

Discovery and Quantification of RNA with RNASeq Roderic Guigó Serra Centre de Regulació Genòmica (CRG) roderic.guigo@crg.cat Bioinformatique et Séquençage Haut Débit, Discovery and Quantification of RNA with RNASeq Roderic Guigó Serra Centre de Regulació Genòmica (CRG) roderic.guigo@crg.cat 1 RNA Transcription to RNA and subsequent

More information

RNA-seq. Quantification and Differential Expression. Genomics: Lecture #12

RNA-seq. Quantification and Differential Expression. Genomics: Lecture #12 (2) Quantification and Differential Expression Institut für Medizinische Genetik und Humangenetik Charité Universitätsmedizin Berlin Genomics: Lecture #12 Today (2) Gene Expression per Sources of bias,

More information

Introduction to Quantitative Methods

Introduction to Quantitative Methods Introduction to Quantitative Methods October 15, 2009 Contents 1 Definition of Key Terms 2 2 Descriptive Statistics 3 2.1 Frequency Tables......................... 4 2.2 Measures of Central Tendencies.................

More information

THE KRUSKAL WALLLIS TEST

THE KRUSKAL WALLLIS TEST THE KRUSKAL WALLLIS TEST TEODORA H. MEHOTCHEVA Wednesday, 23 rd April 08 THE KRUSKAL-WALLIS TEST: The non-parametric alternative to ANOVA: testing for difference between several independent groups 2 NON

More information

Organizing Your Approach to a Data Analysis

Organizing Your Approach to a Data Analysis Biost/Stat 578 B: Data Analysis Emerson, September 29, 2003 Handout #1 Organizing Your Approach to a Data Analysis The general theme should be to maximize thinking about the data analysis and to minimize

More information

Final Project Report

Final Project Report CPSC545 by Introduction to Data Mining Prof. Martin Schultz & Prof. Mark Gerstein Student Name: Yu Kor Hugo Lam Student ID : 904907866 Due Date : May 7, 2007 Introduction Final Project Report Pseudogenes

More information