NEXT GENERATION SEQUENCING Dr. R. Piazza
SANGER SEQUENCING + DNA
NEXT GENERATION SEQUENCING Flowcell
NEXT GENERATION SEQUENCING Library di DNA Genomic DNA
NEXT GENERATION SEQUENCING
NEXT GENERATION SEQUENCING I 4 nucleotidi marcati con fluorocromi e bloccati in 3 sono aggiunti contemporaneamente Primer di sequenziamento Nucleotidi marcati e bloccati
NEXT GENERATION SEQUENCING ACQUISIZIONE DELL IMMAGINE RIMOZIONE DEL BLOCCO AL 3 RIMOZIONE DEL FLUOROFORO
HIGH-THROUGHPUT SEQUENCING
NEXT GENERATION SEQUENCING
HiSeq2500 8 Lane/Flowcell 2 FLOWCELL 8 Lane/Flowcell 250 * 10^6 Cluster/Lane 250 * 10^6 Cluster/Lane 125bp 125bp 2 x 125bp/Cluster (Pair-end) Throughput = 2 * 8 * 250 * 10^6 * 2 * 125bp = 1000 000 000 000 bp!! = 1 Tb (Terabase)
SANGER SEQ vs. NGS THROUGHPUT COSTO PER-BASE Allele #1 Allele #2 C A G C G A C A G C A G C A T T G G G A C C A G C G A C A G C G G C A T T G G G A C Coverage = 5 NGS Read #5 NGS Read #4 NGS Read #3 NGS Read #2 NGS Read #1 Allele #1 Allele #2 C A G C G A C A G C G G C A T T G G G A C C A G C G A C A G C A G C A T T G G G A C C A G C G A C A G C A G C A T T G G G A C C A G C G A C A G C A G C A T T G G G A C C A G C G A C A G C G G C A T T G G G A C C A G C G A C A G C A G C A T T G G G A C C A G C G A C A G C G G C A T T G G G A C
HIGH-THROUGHPUT SEQUENCING: APPLICAZIONI DNA RNA GENOMIC DNA SEQUENCING RESEQUENCING DE NOVO SEQUENCING WHOLE-EXOME SEQUENCING ChIP-Seq ULTRADEEP SEQUENCING METHYL-SEQ mrna SEQUENCING TRANSCRIPTOME SEQUENCING (RNA-SEQ) TAG SEQUENCING (DITAG) MICRO-RNA STUDIES
WHOLE-GENOME, WHOLE-EXOME AND ULTRADEEP-SEQUENCING COVERAGE COVERAGE WHOLE-GENOME ULTRADEEP-SEQ
ULTRADEEP SEQUENCING QUANDO? M M ABL kinase domain
COVERAGE ULTRADEEP-SEQ
WHOLE-EXOME SEQUENCING
VARIANT CALLING SINGLE NUCLEOTIDE POLYMORPHISM VARIANT C T A A G G C T A A G T G C T A A G... G A G A G A T C T G A A T T G C T T T G C T A T T G C T G A T..ACTGAATTGCTGATTGTCAAGTCTGCTAGCG... MUTATION, SEQ ERROR OR SNP? CASE SAMPLE A A G G G G T T T T T T.....ACTGAATTGCTGATTGTCAAGTCTGCTAGCG.. CONTROL SAMPLE VarScan 2 (http://massgenomics.org/varscan) Koboldt DC et al., Genome Res. 2012 Mar;22(3):568-76
CASE CONTROL WHOLE-EXOME SEQUENCING GOES DIGITAL
CASE CONTROL LOSS OF HETEROZYGOSITY ALLELIC IMBALANCE A A T T A A T
WHOLE-EXOME SEQUENCING GOES DIGITAL: CEQer COMPARATIVE EXONIC QUANTIFICATION ANALYZER Piazza R. et al., PLoS One. 2013 Oct 4;8(10):e74825
Statistical module Wilcoxon Signed-Rank test Test statistic W W N r ( case) ( control) sgn x x Ri i i i 1 As sample size increases (Nr> 10) the Z-Score converges to a Gaussian distribution! Estimating the error function of the normal distribution of W.. Wilcoxon Signed-Rank test erf ( x) 1..using the Abramowitz and Stegun approximation equation 7.1.26 a t a 1 2 t 2 a 3 t 3 a 4 t 4 a 5 t 5 e x 2
Log2 Ratio CML-BC PATIENT: CML001BC Chr9 CDKN2A (p16)
CML-BC PATIENT: CML004BC Chr17 http://www.ngsbicocca.org/html/ceqer.html p53
ANALISI DI PRODOTTI DI FUSIONE ONCOGENICI
ANALISI DI PRODOTTI DI FUSIONE ONCOGENICI FRAMMENTAZIONE?
RNA-seq DRIVER FUSION TRANSCRIPTS IDENTIFICATION Junction reads Bridge reads 76bp 76bp Piazza R. et al., Nucleic Acids Res. 2012 Sep;40(16):e123
ALIGNMENT TO HUMAN GENOME SAM BAM EXOME DATASET CCDS / REFFLAT EXOME BUILDER ABNORMAL PAIRS ABNORMAL PAIRS SCANNER BCR ex14 ABL ex2 HALF- MAPPED PAIRS??? Genome PUTATIVE TRANSLOCATIONS SET (PTS) PREFILTERING ALGORITHM Read Quality Mapping Quality Homology Filter Threshold Filter N Filter FILTERED HALF-MAPPED PAIRS FILTERED PTS
1 2 3 4 FILTERED PTS JUNCTION FINDER Ex12 Ex13 Ex14 Ex2 Ex3 Ex4 BCR ABL 1 JUNCTIONS LIST 2 3 4 FILTERED HALF-MAPPED PAIRS ALIGNMENT ALGORITHM Ex14 Ex2 BCR??? JUNCTION JUNCTION READ JUNCTION
JUNCTION READ FRAME ALGORITHM DIRECTION ALGORITHM RECIPROCAL TRANSLOCATION ALGORITHM 5 BCR ABL 3 5 ABL BCR 3
AML1-ETO t(8;21) BCR-ABL1 p190 t(9;22) BCR-ABL1 p210 e13a2 t(9;22) BCR-ABL1 p210 e14a2 t(9;22) CBFB-MYH11 inv(16) CEP110-FGFR1 t(8;9) EWSR1-ERG t(21;22) MLL-MLLT1 t(11;19) MLL-MLLT3 t(9;11) MLLT10-PICALM t(10;11) NCOA4-RET inv(10) NPM-ALK t(2;5)
RNA-Seq HIGH EXPRESSION LOW EXPRESSION RNA-SEQ GOES DIGITAL READ EXON RPKM = READS PER KBASE PER MILLION OF MAPPED READS TPM = TRANSCRIPTS PER MILLION TOPHAT (http://tophat.cbcb.umd.edu/) CUFFLINKS (http://cufflinks.cbcb.umd.edu/) Trapnell C, et al. Nat. Biotechnol. 2010;28:511 515.
HIGH-THROUGHPUT SEQUENCING: APPLICAZIONI DNA RNA GENOMIC DNA SEQUENCING RESEQUENCING DE NOVO SEQUENCING WHOLE-EXOME SEQUENCING ChIP-Seq DEEP SEQUENCING METHYL-SEQ mrna SEQUENCING TRANSCRIPTOME SEQUENCING (RNA-SEQ) TAG SEQUENCING (DITAG) MICRO-RNA STUDIES
METHYL-SEQ
NEXT GENERATION SEQUENCING CON STRUMENTI NGS POSSIAMO SEQUENZIARE INTERI GENOMI I PROTOCOLLI RELATIVI A BUONA PARTE DELLE APPLICAZIONI E STANDARDIZZATA MOLTI TOOL DI ANALISI SONO FREE E OPEN-SOURCE AD OGGI C E BUONA STANDARDIZZAZIONE ANCHE DEI FORMATI FILE SIAMO IN UNA SITUAZIONE IDEALE??
NEXT GENERATION SEQUENCING
HIGH-THROUGHPUT SEQUENCING: DOMANI PAZIENTE EMOCROMO ESAMI EMATOCHIMICI ESAMI COLTURALI ESAMI STRUMENTALI SEQUENZIAMENTO GENOMA