Bioinformatica. Dr. Marco Fondi Lezione # 6. Corso di Laurea in Scienze Biologiche, AA 2012-2013



Similar documents
Next Generation Sequencing

Next Generation Sequencing: Technology, Mapping, and Analysis

Analysis of ChIP-seq data in Galaxy

Dal proge*o genoma umano ad oggi: evoluzione delle tecniche di sequenziamento, analisi genomica e proteomica e prospe9ve future!

Introduction to NGS data analysis

Next Generation Sequencing; Technologies, applications and data analysis

Introduction to next-generation sequencing data

Genetic Analysis. Phenotype analysis: biological-biochemical analysis. Genotype analysis: molecular and physical analysis

Nazneen Aziz, PhD. Director, Molecular Medicine Transformation Program Office

ITALIANO SCIENZE MOTORIE MATEMATICA ITALIANO STORIA+GEOGR MATEMATICA MATEMATICA REL+ALT. ARTE E IMMAGINE ARTE E IMMAGINE TECNOLOGIA

July 7th 2009 DNA sequencing

Shouguo Gao Ph. D Department of Physics and Comprehensive Diabetes Center

Next generation sequencing (NGS)

Cloud Computing Solutions for Genomics Across Geographic, Institutional and Economic Barriers

Cloud BioLinux: Pre-configured and On-demand Bioinformatics Computing for the Genomics Community

Automated DNA sequencing 20/12/2009. Next Generation Sequencing

Cloud BioLinux: Pre-configured and On-demand Bioinformatics Computing for the Genomics Community

BIG DATA BIG DATA 8/1/12. Cool Informa+cs Tools and Services for Biomedical Research. David Ruau, PhD. August 1 st, 2012

Tutorial for Windows and Macintosh. Preparing Your Data for NGS Alignment

Next Generation Sequencing; Technologies, applications and data analysis

MiSeq: Imaging and Base Calling

NGS data analysis. Bernardo J. Clavijo

Computational Genomics. Next generation sequencing (NGS)

Curriculum Vitae et Studiorum

How Sequencing Experiments Fail

Deep Sequencing Data Analysis

Lezione Xl-Xll giovedì 27-X-2011

A Complete Example of Next- Gen DNA Sequencing Read Alignment. Presentation Title Goes Here

Next Generation Sequencing; Technologies, applications and data analysis

Concepts and methods in sequencing and genome assembly

DNA Sequencing & The Human Genome Project

Automated and Scalable Data Management System for Genome Sequencing Data

Putting Genomes in the Cloud with WOS TM. ddn.com. DDN Whitepaper. Making data sharing faster, easier and more scalable

Statistics Jobs. La mia esperienza nell industria farmaceutica Silvia Barbi, Statistician at Novartis Vaccines Bologna, 9 Maggio 2014

Laboratorio di Bioinformatica

NGS Technologies for Genomics and Transcriptomics

DNA Sequencing and Personalised Medicine

Next generation DNA sequencing technologies. theory & prac-ce

Data Processing of Nextera Mate Pair Reads on Illumina Sequencing Platforms

Submission Schedule for Descriptive/Raw Data

SRA File Formats Guide

Using Galaxy for NGS Analysis. Daniel Blankenberg Postdoctoral Research Associate The Galaxy Team

STANFORD UNIVERSITY LANGUAGE CENTER ITALLANG 5C First Year Intensive Italian - 3rd Quarter - Summer 2012

Genome and DNA Sequence Databases. BME 110/BIOL 181 CompBio Tools Todd Lowe March 31, 2009

Metodologie di sequenziamemento

STANFORD UNIVERSITY LANGUAGE CENTER ITALLANG 5A

Workshop Rapid NGS for Public Health Microbiology

De Novo Assembly Using Illumina Reads

Welcome to the Plant Breeding and Genomics Webinar Series

Prepare the environment Practical Part 1.1

Geospiza s Finch-Server: A Complete Data Management System for DNA Sequencing

Master in Diagnostica avanzata per i Beni Culturali I livello Anno accademico:

Issues in Data Storage and Data Management in Large- Scale Next-Gen Sequencing

Comparison of Sequence Reads Obtained from Three Next-Generation Sequencing Platforms

Addressing the Black Box Phenomenon of Genome Sequencing and Assembly

Decode File Client User Guide

SEQUENCING. From Sample to Sequence-Ready

High Performance Compu2ng Facility

UCLA Team Sequences Cell Line, Puts Open Source Software Framework into Production

Q&A: Kevin Shianna on Ramping up Sequencing for the New York Genome Center

Version 5.0 Release Notes

A Hitchhiker s Guide to Next-Generation Sequencing

Working with AppleScript

Lectures 1 and February 7, Genomics 2012: Repetitorium. Peter N Robinson. VL1: Next- Generation Sequencing. VL8 9: Variant Calling

Introduction to transcriptome analysis using High Throughput Sequencing technologies (HTS)

Searching Nucleotide Databases

RNAseq / ChipSeq / Methylseq and personalized genomics

Basic processing of next-generation sequencing (NGS) data

NECC History. Karl V. Steiner 2011 Annual NECC Meeting, Orono, Maine March 15, 2011

The NGS IT notes. George Magklaras PhD RHCE

Overview sequence projects

Bacterial Next Generation Sequencing - nur mehr Daten oder auch mehr Wissen? Dag Harmsen Univ. Münster, Germany dharmsen@uni-muenster.

Introduction to Bioinformatics 3. DNA editing and contig assembly

Lezione 10 Introduzione a OPNET

How To Understand The Science Of Genomics

Microbial Oceanomics using High-Throughput DNA Sequencing

Worldwide Collaborations in Molecular Profiling

Roberto Ciccone, Orsetta Zuffardi Università di Pavia

Importance of Statistics in creating high dimensional data

Analysis of NGS Data

Genotyping by sequencing and data analysis. Ross Whetten North Carolina State University

Mattia Casotti CURRICULUM VITAE

MSc on Applied Computer Science

ITL 101-2C Introductory Italian I Fall 2015

Big data in cancer research : DNA sequencing and personalised medicine

What is a contig? What are the contig assembly programs?

Bioruptor NGS: Unbiased DNA shearing for Next-Generation Sequencing

Computational infrastructure for NGS data analysis. José Carbonell Caballero Pablo Escobar

Keeping up with DNA technologies

DAY 1 THURSDAY 25 JUNE I GIORNATA GIOVEDÌ 25 GIUGNO

HPC-MAQ : A PARALLEL SHORT-READ REFERENCE ASSEMBLER

PROGRAMME SPECIFICATION

High Throughput Sequencing Data Analysis using Cloud Computing

Nota: Test campione in vigore dalla sessione d'esami giugno- luglio 2012

Bioinformatics and its applications

Specialty Lab Informatics and its role in a large academic medical center

How long is long enough?

Practical Guideline for Whole Genome Sequencing

NEXT GENERATION SEQUENCING

How To Write An Open Source Software Project

Transcription:

Bioinformatica Dr. Marco Fondi Lezione # 6 Corso di Laurea in Scienze Biologiche, AA 2012-2013 martedì 30 ottobre 2012 1

Sequenziamento ed analisi di genomi: la genomica 2 martedì 30 ottobre 2012

martedì 30 ottobre 2012 3

martedì 30 ottobre 2012 4

martedì 30 ottobre 2012

martedì 30 ottobre 2012 6

Next Generation Sequencing 454, Roche ABI's SOLiD Method. Solexa, Illumina PACIFIC BIOSCIENCES Desktop Ion torrent 454 GS junior Illumina miseq martedì 30 ottobre 2012

martedì 30 ottobre 2012

http://commons.wikimedia.org/wiki/

http://www.nature.com/

3 Main Technologies Solid

http://www.dkfz.de/gpcf/850.html

Credit: Illumina

http://www.dkfz.de/gpcf/850.html

http://www.illumina.com/technology/

http://www.dkfz.de/gpcf/849.html

http://www.flickr.com/photos/doe_jgi/

The development and impact of 454 sequencing Jonathan M Rothberg & John H Leamon Nature Biotechnology 26, 1117-1124 (2008) Published online: 9 October 2008 doi:10.1038/nbt1485

Genome Biol. 2009; 10(3): R32. Published online 2009 March 27. doi: 10.1186/ gb-2009-10-3-r32. Evaluation of next generation sequencing platforms for population targeted sequencing studies

Sequencing technologies the next generation Michael L. Metzker Nature Reviews Genetics 11, 31-46 (January 2010) doi:10.1038/nrg2626

Storage

http://blogs.forbes.com/sciencebiz/2010/06/03/your-genome-is-

Genome Biol. 2010;11(5):207. Epub 2010 May 5. The case for cloud computing in genome informatics.

http://www.flickr.com/photos/esquimo_2ooo/

http://www.flickr.com/photos/jpf/152611490/

http://www.cloudera.com/what-is-hadoop/hadoop-overview/

FASTQ

@IL31_4368:1:1:996:8507/2 TCCCTTACCCCCAAGCTCCATACCCTCCTAATGCCCACACCTCTTACCTTAGGA + FFCEFFFEEFFFFFFFEFFEFFFEFCFC<EEFEFFFCEFF<;EEFF=FEE?FCE @IL31_4368:1:1:996:21421/2 CAAAAACTTTCACTTTACCTGCCGGGTTTCCCAGTTTACATTCCACTGTTTGAC + >DBDDB,B9BAA4AAB7BB?7BBB=91;+*@;5<87+*=/*@@?9=73=.7)7* @IL31_4368:1:1:997:10572/2 GATCTTCTGTGACTGGAAGAAAATGTGTTACATATTACATTTCTGTCCCCATTG + E?=EECE<EEEE98EEEEAEEBD??BE@AEAB><EEABCEEDEC<<EBDA=DEE @IL31_4368:1:1:997:15684/2 CAGCCTCAGATTCAGCATTCTCAAATTCAGCTGCGGCTGAAACAGCAGCAGGAC + EEEEDEEE9EAEEDEEEEEEEEEECEEAAEEDEE<CD=D=*BCAC?;CB,<D@, @IL31_4368:1:1:997:15249/2 AATGTTCTGAAACCTCTGAGAAAGCAAATATTTATTTTAATGAAAAATCCTTAT + EDEEC;EEE;EEE?EECE;7AEEEEEE07EECEA;D6D>+EE4E7EEE4;E=EA @IL31_4368:1:1:997:6273/2 ACATTTACCAAGACCAAAGGAAACTTACCTTGCAAGAATTAGACAGTTCATTTG + EEAAFFFEEFEFCFAFFAFCCFFEFEF>EFFFFB?ABA@ECEE=<F@DE@DDF; @IL31_4368:1:1:997:1657/2 CCCACCTCTCTCAATGTTTTCCATATGGCAGGGACTCAGCACAGGTGGATTAAT (...)

Solexa/Illumina Read Format The syntax of Solexa/Illumina read format is almost identical to the FASTQ format, but the qualities are scaled differently. Given a character $sq, the following Perl code gives the Phred quality $Q: $Q = 10 * log(1 + 10 ** (ord($sq) - 64) / 10.0)) / log(10); http://maq.sourceforge.net/fastq.shtml

martedì 30 ottobre 2012 454 strategy

Genomica: le 4 A Genomica: le 4 fasi 1. Assemblaggio 2. Assegnazione 3. Annotazione 4. Analisi (genomica comparativa) venerdì 17 dicembre 2010

Genomica: le 4 A Genomica: le 4 fasi 1. Assemblaggio 2. Assegnazione 3. Annotazione 4. Analisi (genomica comparativa) venerdì 17 dicembre 2010

genoma 454, Roche ABI's SOLiD Method. Solexa, Illumina venerdì 17 dicembre 2010

genoma 454, Roche ABI's SOLiD Method. Solexa, Illumina venerdì 17 dicembre 2010

genoma 454, Roche ABI's SOLiD Method. Solexa, Illumina campione venerdì 17 dicembre 2010

genoma 454, Roche ABI's SOLiD Method. Solexa, Illumina campione venerdì 17 dicembre 2010

454, Roche ABI's SOLiD Method. Solexa, Illumina sequenza genoma venerdì 17 dicembre 2010

454, Roche ABI's SOLiD Method. Solexa, Illumina sequenza genoma venerdì 17 dicembre 2010

454, Roche ABI's SOLiD Method. Solexa, Illumina reads sequenza genoma venerdì 17 dicembre 2010

genome venerdì 17 dicembre 2010

genome reads venerdì 17 dicembre 2010

genome reads reads assembly contigs A B C venerdì 17 dicembre 2010

contigs A B C GAP GAP venerdì 17 dicembre 2010

contigs A B C oppure GAP GAP contigs B A C GAP GAP venerdì 17 dicembre 2010

contigs A B C oppure GAP GAP contigs B A C GAP GAP oppure... venerdì 17 dicembre 2010

GAP closure 1 A B reads venerdì 17 dicembre 2010

GAP closure 1 A B reads no overlap? sì A B contig esteso venerdì 17 dicembre 2010

GAP closure 2 1. reference genome A B venerdì 17 dicembre 2010

GAP closure 2 1. reference genome A B 2. reference genome reads venerdì 17 dicembre 2010

GAP closure 2 1. reference genome A B 2. reference genome reads 3. reference genome A contig esteso B venerdì 17 dicembre 2010

1 2 3 4 5 6 giovedì 3 novembre 2011

1 2 3 4 5 6 draft genome giovedì 3 novembre 2011

complete genome 1 2 3 4 5 6 draft genome giovedì 3 novembre 2011

giovedì 3 novembre 2011

giovedì 3 novembre 2011

giovedì 3 novembre 2011 Genoma completamente sequenziato