Biencome Design and Genebank - Aet bio speetal

Size: px
Start display at page:

Download "Biencome Design and Genebank - Aet bio speetal"

Transcription

1 M2 BIM - Génomes, Génétique et Evolution Anatomie et annotations des génomes Mardi 20 octobre Cours de 9h a 12h salle M2 BIM Génétique Génome et Evolution 13 octobre 2010 Definitions: Genome sizes: The C-value paradox 1) Genome: the genome is the entire DNA content of a cell - chromosomes - plasmids - mitochondrial DNA - chloroplastic DNA 2) Gene: A gene is an informative DNA sequence composed of a transcribed region and a regulatory sequence 3) ORF (open reading frame): a DNA sequence betweeen two STOP codons. It is presumed to be the sequence of a protein coding gene 4) CDS (coding sequence): a DNA sequence betweeen a START and a STOP codon 5) Intron: a RNA sequence spliced from the pre-mature RNA Exon: the coding part of the protein encoded genes 1

2 Gene content: The G-value paradox Number of chromosomes/haploid genome: Estimated gene number: S. pombe n = 3 Arabidopsis : n = 5 S. cerevisiae : n = 16 Human : n = 23 Tobacco : n = 36 Kiwi : n = 98 Fern: n > 500 ~ 20,000 Homo sapiens Drosophila melanogaster ~ 13,000 Arabidopsis thaliana ~ 25,000 Caenorhabditis elegans ~ 19,000 ~ 40,000 Paramecium tetraurelia Saccharomyces cerevisiae ~ 6,000 Escherichia coli (Mbp) Genome content : I) no correlation between complexity: genome size number of genes number of chromosomes 2 - Gene duplications Most eukaryotic genomes contain high proportion of duplicated sequences Unique sequences - protein encoding genes S. c. A. t. C. e. D. m. H. s. s. - RNA genes (RNAseP, TelC1, ) II) Repeated sequences - transposable elements - ADN satellite Duplicated Genes 43% 65% 49% 40% duplication 50% 25-60% du génome des vertebrés environ 50% du génome humain Jusqu à 80% du génome des plantes ou des amphibiens. - protein encoding genes - RNA genes (tdna, rdna, etc) Pseudogenization Neofunctionalization Most frequent fate: 278 in yeast Gain of a new function (Lafontaine et al. 2004) Conservation Gene dosage increase Genetic robustness Degeneration Complementation Specialization of the 2 copies 2

3 Homologs, orthologs (co-orthologs), paralogs (in-paralogs, out-paralogs) ancestor A ancestor ancestor A B Duplication DUPLICATION LOSS Speciation Speciation Speciation Duplication A B A1 A2 species 1 species 2 orthologs A1 B1 A2 B2 species 1 species 2 out-paralogs out-paralogs orthologs A B C species 1 species 2 in-paralogs orthologs Duplication ancestor Organisation et structure des gènes «protéiques» chez les eucaryotes Duplication A B C Speciation Loss of B1 A1 B1 C1 A2 C2 B2.1 B2.2 Loss of A2 Loss of C2 Duplication A1 C1 B2.1 B2.2 species 1 species 2 3

4 Organisation et structure des gènes «protéiques» chez les eucaryotes LesamoureodcbighdccohcheuxzhvbzdcizqhcokqsikeiutrzevuzeidcvbCIferventse tlessavantsausfxqghklmpotèresjqsiaiobcsbcoiohsodjsqjjxchcqyxnldsqshsnchgdq qsoqqpcqpcccdgjlcjsjpaimentégalemqshxhxqxioxiient,dansleurmûrebcjqoqp chhizpps,xqioqsogjydsguipgvaddixixxioisqisaison,fsdfrttykylibvqleshsduzis klxlxhjhchghgchhchchatspuissansks,ndoidopezpsmsktsetdoux,orggcq qxucvvvv vxwdtyhsvueilcjqpcjjcqoccccdelamacokqsikeiuzjqsiaioison,qddzaztrykjkloljtvq uicommeeuxccqscqvfg,hk;bscqfjiilopjsdsontfhhjdcizeodcbighdcrileuxetcommee uxqsqsazdzsédentaires.gdjqqspqqsiqsopqpscqpjdiksoaoqjknsndshvsdfsdfsdfshh gloqksdgzsauaqnwnwsschediokcjcjcdsdfgfhkcohchqhbcsbcoiohsodjsqjjxchcqyf xqhgdqqsoqqpcqpcccdgjlcjsjpdsdvsdvezbnj,uiyterrogjydsguipgvaddiqshxhxq xigdjqqspqqsiqsopqpscqpjdiksoaoqjknsndshvsdfsfsdfshhgloqkauaqnwndfgfhkh hjdcizeodcbighdccohchqhcokqsikeiuergzaqcqvzjqsiaiobcsbcoiohsodjswsschedio kcjcjcdsdfgfhkhhjdcizeodcbighdccohchqhcokqsikeiuzjqsiaiobcsbcoiohsodjsqjjx chcqyfxqqpcqpcccdgjlcjsjpvgrgtjykililloleergrrergrrrgerqqqqogjydsguipgvad diqshxhxqxibcoiohsodjsqjjxchcqyfxqhgdqqsoqqpcqpccsodjsqjjxchcqyfxqhgdq qsokqsikeiuzjqsiaiobcsbcoiohsodaiobcsbcoiohsodjsaqnwnwsschediokcjcjcdsdfgf hkhhjdcizeodckeiuzjqsiapgvaddiqshxhxqxioxiixgfhkhhjdcizeodcbighdccohchq AmisdelascienceetdeqqpCqdsdfgfhkhccohchqhcolavoluptéhcokqsikeiuzjqsiaiob Organisation et structure des gènes «protéiques» chez les eucaryotes LesamoureodcbighdccohcheuxzhvbzdcizqhcokqsikeiutrzevuzeidcvbCIferventse tlessavantsausfxqghklmpotèresjqsiaiobcsbcoiohsodjsqjjxchcqyxnldsqshsnchgdq qsoqqpcqpcccdgjlcjsjpaimentégalemqshxhxqxioxiient,dansleurmûrebcjqoqp chhizpps,xqioqsogjydsguipgvaddixixxioisqisaison,fsdfrttykylibvqleshsduzis klxlxhjhchghgchhchchatspuissansks,ndoidopezpsmsktsetdoux,orggcq qxucvvvv vxwdtyhsvueilcjqpcjjcqoccccdelamacokqsikeiuzjqsiaioison,qddzaztrykjkloljtvq uicommeeuxccqscqvfg,hk;bscqfjiilopjsdsontfhhjdcizeodcbighdcrileuxetcommee uxqsqsazdzsédentaires.gdjqqspqqsiqsopqpscqpjdiksoaoqjknsndshvsdfsdfsdfshh gloqksdgzsauaqnwnwsschediokcjcjcdsdfgfhkcohchqhbcsbcoiohsodjsqjjxchcqyf xqhgdqqsoqqpcqpcccdgjlcjsjpdsdvsdvezbnj,uiyterrogjydsguipgvaddiqshxhxq xigdjqqspqqsiqsopqpscqpjdiksoaoqjknsndshvsdfsfsdfshhgloqkauaqnwndfgfhkh hjdcizeodcbighdccohchqhcokqsikeiuergzaqcqvzjqsiaiobcsbcoiohsodjswsschedio kcjcjcdsdfgfhkhhjdcizeodcbighdccohchqhcokqsikeiuzjqsiaiobcsbcoiohsodjsqjjx chcqyfxqqpcqpcccdgjlcjsjpvgrgtjykililloleergrrergrrrgerqqqqogjydsguipgvad diqshxhxqxibcoiohsodjsqjjxchcqyfxqhgdqqsoqqpcqpccsodjsqjjxchcqyfxqhgdq qsokqsikeiuzjqsiaiobcsbcoiohsodaiobcsbcoiohsodjsaqnwnwsschediokcjcjcdsdfgf hkhhjdcizeodckeiuzjqsiapgvaddiqshxhxqxioxiixgfhkhhjdcizeodcbighdccohchq AmisdelascienceetdeqqpCqdsdfgfhkhccohchqhcolavoluptéhcokqsikeiuzjqsiaiob Les amoureux fervents et les savants austères Aiment également, dans leur mûre saison, Les chats puissants et doux, orgueil de la maison, Qui comme eux sont frileux et comme eux sédentaires. Amis de la science et de la volupté, / Structure of Eukaryotic coding genes: Eukaryotic mrnas are modified at their 5ʼ and 3ʼ ends -5ʼ cap -poly-a tail at 3ʼ end Eukaryotic genes give rise to multiple protein products -alternative splicing -alternative promoters -alternative terminators multiple promoters alternatively spliced introns multiple terminators Ch. Beaudelaire, «Les chats» alternative promoters alternative terminators alternative splicing correlation between complexity and protein diversity? About prot in human 4

5 Chaque chromosome humain contient des dizaines de millions de paires de bases Chaque chromosome humain contient des dizaines de millions de paires de bases TCGCGCGTTTCGGTGATGACGGTGAAAACCTCTGACACATGCAGCTCCCGGAGACGGTCACAGCTTGTCTGTAAGCGGATGCCGGGAGCAGACAAGCCCGTCAGGGCGCGTCAGCGGGTGTTGGCGGGTGTCGGGGCTGG CTTAACTATGCGGCATCAGAGCAGATTGTACTGAGAGTGCACCATATGCGGTGTGAAATACCGCACAGATGCGTAAGGAGAAAATACCGCATCAGGCGCCATTCGCCATTCAGGCTGCGCAACTGTTGGGAAGGGCGATC GGTGCGGGCCTCTTCGCTATTACGCCAGCTGGCGAAAGGGGGATGTGCTGCAAGGCGATTAAGTTGGGTAACGCCAGGGTTTTCCCAGTCACGACGTTGTAAAACGACGGCCAGTGAATTCGAGCTCGGTACCCGGGGAT CCTCTAGAGTCGACCTGCAGGCATGCAAGCTTGGCGTAATCATGGTCATAGCTGTTTCCTGTGTGAAATTGTTATCCGCTCACAATTCCACACAACATACGAGCCGGAAGCATAAAGTGTAAAGCCTGGGGTGCCTAATG AGTGAGCTAACTCACATTAATTGCGTTGCGCTCACTGCCCGCTTTCCAGTCGGGAAACCTGTCGTGCCAGCAGATCTGAATTAATTCGGTCGAAAAAAGAAAAGGAGAGGGCCAAGAGGGAGGGCATTGGTGACTATTGA GCACGTGAGTATATATACCGTGATTAAGCACACAAAGGCAGCTTGGAGTATGTCTGTTATTAATTTCACAGGTAGTTCTGGTCCATTGGTGAAAGTTTGCGGCTTGCAGAGCACAGAGGCCGCAGAATGTGCTCTAGATT CCGATGCTGACTTGCTGGGTATTATATGTGTGCCCAATAGAAAGAGAACAATTGACCCGGTTATTGCAAGGAAAATTTCAAGTCTTGTAAAAGCATATAAAAATAGTTCAGGCACTCCGAAATACTTGGTTGGCGTGTTT CGTAATCAACCTAAGGAGGATGTTTTGGCTCTGGTCAATGATTACGGCATTGATATCGTCCAACTGCATGGAGATGAGTCGTGGCAAGAATACCAAGAGTTCCTCGGTTTGCCAGTTATTAAAAGACTCGTATTTCCAAA AGACTGCAACATACTACTCAGTGCAGCTTCACAGAAACCTCATTCGTTTATTCCCTTGTTTGATTCAGAAGCAGGTGGGACAGGTGAACTTTTGGATTGGAACTCGATTTCTGACTGGGTTGGAAGGCAAGAGAGCCCCG AAAGTTTACATTTTATGTTAGCTGGTGGACTGACGCCAGAAAATGTTGGTGATGCGCTTAGATTAAATGGCGTTATTGGTGTTGATGTAAGCGGAGGTGTGGAGACAAATGGTGTAAAAGACTCTAACAAAATAGCAAAT TTCGTCAAAAATGCTAAGAAATAGGTTATTACTGAGTAGTATTTATTTAAGTATTGTTTGTGCACTTGCCCAGATCTGCTGCATTAATGAATCGGCCAACGCGCGGGGAGAGGCGGTTTGCGTATTGGGCGCTCTTCCGC TTCCTCGCTCACTGACTCGCTGCGCTCGGTCGTTCGGCTGCGGCGAGCGGTATCAGCATCGATGCTCACTCAAAGGTCGGTAATACGGTTATCCACAGAATCAGGGGATAACGCAGGAAAGAACATGTGAGCAAAAGGCC AGCAAAAGGCCAGGAACCGTAAAAAGGCCGCGTTGCTGGCGTTTTTCCATAGGCTCCGCCCCCCTGACGAGCATCACAAAAATCGACGCTCAAGTCAGAGGTGGCGAAACCCGACAGGACTATAAAGATACCAGGCGTTT CCCCCTGGAAGCTCCCTCGTGCGCTCTCCTGTTCCGACCCTGCCGCTTACCGGATACCTGTCCGCCTTTCTCCCTTCGGGAAGCGTGGCGCTTTCTCATAGCTCACGCTGTAGGTATCTCAGTTCGGTGTAGGTCGTTCG CTCCAAGCTGGGCTGTGTGCACGAACCCCCCGTTCAGCCCGACCGCTGCGCCTTATCCGGTAACTATCGTCTTGAGTCCAACCCGGTAAGACACGACTTATCGCCACTGGCAGCAGCCACTGGTAACAGGATTAGCAGAG CGAGGTATGTAGGCGGTGCTACAGAGTTCTTGAAGTGGTGGCCTAACTACGGCTACACTAGAAGGACAGTATTTGGTATCTGCGCTCTGCTGAAGCCAGTTACCTTCGGAAAAAGAGTTGGTAGCTCTTGATCCGGCAAA CAAACCACCGCTGGTAGCGGTGGTTTTTTTGTTTGCAAGCAGCAGATTACGCGCAGAAAAAAAGGATCTCAAGAAGACCTTTGATCTTTTCTACGGGGTCTGACGCTCAGTGGAACGAAAACTCACGTTAAGGGATTTTG GTCATGAGATTATCAAAAAGGATCTTCACCTAGATCCTTTTAAATTAAAAATGAAGTTTTAAATCAATCTAAAGTATATATGAGTAAACTTGGTCTGACAGTTACCAATGCTTAATCAGTGAGGCACCTATCTCAGCGAT CTGTCTATTTCGTTCATCCATAGTTGCCTGACTCCCCGTCGTGTAGATAACTACGATACGGGAGGGCTTACCATCTGGCCCCAGTGCTGCAATGATACCGCGAGACCCACGCTCACCGGCTCCAGATTTATCAGCAATAA ACCAGCCAGCCGGAAGGGCCGAGCGCAGAAGTGGTCCTGCAACTTTATCCGCCTCCATCCAGTCTATTAATTGTTGCCGGGAAGCTAGAGTAAGTAGTTCGCCAGTTAATAGTTTGCGCAACGTTGTTGCCATTGCTACA GGATCGTGGTGTCACGCTCGTCGTTTGGTATGGCTTCATTCAGCTCCGGTTCCCAACGATCAAGGCGAGTTACATGATCCCCCATGTTGTGCAAAAAAGCGGTTAGCTCCTTCGGTCCTCCGATCGTTGTCAGAAGTAAG TTGGCCGCAGTGTTATCATCGCGCGTTTCGGTGATGACGGTGAAAACCTCTGACACATGCAGCTCCCGGAGACGGTCACAGCTTGTCTGTAAGCGGATGCCGGGAGCAGACAAGCCCGTCAGGGCGCGTCAGCGGGTGTT GGCGGGTGTCGGGGCTGGCTTAACTATGCGGCATCAGAGCAGATTGTACTGAGAGTGCACCATATGCGGTGTGAAATACCGCACAGATGCGTAAGGAGAAAATACCGCATCAGGCGCCATTCGCCATTCAGGCTGCGCAA CTGTTGGGAAGGGCGATCGGTGCGGGCCTCTTCGCTATTACGCCAGCTGGCGAAAGGGGGATGTGCTGCAAGGCGATTAAGTTGGGTAACGCCAGGGTTTTCCCAGTCACGACGTTGTAAAACGACGGCCAGTGAATTCG AGCTCGGTACCCGGGGATCCTCTAGAGTCGACCTGCAGGCATGCAAGCTTGGCGTAATCATGGTCATAGCTGTTTCCTGTGTGAAATTGTTATCCGCTCACAATTCCACACAACATACGAGCCGGAAGCATAAAGTGTAA AGCCTGGGGTGCCTAATGAGTGAGCTAACTCACATTAATTGCGTTGCGCTCACTGCCCGCTTTCCAGTCGGGAAACCTGTCGTGCCAGCAGATCTGAATTAATTCGGTCGAAAAAAGAAAAGGAGAGGGCCAAGAGGGAG GGCATTGGTGACTATTGAGCACGTGAGTATATATACCGTGATTAAGCACACAAAGGCAGCTTGGAGTATGTCTGTTATTAATTTCACAGGTAGTTCTGGTCCATTGGTGAAAGTTTGCGGCTTGCAGAGCACAGAGGCCG CAGAATGTGCTCTAGATTCCGATGCTGACTTGCTGGGTATTATATGTGTGCCCAATAGAAAGAGAACAATTGACCCGGTTATTGCAAGGAAAATTTCAAGTCTTGTAAAAGCATATAAAAATAGTTCAGGCACTCCGAAA TACTTGGTTGGCGTGTTTCGTAATCAACCTAAGGAGGATGTTTTGGCTCTGGTCAATGATTACGGCATTGATATCGTCCAACTGCATGGAGATGAGTCGTGGCAAGAATACCAAGAGTTCCTCGGTTTGCCAGTTATTAA AAGACTCGTATTTCCAAAAGACTGCAACATACTACTCAGTGCAGCTTCACAGAAACCTCATTCGTTTATTCCCTTGTTTGATTCAGAAGCAGGTGGGACAGGTGAACTTTTGGATTGGAACTCGATTTCTGACTGGGTTG GAAGGCAAGAGAGCCCCGAAAGTTTACATTTTATGTTAGCTGGTGGACTGACGCCAGAAAATGTTGGGAAGGCAAGAGAGCCCCGAAAGTTTACATTTTATGTTAGCTGGTGGACTGACGCCAGAAAATGTTGGGAAGGC subtélomère télomère (TTAGGG)n centromere subtélomère télomère (TTAGGG)n Prix Nobel de médecine 2009: Blackburn, Greider et Szostak Protein/RNA complex -> RNA is template, protein is reverse transcriptase 1) RNA anneals to leading strand 2) Forms template to make more leading strand Comment séquencer l ADN? 3) Translocates 6 bp & repeats 4) Once have enough unpaired leading strand lagging strand is replicated in usual way. -> add back piece that got left off 5

6 Sequençage méthode didéoxy (Fred Sanger, Nobel 1980) 8 years, 120 labs, 633 people Sequençage méthode didéoxy (Fred Sanger, Nobel 1980) The S. cerevisiae genome sequence Comparative genomics Life with 6000 genes; Goffeau et al., Science, French laboratories, Genoscope Genopole Institut Pasteur Library construction Saccharomyces c erevisiae Candida glabrata Zygosaccharo myces rouxii Library construction => DNA extraction => manual sequencing Sanger method Lachancea kluyveri DNA extraction (WashU seq center Lachancea M. Jonhston) thermotolerans Kluyveromy ces lactis Debaryomyces hansenii Yarrowia lipolytica Génolevures I Exploration of 13 species Génolevures II 55% 17% 15% 7% 4% 2% ~ 100 europeen laboratories Sanger centre, Cambridge Washington University, Saint Louis Stanford University Mc Gill University, Montréal Institut RIKEN, Japon 2004 automatic sequencing Complete genome 4 species Génolevures III 2009 Complete genome 3 species 6

7 New sequencing technologies 454 / Roche Genome Sequence FLX Illumina / Solexa Genetic Analyzer Applied Biosystems ABI 3730XL Roche / 454 Genome Sequencer FLX Applied Biosystems SOLiD Ce qui change : La quantité et le type des données générées - Le coût La qualité des données (erreurs) 454 / Roche Genome Sequence FLX 454 / Roche Genome Sequence FLX 7

8 454 / Roche Genome Sequence FLX Illumina / Solexa Genetic Analyzer 1G Illumina / Solexa Genetic Analyzer 1G 1 fragment -> 1 bille 1 bille -> 1 lecture Illumina / Solexa Genetic Analyzer 1G Illumina / Solexa Genetic Analyzer 1G 8

9 Illumina / Solexa Genetic Analyzer 1G Illumina / Solexa Genetic Analyzer 1G Applied Biosystems SOLiD (Sequencing by Oligo Ligation and Detection) Applied Biosystems SOLiD (Sequencing by Oligo Ligation and Detection) Applied Biosystems SOLiD 9

10 Résumé : Séquençage manuel => 1 Kb / réaction X : Séquençage automatique => 100 Kb /réaction X : Nouvelles technologies => 1 Gb / réaction? Comment trouver les gènes? TCGCGCGTTTCGGTGATGACGGTGAAAACCTCTGACACATGCAGCTCCCGGAGACGGTCACAGCTTGTCTGTAAGCGGATGCCGGGAGCAGACAAGCCCGTCAGGGCGCGTC AGCGGGTGTTGGCGGGTGTCGGGGCTGGCTTAACTATGCGGCATCAGAGCAGATTGTACTGAGAGTGCACCATATGCGGTGTGAAATACCGCACAGATGCGTAAGGAGAAAA TACCGCATCAGGCGCCATTCGCCATTCAGGCTGCGCAACTGTTGGGAAGGGCGATCGGTGCGGGCCTCTTCGCTATTACGCCAGCTGGCGAAAGGGGGATGTGCTGCAAGGC GATTAAGTTGGGTAACGCCAGGGTTTTCCCAGTCACGACGTTGTAAAACGACGGCCAGTGAATTCGAGCTCGGTACCCGGGGATCCTCTAGAGTCGACCTGCAGGCATGCAA GCTTGGCGTAATCATGGTCATAGCTGTTTCCTGTGTGAAATTGTTATCCGCTCACAATTCCACACAACATACGAGCCGGAAGCATAAAGTGTAAAGCCTGGGGTGCCTAATG AGTGAGCTAACTCACATTAATTGCGTTGCGCTCACTGCCCGCTTTCCAGTCGGGAAACCTGTCGTGCCAGCAGATCTGAATTAATTCGGTCGAAAAAAGAAAAGGAGAGGGC CAAGAGGGAGGGCATTGGTGACTATTGAGCACGTGAGTATATATACCGTGATTAAGCACACAAAGGCAGCTTGGAGTATGTCTGTTATTAATTTCACAGGTAGTTCTGGTCC ATTGGTGAAAGTTTGCGGCTTGCAGAGCACAGAGGCCGCAGAATGTGCTCTAGATTCCGATGCTGACTTGCTGGGTATTATATGTGTGCCCAATAGAAAGAGAACAATTGAC CCGGTTATTGCAAGGAAAATTTCAAGTCTTGTAAAAGCATATAAAAATAGTTCAGGCACTCCGAAATACTTGGTTGGCGTGTTTCGTAATCAACCTAAGGAGGATGTTTTGG CTCTGGTCAATGATTACGGCATTGATATCGTCCAACTGCATGGAGATGAGTCGTGGCAAGAATACCAAGAGTTCCTCGGTTTGCCAGTTATTAAAAGACTCGTATTTCCAAA AGACTGCAACATACTACTCAGTGCAGCTTCACAGAAACCTCATTCGTTTATTCCCTTGTTTGATTCAGAAGCAGGTGGGACAGGTGAACTTTTGGATTGGAACTCGATTTCT GACTGGGTTGGAAGGCAAGAGAGCCCCGAAAGTTTACATTTTATGTTAGCTGGTGGACTGACGCCAGAAAATGTTGGTGATGCGCTTAGATTAAATGGCGTTATTGGTGTTG ATGTAAGCGGAGGTGTGGAGACAAATGGTGTAAAAGACTCTAACAAAATAGCAAATTTCGTCAAAAATGCTAAGAAATAGGTTATTACTGAGTAGTATTTATTTAAGTATTG TTTGTGCACTTGCCCAGATCTGCTGCATTAATGAATCGGCCAACGCGCGGGGAGAGGCGGTTTGCGTATTGGGCGCTCTTCCGCTTCCTCGCTCACTGACTCGCTGCGCTCG GTCGTTCGGCTGCGGCGAGCGGTATCAGCATCGATGCTCACTCAAAGGTCGGTAATACGGTTATCCACAGAATCAGGGGATAACGCAGGAAAGAACATGTGAGCAAAAGGCC AGCAAAAGGCCAGGAACCGTAAAAAGGCCGCGTTGCTGGCGTTTTTCCATAGGCTCCGCCCCCCTGACGAGCATCACAAAAATCGACGCTCAAGTCAGAGGTGGCGAAACCC GACAGGACTATAAAGATACCAGGCGTTTCCCCCTGGAAGCTCCCTCGTGCGCTCTCCTGTTCCGACCCTGCCGCTTACCGGATACCTGTCCGCCTTTCTCCCTTCGGGAAGC GTGGCGCTTTCTCATAGCTCACGCTGTAGGTATCTCAGTTCGGTGTAGGTCGTTCGCTCCAAGCTGGGCTGTGTGCACGAACCCCCCGTTCAGCCCGACCGCTGCGCCTTAT CCGGTAACTATCGTCTTGAGTCCAACCCGGTAAGACACGACTTATCGCCACTGGCAGCAGCCACTGGTAACAGGATTAGCAGAGCGAGGTATGTAGGCGGTGCTACAGAGTT CTTGAAGTGGTGGCCTAACTACGGCTACACTAGAAGGACAGTATTTGGTATCTGCGCTCTGCTGAAGCCAGTTACCTTCGGAAAAAGAGTTGGTAGCTCTTGATCCGGCAAA CAAACCACCGCTGGTAGCGGTGGTTTTTTTGTTTGCAAGCAGCAGATTACGCGCAGAAAAAAAGGATCTCAAGAAGACCTTTGATCTTTTCTACGGGGTCTGACGCTCAGTG GAACGAAAACTCACGTTAAGGGATTTTGGTCATGAGATTATCAAAAAGGATCTTCACCTAGATCCTTTTAAATTAAAAATGAAGTTTTAAATCAATCTAAAGTATATATGAG TAAACTTGGTCTGACAGTTACCAATGCTTAATCAGTGAGGCACCTATCTCAGCGATCTGTCTATTTCGTTCATCCATAGTTGCCTGACTCCCCGTCGTGTAGATAACTACGA TACGGGAGGGCTTACCATCTGGCCCCAGTGCTGCAATGATACCGCGAGACCCACGCTCACCGGCTCCAGATTTATCAGCAATAAACCAGCCAGCCGGAAGGGCCGAGCGCAG AAGTGGTCCTGCAACTTTATCCGCCTCCATCCAGTCTATTAATTGTTGCCGGGAAGCTAGAGTAAGTAGTTCGCCAGTTAATAGTTTGCGCAACGTTGTTGCCATTGCTACA GGATCGTGGTGTCACGCTCGTCGTTTGGTATGGCTTCATTCAGCTCCGGTTCCCAACGATCAAGGCGAGTTACATGATCCCCCATGTTGTGCAAAAAAGCGGTTAGCTCCTT CGGTCCTCCGATCGTTGTCAGAAGTAAGTTGGCCGCAGTGTTATCATCGCGCGTTTCGGTGATGACGGTGAAAACCTCTGACACATGCAGCTCCCGGAGACGGTCACAGCTT GTCTGTAAGCGGATGCCGGGAGCAGACAAGCCCGTCAGGGCGCGTCAGCGGGTGTTGGCGGGTGTCGGGGCTGGCTTAACTATGCGGCATCAGAGCAGATTGTACTGAGAGT GCACCATATGCGGTGTGAAATACCGCACAGATGCGTAAGGAGAAAATACCGCATCAGGCGCCATTCGCCATTCAGGCTGCGCAACTGTTGGGAAGGGCGATCGGTGCGGGCC TCTTCGCTATTACGCCAGCTGGCGAAAGGGGGATGTGCTGCAAGGCGATTAAGTTGGGTAACGCCAGGGTTTTCCCAGTCACGACGTTGTAAAACGACGGCCAGTGAATTCG AGCTCGGTACCCGGGGATCCTCTAGAGTCGACCTGCAGGCATGCAAGCTTGGCGTAATCATGGTCATAGCTGTTTCCTGTGTGAAATTGTTATCCGCTCACAATTCCACACA ACATACGAGCCGGAAGCATAAAGTGTAAAGCCTGGGGTGCCTAATGAGTGAGCTAACTCACATTAATTGCGTTGCGCTCACTGCCCGCTTTCCAGTCGGGAAACCTGTCGTG CCAGCAGATCTGAATTAATTCGGTCGAAAAAAGAAAAGGAGAGGGCCAAGAGGGAGGGCATTGGTGACTATTGAGCACGTGAGTATATATACCGTGATTAAGCACACAAAGG CAGCTTGGAGTATGTCTGTTATTAATTTCACAGGTAGTTCTGGTCCATTGGTGAAAGTTTGCGGCTTGCAGAGCACAGAGGCCGCAGAATGTGCTCTAGATTCCGATGCTGA CTTGCTGGGTATTATATGTGTGCCCAATAGAAAGAGAACAATTGACCCGGTTATTGCAAGGAAAATTTCAAGTCTTGTAAAAGCATATAAAAATAGTTCAGGCACTCCGAAA TACTTGGTTGGCGTGTTTCGTAATCAACCTAAGGAGGATGTTTTGGCTCTGGTCAATGATTACGGCATTGATATCGTCCAACTGCATGGAGATGAGTCGTGGCAAGAATACC AAGAGTTCCTCGGTTTGCCAGTTATTAAAAGACTCGTATTTCCAAAAGACTGCAACATACTACTCAGTGCAGCTTCACAGAAACCTCATTCGTTTATTCCCTTGTTTGAT 10

11 Strategies to find genes: Predictive methods: frames Stop codons (in the appropriate genetic code) * AUG codons (translation initiator) Predictive methods Interpretation of the DNA sequence into genes according to rules Watson strand 3> 3> 2> 2> 1> 1> Comparative methods Interpretation of the DNA sequence into genes according to similarities with other sequences Crick strand <1 <1 <2 <2 <3 < Experimental methods Interpretation of the DNA sequence into genes according to experimental results Genetics, mutations, mapping cdna libraries Expression data on microarrays RNA seq ORF (open reading frame): a DNA sequence betweeen two STOP codons. It is presumed to be the sequence of a protein coding gene CDS (coding sequence): a DNA sequence betweeen a START and a STOP codon Predictive methods: CAI = mesurement of the bias in codon usage (Sharp and Li, 1987) TTT phe F 2.7 TCT ser S 2.3 TAT tyr Y 1.9 TGT cys C 0.8 TTC phe F 1.8 TCC ser S 1.4 TAC tyr Y 1.4 TGC cys C 0.5 TTA leu L 2.7 TCA ser S 1.9 TAA OCH * TGA OPA * TTG leu L 2.7 TCG ser S 0.9 TAG AMB * TGG trp W 1.0 CTT leu L 1.2 CCT pro P 1.3 CAT his H 1.4 CGT arg R 0.6 Discrepency of the genetic code > synonymous codons Bias due to the different translational efficiencies of codons Reference table of relative synonymous codon usage values (RSCU) from highly expressed genes: CTC leu L 0.5 CCC pro P 0.7 CAC his H 0.8 CGC arg R 0.3 CTA leu L 1.4 CCA pro P 1.8 CAA gln Q 2.7 CGA arg R 0.3 CTG leu L 1.1 CCG pro P 0.5 CAG gln Q 1.2 CGG arg R 0.2 ATT ile I 3.0 ACT thr T 2.0 AAT asn N 3.6 AGT ser S 1.5 ATC ile I 1.7 ACC thr T 1.2 AAC asn N 2.5 AGC ser S 1.0 ATA ile I 1.8 ACA thr T 1.8 AAA lys K 4.3 AGA arg R 2.1 RSCU Phe UUU UUC ATG met M 2.1 ACG thr T 0.8 AAG lys K 3.1 AGG arg R 1.0 GTT val V 2.2 GCT ala A 2.0 GAT asp D 3.8 GGT gly G 2.3 GTC val V 1.1 GCC ala A 1.2 GAC asp D 2.0 GGC gly G 1.0 GTA val V 1.2 GCA ala A 1.6 GAA glu E 4.6 GGA gly G 1.1 GTG val V 1.1 GCG ala A 0.6 GAG glu E 2.0 GGG gly G 0.6 Ile AUU AUC AUA RSCU = freq obs freq exp 11

12 CAI = mesurement of the bias in codon usage Mirror effects CAI = CAI obs / CAI max YPR080w (TEF1) translation elongation factor EF-1 alpha with CAI obs = ( II L RSCU k ) 1/L K=1 CAI max = (II RSCU kmax ) 1/L L 3> 3> 2> 2> 1> 1> <1 <1 <2 <2 <3 <3 and K=1 RSCU = relative synonymous codon usage disregarded ORF Comparative methods: FTI1: RAD52 inhibitor One homolog (Pichia sorbitophila) YKR035wa > 3> 2> 2> 1> 1> <1 <1 <2 <2 <3 < YKR035c No homolog Frames A segment of the S. cerevisiae genome delta trna Ala YKL121w YKL120w PMT1 Entirely included ORFs were disregarded YKL119c VPH2 YKL117w YKL116c YKL114c APN1 Partially overlapping ORFs were considered Proportion gènes/génome : 75% du genome de S. cerevisiae 1,5% du génome humain (40% including introns) 0,05% du génome de certaines plantes

13 S. cerevisiae genome: - 12 Megabases - 16 chromosomes I II III IV V VI VII VIII IX X XI XII XIII XIV XV XVI mt small ORFs Basrai et al. (1997) Genome Research, 7, «Small ORFs: Beautiful needles in the haystack» protein-coding genes 140 rrna genes 274 trna genes - only 4% of intron-containing genes - 40% of the genes belong to families -ORFs occupy 72% of the genome! ORFs < 100 codons 299 sorfs (Functional genomics of genes with small open reading frames (sorfs) in S. cerevisiae Kastenmayer et al, Genome Research, 2006) HRA1 antisens to DRS2: DRS2 GOLGI-membrane located transport protein phenotype involved in maturation of 18S rrna? A novel type of genetic elements: CUTs (cryptic unstable transcripts) Two forms of the exosome in S. cerevisiae Ski3 Degradation of mrnas HRA1 Ski2 Ski8 Rrp4 Rrp41 Rrp44 Exosome = 3' 5' exonucleases mrna 5'UTR 3'UTR AAAAAAAAAA Responsible for the rrna processing phenotype Samanta et al., PNAS (2006); Global identification of noncoding RNAs in S. cerevisiae Cytoplasm Nucleus Rrp4 Rrp41 Rrp44 Rrp6 Rrp47 Maturation/degradation of rrnas, snrnas, snornas and trnas ncrna Wyers et al. Cell,

14 A novel type of genetic elements: CUTs (cryptic unstable transcripts) A novel type of genetic elements: CUTs (cryptic unstable transcripts) Neil et al., Nature 2009 Neil et al., Nature 2009 Eukaryotic promoters are intrinsically bidirectional!!! Repression of serine biosynthesis SRG1 SER3 Activation of serine catabolism CAT1 14

15 CONCLUSIONS : -séquencer des génomes n est plus vraiment limitant - Annotation aisée des gènes codant pour des protéines et pour quelques types de gènes d ARN - La fraction ARN informative des génomes est vraisemblablement très sous-estimée et très difficile à annoter - Trouver TOUS les gènes d un organisme reste un véritable chalenge 15

(http://genomes.urv.es/caical) TUTORIAL. (July 2006)

(http://genomes.urv.es/caical) TUTORIAL. (July 2006) (http://genomes.urv.es/caical) TUTORIAL (July 2006) CAIcal manual 2 Table of contents Introduction... 3 Required inputs... 5 SECTION A Calculation of parameters... 8 SECTION B CAI calculation for FASTA

More information

Hands on Simulation of Mutation

Hands on Simulation of Mutation Hands on Simulation of Mutation Charlotte K. Omoto P.O. Box 644236 Washington State University Pullman, WA 99164-4236 omoto@wsu.edu ABSTRACT This exercise is a hands-on simulation of mutations and their

More information

UNIVERSITETET I OSLO Det matematisk-naturvitenskapelige fakultet

UNIVERSITETET I OSLO Det matematisk-naturvitenskapelige fakultet 1 UNIVERSITETET I OSLO Det matematisk-naturvitenskapelige fakultet Exam in: MBV4010 Arbeidsmetoder i molekylærbiologi og biokjemi I MBV4010 Methods in molecular biology and biochemistry I Day of exam:.

More information

GENEWIZ, Inc. DNA Sequencing Service Details for USC Norris Comprehensive Cancer Center DNA Core

GENEWIZ, Inc. DNA Sequencing Service Details for USC Norris Comprehensive Cancer Center DNA Core DNA Sequencing Services Pre-Mixed o Provide template and primer, mixed into the same tube* Pre-Defined o Provide template and primer in separate tubes* Custom o Full-service for samples with unknown concentration

More information

The p53 MUTATION HANDBOOK

The p53 MUTATION HANDBOOK The p MUTATION HANDBOOK Version 1. /7 Thierry Soussi Christophe Béroud, Dalil Hamroun Jean Michel Rubio Nevado http://p/free.fr The p Mutation HandBook By T Soussi, J.M. Rubio-Nevado, D. Hamroun and C.

More information

Molecular Facts and Figures

Molecular Facts and Figures Nucleic Acids Molecular Facts and Figures DNA/RNA bases: DNA and RNA are composed of four bases each. In DNA the four are Adenine (A), Thymidine (T), Cytosine (C), and Guanine (G). In RNA the four are

More information

Mutation. Mutation provides raw material to evolution. Different kinds of mutations have different effects

Mutation. Mutation provides raw material to evolution. Different kinds of mutations have different effects Mutation Mutation provides raw material to evolution Different kinds of mutations have different effects Mutational Processes Point mutation single nucleotide changes coding changes (missense mutations)

More information

10 µg lyophilized plasmid DNA (store lyophilized plasmid at 20 C)

10 µg lyophilized plasmid DNA (store lyophilized plasmid at 20 C) TECHNICAL DATA SHEET BIOLUMINESCENCE RESONANCE ENERGY TRANSFER RENILLA LUCIFERASE FUSION PROTEIN EXPRESSION VECTOR Product: prluc-c Vectors Catalog number: Description: Amount: The prluc-c vectors contain

More information

Mutations and Genetic Variability. 1. What is occurring in the diagram below?

Mutations and Genetic Variability. 1. What is occurring in the diagram below? Mutations and Genetic Variability 1. What is occurring in the diagram below? A. Sister chromatids are separating. B. Alleles are independently assorting. C. Genes are replicating. D. Segments of DNA are

More information

(A) Microarray analysis was performed on ATM and MDM isolated from 4 obese donors.

(A) Microarray analysis was performed on ATM and MDM isolated from 4 obese donors. Legends of supplemental figures and tables Figure 1: Overview of study design and results. (A) Microarray analysis was performed on ATM and MDM isolated from 4 obese donors. After raw data gene expression

More information

Gene Finding CMSC 423

Gene Finding CMSC 423 Gene Finding CMSC 423 Finding Signals in DNA We just have a long string of A, C, G, Ts. How can we find the signals encoded in it? Suppose you encountered a language you didn t know. How would you decipher

More information

DNA Sample preparation and Submission Guidelines

DNA Sample preparation and Submission Guidelines DNA Sample preparation and Submission Guidelines Requirements: Please submit samples in 1.5ml microcentrifuge tubes. Fill all the required information in the Eurofins DNA sequencing order form and send

More information

Part ONE. a. Assuming each of the four bases occurs with equal probability, how many bits of information does a nucleotide contain?

Part ONE. a. Assuming each of the four bases occurs with equal probability, how many bits of information does a nucleotide contain? Networked Systems, COMPGZ01, 2012 Answer TWO questions from Part ONE on the answer booklet containing lined writing paper, and answer ALL questions in Part TWO on the multiple-choice question answer sheet.

More information

Coding sequence the sequence of nucleotide bases on the DNA that are transcribed into RNA which are in turn translated into protein

Coding sequence the sequence of nucleotide bases on the DNA that are transcribed into RNA which are in turn translated into protein Assignment 3 Michele Owens Vocabulary Gene: A sequence of DNA that instructs a cell to produce a particular protein Promoter a control sequence near the start of a gene Coding sequence the sequence of

More information

Table S1. Related to Figure 4

Table S1. Related to Figure 4 Table S1. Related to Figure 4 Final Diagnosis Age PMD Control Control 61 15 Control 67 6 Control 68 10 Control 49 15 AR-PD PD 62 15 PD 65 4 PD 52 18 PD 68 10 AR-PD cingulate cortex used for immunoblot

More information

Supplementary Online Material for Morris et al. sirna-induced transcriptional gene

Supplementary Online Material for Morris et al. sirna-induced transcriptional gene Supplementary Online Material for Morris et al. sirna-induced transcriptional gene silencing in human cells. Materials and Methods Lentiviral vector and sirnas. FIV vector pve-gfpwp was prepared as described

More information

a. Ribosomal RNA rrna a type ofrna that combines with proteins to form Ribosomes on which polypeptide chains of proteins are assembled

a. Ribosomal RNA rrna a type ofrna that combines with proteins to form Ribosomes on which polypeptide chains of proteins are assembled Biology 101 Chapter 14 Name: Fill-in-the-Blanks Which base follows the next in a strand of DNA is referred to. as the base (1) Sequence. The region of DNA that calls for the assembly of specific amino

More information

Inverse PCR & Cycle Sequencing of P Element Insertions for STS Generation

Inverse PCR & Cycle Sequencing of P Element Insertions for STS Generation BDGP Resources Inverse PCR & Cycle Sequencing of P Element Insertions for STS Generation For recovery of sequences flanking PZ, PlacW and PEP elements E. Jay Rehm Berkeley Drosophila Genome Project I.

More information

Introduction to Perl Programming Input/Output, Regular Expressions, String Manipulation. Beginning Perl, Chap 4 6. Example 1

Introduction to Perl Programming Input/Output, Regular Expressions, String Manipulation. Beginning Perl, Chap 4 6. Example 1 Introduction to Perl Programming Input/Output, Regular Expressions, String Manipulation Beginning Perl, Chap 4 6 Example 1 #!/usr/bin/perl -w use strict; # version 1: my @nt = ('A', 'C', 'G', 'T'); for

More information

Next Generation Sequencing

Next Generation Sequencing Next Generation Sequencing 38. Informationsgespräch der Blutspendezentralefür Wien, Niederösterreich und Burgenland Österreichisches Rotes Kreuz 22. November 2014, Parkhotel Schönbrunn Die Zukunft hat

More information

ISTEP+: Biology I End-of-Course Assessment Released Items and Scoring Notes

ISTEP+: Biology I End-of-Course Assessment Released Items and Scoring Notes ISTEP+: Biology I End-of-Course Assessment Released Items and Scoring Notes Page 1 of 22 Introduction Indiana students enrolled in Biology I participated in the ISTEP+: Biology I Graduation Examination

More information

Gene Synthesis 191. Mutagenesis 194. Gene Cloning 196. AccuGeneBlock Service 198. Gene Synthesis FAQs 201. User Protocol 204

Gene Synthesis 191. Mutagenesis 194. Gene Cloning 196. AccuGeneBlock Service 198. Gene Synthesis FAQs 201. User Protocol 204 Gene Synthesis 191 Mutagenesis 194 Gene Cloning 196 AccuGeneBlock Service 198 Gene Synthesis FAQs 201 User Protocol 204 Gene Synthesis Overview Gene synthesis is the most cost-effective way to enhance

More information

pcas-guide System Validation in Genome Editing

pcas-guide System Validation in Genome Editing pcas-guide System Validation in Genome Editing Tagging HSP60 with HA tag genome editing The latest tool in genome editing CRISPR/Cas9 allows for specific genome disruption and replacement in a flexible

More information

Chapter 9. Applications of probability. 9.1 The genetic code

Chapter 9. Applications of probability. 9.1 The genetic code Chapter 9 Applications of probability In this chapter we use the tools of elementary probability to investigate problems of several kinds. First, we study the language of life by focusing on the universal

More information

http://www.life.umd.edu/grad/mlfsc/ DNA Bracelets

http://www.life.umd.edu/grad/mlfsc/ DNA Bracelets http://www.life.umd.edu/grad/mlfsc/ DNA Bracelets by Louise Brown Jasko John Anthony Campbell Jack Dennis Cassidy Michael Nickelsburg Stephen Prentis Rohm Objectives: 1) Using plastic beads, construct

More information

SERVICES CATALOGUE WITH SUBMISSION GUIDELINES

SERVICES CATALOGUE WITH SUBMISSION GUIDELINES SERVICES CATALOGUE WITH SUBMISSION GUIDELINES 3921 Montgomery Road Cincinnati, Ohio 45212 513-841-2428 www.agctsequencing.com CONTENTS Welcome Dye Terminator Sequencing DNA Sequencing Services - Full Service

More information

Introduction to Genome Annotation

Introduction to Genome Annotation Introduction to Genome Annotation AGCGTGGTAGCGCGAGTTTGCGAGCTAGCTAGGCTCCGGATGCGA CCAGCTTTGATAGATGAATATAGTGTGCGCGACTAGCTGTGTGTT GAATATATAGTGTGTCTCTCGATATGTAGTCTGGATCTAGTGTTG GTGTAGATGGAGATCGCGTAGCGTGGTAGCGCGAGTTTGCGAGCT

More information

Drosophila NK-homeobox genes

Drosophila NK-homeobox genes Proc. Natl. Acad. Sci. USA Vol. 86, pp. 7716-7720, October 1989 Biochemistry Drosophila NK-homeobox genes (NK-1, NK-2,, and DNA clones/chromosome locations of genes) YONGSOK KIM AND MARSHALL NIRENBERG

More information

Characterization of cdna clones of the family of trypsin/a-amylase inhibitors (CM-proteins) in barley {Hordeum vulgare L.)

Characterization of cdna clones of the family of trypsin/a-amylase inhibitors (CM-proteins) in barley {Hordeum vulgare L.) Characterization of cdna clones of the family of trypsin/a-amylase inhibitors (CM-proteins) in barley {Hordeum vulgare L.) J. Paz-Ares, F. Ponz, P. Rodríguez-Palenzuela, A. Lázaro, C. Hernández-Lucas,

More information

Hiding Data in DNA. 1 Introduction

Hiding Data in DNA. 1 Introduction Hiding Data in DNA Boris Shimanovsky *, Jessica Feng +, and Miodrag Potkonjak + * XAP Corporation + Dept. Computer Science, Univ. of California, Los Angeles Abstract. Just like disk or RAM, DNA and RNA

More information

Gene Models & Bed format: What they represent.

Gene Models & Bed format: What they represent. GeneModels&Bedformat:Whattheyrepresent. Gene models are hypotheses about the structure of transcripts produced by a gene. Like all models, they may be correct, partly correct, or entirely wrong. Typically,

More information

Molecular analyses of EGFR: mutation and amplification detection

Molecular analyses of EGFR: mutation and amplification detection Molecular analyses of EGFR: mutation and amplification detection Petra Nederlof, Moleculaire Pathologie NKI Amsterdam Henrique Ruijter, Ivon Tielen, Lucie Boerrigter, Aafke Ariaens Outline presentation

More information

2006 7.012 Problem Set 3 KEY

2006 7.012 Problem Set 3 KEY 2006 7.012 Problem Set 3 KEY Due before 5 PM on FRIDAY, October 13, 2006. Turn answers in to the box outside of 68-120. PLEASE WRITE YOUR ANSWERS ON THIS PRINTOUT. 1. Which reaction is catalyzed by each

More information

Provincial Exam Questions. 9. Give one role of each of the following nucleic acids in the production of an enzyme.

Provincial Exam Questions. 9. Give one role of each of the following nucleic acids in the production of an enzyme. Provincial Exam Questions Unit: Cell Biology: Protein Synthesis (B7 & B8) 2010 Jan 3. Describe the process of translation. (4 marks) 2009 Sample 8. What is the role of ribosomes in protein synthesis? A.

More information

Title : Parallel DNA Synthesis : Two PCR product from one DNA template

Title : Parallel DNA Synthesis : Two PCR product from one DNA template Title : Parallel DNA Synthesis : Two PCR product from one DNA template Bhardwaj Vikash 1 and Sharma Kulbhushan 2 1 Email: vikashbhardwaj@ gmail.com 1 Current address: Government College Sector 14 Gurgaon,

More information

Supplementary Information. Binding region and interaction properties of sulfoquinovosylacylglycerol (SQAG) with human

Supplementary Information. Binding region and interaction properties of sulfoquinovosylacylglycerol (SQAG) with human Supplementary Information Binding region and interaction properties of sulfoquinovosylacylglycerol (SQAG) with human vascular endothelial growth factor 165 revealed by biosensor based assays Yoichi Takakusagi

More information

Protein Synthesis. Page 41 Page 44 Page 47 Page 42 Page 45 Page 48 Page 43 Page 46 Page 49. Page 41. DNA RNA Protein. Vocabulary

Protein Synthesis. Page 41 Page 44 Page 47 Page 42 Page 45 Page 48 Page 43 Page 46 Page 49. Page 41. DNA RNA Protein. Vocabulary Protein Synthesis Vocabulary Transcription Translation Translocation Chromosomal mutation Deoxyribonucleic acid Frame shift mutation Gene expression Mutation Point mutation Page 41 Page 41 Page 44 Page

More information

Inverse PCR and Sequencing of P-element, piggybac and Minos Insertion Sites in the Drosophila Gene Disruption Project

Inverse PCR and Sequencing of P-element, piggybac and Minos Insertion Sites in the Drosophila Gene Disruption Project Inverse PCR and Sequencing of P-element, piggybac and Minos Insertion Sites in the Drosophila Gene Disruption Project Protocol for recovery of sequences flanking insertions in the Drosophila Gene Disruption

More information

pcmv6-neo Vector Application Guide Contents

pcmv6-neo Vector Application Guide Contents pcmv6-neo Vector Application Guide Contents Package Contents and Storage Conditions... 2 Product Description... 2 Introduction... 2 Production and Quality Assurance... 2 Methods... 3 Other required reagents...

More information

The world of non-coding RNA. Espen Enerly

The world of non-coding RNA. Espen Enerly The world of non-coding RNA Espen Enerly ncrna in general Different groups Small RNAs Outline mirnas and sirnas Speculations Common for all ncrna Per def.: never translated Not spurious transcripts Always/often

More information

Genome and DNA Sequence Databases. BME 110/BIOL 181 CompBio Tools Todd Lowe March 31, 2009

Genome and DNA Sequence Databases. BME 110/BIOL 181 CompBio Tools Todd Lowe March 31, 2009 Genome and DNA Sequence Databases BME 110/BIOL 181 CompBio Tools Todd Lowe March 31, 2009 Admin Reading: Chapters 1 & 2 Notes available in PDF format on-line (see class calendar page): http://www.soe.ucsc.edu/classes/bme110/spring09/bme110-calendar.html

More information

Biological One-way Functions

Biological One-way Functions Biological One-way Functions Qinghai Gao, Xiaowen Zhang 2, Michael Anshel 3 gaoj@farmingdale.edu zhangx@mail.csi.cuny.edu csmma@cs.ccny.cuny.edu Dept. Security System, Farmingdale State College / SUNY,

More information

Module 6: Digital DNA

Module 6: Digital DNA Module 6: Digital DNA Representation and processing of digital information in the form of DNA is essential to life in all organisms, no matter how large or tiny. Computing tools and computational thinking

More information

Bio 102 Practice Problems Recombinant DNA and Biotechnology

Bio 102 Practice Problems Recombinant DNA and Biotechnology Bio 102 Practice Problems Recombinant DNA and Biotechnology Multiple choice: Unless otherwise directed, circle the one best answer: 1. Which of the following DNA sequences could be the recognition site

More information

Introduction to Bioinformatics (Master ChemoInformatique)

Introduction to Bioinformatics (Master ChemoInformatique) Introduction to Bioinformatics (Master ChemoInformatique) Roland Stote Institut de Génétique et de Biologie Moléculaire et Cellulaire Biocomputing Group 03.90.244.730 rstote@igbmc.fr Biological Function

More information

Genomes and SNPs in Malaria and Sickle Cell Anemia

Genomes and SNPs in Malaria and Sickle Cell Anemia Genomes and SNPs in Malaria and Sickle Cell Anemia Introduction to Genome Browsing with Ensembl Ensembl The vast amount of information in biological databases today demands a way of organising and accessing

More information

ANALYSIS OF GROWTH HORMONE IN TENCH (TINCA TINCA) ANALÝZA RŮSTOVÉHO HORMONU LÍNA OBECNÉHO (TINCA TINCA)

ANALYSIS OF GROWTH HORMONE IN TENCH (TINCA TINCA) ANALÝZA RŮSTOVÉHO HORMONU LÍNA OBECNÉHO (TINCA TINCA) ANALYSIS OF GROWTH HORMONE IN TENCH (TINCA TINCA) ANALÝZA RŮSTOVÉHO HORMONU LÍNA OBECNÉHO (TINCA TINCA) Zrůstová J., Bílek K., Baránek V., Knoll A. Ústav morfologie, fyziologie a genetiky zvířat, Agronomická

More information

Cloning, sequencing, and expression of H.a. YNRI and H.a. YNII, encoding nitrate and nitrite reductases in the yeast Hansenula anomala

Cloning, sequencing, and expression of H.a. YNRI and H.a. YNII, encoding nitrate and nitrite reductases in the yeast Hansenula anomala Cloning, sequencing, and expression of H.a. YNRI and H.a. YNII, encoding nitrate and nitrite reductases in the yeast Hansenula anomala -'Pablo García-Lugo 1t, Celedonio González l, Germán Perdomo l, Nélida

More information

All commonly-used expression vectors used in the Jia Lab contain the following multiple cloning site: BamHI EcoRI SmaI SalI XhoI_ NotI

All commonly-used expression vectors used in the Jia Lab contain the following multiple cloning site: BamHI EcoRI SmaI SalI XhoI_ NotI 2. Primer Design 2.1 Multiple Cloning Sites All commonly-used expression vectors used in the Jia Lab contain the following multiple cloning site: BamHI EcoRI SmaI SalI XhoI NotI XXX XXX GGA TCC CCG AAT

More information

Insulin mrna to Protein Kit

Insulin mrna to Protein Kit Insulin mrna to Protein Kit A 3DMD Paper BioInformatics and Mini-Toober Folding Activity Teacher Key and Teacher Notes www. Insulin mrna to Protein Kit Contents Becoming Familiar with the Data... 3 Identifying

More information

2. The number of different kinds of nucleotides present in any DNA molecule is A) four B) six C) two D) three

2. The number of different kinds of nucleotides present in any DNA molecule is A) four B) six C) two D) three Chem 121 Chapter 22. Nucleic Acids 1. Any given nucleotide in a nucleic acid contains A) two bases and a sugar. B) one sugar, two bases and one phosphate. C) two sugars and one phosphate. D) one sugar,

More information

ANALYSIS OF A CIRCULAR CODE MODEL

ANALYSIS OF A CIRCULAR CODE MODEL ANALYSIS OF A CIRCULAR CODE MODEL Jérôme Lacan and Chrstan J. Mchel * Laboratore d Informatque de Franche-Comté UNIVERSITE DE FRANCHE-COMTE IUT de Belfort-Montbélard 4 Place Tharradn - BP 747 5 Montbélard

More information

Just the Facts: A Basic Introduction to the Science Underlying NCBI Resources

Just the Facts: A Basic Introduction to the Science Underlying NCBI Resources 1 of 8 11/7/2004 11:00 AM National Center for Biotechnology Information About NCBI NCBI at a Glance A Science Primer Human Genome Resources Model Organisms Guide Outreach and Education Databases and Tools

More information

The nucleotide sequence of the gene for human protein C

The nucleotide sequence of the gene for human protein C Proc. Natl. Acad. Sci. USA Vol. 82, pp. 4673-4677, July 1985 Biochemistry The nucleotide sequence of the gene for human protein C (DNA sequence analysis/vitamin K-dependent proteins/blood coagulation)

More information

An Introduction to Bioinformatics Algorithms www.bioalgorithms.info Gene Prediction

An Introduction to Bioinformatics Algorithms www.bioalgorithms.info Gene Prediction An Introduction to Bioinformatics Algorithms www.bioalgorithms.info Gene Prediction Introduction Gene: A sequence of nucleotides coding for protein Gene Prediction Problem: Determine the beginning and

More information

Gene and Chromosome Mutation Worksheet (reference pgs. 239-240 in Modern Biology textbook)

Gene and Chromosome Mutation Worksheet (reference pgs. 239-240 in Modern Biology textbook) Name Date Per Look at the diagrams, then answer the questions. Gene Mutations affect a single gene by changing its base sequence, resulting in an incorrect, or nonfunctional, protein being made. (a) A

More information

DNA Replication & Protein Synthesis. This isn t a baaaaaaaddd chapter!!!

DNA Replication & Protein Synthesis. This isn t a baaaaaaaddd chapter!!! DNA Replication & Protein Synthesis This isn t a baaaaaaaddd chapter!!! The Discovery of DNA s Structure Watson and Crick s discovery of DNA s structure was based on almost fifty years of research by other

More information

Mutation of the SPSl-encoded protein kinase of Saccharomyces cerevisiae leads to defects in transcription and morphology during spore formation

Mutation of the SPSl-encoded protein kinase of Saccharomyces cerevisiae leads to defects in transcription and morphology during spore formation Mutation of the SPSl-encoded protein kinase of Saccharomyces cerevisiae leads to defects in transcription and morphology during spore formation Helena Friesen/ Rayna Lunz/* Steven Doyle,^ and Jacqueline

More information

Marine Biology DEC 2004; 146(1) : 53-64 http://dx.doi.org/10.1007/s00227-004-1423-6 Copyright 2004 Springer

Marine Biology DEC 2004; 146(1) : 53-64 http://dx.doi.org/10.1007/s00227-004-1423-6 Copyright 2004 Springer Marine Biology DEC 2004; 146(1) : 53-64 http://dx.doi.org/10.1007/s00227-004-1423-6 Copyright 2004 Springer Archimer http://www.ifremer.fr/docelec/ Archive Institutionnelle de l Ifremer The original publication

More information

How To Clone Into Pcdna 3.1/V5-His

How To Clone Into Pcdna 3.1/V5-His pcdna 3.1/V5-His A, B, and C Catalog no. V810-20 Rev. date: 09 November 2010 Manual part no. 28-0141 MAN0000645 User Manual ii Contents Contents and Storage... iv Methods... 1 Cloning into pcdna 3.1/V5-His

More information

expressed histone genes have intervening sequences and encode polyadenylylated mrnas

expressed histone genes have intervening sequences and encode polyadenylylated mrnas Proc. Natl. Acad. Sci. USA Vol. 82, pp. 2834-2838, May 1985 Genetics Structure of a human histone cdna: Evidence that basally expressed histone genes have intervening sequences and encode polyadenylylated

More information

Y-chromosome haplotype distribution in Han Chinese populations and modern human origin in East Asians

Y-chromosome haplotype distribution in Han Chinese populations and modern human origin in East Asians Vol. 44 No. 3 SCIENCE IN CHINA (Series C) June 2001 Y-chromosome haplotype distribution in Han Chinese populations and modern human origin in East Asians KE Yuehai ( `º) 1, SU Bing (3 Á) 1 3, XIAO Junhua

More information

Protein Synthesis Simulation

Protein Synthesis Simulation Protein Synthesis Simulation Name(s) Date Period Benchmark: SC.912.L.16.5 as AA: Explain the basic processes of transcription and translation, and how they result in the expression of genes. (Assessed

More information

http://hdl.handle.net/10197/2727

http://hdl.handle.net/10197/2727 Provided by the author(s) and University College Dublin Library in accordance with publisher policies. Please cite the published version when available. Title Performance of DNA data embedding algorithms

More information

Supplemental Data. Short Article. PPARγ Activation Primes Human Monocytes. into Alternative M2 Macrophages. with Anti-inflammatory Properties

Supplemental Data. Short Article. PPARγ Activation Primes Human Monocytes. into Alternative M2 Macrophages. with Anti-inflammatory Properties Cell Metabolism, Volume 6 Supplemental Data Short Article PPARγ Activation Primes Human Monocytes into Alternative M2 Macrophages with Anti-inflammatory Properties M. Amine Bouhlel, Bruno Derudas, Elena

More information

1 Mutation and Genetic Change

1 Mutation and Genetic Change CHAPTER 14 1 Mutation and Genetic Change SECTION Genes in Action KEY IDEAS As you read this section, keep these questions in mind: What is the origin of genetic differences among organisms? What kinds

More information

Sample Questions for Exam 3

Sample Questions for Exam 3 Sample Questions for Exam 3 1. All of the following occur during prometaphase of mitosis in animal cells except a. the centrioles move toward opposite poles. b. the nucleolus can no longer be seen. c.

More information

NimbleGen SeqCap EZ Library SR User s Guide Version 3.0

NimbleGen SeqCap EZ Library SR User s Guide Version 3.0 NimbleGen SeqCap EZ Library SR User s Guide Version 3.0 For life science research only. Not for use in diagnostic procedures. Copyright 2011 Roche NimbleGen, Inc. All Rights Reserved. Editions Version

More information

TITRATION OF raav (VG) USING QUANTITATIVE REAL TIME PCR

TITRATION OF raav (VG) USING QUANTITATIVE REAL TIME PCR Page 1 of 5 Materials DNase digestion buffer [13 mm Tris-Cl, ph7,5 / 5 mm MgCl2 / 0,12 mm CaCl2] RSS plasmid ptr-uf11 SV40pA Forward primer (10µM) AGC AAT AGC ATC ACA AAT TTC ACA A SV40pA Reverse Primer

More information

The making of The Genoma Music

The making of The Genoma Music 242 Summary Key words Resumen Palabras clave The making of The Genoma Music Aurora Sánchez Sousa 1, Fernando Baquero 1 and Cesar Nombela 2 1 Department of Microbiology, Ramón y Cajal Hospital, and 2 Department

More information

Basic Concepts of DNA, Proteins, Genes and Genomes

Basic Concepts of DNA, Proteins, Genes and Genomes Basic Concepts of DNA, Proteins, Genes and Genomes Kun-Mao Chao 1,2,3 1 Graduate Institute of Biomedical Electronics and Bioinformatics 2 Department of Computer Science and Information Engineering 3 Graduate

More information

Announcements. Chapter 15. Proteins: Function. Proteins: Function. Proteins: Structure. Peptide Bonds. Lab Next Week. Help Session: Monday 6pm LSS 277

Announcements. Chapter 15. Proteins: Function. Proteins: Function. Proteins: Structure. Peptide Bonds. Lab Next Week. Help Session: Monday 6pm LSS 277 Lab Next Week Announcements Help Session: Monday 6pm LSS 277 Office Hours Chapter 15 and Translation Proteins: Function Proteins: Function Enzymes Transport Structural Components Regulation Communication

More information

Lecture 1 MODULE 3 GENE EXPRESSION AND REGULATION OF GENE EXPRESSION. Professor Bharat Patel Office: Science 2, 2.36 Email: b.patel@griffith.edu.

Lecture 1 MODULE 3 GENE EXPRESSION AND REGULATION OF GENE EXPRESSION. Professor Bharat Patel Office: Science 2, 2.36 Email: b.patel@griffith.edu. Lecture 1 MODULE 3 GENE EXPRESSION AND REGULATION OF GENE EXPRESSION Professor Bharat Patel Office: Science 2, 2.36 Email: b.patel@griffith.edu.au What is Gene Expression & Gene Regulation? 1. Gene Expression

More information

DNA Sequencing of the eta Gene Coding for Staphylococcal Exfoliative Toxin Serotype A

DNA Sequencing of the eta Gene Coding for Staphylococcal Exfoliative Toxin Serotype A Journal of General Microbiology (1988), 134, 71 1-71 7. Printed in Great Britain 71 1 DNA Sequencing of the eta Gene Coding for Staphylococcal Exfoliative Toxin Serotype A By SUSUMU SAKURA, HTOSH SUZUK

More information

cdna sequence and expression pattern of the putative

cdna sequence and expression pattern of the putative Proc. Natl. Acad. Sci. USA Vol. 92, pp. 2091-2095, March 1995 Biochemistry cdna sequence and expression pattern of the putative pheromone carrier aphrodisin (lipocalin/hamster/vagina/bartholin's glands)

More information

Structure and Function of DNA

Structure and Function of DNA Structure and Function of DNA DNA and RNA Structure DNA and RNA are nucleic acids. They consist of chemical units called nucleotides. The nucleotides are joined by a sugar-phosphate backbone. The four

More information

Molecular Characterization of the Llamas (Lama glama) Casein Cluster Genes Transcripts (CSN1S1, CSN2, CSN1S2, CSN3) and Regulatory Regions

Molecular Characterization of the Llamas (Lama glama) Casein Cluster Genes Transcripts (CSN1S1, CSN2, CSN1S2, CSN3) and Regulatory Regions RESEARCH ARTICLE Molecular Characterization of the Llamas (Lama glama) Casein Cluster Genes Transcripts (CSN1S1, CSN2, CSN1S2, CSN3) and Regulatory Regions Alfredo Pauciullo 1,2 *, Georg Erhardt 2 1 Department

More information

Transcription and Translation of DNA

Transcription and Translation of DNA Transcription and Translation of DNA Genotype our genetic constitution ( makeup) is determined (controlled) by the sequence of bases in its genes Phenotype determined by the proteins synthesised when genes

More information

The sequence of bases on the mrna is a code that determines the sequence of amino acids in the polypeptide being synthesized:

The sequence of bases on the mrna is a code that determines the sequence of amino acids in the polypeptide being synthesized: Module 3F Protein Synthesis So far in this unit, we have examined: How genes are transmitted from one generation to the next Where genes are located What genes are made of How genes are replicated How

More information

CCR Biology - Chapter 8 Practice Test - Summer 2012

CCR Biology - Chapter 8 Practice Test - Summer 2012 Name: Class: Date: CCR Biology - Chapter 8 Practice Test - Summer 2012 Multiple Choice Identify the choice that best completes the statement or answers the question. 1. What did Hershey and Chase know

More information

From DNA to Protein. Proteins. Chapter 13. Prokaryotes and Eukaryotes. The Path From Genes to Proteins. All proteins consist of polypeptide chains

From DNA to Protein. Proteins. Chapter 13. Prokaryotes and Eukaryotes. The Path From Genes to Proteins. All proteins consist of polypeptide chains Proteins From DNA to Protein Chapter 13 All proteins consist of polypeptide chains A linear sequence of amino acids Each chain corresponds to the nucleotide base sequence of a gene The Path From Genes

More information

Protein Synthesis How Genes Become Constituent Molecules

Protein Synthesis How Genes Become Constituent Molecules Protein Synthesis Protein Synthesis How Genes Become Constituent Molecules Mendel and The Idea of Gene What is a Chromosome? A chromosome is a molecule of DNA 50% 50% 1. True 2. False True False Protein

More information

Distribution of the DNA transposon family, Pokey in the Daphnia pulex species complex

Distribution of the DNA transposon family, Pokey in the Daphnia pulex species complex Eagle and Crease Mobile DNA (2016) 7:11 DOI 10.1186/s13100-016-0067-7 RESEARCH Open Access Distribution of the DNA transposon family, Pokey in the Daphnia pulex species complex Shannon H. C. Eagle and

More information

DISSERTATIONES MEDICINAE UNIVERSITATIS TARTUENSIS 108

DISSERTATIONES MEDICINAE UNIVERSITATIS TARTUENSIS 108 DISSERTATIONES MEDICINAE UNIVERSITATIS TARTUENSIS 108 DISSERTATIONES MEDICINAE UNIVERSITATIS TARTUENSIS 108 THE INTERLEUKIN-10 FAMILY CYTOKINES GENE POLYMORPHISMS IN PLAQUE PSORIASIS KÜLLI KINGO TARTU

More information

Molecular chaperones involved in preprotein. targeting to plant organelles

Molecular chaperones involved in preprotein. targeting to plant organelles Molecular chaperones involved in preprotein targeting to plant organelles Dissertation der Fakultät für Biologie der Ludwig-Maximilians-Universität München vorgelegt von Christine Fellerer München 29.

More information

Introduction to Bioinformatics 3. DNA editing and contig assembly

Introduction to Bioinformatics 3. DNA editing and contig assembly Introduction to Bioinformatics 3. DNA editing and contig assembly Benjamin F. Matthews United States Department of Agriculture Soybean Genomics and Improvement Laboratory Beltsville, MD 20708 matthewb@ba.ars.usda.gov

More information

and revertant strains. The present paper demonstrates that the yeast gene for subunit II can also be translated to yield a polypeptide

and revertant strains. The present paper demonstrates that the yeast gene for subunit II can also be translated to yield a polypeptide Proc. Nati. Acad. Sci. USA Vol. 76, No. 12, pp. 6534-6538, December 1979 Genetics Five TGA "stop" codons occur within the translated sequence of the yeast mitochondrial gene for cytochrome c oxidase subunit

More information

The DNA-"Wave Biocomputer"

The DNA-Wave Biocomputer The DNA-"Wave Biocomputer" Peter P. Gariaev (Pjotr Garjajev)*, Boris I. Birshtein*, Alexander M. Iarochenko*, Peter J. Marcer**, George G. Tertishny*, Katherine A. Leonova*, Uwe Kaempf ***. * Institute

More information

Lecture Series 7. From DNA to Protein. Genotype to Phenotype. Reading Assignments. A. Genes and the Synthesis of Polypeptides

Lecture Series 7. From DNA to Protein. Genotype to Phenotype. Reading Assignments. A. Genes and the Synthesis of Polypeptides Lecture Series 7 From DNA to Protein: Genotype to Phenotype Reading Assignments Read Chapter 7 From DNA to Protein A. Genes and the Synthesis of Polypeptides Genes are made up of DNA and are expressed

More information

Insulin Receptor Gene Mutations in Iranian Patients with Type II Diabetes Mellitus

Insulin Receptor Gene Mutations in Iranian Patients with Type II Diabetes Mellitus Iranian Biomedical Journal 13 (3): 161-168 (July 2009) Insulin Receptor Gene Mutations in Iranian Patients with Type II Diabetes Mellitus Bahram Kazemi 1*, Negar Seyed 1, Elham Moslemi 2, Mojgan Bandehpour

More information

Next Generation Sequencing

Next Generation Sequencing Next Generation Sequencing Technology and applications 10/1/2015 Jeroen Van Houdt - Genomics Core - KU Leuven - UZ Leuven 1 Landmarks in DNA sequencing 1953 Discovery of DNA double helix structure 1977

More information

EU Reference Laboratory for E. coli Department of Veterinary Public Health and Food Safety Unit of Foodborne Zoonoses Istituto Superiore di Sanità

EU Reference Laboratory for E. coli Department of Veterinary Public Health and Food Safety Unit of Foodborne Zoonoses Istituto Superiore di Sanità Identification and characterization of Verocytotoxin-producing Escherichia coli (VTEC) by Real Time PCR amplification of the main virulence genes and the genes associated with the serogroups mainly associated

More information

CHALLENGES IN THE HUMAN GENOME PROJECT

CHALLENGES IN THE HUMAN GENOME PROJECT REPRINT: originally published as: Robbins, R. J., 1992. Challenges in the human genome project. IEEE Engineering in Biology and Medicine, (March 1992):25 34. CHALLENGES IN THE HUMAN GENOME PROJECT PROGRESS

More information

BD BaculoGold Baculovirus Expression System Innovative Solutions for Proteomics

BD BaculoGold Baculovirus Expression System Innovative Solutions for Proteomics BD BaculoGold Baculovirus Expression System Innovative Solutions for Proteomics Table of Contents Innovative Solutions for Proteomics...........................................................................

More information

Genetic information (DNA) determines structure of proteins DNA RNA proteins cell structure 3.11 3.15 enzymes control cell chemistry ( metabolism )

Genetic information (DNA) determines structure of proteins DNA RNA proteins cell structure 3.11 3.15 enzymes control cell chemistry ( metabolism ) Biology 1406 Exam 3 Notes Structure of DNA Ch. 10 Genetic information (DNA) determines structure of proteins DNA RNA proteins cell structure 3.11 3.15 enzymes control cell chemistry ( metabolism ) Proteins

More information

Biopython Tutorial and Cookbook

Biopython Tutorial and Cookbook Biopython Tutorial and Cookbook Jeff Chang, Brad Chapman, Iddo Friedberg, Thomas Hamelryck, Michiel de Hoon, Peter Cock Last Update September 2008 Contents 1 Introduction 5 1.1 What is Biopython?.........................................

More information

RT-PCR: Two-Step Protocol

RT-PCR: Two-Step Protocol RT-PCR: Two-Step Protocol We will provide both one-step and two-step protocols for RT-PCR. We recommend the twostep protocol for this class. In the one-step protocol, the components of RT and PCR are mixed

More information

C YTOPLASMIC organelles proliferate by the growth and

C YTOPLASMIC organelles proliferate by the growth and Nuclear and Mitochondrial Inheritance in Yeast Depends on Novel Cytoplasmic Structures Defined by the MDM1 Protein Stephen J. McConnell and Michael P. Yaffe University of California, San Diego, Department

More information

GENE REGULATION. Teacher Packet

GENE REGULATION. Teacher Packet AP * BIOLOGY GENE REGULATION Teacher Packet AP* is a trademark of the College Entrance Examination Board. The College Entrance Examination Board was not involved in the production of this material. Pictures

More information