Basics on bioinformatics Lecture 2. Nunzio D Agostino

Size: px
Start display at page:

Download "Basics on bioinformatics Lecture 2. Nunzio D Agostino nunzio.dagostino@entecra.it; nunzio.dagostino@gmail.com"

Transcription

1 Basics on bioinformatics Lecture 2 Nunzio D Agostino

2 Database or databank? Initially o Databank(UK) o Database(USA) Solution The abbreviation db 2

3 Entity-Relationship(ER) modeling Notation uses three main constructs: o Data entities Represents a set or collection of objects in the real world that share the same properties. Person, place, object, event or concept about which data is to be maintained. o Attributes Named property or characteristic of an entity o Relationships Association between the instances of one or more entity types Relationships can be classified as either one to one 11 one to many 1N many to many NN Connectivity 3

4 Cardinality 1 : 1 1 : N N: M 4

5 ER example 5

6 database: basic structure Gi Accession Length Cultivar Dev.stag Tissue sequence CD Turning stage of fruit ripening Pericarp GTACTCCTAAAC BI TA days old callus CCACAACCACA AJ West Virginia dayspost anthesis fruit CAAATTTA.. Databases are composed of tables of data. Tables hold logically related sets of data. A table is essentially the same thing as a spreadsheet: a set of rows and columns 6

7 database: basic structure Gi Accession Length Cultivar Dev.stag Tissue sequence CD Turning stage of fruit ripening Pericarp GTACTCCTAAAC BI TA days old callus CCACAACCACA AJ West Virginia dayspost anthesis fruit CAAATTTA.. Each table has several records or entries : a record stores all the information for a given individual Records are therowsof a data table 7

8 database: basic structure Gi Accession Length Cultivar Dev.stag Tissue sequence CD Turning stage of fruit ripening Pericarp GTACTCCTAAAC BI TA days old callus CCACAACCACA AJ West Virginia dayspost anthesis fruit CAAATTTA.. Each record has several fields: A field is an individual piece of data, a single attribute of the record. Fields are thecolumnsof a data table 8

9 database: basic structure Gi Accession Length Cultivar Dev.stag Tissue sequence CD Turning stage of fruit ripening Pericarp GTACTCCTAAAC BI TA days old callus CCACAACCACA AJ West Virginia dayspost anthesis fruit CAAATTTA.. Each record (row) has a unique identifier, the primary key. the primary key serves to identify the data stored in this record across all the tables in the database. Databases are manipulated with a language called SQL (Structured Query Language). It s a baby English type of language: uses real words,butrigidintermsoftheorderandplacement. Various database software: Oracle, MS Access, MySQL, etc. 9

10 Why biological databases? omake biological data available to scientists Consolidation of data(gather data from different sources) Provide access to large dataset that cannot be published explicitly(genome, ) omake biological data available in computer-readable format Make data accessible for automated analysis 10

11 Biological db o Vary in size, quality, coverage, level of interest o Many of the major ones covered in the annual Database Issue of Nucleic Acids Research

12 Biological db 12

13 Biological db 13

14 What makes a good db? o comprehensiveness o accuracy o is up-to-date o good interface o batch search/download o API (web services, DAS, etc.) 14

15 must have item when using db o Remember the server, the database, and the program version used o Write down sequence identification numbers o Databases are not like good wine (use up-to-date builds) o Use local installs when it becomes necessary 15

16 Primary and derived data Primary databases: Databases consisting of data derived experimentally such as nucleotide sequences and three dimensional structures. Secondary databases: Those data that are derived from the analysis or treatment of primary data 16

17 Nucleotide sequence databases GenBank The 3 databases are synchronized on a daily basis, and the accession numbers are consistent. There are no legal restriction in the usage of these databases. However, there are some patented sequences in the database 17

18 GenBank sample record LOCUS AF bp DNA linear BCT 19-AUG-1999 DEFINITION Pseudomonas fluorescens ECF sigma factor SigX (sigx) gene, complete cds. ACCESSION AF VERSION AF GI: KEYWORDS. SOURCE Pseudomonas fluorescens. ORGANISM Pseudomonas fluorescens Bacteria; Proteobacteria; gamma subdivision; Pseudomonadaceae; Pseudomonas. REFERENCE 1 (bases 1 to 591) AUTHORS Brinkman,F.S., Schoofs,G., Hancock,R.E. and De Mot,R. TITLE Influence of a putative ECF sigma factor on expression of the major outer membrane protein, OprF, in Pseudomonas aeruginosa and Pseudomonas fluorescens JOURNAL J. Bacteriol. 181 (16), (1999) MEDLINE PUBMED REFERENCE 2 (bases 1 to 591) AUTHORS De Mot,R. TITLE Direct Submission JOURNAL Submitted (04-DEC-1998) F.A. Janssens Laboratory of Genetics, Applied Plant Sciences, K. Mercierlaan 92, Heverlee B-3001, Belgium FEATURES Location/Qualifiers source /organism="pseudomonas fluorescens" /strain="m114" /db_xref="taxon:294" gene /gene="sigx" CDS /gene="sigx" /codon_start=1 /transl_table=11 /product="ecf sigma factor SigX" /protein_id="aad " /db_xref="gi: " /translation="mnkaqtlstrydprelsdeelvarshtelfhvtrayeelmrryq RTLFNVCARYLGNDRDADDVCQEVMLKVLYGLKNLEGKSKFKTWLYSITYNECITQYR KERRKRRLMDALSLDPLEEASEEKALQPEEKGGLDRWLVYVNPIDRGILVLRFVAELE FQEIADIMHMGLSATKMRYKRALDKLREKFAGETET" BASE COUNT 157 a 133 c 170 g 131 t ORIGIN 1 atgaataaag cccaaacgct atccacgcgc tacgaccccc gcgagctctc tgatgaggag 61 ttggtcgcgc gctcgcatac cgagcttttt cacgtaacgc gcgcctatga agaactgatg 121 cggcgttacc agcgaacatt atttaacgtt tgtgcgagat atcttgggaa cgatcgcgac 181 gcagacgatg tctgtcagga agtcatgttg aaggtgctgt atggcctgaa gaacctcgag 241 gggaaatcga agttcaaaac gtggctctac agcatcacgt acaacgaatg tattacgcag 301 tatcggaagg aacggcgaaa gcgtcgcttg atggacgcat tgagtcttga ccccctcgag 361 gaagcgtccg aagaaaaggc gcttcaaccc gaggagaagg gcgggcttga tcgctggctg 421 gtgtatgtga acccgattga ccgtggaatt ctggtgcttc gatttgtcgc agagctggaa 481 tttcaggaga tcgcagacat catgcacatg ggtttgagtg cgacaaaaat gcgttacaaa 541 cgtgctctag ataaattgcg tgagaaattt gcaggcgaga ctgaaactta g header features sequence title taxonomy citation 18

19 Protein sequence database The mission of UniProt is to provide the scientific community with a comprehensive, high-quality and freely accessible resource of protein sequence and functional information. UniprotKB Knowledgebase is the central hub for the collection of functional information on proteins, with accurate, consistent and rich annotation. Swiss-Prot, which is manually annotated and reviewed. TrEMBL, which is automatically annotated and is not reviewed. The UniProtReference Clusters (UniRef), which is used to speed up sequence similarity searches. 19

20 UniProt entry 20

21 Protein data bank The PDB archive contains information about experimentally determined structures of proteins, nucleic acids, and complex assemblies.(xray, NMR, Computationally predicted) Mission: maintain a single archive of macromolecular structural data that is freely and openly available to the global community Number of Structures Available 21

22 PDB entry 22

23 Protein structure levels 23

24 The gene Ontology (GO) GO goals The GO Website 24

25 The gene Ontology (GO) GO is divided in 3 domain (levels of annotation): omolecular function - basic activities of a gene product at the molecular level o Biological process - set of molecular events with a defined beginning and an end o Cellular component - the parts of a cell or its extracellular environment 25

26 GO structure The structure of GO can be described in terms of direct acyclic graph (DAG), where each GO term is a node, and the relationships between the terms are arcs between the nodes nucleus chromosome mitochondrion part_of Is_a part_of Nuclear chromosome mitochondrial chromosome GO currently has 2 relationship types: Is_a An is_achild of a parent means that the child is a complete type of its parent, but can be discriminated in some way from other children of the parent. Part_of A part_ofchild of a parent means that the child is always a constituent of the parent that in combination with other constituents of the parent make up the parent. 26

27 Searching for papers

28 Querying GenBank Search from the Entrezmain page the gene whose accession number is BC o How many results we get in the Gene db? o What is the official name of the gene? Other possible names? o On which DNA strand is it located? o How many variants of splicing it has? o Which disease is the gene associated to? o Is it involved in the apoptosis process? o How long is the coding sequence of the first variant of slicing? 28

29 Querying GenBank NG_

30 What kind of molecule is it? Genomic DNA Querying GenBank 30

31 Querying GenBank Where is locate the promoter of the gene HBB? Upstream the nucleotide

32 Querying GenBank Indicate the number of exons= Indicate the length of the second exon= Indicate the number of introns = Indicate the length of the first intron= = 223 nts = 132 nts 32

33 Querying GenBank Indicate the location of the 5 'UTR = Indicate the length of the 5 'UTR = Indicate the location of the 3 'UTR = Indicate the length of the 3 'UTR = = 50 nts = 132 nts 33

34 Querying GenBank Indicate the nucleotide positions of the start codon = 70595,70596,

35 Querying GenBank Download in FASTA format the sequence of the HBB gene 35

36 Querying GenBank

37 Querying GenBank 37

38 Querying GenBank >gi : Homo sapiens beta globin region and hemoglobin, beta (HBB); and hemoglobin, delta (HBD); and hemoglobin, epsilon 1 (HBE1); and hemoglobin, gamma A (HBG1); and hemoglobin, gamma G (HBG2), RefSeqGene on chromosome 11 ACATTTGCTTCTGACACAACTGTGTTCACTAGCAACCTCAAACAGACACCATGGTGCATCTGACTCCTGA GGAGAAGTCTGCCGTTACTGCCCTGTGGGGCAAGGTGAACGTGGATGAAGTTGGTGGTGAGGCCCTGGGC AGGTTGGTATCAAGGTTACAAGACAGGTTTAAGGAGACCAATAGAAACTGGGCATGTGGAGACAGAGAAG ACTCTTGGGTTTCTGATAGGCACTGACTCTCTCTGCCTATTGGTCTATTTTCCCACCCTTAGGCTGCTGG TGGTCTACCCTTGGACCCAGAGGTTCTTTGAGTCCTTTGGGGATCTGTCCACTCCTGATGCTGTTATGGG CAACCCTAAGGTGAAGGCTCATGGCAAGAAAGTGCTCGGTGCCTTTAGTGATGGCCTGGCTCACCTGGAC AACCTCAAGGGCACCTTTGCCACACTGAGTGAGCTGCACTGTGACAAGCTGCACGTGGATCCTGAGAACT TCAGGGTGAGTCTATGGGACGCTTGATGTTTTCTTTCCCCTTCTTTTCTATGGTTAAGTTCATGTCATAG GAAGGGGATAAGTAACAGGGTACAGTTTAGAATGGGAAACAGACGAATGATTGCATCAGTGTGGAAGTCT CAGGATCGTTTTAGTTTCTTTTATTTGCTGTTCATAACAATTGTTTTCTTTTGTTTAATTCTTGCTTTCT TTTTTTTTCTTCTCCGCAATTTTTACTATTATACTTAATGCCTTAACATTGTGTATAACAAAAGGAAATA TCTCTGAGATACATTAAGTAACTTAAAAAAAAACTTTACACAGTCTGCCTAGTACATTACTATTTGGAAT ATATGTGTGCTTATTTGCATATTCATAATCTCCCTACTTTATTTTCTTTTATTTTTAATTGATACATAAT CATTATACATATTTATGGGTTAAAGTGTAATGTTTTAATATGTGTACACATATTGACCAAATCAGGGTAA TTTTGCATTTGTAATTTTAAAAAATGCTTTCTTCTTTTAATATACTTTTTTGTTTATCTTATTTCTAATA CTTTCCCTAATCTCTTTCTTTCAGGGCAATAATGATACAATGTATCATGCCTCTTTGCACCATTCTAAAG AATAACAGTGATAATTTCTGGGTTAAGGCAATAGCAATATCTCTGCATATAAATATTTCTGCATATAAAT TGTAACTGATGTAAGAGGTTTCATATTGCTAATAGCAGCTACAATCCAGCTACCATTCTGCTTTTATTTT ATGGTTGGGATAAGGCTGGATTATTCTGAGTCCAAGCTAGGCCCTTTTGCTAATCATGTTCATACCTCTT ATCTTCCTCCCACAGCTCCTGGGCAACGTGCTGGTCTGTGTGCTGGCCCATCACTTTGGCAAAGAATTCA CCCCACCAGTGCAGGCTGCCTATCAGAAAGTGGTGGCTGGTGTGGCTAATGCCCTGGCCCACAAGTATCA CTAAGCTCGCTTTCTTGCTGTCCAATTTCTATTAAAGGTTCCTTTGTTCCCTAAGTCCAACTACTAAACT GGGGGATATTATGAAGGGCCTTGAGCATCTGGATTCTGCCTAATAAAAAACATTTATTTTCATTGC 38

39 Querying GenBank 39

40 Querying GenBank: link to geneid 40

41 Querying PUBMED How many articles did Nunzio D Agostino publish? 41

42 Querying PUBMED How many articles did Nunzio D Agostino publish? D'Agostino, Nunzio[Full Author Name] OR D Agostino, Nunzio[Full Author Name] 42

43 Querying PUBMED How many articles did Nunzio D Agostino publish? D'Agostino, Nunzio[Full Author Name] OR D Agostino, Nunzio[Full Author Name] How manyof theseare reletedto EST? 43

44 Querying PUBMED How many articles did Nunzio D Agostino publish? D'Agostino, Nunzio[Full Author Name] OR D Agostino, Nunzio[Full Author Name] How manyof theseare reletedto EST? D'Agostino, Nunzio[Full Author Name] AND EST [Title/Abstract] 44

45 Querying PUBMED How many articles did Nunzio D Agostino publish? D'Agostino, Nunzio[Full Author Name] OR D Agostino, Nunzio[Full Author Name] How manyof theseare reletedto EST? D'Agostino, Nunzio[Full Author Name] AND EST [Title/Abstract] How manyof theseare on the BMC GenomicsJournal? 45

46 Querying PUBMED How many articles did Nunzio D Agostino publish? D'Agostino, Nunzio[Full Author Name] OR D Agostino, Nunzio[Full Author Name] How manyof theseare reletedto EST? D'Agostino, Nunzio[Full Author Name] AND EST [Title/Abstract] How manyof theseare on the BMC GenomicsJournal? D'Agostino, Nunzio[Full Author Name] OR D Agostino, Nunzio[Full Author Name] AND BMC Genomics [journal] 46

47 Querying PUBMED How many articles did Nunzio D Agostino publish? D'Agostino, Nunzio[Full Author Name] OR D Agostino, Nunzio[Full Author Name] How manyof theseare reletedto EST? D'Agostino, Nunzio[Full Author Name] AND EST [Title/Abstract] How manyof theseare on the BMC GenomicsJournal? D'Agostino, Nunzio[Full Author Name] OR D Agostino, Nunzio[Full Author Name] AND BMC Genomics [journal] How many articles do include the word RNA-Seq in the title? 47

48 Querying PUBMED How many articles did Nunzio D Agostino publish? D'Agostino, Nunzio[Full Author Name] OR D Agostino, Nunzio[Full Author Name] How manyof theseare reletedto EST? D'Agostino, Nunzio[Full Author Name] AND EST [Title/Abstract] How manyof theseare on the BMC GenomicsJournal? D'Agostino, Nunzio[Full Author Name] OR D Agostino, Nunzio[Full Author Name] AND BMC Genomics [journal] How manyarticlesin PubMEDdo include the word RNA-Seq in the title? RNA-Seq[title] 48

49 Querying PUBMED How many articles did Nunzio D Agostino publish? D'Agostino, Nunzio[Full Author Name] OR D Agostino, Nunzio[Full Author Name] How manyof theseare reletedto EST? D'Agostino, Nunzio[Full Author Name] AND EST [Title/Abstract] How manyof theseare on the BMC GenomicsJournal? D'Agostino, Nunzio[Full Author Name] OR D Agostino, Nunzio[Full Author Name] AND BMC Genomics [journal] How manyarticlesin PubMEDdo include the word RNA-Seq in the title? RNA-Seq[title] How many reviews have been published in 2008 containing the word "transcriptome? 49

50 Querying PUBMED How many articles did Nunzio D Agostino publish? D'Agostino, Nunzio[Full Author Name] OR D Agostino, Nunzio[Full Author Name] How manyof theseare reletedto EST? D'Agostino, Nunzio[Full Author Name] AND EST [Title/Abstract] How manyof theseare on the BMC GenomicsJournal? D'Agostino, Nunzio[Full Author Name] OR D Agostino, Nunzio[Full Author Name] AND BMC Genomics [journal] How manyarticlesin PubMEDdo include the word RNA-Seq in the title? RNA-Seq[title] How many reviews have been published in 2008 containing the word "transcriptome? transcriptome[title] AND review [Publication Type] AND 2008[publication date] 50

GenBank, Entrez, & FASTA

GenBank, Entrez, & FASTA GenBank, Entrez, & FASTA Nucleotide Sequence Databases First generation GenBank is a representative example started as sort of a museum to preserve knowledge of a sequence from first discovery great repositories,

More information

Lecture Outline. Introduction to Databases. Introduction. Data Formats Sample databases How to text search databases. Shifra Ben-Dor Irit Orr

Lecture Outline. Introduction to Databases. Introduction. Data Formats Sample databases How to text search databases. Shifra Ben-Dor Irit Orr Introduction to Databases Shifra Ben-Dor Irit Orr Lecture Outline Introduction Data and Database types Database components Data Formats Sample databases How to text search databases What units of information

More information

Biological Sequence Data Formats

Biological Sequence Data Formats Biological Sequence Data Formats Here we present three standard formats in which biological sequence data (DNA, RNA and protein) can be stored and presented. Raw Sequence: Data without description. FASTA

More information

Genome and DNA Sequence Databases. BME 110/BIOL 181 CompBio Tools Todd Lowe March 31, 2009

Genome and DNA Sequence Databases. BME 110/BIOL 181 CompBio Tools Todd Lowe March 31, 2009 Genome and DNA Sequence Databases BME 110/BIOL 181 CompBio Tools Todd Lowe March 31, 2009 Admin Reading: Chapters 1 & 2 Notes available in PDF format on-line (see class calendar page): http://www.soe.ucsc.edu/classes/bme110/spring09/bme110-calendar.html

More information

Just the Facts: A Basic Introduction to the Science Underlying NCBI Resources

Just the Facts: A Basic Introduction to the Science Underlying NCBI Resources 1 of 8 11/7/2004 11:00 AM National Center for Biotechnology Information About NCBI NCBI at a Glance A Science Primer Human Genome Resources Model Organisms Guide Outreach and Education Databases and Tools

More information

Bioinformatics Grid - Enabled Tools For Biologists.

Bioinformatics Grid - Enabled Tools For Biologists. Bioinformatics Grid - Enabled Tools For Biologists. What is Grid-Enabled Tools (GET)? As number of data from the genomics and proteomics experiment increases. Problems arise for the current sequence analysis

More information

RETRIEVING SEQUENCE INFORMATION. Nucleotide sequence databases. Database search. Sequence alignment and comparison

RETRIEVING SEQUENCE INFORMATION. Nucleotide sequence databases. Database search. Sequence alignment and comparison RETRIEVING SEQUENCE INFORMATION Nucleotide sequence databases Database search Sequence alignment and comparison Biological sequence databases Originally just a storage place for sequences. Currently the

More information

Biological Databases and Protein Sequence Analysis

Biological Databases and Protein Sequence Analysis Biological Databases and Protein Sequence Analysis Introduction M. Madan Babu, Center for Biotechnology, Anna University, Chennai 25, India Bioinformatics is the application of Information technology to

More information

SICKLE CELL ANEMIA & THE HEMOGLOBIN GENE TEACHER S GUIDE

SICKLE CELL ANEMIA & THE HEMOGLOBIN GENE TEACHER S GUIDE AP Biology Date SICKLE CELL ANEMIA & THE HEMOGLOBIN GENE TEACHER S GUIDE LEARNING OBJECTIVES Students will gain an appreciation of the physical effects of sickle cell anemia, its prevalence in the population,

More information

When you install Mascot, it includes a copy of the Swiss-Prot protein database. However, it is almost certain that you and your colleagues will want

When you install Mascot, it includes a copy of the Swiss-Prot protein database. However, it is almost certain that you and your colleagues will want 1 When you install Mascot, it includes a copy of the Swiss-Prot protein database. However, it is almost certain that you and your colleagues will want to search other databases as well. There are very

More information

org.rn.eg.db December 16, 2015 org.rn.egaccnum is an R object that contains mappings between Entrez Gene identifiers and GenBank accession numbers.

org.rn.eg.db December 16, 2015 org.rn.egaccnum is an R object that contains mappings between Entrez Gene identifiers and GenBank accession numbers. org.rn.eg.db December 16, 2015 org.rn.egaccnum Map Entrez Gene identifiers to GenBank Accession Numbers org.rn.egaccnum is an R object that contains mappings between Entrez Gene identifiers and GenBank

More information

Bioinformatics Resources at a Glance

Bioinformatics Resources at a Glance Bioinformatics Resources at a Glance A Note about FASTA Format There are MANY free bioinformatics tools available online. Bioinformaticists have developed a standard format for nucleotide and protein sequences

More information

Module 1. Sequence Formats and Retrieval. Charles Steward

Module 1. Sequence Formats and Retrieval. Charles Steward The Open Door Workshop Module 1 Sequence Formats and Retrieval Charles Steward 1 Aims Acquaint you with different file formats and associated annotations. Introduce different nucleotide and protein databases.

More information

Sequence Formats and Sequence Database Searches. Gloria Rendon SC11 Education June, 2011

Sequence Formats and Sequence Database Searches. Gloria Rendon SC11 Education June, 2011 Sequence Formats and Sequence Database Searches Gloria Rendon SC11 Education June, 2011 Sequence A is the primary structure of a biological molecule. It is a chain of residues that form a precise linear

More information

Genomes and SNPs in Malaria and Sickle Cell Anemia

Genomes and SNPs in Malaria and Sickle Cell Anemia Genomes and SNPs in Malaria and Sickle Cell Anemia Introduction to Genome Browsing with Ensembl Ensembl The vast amount of information in biological databases today demands a way of organising and accessing

More information

A Tutorial in Genetic Sequence Classification Tools and Techniques

A Tutorial in Genetic Sequence Classification Tools and Techniques A Tutorial in Genetic Sequence Classification Tools and Techniques Jake Drew Data Mining CSE 8331 Southern Methodist University jakemdrew@gmail.com www.jakemdrew.com Sequence Characters IUPAC nucleotide

More information

Searching Nucleotide Databases

Searching Nucleotide Databases Searching Nucleotide Databases 1 When we search a nucleic acid databases, Mascot always performs a 6 frame translation on the fly. That is, 3 reading frames from the forward strand and 3 reading frames

More information

BIO 3350: ELEMENTS OF BIOINFORMATICS PARTIALLY ONLINE SYLLABUS

BIO 3350: ELEMENTS OF BIOINFORMATICS PARTIALLY ONLINE SYLLABUS BIO 3350: ELEMENTS OF BIOINFORMATICS PARTIALLY ONLINE SYLLABUS NEW YORK CITY COLLEGE OF TECHNOLOGY The City University Of New York School of Arts and Sciences Biological Sciences Department Course title:

More information

ACAAGGGACTAGAGAAACCAAAA AGAAACCAAAACGAAAGGTGCAGAA AACGAAAGGTGCAGAAGGGGAAACAGATGCAGA CHAPTER 3

ACAAGGGACTAGAGAAACCAAAA AGAAACCAAAACGAAAGGTGCAGAA AACGAAAGGTGCAGAAGGGGAAACAGATGCAGA CHAPTER 3 ACAAGGGACTAGAGAAACCAAAA AGAAACCAAAACGAAAGGTGCAGAA AACGAAAGGTGCAGAAGGGGAAACAGATGCAGA CHAPTER 3 GAAGGGGAAACAGATGCAGAAAGCATC AGAAAGCATC ACAAGGGACTAGAGAAACCAAAACGAAAGGTGCAGAAGGGGAAACAGATGCAGAAAGCATC Introduction

More information

DNA Sequence formats

DNA Sequence formats DNA Sequence formats [Plain] [EMBL] [FASTA] [GCG] [GenBank] [IG] [IUPAC] [How Genomatix represents sequence annotation] Plain sequence format A sequence in plain format may contain only IUPAC characters

More information

GC3 Use cases for the Cloud

GC3 Use cases for the Cloud GC3: Grid Computing Competence Center GC3 Use cases for the Cloud Some real world examples suited for cloud systems Antonio Messina Trieste, 24.10.2013 Who am I System Architect

More information

DNA and the Cell. Version 2.3. English version. ELLS European Learning Laboratory for the Life Sciences

DNA and the Cell. Version 2.3. English version. ELLS European Learning Laboratory for the Life Sciences DNA and the Cell Anastasios Koutsos Alexandra Manaia Julia Willingale-Theune Version 2.3 English version ELLS European Learning Laboratory for the Life Sciences Anastasios Koutsos, Alexandra Manaia and

More information

The Steps. 1. Transcription. 2. Transferal. 3. Translation

The Steps. 1. Transcription. 2. Transferal. 3. Translation Protein Synthesis Protein synthesis is simply the "making of proteins." Although the term itself is easy to understand, the multiple steps that a cell in a plant or animal must go through are not. In order

More information

ID of alternative translational initiation events. Description of gene function Reference of NCBI database access and relative literatures

ID of alternative translational initiation events. Description of gene function Reference of NCBI database access and relative literatures Data resource: In this database, 650 alternatively translated variants assigned to a total of 300 genes are contained. These database records of alternative translational initiation have been collected

More information

FlipFlop: Fast Lasso-based Isoform Prediction as a Flow Problem

FlipFlop: Fast Lasso-based Isoform Prediction as a Flow Problem FlipFlop: Fast Lasso-based Isoform Prediction as a Flow Problem Elsa Bernard Laurent Jacob Julien Mairal Jean-Philippe Vert September 24, 2013 Abstract FlipFlop implements a fast method for de novo transcript

More information

GenBank: A Database of Genetic Sequence Data

GenBank: A Database of Genetic Sequence Data GenBank: A Database of Genetic Sequence Data Computer Science 105 Boston University David G. Sullivan, Ph.D. An Explosion of Scientific Data Scientists are generating ever increasing amounts of data. Relevant

More information

Transcription and Translation of DNA

Transcription and Translation of DNA Transcription and Translation of DNA Genotype our genetic constitution ( makeup) is determined (controlled) by the sequence of bases in its genes Phenotype determined by the proteins synthesised when genes

More information

Introduction to Bioinformatics 2. DNA Sequence Retrieval and comparison

Introduction to Bioinformatics 2. DNA Sequence Retrieval and comparison Introduction to Bioinformatics 2. DNA Sequence Retrieval and comparison Benjamin F. Matthews United States Department of Agriculture Soybean Genomics and Improvement Laboratory Beltsville, MD 20708 matthewb@ba.ars.usda.gov

More information

Protein Synthesis How Genes Become Constituent Molecules

Protein Synthesis How Genes Become Constituent Molecules Protein Synthesis Protein Synthesis How Genes Become Constituent Molecules Mendel and The Idea of Gene What is a Chromosome? A chromosome is a molecule of DNA 50% 50% 1. True 2. False True False Protein

More information

RJE Database Accessory Programs

RJE Database Accessory Programs RJE Database Accessory Programs Richard J. Edwards (2006) 1: Introduction...2 1.1: Version...2 1.2: Using this Manual...2 1.3: Getting Help...2 1.4: Availability and Local Installation...2 2: RJE_DBASE...3

More information

EMBL Identity & Access Management

EMBL Identity & Access Management EMBL Identity & Access Management Rupert Lück EMBL Heidelberg e IRG Workshop Zürich Apr 24th 2008 Outline EMBL Overview Identity & Access Management for EMBL IT Requirements & Strategy Project Goal and

More information

Name Date Period. 2. When a molecule of double-stranded DNA undergoes replication, it results in

Name Date Period. 2. When a molecule of double-stranded DNA undergoes replication, it results in DNA, RNA, Protein Synthesis Keystone 1. During the process shown above, the two strands of one DNA molecule are unwound. Then, DNA polymerases add complementary nucleotides to each strand which results

More information

The human gene encoding Glucose-6-phosphate dehydrogenase (G6PD) is located on chromosome X in cytogenetic band q28.

The human gene encoding Glucose-6-phosphate dehydrogenase (G6PD) is located on chromosome X in cytogenetic band q28. Tutorial Module 5 BioMart You will learn about BioMart, a joint project developed and maintained at EBI and OiCR www.biomart.org How to use BioMart to quickly obtain lists of gene information from Ensembl

More information

Gene Models & Bed format: What they represent.

Gene Models & Bed format: What they represent. GeneModels&Bedformat:Whattheyrepresent. Gene models are hypotheses about the structure of transcripts produced by a gene. Like all models, they may be correct, partly correct, or entirely wrong. Typically,

More information

Lecture Series 7. From DNA to Protein. Genotype to Phenotype. Reading Assignments. A. Genes and the Synthesis of Polypeptides

Lecture Series 7. From DNA to Protein. Genotype to Phenotype. Reading Assignments. A. Genes and the Synthesis of Polypeptides Lecture Series 7 From DNA to Protein: Genotype to Phenotype Reading Assignments Read Chapter 7 From DNA to Protein A. Genes and the Synthesis of Polypeptides Genes are made up of DNA and are expressed

More information

Introduction to Genome Annotation

Introduction to Genome Annotation Introduction to Genome Annotation AGCGTGGTAGCGCGAGTTTGCGAGCTAGCTAGGCTCCGGATGCGA CCAGCTTTGATAGATGAATATAGTGTGCGCGACTAGCTGTGTGTT GAATATATAGTGTGTCTCTCGATATGTAGTCTGGATCTAGTGTTG GTGTAGATGGAGATCGCGTAGCGTGGTAGCGCGAGTTTGCGAGCT

More information

FINDING RELATION BETWEEN AGING AND

FINDING RELATION BETWEEN AGING AND FINDING RELATION BETWEEN AGING AND TELOMERE BY APRIORI AND DECISION TREE Jieun Sung 1, Youngshin Joo, and Taeseon Yoon 1 Department of National Science, Hankuk Academy of Foreign Studies, Yong-In, Republic

More information

Processing Genome Data using Scalable Database Technology. My Background

Processing Genome Data using Scalable Database Technology. My Background Johann Christoph Freytag, Ph.D. freytag@dbis.informatik.hu-berlin.de http://www.dbis.informatik.hu-berlin.de Stanford University, February 2004 PhD @ Harvard Univ. Visiting Scientist, Microsoft Res. (2002)

More information

THE GENBANK SEQUENCE DATABASE

THE GENBANK SEQUENCE DATABASE Bioinformatics: A Practical Guide to the Analysis of Genes and Proteins, Second Edition Andreas D. Baxevanis, B.F. Francis Ouellette Copyright 2001 John Wiley & Sons, Inc. ISBNs: 0-471-38390-2 (Hardback);

More information

Web-Based Genomic Information Integration with Gene Ontology

Web-Based Genomic Information Integration with Gene Ontology Web-Based Genomic Information Integration with Gene Ontology Kai Xu 1 IMAGEN group, National ICT Australia, Sydney, Australia, kai.xu@nicta.com.au Abstract. Despite the dramatic growth of online genomic

More information

Final Project Report

Final Project Report CPSC545 by Introduction to Data Mining Prof. Martin Schultz & Prof. Mark Gerstein Student Name: Yu Kor Hugo Lam Student ID : 904907866 Due Date : May 7, 2007 Introduction Final Project Report Pseudogenes

More information

A Primer of Genome Science THIRD

A Primer of Genome Science THIRD A Primer of Genome Science THIRD EDITION GREG GIBSON-SPENCER V. MUSE North Carolina State University Sinauer Associates, Inc. Publishers Sunderland, Massachusetts USA Contents Preface xi 1 Genome Projects:

More information

Unit I: Introduction To Scientific Processes

Unit I: Introduction To Scientific Processes Unit I: Introduction To Scientific Processes This unit is an introduction to the scientific process. This unit consists of a laboratory exercise where students go through the QPOE2 process step by step

More information

Integration of data management and analysis for genome research

Integration of data management and analysis for genome research Integration of data management and analysis for genome research Volker Brendel Deparment of Zoology & Genetics and Department of Statistics Iowa State University 2112 Molecular Biology Building Ames, Iowa

More information

A Multiple DNA Sequence Translation Tool Incorporating Web Robot and Intelligent Recommendation Techniques

A Multiple DNA Sequence Translation Tool Incorporating Web Robot and Intelligent Recommendation Techniques Proceedings of the 2007 WSEAS International Conference on Computer Engineering and Applications, Gold Coast, Australia, January 17-19, 2007 402 A Multiple DNA Sequence Translation Tool Incorporating Web

More information

Check Your Data Freedom: A Taxonomy to Assess Life Science Database Openness

Check Your Data Freedom: A Taxonomy to Assess Life Science Database Openness Check Your Data Freedom: A Taxonomy to Assess Life Science Database Openness Melanie Dulong de Rosnay Fellow, Science Commons and Berkman Center for Internet & Society at Harvard University This article

More information

M110.726 The Nucleus M110.727 The Cytoskeleton M340.703 Cell Structure and Dynamics

M110.726 The Nucleus M110.727 The Cytoskeleton M340.703 Cell Structure and Dynamics of Biochemistry and Molecular Biology 1. Master the knowledge base of current biochemistry, molecular biology, and cellular physiology Describe current knowledge in metabolic transformations conducted

More information

Algorithms in Computational Biology (236522) spring 2007 Lecture #1

Algorithms in Computational Biology (236522) spring 2007 Lecture #1 Algorithms in Computational Biology (236522) spring 2007 Lecture #1 Lecturer: Shlomo Moran, Taub 639, tel 4363 Office hours: Tuesday 11:00-12:00/by appointment TA: Ilan Gronau, Taub 700, tel 4894 Office

More information

Lecture Data Warehouse Systems

Lecture Data Warehouse Systems Lecture Data Warehouse Systems Eva Zangerle SS 2013 PART A: Architecture Chapter 1: Motivation and Definitions Motivation Goal: to build an operational general view on a company to support decisions in

More information

G E N OM I C S S E RV I C ES

G E N OM I C S S E RV I C ES GENOMICS SERVICES THE NEW YORK GENOME CENTER NYGC is an independent non-profit implementing advanced genomic research to improve diagnosis and treatment of serious diseases. capabilities. N E X T- G E

More information

Euro-BioImaging European Research Infrastructure for Imaging Technologies in Biological and Biomedical Sciences

Euro-BioImaging European Research Infrastructure for Imaging Technologies in Biological and Biomedical Sciences Euro-BioImaging European Research Infrastructure for Imaging Technologies in Biological and Biomedical Sciences WP11 Data Storage and Analysis Task 11.1 Coordination Deliverable 11.2 Community Needs of

More information

1 Mutation and Genetic Change

1 Mutation and Genetic Change CHAPTER 14 1 Mutation and Genetic Change SECTION Genes in Action KEY IDEAS As you read this section, keep these questions in mind: What is the origin of genetic differences among organisms? What kinds

More information

Teaching Bioinformatics to Undergraduates

Teaching Bioinformatics to Undergraduates Teaching Bioinformatics to Undergraduates http://www.med.nyu.edu/rcr/asm Stuart M. Brown Research Computing, NYU School of Medicine I. What is Bioinformatics? II. Challenges of teaching bioinformatics

More information

The Ensembl Core databases and API

The Ensembl Core databases and API The Ensembl Core databases and API Useful links Installation instructions: http://www.ensembl.org/info/docs/api/api_installation.html Schema description: http://www.ensembl.org/info/docs/api/core/core_schema.html

More information

CD-HIT User s Guide. Last updated: April 5, 2010. http://cd-hit.org http://bioinformatics.org/cd-hit/

CD-HIT User s Guide. Last updated: April 5, 2010. http://cd-hit.org http://bioinformatics.org/cd-hit/ CD-HIT User s Guide Last updated: April 5, 2010 http://cd-hit.org http://bioinformatics.org/cd-hit/ Program developed by Weizhong Li s lab at UCSD http://weizhong-lab.ucsd.edu liwz@sdsc.edu 1. Introduction

More information

Developing a Database for GenBank Information

Developing a Database for GenBank Information Developing a Database for GenBank Information By Nathan Mann B.S., University of Louisville, 2003 A Thesis Submitted to the Faculty of the University of Louisville Speed Scientific School As Partial Fulfillment

More information

Tutorial. Reference Genome Tracks. Sample to Insight. November 27, 2015

Tutorial. Reference Genome Tracks. Sample to Insight. November 27, 2015 Reference Genome Tracks November 27, 2015 Sample to Insight CLC bio, a QIAGEN Company Silkeborgvej 2 Prismet 8000 Aarhus C Denmark Telephone: +45 70 22 32 44 www.clcbio.com support-clcbio@qiagen.com Reference

More information

Exercise with Gene Ontology - Cytoscape - BiNGO

Exercise with Gene Ontology - Cytoscape - BiNGO Exercise with Gene Ontology - Cytoscape - BiNGO This practical has material extracted from http://www.cbs.dtu.dk/chipcourse/exercises/ex_go/goexercise11.php In this exercise we will analyze microarray

More information

RNA & Protein Synthesis

RNA & Protein Synthesis RNA & Protein Synthesis Genes send messages to cellular machinery RNA Plays a major role in process Process has three phases (Genetic) Transcription (Genetic) Translation Protein Synthesis RNA Synthesis

More information

Linear Sequence Analysis. 3-D Structure Analysis

Linear Sequence Analysis. 3-D Structure Analysis Linear Sequence Analysis What can you learn from a (single) protein sequence? Calculate it s physical properties Molecular weight (MW), isoelectric point (pi), amino acid content, hydropathy (hydrophilic

More information

Guide for Bioinformatics Project Module 3

Guide for Bioinformatics Project Module 3 Structure- Based Evidence and Multiple Sequence Alignment In this module we will revisit some topics we started to look at while performing our BLAST search and looking at the CDD database in the first

More information

Next Generation Sequencing Data Visualization

Next Generation Sequencing Data Visualization Next Generation Sequencing Data Visualization GBrowse2 from GMOD Andreas Gisel Institute for Biomedical Technologies CNR Bari - Italy GMOD is the Generic Model Organism Database project GMOD is a collection

More information

DNA Technology Mapping a plasmid digesting How do restriction enzymes work?

DNA Technology Mapping a plasmid digesting How do restriction enzymes work? DNA Technology Mapping a plasmid A first step in working with DNA is mapping the DNA molecule. One way to do this is to use restriction enzymes (restriction endonucleases) that are naturally found in bacteria

More information

Efficient Parallel Execution of Sequence Similarity Analysis Via Dynamic Load Balancing

Efficient Parallel Execution of Sequence Similarity Analysis Via Dynamic Load Balancing Efficient Parallel Execution of Sequence Similarity Analysis Via Dynamic Load Balancing James D. Jackson Philip J. Hatcher Department of Computer Science Kingsbury Hall University of New Hampshire Durham,

More information

Case Study Life Sciences Data

Case Study Life Sciences Data Case Study Life Sciences Data Centre for Integrative Systems Biology and Bioinformatics www.imperial.ac.uk/bioinfsupport Sarah Butcher s.butcher@imperial.ac.uk www.imperial.ac.uk/bioinfsupport Bio-data

More information

The EcoCyc Curation Process

The EcoCyc Curation Process The EcoCyc Curation Process Ingrid M. Keseler SRI International 1 HOW OFTEN IS THE GOLDEN GATE BRIDGE PAINTED? Many misconceptions exist about how often the Bridge is painted. Some say once every seven

More information

Core Bioinformatics. Degree Type Year Semester. 4313473 Bioinformàtica/Bioinformatics OB 0 1

Core Bioinformatics. Degree Type Year Semester. 4313473 Bioinformàtica/Bioinformatics OB 0 1 Core Bioinformatics 2014/2015 Code: 42397 ECTS Credits: 12 Degree Type Year Semester 4313473 Bioinformàtica/Bioinformatics OB 0 1 Contact Name: Sònia Casillas Viladerrams Email: Sonia.Casillas@uab.cat

More information

Information and Data Sharing Policy* Genomics:GTL Program

Information and Data Sharing Policy* Genomics:GTL Program Appendix 1 Information and Data Sharing Policy* Genomics:GTL Program Office of Biological and Environmental Research Office of Science Department of Energy Appendix 1 Final Date: April 4, 2008 Introduction

More information

Genetic information (DNA) determines structure of proteins DNA RNA proteins cell structure 3.11 3.15 enzymes control cell chemistry ( metabolism )

Genetic information (DNA) determines structure of proteins DNA RNA proteins cell structure 3.11 3.15 enzymes control cell chemistry ( metabolism ) Biology 1406 Exam 3 Notes Structure of DNA Ch. 10 Genetic information (DNA) determines structure of proteins DNA RNA proteins cell structure 3.11 3.15 enzymes control cell chemistry ( metabolism ) Proteins

More information

Basic Concepts of DNA, Proteins, Genes and Genomes

Basic Concepts of DNA, Proteins, Genes and Genomes Basic Concepts of DNA, Proteins, Genes and Genomes Kun-Mao Chao 1,2,3 1 Graduate Institute of Biomedical Electronics and Bioinformatics 2 Department of Computer Science and Information Engineering 3 Graduate

More information

Lecture 11 Data storage and LIMS solutions. Stéphane LE CROM lecrom@biologie.ens.fr

Lecture 11 Data storage and LIMS solutions. Stéphane LE CROM lecrom@biologie.ens.fr Lecture 11 Data storage and LIMS solutions Stéphane LE CROM lecrom@biologie.ens.fr Various steps of a DNA microarray experiment Experimental steps Data analysis Experimental design set up Chips on catalog

More information

13.4 Gene Regulation and Expression

13.4 Gene Regulation and Expression 13.4 Gene Regulation and Expression Lesson Objectives Describe gene regulation in prokaryotes. Explain how most eukaryotic genes are regulated. Relate gene regulation to development in multicellular organisms.

More information

Control of Gene Expression

Control of Gene Expression Control of Gene Expression What is Gene Expression? Gene expression is the process by which informa9on from a gene is used in the synthesis of a func9onal gene product. What is Gene Expression? Figure

More information

Global and Discovery Proteomics Lecture Agenda

Global and Discovery Proteomics Lecture Agenda Global and Discovery Proteomics Christine A. Jelinek, Ph.D. Johns Hopkins University School of Medicine Department of Pharmacology and Molecular Sciences Middle Atlantic Mass Spectrometry Laboratory Global

More information

Genomic Data at the British Oceanographic Data Centre

Genomic Data at the British Oceanographic Data Centre Genomic Data at the British Oceanographic Data Centre A data management project for the NERC Marine and Freshwater Microbial Biodiversity Thematic Programme Gwen Moncoiffé British Oceanographic Data Centre,

More information

CCR Biology - Chapter 9 Practice Test - Summer 2012

CCR Biology - Chapter 9 Practice Test - Summer 2012 Name: Class: Date: CCR Biology - Chapter 9 Practice Test - Summer 2012 Multiple Choice Identify the choice that best completes the statement or answers the question. 1. Genetic engineering is possible

More information

Scientific databases. Biological data management

Scientific databases. Biological data management Scientific databases Biological data management The term paper within the framework of the course Principles of Modern Database Systems by Aleksejs Kontijevskis PhD student The Linnaeus Centre for Bioinformatics

More information

Applying data integration into reconstruction of gene networks from micro

Applying data integration into reconstruction of gene networks from micro Applying data integration into reconstruction of gene networks from microarray data PhD Thesis Proposal Dipartimento di Informatica e Scienze dell Informazione Università degli Studi di Genova December

More information

To be able to describe polypeptide synthesis including transcription and splicing

To be able to describe polypeptide synthesis including transcription and splicing Thursday 8th March COPY LO: To be able to describe polypeptide synthesis including transcription and splicing Starter Explain the difference between transcription and translation BATS Describe and explain

More information

Three data delivery cases for EMBL- EBI s Embassy. Guy Cochrane www.ebi.ac.uk

Three data delivery cases for EMBL- EBI s Embassy. Guy Cochrane www.ebi.ac.uk Three data delivery cases for EMBL- EBI s Embassy Guy Cochrane www.ebi.ac.uk EMBL European Bioinformatics Institute Genes, genomes & variation European Nucleotide Archive 1000 Genomes Ensembl Ensembl Genomes

More information

Human Genome and Human Genome Project. Louxin Zhang

Human Genome and Human Genome Project. Louxin Zhang Human Genome and Human Genome Project Louxin Zhang A Primer to Genomics Cells are the fundamental working units of every living systems. DNA is made of 4 nucleotide bases. The DNA sequence is the particular

More information

What is a Gene? HC70AL Spring An Introduction to Bioinformatics -- Part I. What are the 4 Nucleotides By in DNA?

What is a Gene? HC70AL Spring An Introduction to Bioinformatics -- Part I. What are the 4 Nucleotides By in DNA? APPENDIX 2 - BIOINFORMATICS (PARTS I AND II) What is a Gene? HC70AL Spring 2004 An ordered sequence of nucleotides An Introduction to Bioinformatics -- Part I What are the 4 Nucleotides By in DNA? Brandon

More information

Special report. Chronic Lymphocytic Leukemia (CLL) Genomic Biology 3020 April 20, 2006

Special report. Chronic Lymphocytic Leukemia (CLL) Genomic Biology 3020 April 20, 2006 Special report Chronic Lymphocytic Leukemia (CLL) Genomic Biology 3020 April 20, 2006 Gene And Protein The gene that causes the mutation is CCND1 and the protein NP_444284 The mutation deals with the cell

More information

Frequently Asked Questions Next Generation Sequencing

Frequently Asked Questions Next Generation Sequencing Frequently Asked Questions Next Generation Sequencing Import These Frequently Asked Questions for Next Generation Sequencing are some of the more common questions our customers ask. Questions are divided

More information

B2 1 Cells, Tissues and Organs

B2 1 Cells, Tissues and Organs B2 Cells, Tissues and Organs 5 minutes 5 marks Page of 7 Q. The diagram shows a bacterium. On the drawing, name the structures labelled A, B, C and D. (Total 4 marks) Q2. (a) The diagrams show cells containing

More information

On-line supplement to manuscript Galaxy for collaborative analysis of ENCODE data: Making large-scale analyses biologist-friendly

On-line supplement to manuscript Galaxy for collaborative analysis of ENCODE data: Making large-scale analyses biologist-friendly On-line supplement to manuscript Galaxy for collaborative analysis of ENCODE data: Making large-scale analyses biologist-friendly DANIEL BLANKENBERG, JAMES TAYLOR, IAN SCHENCK, JIANBIN HE, YI ZHANG, MATTHEW

More information

REGULATIONS FOR THE DEGREE OF BACHELOR OF SCIENCE IN BIOINFORMATICS (BSc[BioInf])

REGULATIONS FOR THE DEGREE OF BACHELOR OF SCIENCE IN BIOINFORMATICS (BSc[BioInf]) 820 REGULATIONS FOR THE DEGREE OF BACHELOR OF SCIENCE IN BIOINFORMATICS (BSc[BioInf]) (See also General Regulations) BMS1 Admission to the Degree To be eligible for admission to the degree of Bachelor

More information

Mitochondrial DNA Analysis

Mitochondrial DNA Analysis Mitochondrial DNA Analysis Lineage Markers Lineage markers are passed down from generation to generation without changing Except for rare mutation events They can help determine the lineage (family tree)

More information

Appendix 2 Molecular Biology Core Curriculum. Websites and Other Resources

Appendix 2 Molecular Biology Core Curriculum. Websites and Other Resources Appendix 2 Molecular Biology Core Curriculum Websites and Other Resources Chapter 1 - The Molecular Basis of Cancer 1. Inside Cancer http://www.insidecancer.org/ From the Dolan DNA Learning Center Cold

More information

Name Class Date. KEY CONCEPT Mutations are changes in DNA that may or may not affect phenotype. frameshift mutation

Name Class Date. KEY CONCEPT Mutations are changes in DNA that may or may not affect phenotype. frameshift mutation Unit 7 Study Guide Section 8.7: Mutations KEY CONCEPT Mutations are changes in DNA that may or may not affect phenotype. VOCABULARY mutation point mutation frameshift mutation mutagen MAIN IDEA: Some mutations

More information

Vad är bioinformatik och varför behöver vi det i vården? a bioinformatician's perspectives

Vad är bioinformatik och varför behöver vi det i vården? a bioinformatician's perspectives Vad är bioinformatik och varför behöver vi det i vården? a bioinformatician's perspectives Dirk.Repsilber@oru.se 2015-05-21 Functional Bioinformatics, Örebro University Vad är bioinformatik och varför

More information

Structure and Function of DNA

Structure and Function of DNA Structure and Function of DNA DNA and RNA Structure DNA and RNA are nucleic acids. They consist of chemical units called nucleotides. The nucleotides are joined by a sugar-phosphate backbone. The four

More information

Databases and Information Management

Databases and Information Management Databases and Information Management Reading: Laudon & Laudon chapter 5 Additional Reading: Brien & Marakas chapter 3-4 COMP 5131 1 Outline Database Approach to Data Management Database Management Systems

More information

Human Genome Organization: An Update. Genome Organization: An Update

Human Genome Organization: An Update. Genome Organization: An Update Human Genome Organization: An Update Genome Organization: An Update Highlights of Human Genome Project Timetable Proposed in 1990 as 3 billion dollar joint venture between DOE and NIH with 15 year completion

More information

Using Illumina BaseSpace Apps to Analyze RNA Sequencing Data

Using Illumina BaseSpace Apps to Analyze RNA Sequencing Data Using Illumina BaseSpace Apps to Analyze RNA Sequencing Data The Illumina TopHat Alignment and Cufflinks Assembly and Differential Expression apps make RNA data analysis accessible to any user, regardless

More information

An agent-based layered middleware as tool integration

An agent-based layered middleware as tool integration An agent-based layered middleware as tool integration Flavio Corradini Leonardo Mariani Emanuela Merelli University of L Aquila University of Milano University of Camerino ITALY ITALY ITALY Helsinki FSE/ESEC

More information

BBSRC TECHNOLOGY STRATEGY: TECHNOLOGIES NEEDED BY RESEARCH KNOWLEDGE PROVIDERS

BBSRC TECHNOLOGY STRATEGY: TECHNOLOGIES NEEDED BY RESEARCH KNOWLEDGE PROVIDERS BBSRC TECHNOLOGY STRATEGY: TECHNOLOGIES NEEDED BY RESEARCH KNOWLEDGE PROVIDERS 1. The Technology Strategy sets out six areas where technological developments are required to push the frontiers of knowledge

More information

Distributed Data Mining in Discovery Net. Dr. Moustafa Ghanem Department of Computing Imperial College London

Distributed Data Mining in Discovery Net. Dr. Moustafa Ghanem Department of Computing Imperial College London Distributed Data Mining in Discovery Net Dr. Moustafa Ghanem Department of Computing Imperial College London 1. What is Discovery Net 2. Distributed Data Mining for Compute Intensive Tasks 3. Distributed

More information

Module 3. Genome Browsing. Using Web Browsers to View Genome Annota4on. Kers4n Howe Wellcome Trust Sanger Ins4tute zfish- help@sanger.ac.

Module 3. Genome Browsing. Using Web Browsers to View Genome Annota4on. Kers4n Howe Wellcome Trust Sanger Ins4tute zfish- help@sanger.ac. Module 3 Genome Browsing Using Web Browsers to View Genome Annota4on Kers4n Howe Wellcome Trust Sanger Ins4tute zfish- help@sanger.ac.uk Introduc.on Genome browsing The Ensembl gene set Guided examples

More information