Sequence information Multiple Pair-wise SRS Entrez Comparisons Database searches Sequence Information Orthologue clusters Sequence Organell localisation Patterns Protein families Membrane attachment Bengt Persson Post-translational modifications Prosite InterPro Pfam Secondary structure Linköping University & Karolinska Institutet 2 Multiple Orthologue clusters Sequence information Comparisons Pair-wise Sequence SRS Database searches Entrez Organell localisation www.expasy.org www.ebi.ac.uk www.ncbi.nlm.nih.gov www.cbs.dtu.dk Good web sites Patterns Protein families Membrane attachment Post-translational modifications Prosite InterPro Pfam Secondary structure Linköping University & Karolinska Institutet 3 Linköping University & Karolinska Institutet 4 (c) Bengt Persson 1
Protein family databases Protein families, nomenclature Super-family Family Sub-family Linköping University & Karolinska Institutet 6 InterPro InterPro entry Prosite Amos Bairoch, Genève Pfam Erik Sonnhammer, KI and Sanger Institute, UK PRINTS Terri Attwood, UCL, London, UK ProDom Daniel Kahn, INRA, Toulouse, France SMART Peer Bork, EMBL Swissprot+TrEMBL Linköping University & Karolinska Institutet 7 Linköping University & Karolinska Institutet 8 (c) Bengt Persson 2
InterPro entry, cont. InterPro entry, cont. Linköping University & Karolinska Institutet 9 Linköping University & Karolinska Institutet 10 InterPro -- protein matches InterPro -- protein matches, graphical Linköping University & Karolinska Institutet 11 Linköping University & Karolinska Institutet 12 (c) Bengt Persson 3
Prosite Prosite Database of protein families and domains Release 16, September 1999 1035 documentation entries 1375 different patterns http://www.expasy.ch/prosite/ Amos Bairoch, University of Geneva Linköping University & Karolinska Institutet 13 Linköping University & Karolinska Institutet 14 Prosite ScanProsite Linköping University & Karolinska Institutet 15 Linköping University & Karolinska Institutet 16 (c) Bengt Persson 4
Prosite, documentation entry Example of Prosite patterns Post-translational modifications Domains DNA or RNA associated proteins Enzymes Electron transport proteins Other transport proteins Structural proteins Receptors Hormones and active peptides Toxins Inhibitors Protein secretion and chaperones Cytokines and growth factors Others Linköping University & Karolinska Institutet 17 Linköping University & Karolinska Institutet 18 Pfam A collection of protein families and domains. Pfam contains multiple protein alignments and profile-hmms of these families. Pfam is a semi-automatic protein family database, which aims to be comprehensive as well as accurate. Hidden Markov Models (HMMs) Statistical profile method Enables database searches Enables multiple alignment creation http://www.sanger.ac.uk/software/pfam/index.shtml http://www.cgr.ki.se/pfam from Yvonne Kallberg Linköping University & Karolinska Institutet 19 Linköping University & Karolinska Institutet 20 (c) Bengt Persson 5
Pfam Pfam Linköping University & Karolinska Institutet 21 Linköping University & Karolinska Institutet 22 Pfam COG--Clusters of Orthologous Groups Linköping University & Karolinska Institutet 23 Linköping University & Karolinska Institutet 24 (c) Bengt Persson 6
Functional groups of protein families COG Linköping University & Karolinska Institutet 25 Linköping University & Karolinska Institutet 26 COG Predictions of structure and post- translational modifications Linköping University & Karolinska Institutet 27 (c) Bengt Persson 7
Secondary structure Hydrophilicity Structure predictions Membrane-spanning regions Antigenicity Glycosylation Acetylation and much more... Secondary structure predictions Chou & Fasman (CF) Garnier, Osguthorpe & Robson (GOR) http://pbil.ibcp.fr/cgi-bin/npsa_automat.pl?page=npsa_gor4.html neural networks (e.g. PHD) http://dodo.cpmc.columbia.edu/predictprotein/ Linköping University & Karolinska Institutet 29 Linköping University & Karolinska Institutet 30 Artificial Neural Networks (ANNs) The PredictProtein server Statistical method Pattern recognition, e. g. secondary structure predictions Output layer Output layer Hidden layer Hidden layer Input layer Input layer modified from Yvonne Kallberg Linköping University & Karolinska Institutet 31 Linköping University & Karolinska Institutet 32 (c) Bengt Persson 8
Default submission form Hydrophilicity Kyte & Doolittle Hopp & Woods Linköping University & Karolinska Institutet 33 Linköping University & Karolinska Institutet 34 Example of hydrophilicity and secondary structure plots ProtScale A general tool for plotting sequence properties, e.g. hydrophilicity http://www.expasy.ch/cgi-bin/protscale.pl Linköping University & Karolinska Institutet 35 Linköping University & Karolinska Institutet 36 (c) Bengt Persson 9
ProtScale, selection of property to plot ProtScale, results Linköping University & Karolinska Institutet 37 Linköping University & Karolinska Institutet 38 ProtScale, Graphic view Membrane protein prediction, TMAP http://www.ifm.liu.se/bioinfo/ Linköping University & Karolinska Institutet 39 Linköping University & Karolinska Institutet 40 (c) Bengt Persson 10
Membrane protein prediction, TMAP TMAP, graphics output Linköping University & Karolinska Institutet 41 Linköping University & Karolinska Institutet 42 Prediction servers at CBS www.cbs.dtu.dk/services/ SignalP Linköping University & Karolinska Institutet 43 Linköping University & Karolinska Institutet 44 (c) Bengt Persson 11
SignalP -- Results SignalP -- Results, cont. Linköping University & Karolinska Institutet 45 Linköping University & Karolinska Institutet 46 TargetP TargetP -- Results Linköping University & Karolinska Institutet 47 Linköping University & Karolinska Institutet 48 (c) Bengt Persson 12
Phobius Phobius, results Linköping University & Karolinska Institutet 49 Linköping University & Karolinska Institutet 50 ExPASy site map Protein identification and characterisation Linköping University & Karolinska Institutet 51 Linköping University & Karolinska Institutet 52 (c) Bengt Persson 13
Post-translational modifications Primary structure analysis Linköping University & Karolinska Institutet 53 Linköping University & Karolinska Institutet 54 Secondary structure prediction Transmembrane regions & Sequence alignments Linköping University & Karolinska Institutet 55 Linköping University & Karolinska Institutet 56 (c) Bengt Persson 14