Webserver: bioinfo.bio.wzw.tum.de Mail: w.mewes@weihenstephan.de
About me H. Werner Mewes, Lehrstuhl f. Bioinformatik, WZW C.V.: Studium der Chemie in Marburg Uni Heidelberg (Med. Fakultät, Bioenergetik) EMBL Heidelberg (Proteinchemie, Prof. Tsugita) MPI f. Biochemie, Martinsried (Proteinchemie, Dr. Lottspeich) Seit 1988 in der Bioinformatik: Proteinsequenzen, Genomanalyse, Seit 2001 Lehrstuhl f. Bioinformatik am WZW Weihenstephan Direktor des Instituts f. Bioinformatik und Systembiologie am Helmholtz Zentrum München (Dt. Zentrum f. Gesundheit u. Umwelt) Arbeitsgebiete: Genomanalyse, Biol. Wissen, Systembiologie Publikationen/Web Site: mips.gsf.de // binfo.bio.wzw.tum.de 2
Rules for this lecture Do not redistribute course material, it is subject to copyright (less mine than others) but in a closed group (students of this lecture) private use is allowed. Please do not disturb your neighbours (and me). Some students may be interested in the subject. 3 Photo: KoAn (cc)
About this lecture Goals: To understand that bioinformatics is a pure necessity to understand biological processes To get an idea that bioinformatics and experimental biology will become part of Systems Biology (and what that is) To understand that intuition is probably not an ideal solution to solve complex problems Webserver: bioinfo.bio.wzw.tum.de Mail: w.mewes@weihenstephan.de 4
About this lecture Bioinformatics for Biologists: Biology is the driving force, but bioinformatics is a powerful enabling techologie and science by itself Less (but not no) algorithms Mark the hot spots that you should understand & know Understand what bioinformatics can do for you Wider scope than previous lectures Different type of exercises (T. Rattei, P. Pagel) Webserver: bioinfo.bio.wzw.tum.de Mail: w.mewes@weihenstephan.de 5
Topics of the Lecture 6
Mind Map for this Lecture 7
THE GRAND CHALLENGE IS: TO DO SOMETHING HAVING AN IMPACT BEYOND THE NEXT DAY
Multifactorial diseases in a changing world Climate & Ecosystems Global Economy Socioeconomics & Demographics Behavior & Life style? Well Being Health Diseases
Guess what this is?
The burden of diabetes
Weight watchers 12
Different Views to Human Health
From intuitive interpretation to desperation
DISEASES AND THEIR ORIGIN
A single cause? Overexpressing a particular microrna in some of a mouse's immune cells leads to leukemia Couzin, Science, 2008
Factors influencing diseases Genes Genetic variance (disposition) Regulation Transcription Translation (e.g. mirna) Splicing Post-translational modifications Signaling Epigenetic Histone modification Development Environmental factors Nutrients, toxins, drugs Behavior (exercise) Nutrition Infection Stress Aging
Genetic, cellular and molecular basis of disease Diseases are disorders of homeostasis To understand the molecular mechanisms involved in aitiology and progress of diseases we need information from the molecule to the population Photo Yuri Tand Methods, technologies, and discoveries supported by bioinformatics
What will happen in the Life Sciences within 10 years? The principle of the Black Swan (N.N. Taleb) As a matter of fact: we don`t know what will happen (we never know the improbable) If we explain 99% we will not find the 1% causing dramatic changes: what makes the change? A single mirna may cause dramatic effects However, we can at least extrapolate Massive amounts of data New technologies: Deep sequencing Imaging High resolution and dynamic structures New approaches to biological problems 19
What we want to know about biology? Life as a process depends on Laws (chemistry, physics) Principles (DNA, RNA, Protein) Rules (cis-splicing) Exceptions (trans-splicing) What can we observe? Phenotypes (any observable) conditon & time dependent Genotypes Given the molecular level as the level of action how can we find the cause for macro-observable phenotypes? 20
WHAT IS BIOINFORMATICS GOOD FOR?
How to become? [PDF] Basic local alignment search tool SF Altschul, W Gish, W Miller, EW Myers, DJ Lipman - J. Mol. Biol, 1990 - puffer.g J. Mol. Biol. (1990) 215, 403-410 Basic Local Alignment Search Tool Stephen F. Altschul1, Warren Gish1, Webb Miller2 Eugene W. Myers3 and David J. Lipman1 National Center for Biotechnology Information National Library of Medicine,... Zitiert durch: 21699 - Weitere Artikel - Websuche - Alle 59 Versionen Gapped BLAST and PSI-BLAST: a new generation of protein database search prog SF Altschul, TL Madden, AA Schaffer, J Zhang, Z - Nucleic Acids Research - O Variations of the BLAST algorithm (1 ) have been incorporated into several popular programs for searching protein and DNA databases for sequence similarities. BLAST programs have been written to compare protein or DNA... Zitiert durch: 25808 - Weitere Artikel - Websuche - Alle 146 Versionen [ZITATION] Basic local alignment tool SF Altschul, W Gish, W Miller, EW Myers, DJ Lipman - J. Mol. Biol, 1990 Zitiert durch: 3094 - Weitere Artikel - Websuche Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignmen CE Lawrence, SF Altschul, MS Boguski, JS Liu, AF - Science, 1993 - sciencem 1. 0. Diels and K. Alder, Justus Liebigs Ann. Chem. 460, 98 (1928); 0. Diels, JH Blom, W. Koll, ibid. 443, 242 (1925); BM Trost, I. Fleming, LA Paquette, Comprehensive Organic Synthesis (Pergamon, Oxford, 1991), vol. 5, pp. 316;... Zitiert durch: 1160 - Weitere Artikel - Websuche - Alle 15 Versionen Methods for Assessing the Statistical Significance of Molecular Sequence Features [PDF] S Karlin, SF Altschul - Proceedings of the National Academy of Sciences, 1990 - N Page 1. Proc. Natl. Acad. Sci. USA Vol. 87, pp. 2264-2268, March 1990 Evolution Methods for assessing the statistical significance of molecular sequence features by using general scoring schemes... Zitiert durch: 884 - Weitere Artikel - Websuche - Alle 26 Versionen
Information vs. Complexity 23
Bioinformatics classics 24
The secrets of bioinformatics Taken from G. Brooch OO-Programming 25
A molecular interaction map qualitative and quantitative modeling 26 Kohn et al.; Mol. Biol. Cell (2006)
What is Bioinformatics? Bioinformatics is a newly emerging interdisciplinary research area which may be defined as the interface between biological and computational sciences. Thus, the people working in this field in most cases either have a training in biology or computer science, and they learned about the other field by dealing with problems or using the tools of the other one. http://bioinformatics.weizmann.ac.il/cards/bioinfo_intro.html I see it much different: bioinformatics is about concepts. We apply these concepts in a very general approach to biological questions. It is not relevant what we use to solve the question: text mining, algorithms, huge storage space, differential equations. In most cases we use computer and network technologies. There is no general paradigm for a definition of bioinformatics (see the concept of systems biology following later). It is truly interdisciplinary. 27
Again: What is Bioinformatics? Bioinformatics is an integration of mathematical, statistical and computer methods to analyze biological, biochemical and biophysical data. Bioinformatics and computational biology involve the use of techniques including applied mathematics, informatics, statistics, computer science, chemistry and biochemistry to solve biological problems usually on the molecular level. Research in computational biology often overlaps with systems biology.. Wikipedia! 28
History and Outlook 70ies: collect data and store on a computer instead of cards; first pioneering efforts in protein comparison 80ies: DNA sequencing, collection of sequence data, first databases (PIR, GenBank, EMBL); algorithms for sequence comparison and sequence analysis 90ies: First complete genomes sequenced; genome analysis, explosion of information; prediction of function; annotation of sequences; functional and structural classification; protein domains; gene prediction; function prediction; probabilistics Computational Biology: apply algorithms for discovery Systems Biology: understanding biological networks 29
Keywords Major research efforts in the field include sequence alignment, gene finding, genome assembly, protein structure alignment, protein structure prediction, prediction of gene expression and protein-protein interactions, and the modeling of evolution But: Since 10.000 people are working in the field, things get complex! 30
Basic bioinformatics goals Can we provide a general solution for a problem? To measure the similarity of two sequences To classify a protein for its function To assign a fold to a protein Can we make discoveries in biology using bioinformatics? Are certain protein families unique for certain species? Does the conservation of pairs of proteins correlate to certain sets of species? Note: if you have a realy interesting question: bioinformatics can provide information but is hardly suitable to get you a nobel price (without using other tricks) example: prediction of protein/protein interactions 31
Bioinformatics is an interdisciplinary adventure Bioinformatics is about biology. Life is the most important and complex object to study. Experiments Data Databases To give reliable answers to biological questions, bioinformatics needs computer science to solve difficult (hard) problems. Data Algorithms Information The genome era is data driven. Huge amounts of complex data have to be analyzed and interpreted. Data & Knowledge Algorithms Interpretation Next step: Systems Biology: why is a cell more than a soup of molecules? Qualitative and quantitative models 32
END OF CHAPTER ONE