Biomolekulare Strukturmodellierung. DKFZ, Abteilung Molekulare Biophysik Michaela Knapp-Mohammady

Similar documents

Built from 20 kinds of amino acids

CSC 2427: Algorithms for Molecular Biology Spring Lecture 16 March 10

Advanced Medicinal & Pharmaceutical Chemistry CHEM 5412 Dept. of Chemistry, TAMUK

Structure of proteins

Paper: 6 Chemistry University I Chemistry: Models Page: 2 of Which of the following weak acids would make the best buffer at ph = 5.0?

18.2 Protein Structure and Function: An Overview

Peptide Bonds: Structure

Peptide bonds: resonance structure. Properties of proteins: Peptide bonds and side chains. Dihedral angles. Peptide bond. Protein physics, Lecture 5

Lecture 19: Proteins, Primary Struture

This class deals with the fundamental structural features of proteins, which one can understand from the structure of amino acids, and how they are

Amino Acids. Amino acids are the building blocks of proteins. All AA s have the same basic structure: Side Chain. Alpha Carbon. Carboxyl. Group.

Combinatorial Biochemistry and Phage Display

Disulfide Bonds at the Hair Salon

The peptide bond is rigid and planar

Peptide Bond Amino acids are linked together by peptide bonds to form polypepetide chain.

Protein Physics. A. V. Finkelstein & O. B. Ptitsyn LECTURE 1

Linear Sequence Analysis. 3-D Structure Analysis

Biological Molecules

INTRODUCTION TO PROTEIN STRUCTURE

MCAT Organic Chemistry - Problem Drill 23: Amino Acids, Peptides and Proteins

IV. -Amino Acids: carboxyl and amino groups bonded to -Carbon. V. Polypeptides and Proteins

Bioinformatics for Biologists. Protein Structure

Protein 3D-structure analysis. why and how

Recap. Lecture 2. Protein conformation. Proteins. 8 types of protein function 10/21/10. Proteins.. > 50% dry weight of a cell

Amino Acids, Proteins, and Enzymes. Primary and Secondary Structure Tertiary and Quaternary Structure Protein Hydrolysis and Denaturation

Guide for Bioinformatics Project Module 3

Role of Hydrogen Bonding on Protein Secondary Structure Introduction

Myoglobin and Hemoglobin

Section I Using Jmol as a Computer Visualization Tool

4. Which carbohydrate would you find as part of a molecule of RNA? a. Galactose b. Deoxyribose c. Ribose d. Glucose

Carbohydrates, proteins and lipids

Chapter 3 Molecules of Cells

Amino Acids and Proteins

Hydrogen Bonds The electrostatic nature of hydrogen bonds

(c) How would your answers to problem (a) change if the molecular weight of the protein was 100,000 Dalton?

Chapter 3. Protein Structure and Function

Introduction to Proteins and Enzymes

Helices From Readily in Biological Structures

Structure Tools and Visualization

Lecture Overview. Hydrogen Bonds. Special Properties of Water Molecules. Universal Solvent. ph Scale Illustrated. special properties of water

The peptide bond Peptides and proteins are linear polymers of amino acids. The amino acids are

Overview'of'Solid-Phase'Peptide'Synthesis'(SPPS)'and'Secondary'Structure'Determination'by'FTIR'

Papers listed: Cell2. This weeks papers. Chapt 4. Protein structure and function

NO CALCULATORS OR CELL PHONES ALLOWED

PROTEINS THE PEPTIDE BOND. The peptide bond, shown above enclosed in the blue curves, generates the basic structural unit for proteins.

Molecular Dynamics Simulations

Part A: Amino Acids and Peptides (Is the peptide IAG the same as the peptide GAI?)

Biological Databases and Protein Sequence Analysis

Steffen Lindert, René Staritzbichler, Nils Wötzel, Mert Karakaş, Phoebe L. Stewart, and Jens Meiler

Structure Determination

Chapter 3: Biological Molecules. 1. Carbohydrates 2. Lipids 3. Proteins 4. Nucleic Acids

Ionization of amino acids

Student name ID # 2. (4 pts) What is the terminal electron acceptor in respiration? In photosynthesis? O2, NADP+

Transcription and Translation of DNA

Structure and properties of proteins. Vladimíra Kvasnicová

Proteins. Proteins. Amino Acids. Most diverse and most important molecule in. Functions: Functions (cont d)

The Molecules of Cells

Phase determination methods in macromolecular X- ray Crystallography

A disaccharide is formed when a dehydration reaction joins two monosaccharides. This covalent bond is called a glycosidic linkage.

Structure Check. Authors: Eduard Schreiner Leonardo G. Trabuco. February 7, 2012

Chapter 5: The Structure and Function of Large Biological Molecules

Pipe Cleaner Proteins. Essential question: How does the structure of proteins relate to their function in the cell?

How To Understand The Chemistry Of Organic Molecules

Sequence Formats and Sequence Database Searches. Gloria Rendon SC11 Education June, 2011

BIO 3350: ELEMENTS OF BIOINFORMATICS PARTIALLY ONLINE SYLLABUS

Protein annotation and modelling servers at University College London

Lecture Conformation of proteins Conformation of a protein three-dimensional structure native state. native condition

Problem Set 1 KEY

Basic Concepts of DNA, Proteins, Genes and Genomes

A. A peptide with 12 amino acids has the following amino acid composition: 2 Met, 1 Tyr, 1 Trp, 2 Glu, 1 Lys, 1 Arg, 1 Thr, 1 Asn, 1 Ile, 1 Cys

Translation Study Guide

Proteins the primary biological macromolecules of living organisms

Consensus alignment server for reliable comparative modeling with distant templates

Structures of Proteins. Primary structure - amino acid sequence

1 Peptide bond rotation

Chapter 5. The Structure and Function of Macromolecule s

Nafith Abu Tarboush DDS, MSc, PhD

Oxygen-Binding Proteins

RNA & Protein Synthesis

Structural Bioinformatics (C3210) Experimental Methods for Macromolecular Structure Determination

Computational Systems Biology. Lecture 2: Enzymes

AP BIOLOGY 2008 SCORING GUIDELINES

Discovering Bioinformatics

Lab 3 Organic Molecules of Biological Importance

FTIR Analysis of Protein Structure

What is molecular dynamics (MD) simulation and how does it work?

Biological molecules:

Replication Study Guide

Worksheet Chapter 13: Human biochemistry glossary

In addition to being shorter than a single bond, the double bonds in ethylene don t twist the way single bonds do. In other words, the other atoms

Module 1. Sequence Formats and Retrieval. Charles Steward

Syllabus of B.Sc. (Bioinformatics) Subject- Bioinformatics (as one subject) B.Sc. I Year Semester I Paper I: Basic of Bioinformatics 85 marks

Sequence Information. Sequence information. Good web sites. Sequence information. Sequence. Sequence

Introduction to Protein Folding

LESSON 5. Learning to Use Cn3D: A Bioinformatics Tool. Introduction. Learning Objectives. Key Concepts

Chapter 12 - Proteins

Transcription:

Biomolekulare Strukturmodellierung DKFZ, Abteilung Molekulare Biophysik Michaela Knapp-Mohammady

Biomolekulare Strukturmodellierung I) Structure of proteins, basics - Primary structure - Secondary structure - Tertiary structure II) Protein modelling, tools and techniques - Primary structure analysis - Secondary structure prediction - Tertiary structure analysis and modelling - Protein simulation

! "#!$ "%& "$ '() *& &! '++!,

Nachfolgend das vollständige Gen in komplementärer Sequenz: GGATCCTGCC AGAGCCTCCT CCCACCTGGA GGGGTCCCAG CGTCCACCTT CCCTGCCCCA 60 GCCCCCCTCC TCGAGGTACT GGGAGGCTGG ATAAAGTCTT CGGCTGGGCC ACACCCCACC 120 CCAAATTCTC CCTGTCCCAC CCTAGTGCCC AGGCCACCCC GGCCTGCTCC CTTCCGCAAG 180 GCACCTCACC TTCTGTGCCC AGACCATTAG CCAACGCGGT GACCTTGACC CCGGCCCAGG 240 CCCTGCTAAT GAAGAGGAAA GCCCGTACGC ACTCGGCCTG ACCCACGGCG ACCCTCTGTG 300 ACCAATCATA CTACCAACCT CTTAAACAGA GCTCCACCGA CGCAATGCCC AGGCATAAAA 360 AGGCCAGGCC GAGAGACCGC CACCAGTCAC GGACCCTGGA CCCAGCGCAC CCGCACCATG 420 GCCGGCCCCA GCCTCGCTTG CTGTCTGCTC GGCCTCCTGG CGCTGACCTC CGCCTGCTAC 480 ATCCAGAACT GCCCCCTGGG AGGCAAGAGG GCCGCGCCGG ACCTCGACGT GCGCAAGGTG 540 AGTCCCCAGC CCTGGTCCCG CGGCGCTCCG GGGAGGGAGG GACCCGCAGC CACAGGGGCG 600 CGCCCCGCTC CGGCCTCGCC TGAGAACTCC AGGAGCTGAG CGGATTTTGA CGCCCCGCCC 660 TTGACCGCGG TCGAGGCCCC CACGGCGCCC CAGCGTCTCA GCCCCGCTGT CCCCGCCCGA 720 ACTCCGAACC CCGGACCCCA GCATCCTTGC CCGGCGCACC CCGGCCGGCC TCGCAGGGTC 780 CTCCGAGCGA GTCCCCAGCG CCGCCCCGCG TCCCGCTCAC CCCGCCCGTC CCCCGAGTGC 840 CTCCCCTGCG GCCCCGGGGG CAAAGGCCGC TGCTTCGGGC CCAATATCTG CTGCGCGGAA 900 GAGCTGGGCT GCTTCGTGGG CACCGCCGAA GCGCTGCGCT GCCAGGAGGA GAACTACCTG 960 CCGTCGCCCT GCCAGTCCGG CCAGAAGGCG TGCGGGAGCG GGGGCCGCTG CGCCTTGGGC 1020 CTCTGCTGCA GCCCGGGTGA GCGGGGCAAG GCGCTCCGGG GCCAGGGGGA GGCGGGCGGG 1080 GGTGCGGCCG GGATTCCCCT GACTCCACCT CTTCCTCCAG ACGGCTGCCA CGCCGACCCT 1140 GCCTGCGACG CGGAAGCCAC CTTCTCCCAG CGCTGAAACT TGATGGCTCC GAACACCCTC 1200 GAAGCGCGCC ACTCGCTTCC CCCATAGCCA CCCCAGAAAT GGTGAAAATA AAATAAAGCA 1260 GGTTTTTCTC CTCTACCTTG ACTCGTGTCT AAGTGCCAGA AATGGGACGG GGAGGGGGCA 1320 TTGTGGGACT GGAAGATC 1338

Die 20 Aminosäuren unterscheiden sich nur in ihren Seitenketten (funktionelle Gruppen)

different amino acids Amino acids have different biochemical and physical properties that influence their relative replaceability in evolution. aliphatic I L C S+S V A G T P G C SH S D N tiny small hydrophobic aromatic M F Y W H K E Q R charged positive polar

Unter Abgabe eines Wassermoleküls vereinigen sich die Aminosäuren zu einem Dipeptid. Es entsteht eine sogenannte Peptidbindung zwischen einem C- und einem N-Atom.

Hier sieht man die Peptidbindung in Großaufnahme (blau = Stickstoff, rot = Sauerstoff, schwarz = Kohlenstoff, grau = Wasserstoff, grün = Rest). Die dunkelrot gefärbten Bindungen liegen in einer Ebene und sind recht starr. Ursache hierfür ist die C=O-Doppelbindung. An den anderen Stellen des Peptids herrscht dagegen freie Beweglichkeit. Tripeptide bilden sich, wenn drei Aminosäuren (oder ein Dipeptid und eine Aminosäure) miteinander unter Wasserabspaltung reagieren (man nennt einen solchen Vorgang, bei dem Wasser abgegeben wird, auch Kondensation). Allgemein bezeichnet man Peptide, die aus wenigen Aminosäuren bestehen, als Oligopeptide. Das Gegenteil sind dann die Polypeptide, die aus vielen Aminosäuren bestehen. Peptide, die aus mehr als 100 Aminosäuren zusammengesetzt sind, bezeichnet man dann als Proteine.

Secondary structure - alpha-helix Properties of the α-helix. The structure repeats itself every 5.4 Å along the helix axis, i.e. we say that the α-helix has a pitch of 5.4 Å. α-helices have 3.6 amino acid residues per turn, i.e. a helix 36 amino acids long would form 10 turns.

Helix-Stukturen

Secondary Structure - ß-Sheet The ß-sheet structure In a ß-sheet two or more polypeptide chains run alongside each other and are linked in a regular manner by hydrogen bonds between the main chain C=O and N-H groups. Therefore all hydrogen bonds in a ß-sheet are between different segments of polypeptide. This contrasts with the α-helix where all hydrogen bonds involve the same element of secondary structure.

Secondary Structure - ß-Sheet

Secondary structure Reverse turns A reverse turn is region of the polypeptide having a hydrogen bond from one main chain carbonyl oxygen to the main chain N-H group 3 residues along the chain (i.e. Oi to Ni+3). Helical regions are excluded from this definition and turns between ß-strands form a special class of turn known as the ß-hairpin.

Tertiary structure Tertiary structure describes the packing of alpha-helices, beta-sheets and random coils with respect to each other on the level of one whole polypeptide chain. Figure shows the tertiary structure of Chain B of Protein Kinase C Interacting Protein

Quarternary structure Quaternary structure only exists, if there is more than one polypeptide chain present in a complex protein. Then quaternary structure describes the spatial organization of the chains. The figure shows the Protein Kinase C interacting protein.

Zusammenfassung von I) The wide variety of 3-dimensional protein structures corresponds to the diversity of functions proteins fulfill. Proteins fold in three dimensions. Protein structure is organized hierarchically from so-called primary structure to quaternary structure. Higher-level structures are motifs and domains. The primary structure is the sequence of residues in the polypedptide chain.

II Aufgaben der Bioinformatik

How can protein structures be predicted Structure prediction methods are coarsely divided into three categories: 1. Comparative modelling If the sequence to model has a homologue in the PDB (Brookhaven protein database) which it is very similar to, the homologue may be used as target and a structural model is built on the basis of this template. 2. Fold recognition In absence of a significantly similar sequence with known structure, various methods put together in the term "Fold Recognition". 3. Ab initio prediction In contrast to the above methods, the goal of ab initio prediction is to build a model for a given sequence without using a template e.g by minimizing knowledge based energy functions (Potential energy for any protein conformation - Potential energy function (PEF) Secondary Structure Prediction

1. Protein structure database - PDB Experimental methods given by X-ray crystallography and NMR spectroscopy to determine protein structure are essential. The Brookhaven Protein Data Bank (PDB) is the repository for those structures. Files include atom coordinates and are suited for visualization by graphical molecule viewers like rasmol. Atom coordinates Sequences (NRL3D)

How are the secondary structures detected in a PDB file The figure below shows the three main chain torsion angles of a polypeptide. These are phi (F), psi (Y), and omega (W). beta alpha omega fixed because of planar peptide bond.

Sequence Analysis on the Web

2.

Sequence Databases SWISS-PROT is a curated protein sequence database which strives to provide a high level of annotations (such as the description of the function of a protein, its domains structure, posttranslational modifications, variants, etc.), a minimal level of redundancy and high level of integration with other databases. TrEMBL is a computer-annotated supplement of SWISS-PROT that contains all the translations of EMBL nucleotide sequence entries not yet integrated in SWISS-PROT. These databases are developed by the SWISS-PROT groups at SIB and at EBI. SwissProt:Release 40 and updates up to 15-Nov-2001: 102164 entries TrEMBL (Nov. 2001): 557388 entries

Homology modelling Quick and easy!!!! Use the SWISS-MODEL server: HTTP://www.expasy.ch/swissmod/SWISS-MODEL.html SWISS-MODEL is an Automated Protein Modelling Server running at the GlaxoWellcome Experimental Research in Geneva, Switzerland. Disclaimer The result of any modelling procedure is NON- EXPERIMENTAL and MUST be considered with care. This is especially true since there is no human intervention during model building. New 3D modeling Server Geno3d: HTTP://geno3d-pbil.ibcp.fr/

TASK DESIGN DomainSweep compares a protein sequence with a range of protein family databases. The output of DomainSweep is comprised of an overview of the different database search results as well as a graphical report on the location of family patterns found in the sequence. PROBLEM Determine function for an uncharacterised protein sequence

Protein Domain Databases Evaluation Protein Analysis Each database has different strengths and weaknesses PFAM, PRODOM: Identification of members of highly divergent superfamilies but less likely to give specific sub-family diagnoses and quality is low PRINTS, BLOCKS: give specific sub-family diagnoses but less coverage Pattern part of PROSITE: good detection of very short motifs but least coverage and unreliable in the identification of highly divergent superfamilies

all alpha Fold classes all beta alpha+beta

Fold class prediction - FoldClass FoldClass (HUSAR) predicts protein fold classes and protein domains from sequence data. The predictions are generated by artificial neural networks (Reczko, M. and Bohr, H. Nucl. Ac. Res. 22: 3616-3619 (1994)). This program predicts: a specific overall fold-class, a super fold-class with respect to secondary structure content and spatial distribution optionally, a profile of possible fold-classes along the sequence.

Fold class prediction - (Gen)Threader Algorithm: A library of unique protein domain folds is derived from PDB Testsequence is optimally fitted to all folds (allowing insertions/deletions) Energy of each possible fit is calculated by summing interactions and solvations parameters The lowest energy fold is taken Unlike most threading methods, such as the original THREADER, GenTHREADER attempts to make inferences about possible evolutionary relationships.

Number of analysis programs is huge. Which one should be used for what purpose? It is difficult to feed results from one program as input into the next program Users need compact presentable reports on analysis results

3.

Energy Minimisation - Start Calculate potentiell energy for a given molecule (atom coordinates): set of nuclear positions of all atoms = R

Energy Minimisation - Method We move the molecule so as to reduce its potential energy. There are several routines to do this: - Steepest Descent - Gradient conjugation - and more Unfortunately no technique can guarantee to find the global energy minimum of a complex problem (although simulated annealing is partial solution).

Modelling Programs WHATIF INSIGHTII GAUSSIAN SCC-DFTB.. GROMOS DISCOVER..

Model Viewer: Rasmol Kinemage Molden Gaussview Sybyl MSViewer Insight WebLab Swiss... SWISS-3DIMAGE (References) is an image database which strives to provide high quality pictures of biological macromolecules with known three-dimensional structure. The database contains mostly images of experimentally elucidated structures, but also provides views of well accepted theoretical protein models. The images are provided in several useful formats; both mono and stereo pictures are generally available (Disclaimer).

Molecule Simulation - Molecular Dynamics - The starting place for most simulations is the experimental crystal or NMR structure. - This is energy minimized, solvated in a box of water. - System is heated (high energy state) - Equilibration and simulation for 1 nano seconds, only short times are possible The detailed atomic motions are usually unimportant. What really matters are "the ensemble average" properties - i.e., what happens on average (MD is in fact chaotic with sensitive dependence on initial conditions - like the weather!).

Molecular Dynamics Proteins are not the static structures that X-ray crystallography can suggest, but are continuously moving. This is a short simulation of crambin, calculated using the AMBER force field.

DNA is not static either. This simulation was calculated using AMBER and a continuum model for water.

MD-Simulation