Secondary structure assignment. Secondary structure assignment and prediction. Talk overview

Size: px
Start display at page:

Download "Secondary structure assignment. Secondary structure assignment and prediction. Talk overview"

Transcription

1 Talk overview Secondary structure assignment and prediction Secondary structure assignment Why to predict secondary structures in proteins Methods to predict secondary structures in proteins Machine learning approaches Detailed description of several specific programs (PHD) May 2011 Eran Eyal Performance and evaluation Automatic assignment of secondary structures to a set of protein coordinates Assignment of secondary structures to known secondary structures is a relatively simple bioinformatics task. Given exact definitions for secondary structures, all we need to do is to see which part of the structure falls within each definition α-helix

2 Why to automatically and routinely assign secondary structures? Standardization Easy visualization Detection of structural motifs and improved sequence-structure searches Β-strand Structural alignment Structural classification What basic structural information is used? q > 120 and r HO < 2.5 Ǻ Hydrogen bond patterns Backbone dihedral angles

3 DSSP algorithm The so-called Dictionary of Secondary Structure of Proteins (DSSP) by Kabsch and Sander makes its sheet and helix assignments solely on the basis of backbone-backbone hydrogen bonds. The DSSP method defines a hydrogen bond when the bond energy is below -0.5 kcal/mol from a Coulomb approximation of the hydrogen bond energy. The structural assignments are defined such that visually appealing and unbroken structures result. In case of overlaps, alpha-helix is given first priority. The helix definition does not include the terminal residue having the initial and final hydrogen bonds in the helix. A minimal size helix is set to have two consecutive hydrogen bonds in the helix, leaving out single helix hydrogen bonds, which are assigned as turns (state 'T'). beta-sheet residues (state 'E') are defined as either having two hydrogen bonds in the sheet, or being surrounded by two hydrogen bonds in the sheet. The minimal sheet consists of two residues at each partner segment. STRIDE The secondary STRuctural IDEntification method by Frishman and Argos uses an empirically derived hydrogen bond energy and phipsi torsion angle criteria to assign secondary structure. Torsion angles are given alpha-helix and beta-sheet propensities according to how close they are to their regions in Ramachandran plots. The parameters are optimized to mirror visual assignments made by crystallographers for a set of proteins. By construction, the STRIDE assignments agreed better with the expert assignments than DSSP, at least for the data set used to optimize the free parameters.

4 Like DSSP, STRIDE assigns the shortest alpha-helix ('H') if it contains at least two consecutive i - i+4 hydrogen bonds. In contrast to DSSP, helices are elongated to comprise one or both edge residues if they have acceptable phi-psi angles, similarly a short helix can be vetoed. hydrogen bond patterns may be ignored if the phi-psi angles are unfavorable. The sheet category does not distinguish between parallel and anti-parallel sheets. The minimal sheet ('E') is composed of two residues. The dihedral angles are incorporated into the final sheet assignment criterion as was done for the alpha-helix. DEFINE An algorithm by Richards and Kundrot which assigns secondary structures by matching Cα-coordinates with a linear distance mask of the ideal secondary structures. First, strict matches are found, which subsequently are elongated and/or joined allowing moderate irregularities or curvature. The algorithm locates the starts and ends of α- and helices, beta-sheets, turns and loops. With these classifications the authors are able to assign 90-95% of all residues to at least one of the given secondary structure classes.

5 Secondary structure prediction Prediction of tertiary structures based on the amino acid sequence is still a very difficult task. Prediction of more local structural properties is easier Prediction of secondary structures and solvent accessibility (SAS) is important and more feasible Prediction of secondary structures is a bridge between the linear information and the 3D structure A-C-H-Y-T-T-E-K-R-G-G-S-G-T-K-K-R-E-A Programs in this field often employ different types of machine learning approaches A-C-H-Y-T-T-E-K-R-G-G-S-G-T-K-K-R-E-A H-H-H-H-H-H-H-H-O-O-O-O-O-S-S-S-S-S-S

6 The importance and the need of predicting the secondary structures in proteins The information might give clues concerning the function of the protein and the existing of specific structural motifs Intermediate step toward construction of a complete 3D model from the sequence. Many degrees of freedom Long search. Pruned to errors Few degrees of freedom Fast search Secondary structure content also allows us to classify a protein to the basic levels of structure type based on its sequence alone.

7 Generations in algorithm development The Chou-Fasman method First generation: uses statistics regarding preferences of individual amino acids. Each amino acid has preferences regarding appearance in secondary structures. This can be determined by counting amino acids in different secondary structures in known solved structures.

8 Second generation: the improvements comparing to the first were the uses of better statistics and statistical methods, and by looking on a set of adjacent amino acids on the sequence (usually windows of amino acids) rather than on individual amino acids The new statistics determined what is the probability of an amino acid to be in a particular secondary structure given that it is in the middle of a local sequence segment. Other segments similar to the given segments might also assist in the prediction. Different methods tried to correspond the segments to other segments in the 3D database by sequence alignments and other methods. The GOR method Strand table Helix table

9 General problems of methods in generations I,II Overall prediction rate was rather low: Overall prediction: 60% B-strands prediction: ~35% Predictions included small secondary elements, with disability to integrate them to longer structures such as those found in protein structures. Third generation: the improvement of the programs in the third generation was mainly due to incorporation of evolutionary information. This was done by looking at the multiple alignment which included sequences similar to the sequence we wish to predict. Such information presented as MSA or by other way include plenty of information which can not be obtained from evolutionary sequences: Which regions are more conserved which substitution are allowed in each position Information regarding interacting sites Comparison of many sequences of protein families helps to detect conserved regions Comparison of many sequences of protein families helps to detect interactions in space SAARDFFRT--HAAGRFFTFT SAARDFFRS--GTRAKFFTFT TAARDFFRF-GKAA-KFFTFT SAARRFFRTGDHAALDFFTFT SAARRFFRWHGLAAIDFFTFT AAARDFFRTGGHAAGRFFTFT AAARDFFRSGGHAAGKFFTFT AAARDFFRTGGHAAGKFFTFT AAARRFFRTGAHAAGDFYTFS AAARRFFRTGGHAAGDFFTFT

10 Information obtained from MSA might help in the prediction. Because the fold of all members of the family is identical, every sequence can contribute the structure prediction of other given sequence in the family The best MSA for this purpose is one which includes many sequences of the family but being not too close one to another Introduction to neural networks Neurons cells are the basic components of the nerve system Every neuron gets information from several other neurons by the dendrites The information is being processed and the neuron makes binary decision if to transfer a signal to other neurons The information is transferred by the axon Computational tasks that the nerve system executes: Representation of data Holding data Learning procedures Decision making Pattern recognition

11 Neural networks - properties System which are composed of many simple processors connected and work in parallel. The information may be obtained by learning process and stored in the connections between the processors The perceptron The perceptron models the action of a single neuron, it can be used to classify only linearly separable cases. Example: binary neuron Example: binary neuron Inputs: S i = 01, Inputs: S i = 01, Output: Θ( W S1 + W2S 2 1 T) Output: Θ( W S1 + W2S 2 1 T) AND gate שער OR s 1 s W 1 =? W 2 =? 1.5 s 1 s 2 W 1 =? 1 W1 2 =? 0.5

12 In practice, usually some differentiable function is used instead of the step function Networks of layers Input Networks with feedback Internal representation Output

13 Training Preparation of a large training set The neural network gets the input and random initial values for the parameters (weights) The network tries to maximize the number of correctly predicted cases by changes in the values of the parameters (weights) To test the net we evaluate its performance on a collection of solved examples (test set) The test set should be independent of the training set. The first interaction of the net with this set should be done during evaluation The test set should be large and representative. It is better to use test set already used for evaluation of other programs designed to solve similar task

14 PHD a third generation program the uses neural networks. PhD is the most popular secondary structure prediction program, although other programs reach the same accuracy, it is still very popular today The versions of this program implement and demonstrate the recent elements which are considered the most important for prediction accuracy Demonstrates the use of machine learning approaches in this field Input: sequence of amino acids. Using data base sequence alignment, similar alignments are found and MSA is built The composition of this alignment is the input to the neural network which is the core of the program Every position in the input sequence is expressed by 21 parameters: the prectage of each amino acid in that position and another character which indicate the start/end of the sequence In addition the input for each position includes global information about the protein composition and the sequence distance between the predicted region to the start/end positions

15 Important of variability in the input sequences Good alignment! The neural network includes several layers: Input layer: sequence -> structure Intermediate layer: structure -> structure Output system: summation of several networks Output: the secondary structure with the highest score is the final prediction for that position

16

17 Comparison of secondary structure prediction tools Assignment Reliability index Prediction periplasmic binding protein 4mbp

18 Reliability index -PHD Combination of different prediction methods Every method has errors which can be classified to 2 general types: 1.Systematic errors 2.Non-systematic errors Several methods can be therefore combined to increase the prediction accuracy The basic condition to successful combination is that the source of error of each individual method is not only systematical Several new methods exploit this fact and train independently several neural networks and predict based on average prediction of all the networks. Another method (Jpred) gets as input results of several existing methods and predict based on that.

19 Many web-server available. To understand some of the sequence signals that might be used we can consider the basic biochemistry of secondary structures α-helix for example has a periodicity of 3.6 amino acids. Helices on the protein surface are expected to posses some signal in this periodicity for positions occupied by hydrophilic and hydrophobic side chains. Finding hydrophobic amino acids in positions i,i+3,i+7, i+10 for example is a strong indication for a helix

20 α-helix in Myoglobin Similarly, in surface B-strands, there is preferences for Zigzag pattern. For example, hydrophilic side chain at positions i, i+2, i+4... and hydrophobic side chains at positions i+1, i+3, i+5 β-strand of CD8 Related topics Average prediction accuracies from (based on the 480 protein set) for 2-state Solvent Accessibility Prediction secondary structures of membrane proteins Prediction of solvent accessibility Rel. Acc. (%) PSIBLAST (%) HMMER2 (%) Combined [change] (%) 25% % %

Peptide bonds: resonance structure. Properties of proteins: Peptide bonds and side chains. Dihedral angles. Peptide bond. Protein physics, Lecture 5

Peptide bonds: resonance structure. Properties of proteins: Peptide bonds and side chains. Dihedral angles. Peptide bond. Protein physics, Lecture 5 Protein physics, Lecture 5 Peptide bonds: resonance structure Properties of proteins: Peptide bonds and side chains Proteins are linear polymers However, the peptide binds and side chains restrict conformational

More information

Lecture 19: Proteins, Primary Struture

Lecture 19: Proteins, Primary Struture CPS260/BGT204.1 Algorithms in Computational Biology November 04, 2003 Lecture 19: Proteins, Primary Struture Lecturer: Pankaj K. Agarwal Scribe: Qiuhua Liu 19.1 The Building Blocks of Protein [1] Proteins

More information

The peptide bond is rigid and planar

The peptide bond is rigid and planar Level Description Bonds Primary Sequence of amino acids in proteins Covalent (peptide bonds) Secondary Structural motifs in proteins: α- helix and β-sheet Hydrogen bonds (between NH and CO groups in backbone)

More information

CSC 2427: Algorithms for Molecular Biology Spring 2006. Lecture 16 March 10

CSC 2427: Algorithms for Molecular Biology Spring 2006. Lecture 16 March 10 CSC 2427: Algorithms for Molecular Biology Spring 2006 Lecture 16 March 10 Lecturer: Michael Brudno Scribe: Jim Huang 16.1 Overview of proteins Proteins are long chains of amino acids (AA) which are produced

More information

Peptide Bonds: Structure

Peptide Bonds: Structure Peptide Bonds: Structure Peptide primary structure The amino acid sequence, from - to C-terminus, determines the primary structure of a peptide or protein. The amino acids are linked through amide or peptide

More information

(c) How would your answers to problem (a) change if the molecular weight of the protein was 100,000 Dalton?

(c) How would your answers to problem (a) change if the molecular weight of the protein was 100,000 Dalton? Problem 1. (12 points total, 4 points each) The molecular weight of an unspecified protein, at physiological conditions, is 70,000 Dalton, as determined by sedimentation equilibrium measurements and by

More information

Built from 20 kinds of amino acids

Built from 20 kinds of amino acids Built from 20 kinds of amino acids Each Protein has a three dimensional structure. Majority of proteins are compact. Highly convoluted molecules. Proteins are folded polypeptides. There are four levels

More information

Helices From Readily in Biological Structures

Helices From Readily in Biological Structures The α Helix and the β Sheet Are Common Folding Patterns Although the overall conformation each protein is unique, there are only two different folding patterns are present in all proteins, which are α

More information

Amino Acids. Amino acids are the building blocks of proteins. All AA s have the same basic structure: Side Chain. Alpha Carbon. Carboxyl. Group.

Amino Acids. Amino acids are the building blocks of proteins. All AA s have the same basic structure: Side Chain. Alpha Carbon. Carboxyl. Group. Protein Structure Amino Acids Amino acids are the building blocks of proteins. All AA s have the same basic structure: Side Chain Alpha Carbon Amino Group Carboxyl Group Amino Acid Properties There are

More information

Disulfide Bonds at the Hair Salon

Disulfide Bonds at the Hair Salon Disulfide Bonds at the Hair Salon Three Alpha Helices Stabilized By Disulfide Bonds! In order for hair to grow 6 inches in one year, 9 1/2 turns of α helix must be produced every second!!! In some proteins,

More information

The peptide bond Peptides and proteins are linear polymers of amino acids. The amino acids are

The peptide bond Peptides and proteins are linear polymers of amino acids. The amino acids are Introduction to Protein Structure Proteins are large heteropolymers usually comprised of 50 2500 monomer units, although larger proteins are observed 7. The monomer units of proteins are amino acids. The

More information

Advanced Medicinal & Pharmaceutical Chemistry CHEM 5412 Dept. of Chemistry, TAMUK

Advanced Medicinal & Pharmaceutical Chemistry CHEM 5412 Dept. of Chemistry, TAMUK Advanced Medicinal & Pharmaceutical Chemistry CHEM 5412 Dept. of Chemistry, TAMUK Dai Lu, Ph.D. dlu@tamhsc.edu Tel: 361-221-0745 Office: RCOP, Room 307 Drug Discovery and Development Drug Molecules Medicinal

More information

Myoglobin and Hemoglobin

Myoglobin and Hemoglobin Myoglobin and Hemoglobin Myoglobin and hemoglobin are hemeproteins whose physiological importance is principally related to their ability to bind molecular oxygen. Myoglobin (Mb) The oxygen storage protein

More information

Secondary Structure Prediction. Michael Tress CNIO

Secondary Structure Prediction. Michael Tress CNIO Secondary Structure Prediction Michael Tress CNIO Why do we Need to Know About Secondary Structure? Secondary structure prediction is a step towards deducing the fold. In order to arrive at the correct

More information

Bioinformatics for Biologists. Protein Structure

Bioinformatics for Biologists. Protein Structure Bioinformatics for Biologists Comparative Protein Analysis: Part III. Protein Structure Prediction and Comparison Robert Latek, PhD Sr. Bioinformatics Scientist Whitehead Institute for Biomedical Research

More information

18.2 Protein Structure and Function: An Overview

18.2 Protein Structure and Function: An Overview 18.2 Protein Structure and Function: An Overview Protein: A large biological molecule made of many amino acids linked together through peptide bonds. Alpha-amino acid: Compound with an amino group bonded

More information

Hydrogen Bonds The electrostatic nature of hydrogen bonds

Hydrogen Bonds The electrostatic nature of hydrogen bonds Hydrogen Bonds Hydrogen bonds have played an incredibly important role in the history of structural biology. Both the structure of DNA and of protein a-helices and b-sheets were predicted based largely

More information

Pipe Cleaner Proteins. Essential question: How does the structure of proteins relate to their function in the cell?

Pipe Cleaner Proteins. Essential question: How does the structure of proteins relate to their function in the cell? Pipe Cleaner Proteins GPS: SB1 Students will analyze the nature of the relationships between structures and functions in living cells. Essential question: How does the structure of proteins relate to their

More information

Protein Physics. A. V. Finkelstein & O. B. Ptitsyn LECTURE 1

Protein Physics. A. V. Finkelstein & O. B. Ptitsyn LECTURE 1 Protein Physics A. V. Finkelstein & O. B. Ptitsyn LECTURE 1 PROTEINS Functions in a Cell MOLECULAR MACHINES BUILDING BLOCKS of a CELL ARMS of a CELL ENZYMES - enzymatic catalysis of biochemical reactions

More information

PROTEINS THE PEPTIDE BOND. The peptide bond, shown above enclosed in the blue curves, generates the basic structural unit for proteins.

PROTEINS THE PEPTIDE BOND. The peptide bond, shown above enclosed in the blue curves, generates the basic structural unit for proteins. Ca 2+ The contents of this module were developed under grant award # P116B-001338 from the Fund for the Improvement of Postsecondary Education (FIPSE), United States Department of Education. However, those

More information

Structure Tools and Visualization

Structure Tools and Visualization Structure Tools and Visualization Gary Van Domselaar University of Alberta gary.vandomselaar@ualberta.ca Slides Adapted from Michel Dumontier, Blueprint Initiative 1 Visualization & Communication Visualization

More information

Biological Molecules

Biological Molecules Biological Molecules I won t lie. This is probably the most boring topic you have ever done in any science. It s pretty much as simple as this: learn the material deal with it. Enjoy don t say I didn t

More information

Structure of proteins

Structure of proteins Structure of proteins Primary structure: is amino acids sequence or the covalent structure (50-2500) amino acids M.Wt. of amino acid=110 Dalton (56 110=5610 Dalton). Single chain or more than one polypeptide

More information

Disaccharides consist of two monosaccharide monomers covalently linked by a glycosidic bond. They function in sugar transport.

Disaccharides consist of two monosaccharide monomers covalently linked by a glycosidic bond. They function in sugar transport. 1. The fundamental life processes of plants and animals depend on a variety of chemical reactions that occur in specialized areas of the organism s cells. As a basis for understanding this concept: 1.

More information

Combinatorial Biochemistry and Phage Display

Combinatorial Biochemistry and Phage Display Combinatorial Biochemistry and Phage Display Prof. Valery A. Petrenko Director - Valery Petrenko Instructors Galina Kouzmitcheva and I-Hsuan Chen Auburn 2006, Spring semester COMBINATORIAL BIOCHEMISTRY

More information

Replication Study Guide

Replication Study Guide Replication Study Guide This study guide is a written version of the material you have seen presented in the replication unit. Self-reproduction is a function of life that human-engineered systems have

More information

Recap. Lecture 2. Protein conformation. Proteins. 8 types of protein function 10/21/10. Proteins.. > 50% dry weight of a cell

Recap. Lecture 2. Protein conformation. Proteins. 8 types of protein function 10/21/10. Proteins.. > 50% dry weight of a cell Lecture 2 Protein conformation ecap Proteins.. > 50% dry weight of a cell ell s building blocks and molecular tools. More important than genes A large variety of functions http://www.tcd.ie/biochemistry/courses/jf_lectures.php

More information

RNA & Protein Synthesis

RNA & Protein Synthesis RNA & Protein Synthesis Genes send messages to cellular machinery RNA Plays a major role in process Process has three phases (Genetic) Transcription (Genetic) Translation Protein Synthesis RNA Synthesis

More information

Introduction to Machine Learning and Data Mining. Prof. Dr. Igor Trajkovski trajkovski@nyus.edu.mk

Introduction to Machine Learning and Data Mining. Prof. Dr. Igor Trajkovski trajkovski@nyus.edu.mk Introduction to Machine Learning and Data Mining Prof. Dr. Igor Trakovski trakovski@nyus.edu.mk Neural Networks 2 Neural Networks Analogy to biological neural systems, the most robust learning systems

More information

This class deals with the fundamental structural features of proteins, which one can understand from the structure of amino acids, and how they are

This class deals with the fundamental structural features of proteins, which one can understand from the structure of amino acids, and how they are This class deals with the fundamental structural features of proteins, which one can understand from the structure of amino acids, and how they are put together. 1 A more detailed view of a single protein

More information

AP BIOLOGY 2008 SCORING GUIDELINES

AP BIOLOGY 2008 SCORING GUIDELINES AP BIOLOGY 2008 SCORING GUIDELINES Question 1 1. The physical structure of a protein often reflects and affects its function. (a) Describe THREE types of chemical bonds/interactions found in proteins.

More information

Linear Sequence Analysis. 3-D Structure Analysis

Linear Sequence Analysis. 3-D Structure Analysis Linear Sequence Analysis What can you learn from a (single) protein sequence? Calculate it s physical properties Molecular weight (MW), isoelectric point (pi), amino acid content, hydropathy (hydrophilic

More information

Chapter 6 DNA Replication

Chapter 6 DNA Replication Chapter 6 DNA Replication Each strand of the DNA double helix contains a sequence of nucleotides that is exactly complementary to the nucleotide sequence of its partner strand. Each strand can therefore

More information

http://faculty.sau.edu.sa/h.alshehri

http://faculty.sau.edu.sa/h.alshehri http://faculty.sau.edu.sa/h.alshehri Definition: Proteins are macromolecules with a backbone formed by polymerization of amino acids. Proteins carry out a number of functions in living organisms: - They

More information

Proteins and Nucleic Acids

Proteins and Nucleic Acids Proteins and Nucleic Acids Chapter 5 Macromolecules: Proteins Proteins Most structurally & functionally diverse group of biomolecules. : o Involved in almost everything o Enzymes o Structure (keratin,

More information

4. Which carbohydrate would you find as part of a molecule of RNA? a. Galactose b. Deoxyribose c. Ribose d. Glucose

4. Which carbohydrate would you find as part of a molecule of RNA? a. Galactose b. Deoxyribose c. Ribose d. Glucose 1. How is a polymer formed from multiple monomers? a. From the growth of the chain of carbon atoms b. By the removal of an OH group and a hydrogen atom c. By the addition of an OH group and a hydrogen

More information

Introduction to Proteins and Enzymes

Introduction to Proteins and Enzymes Introduction to Proteins and Enzymes Basics of protein structure and composition The life of a protein Enzymes Theory of enzyme function Not all enzymes are proteins / not all proteins are enzymes Enzyme

More information

Antibody responses to linear and conformational epitopes

Antibody responses to linear and conformational epitopes Antibody responses to linear and conformational epitopes PhD course: Biological Sequence Analysis 30.05.2008 Pernille Andersen Outline Antibodies and B-cell epitopes Classification of B-cell epitopes Prediction

More information

A. A peptide with 12 amino acids has the following amino acid composition: 2 Met, 1 Tyr, 1 Trp, 2 Glu, 1 Lys, 1 Arg, 1 Thr, 1 Asn, 1 Ile, 1 Cys

A. A peptide with 12 amino acids has the following amino acid composition: 2 Met, 1 Tyr, 1 Trp, 2 Glu, 1 Lys, 1 Arg, 1 Thr, 1 Asn, 1 Ile, 1 Cys Questions- Proteins & Enzymes A. A peptide with 12 amino acids has the following amino acid composition: 2 Met, 1 Tyr, 1 Trp, 2 Glu, 1 Lys, 1 Arg, 1 Thr, 1 Asn, 1 Ile, 1 Cys Reaction of the intact peptide

More information

DNA Worksheet BIOL 1107L DNA

DNA Worksheet BIOL 1107L DNA Worksheet BIOL 1107L Name Day/Time Refer to Chapter 5 and Chapter 16 (Figs. 16.5, 16.7, 16.8 and figure embedded in text on p. 310) in your textbook, Biology, 9th Ed, for information on and its structure

More information

Discrete representations of the protein C. chain Xavier F de la Cruz 1, Michael W Mahoney 2 and Byungkook Lee

Discrete representations of the protein C. chain Xavier F de la Cruz 1, Michael W Mahoney 2 and Byungkook Lee Research Paper 223 Discrete representations of the protein C chain Xavier F de la Cruz 1, Michael W Mahoney 2 and Byungkook Lee Background: When a large number of protein conformations are generated and

More information

The Lipid Bilayer Is a Two-Dimensional Fluid

The Lipid Bilayer Is a Two-Dimensional Fluid The Lipid Bilayer Is a Two-Dimensional Fluid The aqueous environment inside and outside a cell prevents membrane lipids from escaping from bilayer, but nothing stops these molecules from moving about and

More information

FTIR Analysis of Protein Structure

FTIR Analysis of Protein Structure FTIR Analysis of Protein Structure Warren Gallagher A. Introduction to protein structure The first structures of proteins at an atomic resolution were determined in the late 1950 s. 1 From that time to

More information

THREE DIMENSIONAL REPRESENTATION OF AMINO ACID CHARAC- TERISTICS

THREE DIMENSIONAL REPRESENTATION OF AMINO ACID CHARAC- TERISTICS THREE DIMENSIONAL REPRESENTATION OF AMINO ACID CHARAC- TERISTICS O.U. Sezerman 1, R. Islamaj 2, E. Alpaydin 2 1 Laborotory of Computational Biology, Sabancı University, Istanbul, Turkey. 2 Computer Engineering

More information

Amino Acids and Proteins

Amino Acids and Proteins Amino Acids and Proteins Proteins are composed of amino acids. There are 20 amino acids commonly found in proteins. All have: N2 C α R COO Amino acids at neutral p are dipolar ions (zwitterions) because

More information

Proteins. Proteins. Amino Acids. Most diverse and most important molecule in. Functions: Functions (cont d)

Proteins. Proteins. Amino Acids. Most diverse and most important molecule in. Functions: Functions (cont d) Proteins Proteins Most diverse and most important molecule in living i organisms Functions: 1. Structural (keratin in hair, collagen in ligaments) 2. Storage (casein in mother s milk) 3. Transport (HAEMOGLOBIN!)

More information

Consensus alignment server for reliable comparative modeling with distant templates

Consensus alignment server for reliable comparative modeling with distant templates W50 W54 Nucleic Acids Research, 2004, Vol. 32, Web Server issue DOI: 10.1093/nar/gkh456 Consensus alignment server for reliable comparative modeling with distant templates Jahnavi C. Prasad 1, Sandor Vajda

More information

agucacaaacgcu agugcuaguuua uaugcagucuua

agucacaaacgcu agugcuaguuua uaugcagucuua RNA Secondary Structure Prediction: The Co-transcriptional effect on RNA folding agucacaaacgcu agugcuaguuua uaugcagucuua By Conrad Godfrey Abstract RNA secondary structure prediction is an area of bioinformatics

More information

Pairwise Sequence Alignment

Pairwise Sequence Alignment Pairwise Sequence Alignment carolin.kosiol@vetmeduni.ac.at SS 2013 Outline Pairwise sequence alignment global - Needleman Wunsch Gotoh algorithm local - Smith Waterman algorithm BLAST - heuristics What

More information

Papers listed: Cell2. This weeks papers. Chapt 4. Protein structure and function

Papers listed: Cell2. This weeks papers. Chapt 4. Protein structure and function Papers listed: Cell2 During the semester I will speak of information from several papers. For many of them you will not be required to read these papers, however, you can do so for the fun of it (and it

More information

Chapter 12 - Proteins

Chapter 12 - Proteins Roles of Biomolecules Carbohydrates Lipids Proteins 1) Catalytic 2) Transport 3) Regulatory 4) Structural 5) Contractile 6) Protective 7) Storage Nucleic Acids 12.1 -Amino Acids Chapter 12 - Proteins Amino

More information

Paper: 6 Chemistry 2.130 University I Chemistry: Models Page: 2 of 7. 4. Which of the following weak acids would make the best buffer at ph = 5.0?

Paper: 6 Chemistry 2.130 University I Chemistry: Models Page: 2 of 7. 4. Which of the following weak acids would make the best buffer at ph = 5.0? Paper: 6 Chemistry 2.130 University I Chemistry: Models Page: 2 of 7 4. Which of the following weak acids would make the best buffer at ph = 5.0? A) Acetic acid (Ka = 1.74 x 10-5 ) B) H 2 PO - 4 (Ka =

More information

Carbohydrates, proteins and lipids

Carbohydrates, proteins and lipids Carbohydrates, proteins and lipids Chapter 3 MACROMOLECULES Macromolecules: polymers with molecular weights >1,000 Functional groups THE FOUR MACROMOLECULES IN LIFE Molecules in living organisms: proteins,

More information

Structure Check. Authors: Eduard Schreiner Leonardo G. Trabuco. February 7, 2012

Structure Check. Authors: Eduard Schreiner Leonardo G. Trabuco. February 7, 2012 University of Illinois at Urbana-Champaign NIH Resource for Macromolecular Modeling and Bioinformatics Beckman Institute Computational Biophysics Workshop Structure Check Authors: Eduard Schreiner Leonardo

More information

Protein Structure Prediction and Analysis Tools Jianlin Cheng, PhD

Protein Structure Prediction and Analysis Tools Jianlin Cheng, PhD Protein Structure Prediction and Analysis Tools Jianlin Cheng, PhD Assistant Professor Department of Computer Science & Informatics Institute University of Missouri, Columbia 2011 Sequence, Structure and

More information

Lectures 2 & 3. If the base pair is imbedded in a helix, then there are several more angular attributes of the base pair that we must consider:

Lectures 2 & 3. If the base pair is imbedded in a helix, then there are several more angular attributes of the base pair that we must consider: Lectures 2 & 3 Patterns of base-base hydrogen bonds-characteristics of the base pairs How are double helices assembled?? Figure 13 Let us first examine the angular characteristics of base pairs. Figure

More information

Chapter 3 Molecules of Cells

Chapter 3 Molecules of Cells Bio 100 Molecules of cells 1 Chapter 3 Molecules of Cells Compounds containing carbon are called organic compounds Molecules such as methane that are only composed of carbon and hydrogen are called hydrocarbons

More information

Chapter 6. The stacking ensemble approach

Chapter 6. The stacking ensemble approach 82 This chapter proposes the stacking ensemble approach for combining different data mining classifiers to get better performance. Other combination techniques like voting, bagging etc are also described

More information

Role of Hydrogen Bonding on Protein Secondary Structure Introduction

Role of Hydrogen Bonding on Protein Secondary Structure Introduction Role of Hydrogen Bonding on Protein Secondary Structure Introduction The function and chemical properties of proteins are determined by its three-dimensional structure. The final architecture of the protein

More information

RNA Structure and folding

RNA Structure and folding RNA Structure and folding Overview: The main functional biomolecules in cells are polymers DNA, RNA and proteins For RNA and Proteins, the specific sequence of the polymer dictates its final structure

More information

Overview'of'Solid-Phase'Peptide'Synthesis'(SPPS)'and'Secondary'Structure'Determination'by'FTIR'

Overview'of'Solid-Phase'Peptide'Synthesis'(SPPS)'and'Secondary'Structure'Determination'by'FTIR' verviewofsolid-phasepeptidesynthesis(spps)andsecondarystructuredeterminationbyftir Introduction Proteinsareubiquitousinlivingorganismsandcells,andcanserveavarietyoffunctions.Proteinscanactas enzymes,hormones,antibiotics,receptors,orserveasstructuralsupportsintissuessuchasmuscle,hair,and

More information

Hidden Markov Models in Bioinformatics. By Máthé Zoltán Kőrösi Zoltán 2006

Hidden Markov Models in Bioinformatics. By Máthé Zoltán Kőrösi Zoltán 2006 Hidden Markov Models in Bioinformatics By Máthé Zoltán Kőrösi Zoltán 2006 Outline Markov Chain HMM (Hidden Markov Model) Hidden Markov Models in Bioinformatics Gene Finding Gene Finding Model Viterbi algorithm

More information

MCAT Organic Chemistry - Problem Drill 23: Amino Acids, Peptides and Proteins

MCAT Organic Chemistry - Problem Drill 23: Amino Acids, Peptides and Proteins MCAT rganic Chemistry - Problem Drill 23: Amino Acids, Peptides and Proteins Question No. 1 of 10 Question 1. Which amino acid does not contain a chiral center? Question #01 (A) Serine (B) Proline (C)

More information

Part A: Amino Acids and Peptides (Is the peptide IAG the same as the peptide GAI?)

Part A: Amino Acids and Peptides (Is the peptide IAG the same as the peptide GAI?) ChemActivity 46 Amino Acids, Polypeptides and Proteins 1 ChemActivity 46 Part A: Amino Acids and Peptides (Is the peptide IAG the same as the peptide GAI?) Model 1: The 20 Amino Acids at Biological p See

More information

Chapter 5. The Structure and Function of Macromolecule s

Chapter 5. The Structure and Function of Macromolecule s Chapter 5 The Structure and Function of Macromolecule s Most Macromolecules are polymers: Polymer: (poly: many; mer: part) Large molecules consisting of many identical or similar subunits connected together.

More information

Computational Systems Biology. Lecture 2: Enzymes

Computational Systems Biology. Lecture 2: Enzymes Computational Systems Biology Lecture 2: Enzymes 1 Images from: David L. Nelson, Lehninger Principles of Biochemistry, IV Edition, Freeman ed. or under creative commons license (search for images at http://search.creativecommons.org/)

More information

Lecture Overview. Hydrogen Bonds. Special Properties of Water Molecules. Universal Solvent. ph Scale Illustrated. special properties of water

Lecture Overview. Hydrogen Bonds. Special Properties of Water Molecules. Universal Solvent. ph Scale Illustrated. special properties of water Lecture Overview special properties of water > water as a solvent > ph molecules of the cell > properties of carbon > carbohydrates > lipids > proteins > nucleic acids Hydrogen Bonds polarity of water

More information

Chapter 3: Biological Molecules. 1. Carbohydrates 2. Lipids 3. Proteins 4. Nucleic Acids

Chapter 3: Biological Molecules. 1. Carbohydrates 2. Lipids 3. Proteins 4. Nucleic Acids Chapter 3: Biological Molecules 1. Carbohydrates 2. Lipids 3. Proteins 4. Nucleic Acids Elements in Biological Molecules Biological macromolecules are made almost entirely of just 6 elements: Carbon (C)

More information

Structures of Proteins. Primary structure - amino acid sequence

Structures of Proteins. Primary structure - amino acid sequence Structures of Proteins Primary structure - amino acid sequence Secondary structure chain of covalently linked amino acids folds into regularly repeating structures. Secondary structure is the result of

More information

Gold (Genetic Optimization for Ligand Docking) G. Jones et al. 1996

Gold (Genetic Optimization for Ligand Docking) G. Jones et al. 1996 Gold (Genetic Optimization for Ligand Docking) G. Jones et al. 1996 LMU Institut für Informatik, LFE Bioinformatik, Cheminformatics, Structure based methods J. Apostolakis 1 Genetic algorithms Inspired

More information

K'NEX DNA Models. Developed by Dr. Gary Benson Department of Biomathematical Sciences Mount Sinai School of Medicine

K'NEX DNA Models. Developed by Dr. Gary Benson Department of Biomathematical Sciences Mount Sinai School of Medicine KNEX DNA Models Introduction Page 1 of 11 All photos by Kevin Kelliher. To download an Acrobat pdf version of this website Click here. K'NEX DNA Models Developed by Dr. Gary Benson Department of Biomathematical

More information

A disaccharide is formed when a dehydration reaction joins two monosaccharides. This covalent bond is called a glycosidic linkage.

A disaccharide is formed when a dehydration reaction joins two monosaccharides. This covalent bond is called a glycosidic linkage. CH 5 Structure & Function of Large Molecules: Macromolecules Molecules of Life All living things are made up of four classes of large biological molecules: carbohydrates, lipids, proteins, and nucleic

More information

INTRODUCTION TO PROTEIN STRUCTURE

INTRODUCTION TO PROTEIN STRUCTURE Name Class: Partner, if any: INTRODUCTION TO PROTEIN STRUCTURE PRIMARY STRUCTURE: 1. Write the complete structural formula of the tripeptide shown (frame 10). Circle and label the three sidechains which

More information

Steffen Lindert, René Staritzbichler, Nils Wötzel, Mert Karakaş, Phoebe L. Stewart, and Jens Meiler

Steffen Lindert, René Staritzbichler, Nils Wötzel, Mert Karakaş, Phoebe L. Stewart, and Jens Meiler Structure 17 Supplemental Data EM-Fold: De Novo Folding of α-helical Proteins Guided by Intermediate-Resolution Electron Microscopy Density Maps Steffen Lindert, René Staritzbichler, Nils Wötzel, Mert

More information

Non-Covalent Bonds (Weak Bond)

Non-Covalent Bonds (Weak Bond) Non-Covalent Bonds (Weak Bond) Weak bonds are those forces of attraction that, in biological situations, do not take a large amount of energy to break. For example, hydrogen bonds are broken by energies

More information

Nafith Abu Tarboush DDS, MSc, PhD natarboush@ju.edu.jo www.facebook.com/natarboush

Nafith Abu Tarboush DDS, MSc, PhD natarboush@ju.edu.jo www.facebook.com/natarboush Nafith Abu Tarboush DDS, MSc, PhD natarboush@ju.edu.jo www.facebook.com/natarboush α-keratins, bundles of α- helices Contain polypeptide chains organized approximately parallel along a single axis: Consist

More information

Protein Secondary Structure Prediction: Novel Methods and Software Architectures

Protein Secondary Structure Prediction: Novel Methods and Software Architectures Ph.D. in Electronic and Computer Engineering Dept. of Electrical and Electronic Engineering University of Cagliari Protein Secondary Structure Prediction: Novel Methods and Software Architectures Filippo

More information

BIOLOGICAL MEMBRANES: FUNCTIONS, STRUCTURES & TRANSPORT

BIOLOGICAL MEMBRANES: FUNCTIONS, STRUCTURES & TRANSPORT BIOLOGICAL MEMBRANES: FUNCTIONS, STRUCTURES & TRANSPORT UNIVERSITY OF PNG SCHOOL OF MEDICINE AND HEALTH SCIENCES DISCIPLINE OF BIOCHEMISTRY AND MOLECULAR BIOLOGY BMLS II / B Pharm II / BDS II VJ Temple

More information

FLUORESCENT PROTEINS - XFPs

FLUORESCENT PROTEINS - XFPs FLUORESCENT PROTEINS - XFPs Marcel Walser; PhD student ETH Zürich Taskforce Kommunikation Presentations Content - Originating organisms - GFP characteristics - Applications - Short protein structure review

More information

Biological molecules:

Biological molecules: Biological molecules: All are organic (based on carbon). Monomers vs. polymers: Monomers refer to the subunits that, when polymerized, make up a larger polymer. Monomers may function on their own in some

More information

A reduced model of short range interactions in polypeptide chains

A reduced model of short range interactions in polypeptide chains A reduced model of short range interactions in polypeptide chains Andrzej Kolinski a) Department of Chemistry, University of Warsaw, Pasteura 1, 0-093 Warsaw, Poland (and Department of Molecular Biology,

More information

How To Understand The Chemistry Of Organic Molecules

How To Understand The Chemistry Of Organic Molecules CHAPTER 3 THE CHEMISTRY OF ORGANIC MOLECULES 3.1 Organic Molecules The chemistry of carbon accounts for the diversity of organic molecules found in living things. Carbon has six electrons, four of which

More information

ECBDL 14: Evolu/onary Computa/on for Big Data and Big Learning Workshop July 13 th, 2014 Big Data Compe//on

ECBDL 14: Evolu/onary Computa/on for Big Data and Big Learning Workshop July 13 th, 2014 Big Data Compe//on ECBDL 14: Evolu/onary Computa/on for Big Data and Big Learning Workshop July 13 th, 2014 Big Data Compe//on Jaume Bacardit jaume.bacardit@ncl.ac.uk The Interdisciplinary Compu/ng and Complex BioSystems

More information

Protein annotation and modelling servers at University College London

Protein annotation and modelling servers at University College London Nucleic Acids Research Advance Access published May 27, 2010 Nucleic Acids Research, 2010, 1 6 doi:10.1093/nar/gkq427 Protein annotation and modelling servers at University College London D. W. A. Buchan*,

More information

Introduction to Protein Folding

Introduction to Protein Folding Introduction to Protein Folding Chapter 4 Proteins: Three Dimensional Structure and Function Conformation - three dimensional shape Native conformation - each protein folds into a single stable shape (physiological

More information

Genetic information (DNA) determines structure of proteins DNA RNA proteins cell structure 3.11 3.15 enzymes control cell chemistry ( metabolism )

Genetic information (DNA) determines structure of proteins DNA RNA proteins cell structure 3.11 3.15 enzymes control cell chemistry ( metabolism ) Biology 1406 Exam 3 Notes Structure of DNA Ch. 10 Genetic information (DNA) determines structure of proteins DNA RNA proteins cell structure 3.11 3.15 enzymes control cell chemistry ( metabolism ) Proteins

More information

The Steps. 1. Transcription. 2. Transferal. 3. Translation

The Steps. 1. Transcription. 2. Transferal. 3. Translation Protein Synthesis Protein synthesis is simply the "making of proteins." Although the term itself is easy to understand, the multiple steps that a cell in a plant or animal must go through are not. In order

More information

Chapter 5: The Structure and Function of Large Biological Molecules

Chapter 5: The Structure and Function of Large Biological Molecules Name Period Concept 5.1 Macromolecules are polymers, built from monomers 1. The large molecules of all living things fall into just four main classes. Name them. 2. Circle the three classes that are called

More information

Chapter 3. Protein Structure and Function

Chapter 3. Protein Structure and Function Chapter 3 Protein Structure and Function Broad functional classes So Proteins have structure and function... Fine! -Why do we care to know more???? Understanding functional architechture gives us POWER

More information

Supplementary Figures S1 - S11

Supplementary Figures S1 - S11 1 Membrane Sculpting by F-BAR Domains Studied by Molecular Dynamics Simulations Hang Yu 1,2, Klaus Schulten 1,2,3, 1 Beckman Institute, University of Illinois, Urbana, Illinois, USA 2 Center of Biophysics

More information

Invariant residue-a residue that is always conserved. It is assumed that these residues are essential to the structure or function of the protein.

Invariant residue-a residue that is always conserved. It is assumed that these residues are essential to the structure or function of the protein. Chapter 6 The amino acid side chains have polar and nonpolar properties, and the relative hydrophobicity of the amino acid side chains is critical for the folding and stability of a protein. The more hydrophobic

More information

Activity 7.21 Transcription factors

Activity 7.21 Transcription factors Purpose To consolidate understanding of protein synthesis. To explain the role of transcription factors and hormones in switching genes on and off. Play the transcription initiation complex game Regulation

More information

Neural Networks and Support Vector Machines

Neural Networks and Support Vector Machines INF5390 - Kunstig intelligens Neural Networks and Support Vector Machines Roar Fjellheim INF5390-13 Neural Networks and SVM 1 Outline Neural networks Perceptrons Neural networks Support vector machines

More information

Introduction to Principal Components and FactorAnalysis

Introduction to Principal Components and FactorAnalysis Introduction to Principal Components and FactorAnalysis Multivariate Analysis often starts out with data involving a substantial number of correlated variables. Principal Component Analysis (PCA) is a

More information

Neural Network Design in Cloud Computing

Neural Network Design in Cloud Computing International Journal of Computer Trends and Technology- volume4issue2-2013 ABSTRACT: Neural Network Design in Cloud Computing B.Rajkumar #1,T.Gopikiran #2,S.Satyanarayana *3 #1,#2Department of Computer

More information

Seminar. Path planning using Voronoi diagrams and B-Splines. Stefano Martina stefano.martina@stud.unifi.it

Seminar. Path planning using Voronoi diagrams and B-Splines. Stefano Martina stefano.martina@stud.unifi.it Seminar Path planning using Voronoi diagrams and B-Splines Stefano Martina stefano.martina@stud.unifi.it 23 may 2016 This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International

More information

Proteins the primary biological macromolecules of living organisms

Proteins the primary biological macromolecules of living organisms Proteins the primary biological macromolecules of living organisms Protein structure and folding Primary Secondary Tertiary Quaternary structure of proteins Structure of Proteins Protein molecules adopt

More information

Describe the process of parallelization as it relates to problem solving.

Describe the process of parallelization as it relates to problem solving. Level 2 (recommended for grades 6 9) Computer Science and Community Middle school/junior high school students begin using computational thinking as a problem-solving tool. They begin to appreciate the

More information

1.1.2. thebiotutor. AS Biology OCR. Unit F211: Cells, Exchange & Transport. Module 1.2 Cell Membranes. Notes & Questions.

1.1.2. thebiotutor. AS Biology OCR. Unit F211: Cells, Exchange & Transport. Module 1.2 Cell Membranes. Notes & Questions. thebiotutor AS Biology OCR Unit F211: Cells, Exchange & Transport Module 1.2 Cell Membranes Notes & Questions Andy Todd 1 Outline the roles of membranes within cells and at the surface of cells. The main

More information

IV. -Amino Acids: carboxyl and amino groups bonded to -Carbon. V. Polypeptides and Proteins

IV. -Amino Acids: carboxyl and amino groups bonded to -Carbon. V. Polypeptides and Proteins IV. -Amino Acids: carboxyl and amino groups bonded to -Carbon A. Acid/Base properties 1. carboxyl group is proton donor! weak acid 2. amino group is proton acceptor! weak base 3. At physiological ph: H

More information