PROTEIN STRUCTURE PREDICTION USING SUPPORT VECTOR MACHINE
|
|
|
- Andrew Welch
- 10 years ago
- Views:
Transcription
1 PROTEIN STRUCTURE PREDICTION USING SUPPORT VECTOR MACHINE Anil Kumar Mandle 1, Pranita Jain 2, and Shailendra Kumar Shrivastava 3 1 Research Scholar, Information Technology Department Samrat Ashok Technological InstituteVidisha, (M. P.) INDIA [email protected] 2 Astt.Prof. Information Technology Department Samrat Ashok Technological Institute Vidisha, (M. P.) INDIA [email protected] 3 HOD, Information Technology Department Samrat Ashok Technological Institute Vidisha, (M. P.) INDIA [email protected] ABSTRACT Support Vector Machine (SVM) is used for predict the protein structural. Bioinformatics method use to protein structure prediction mostly depends on the amino acid sequence. In this paper, work predicted of 1- D, 2-D, and 3-D protein structure prediction. Protein structure prediction is one of the most important problems in modern computation biology. Support Vector Machine haves shown strong generalization ability protein structure prediction. Binary classification techniques of Support Vector Machine are implemented and RBF kernel function is used in SVM. This Radial Basic Function (RBF) of SVM produces better accuracy in terms of classification and the learning results. KEYWORDS Bioinformatics, Support Vector Machine, protein folding, protein structure prediction. 1. INTRODUCTION A protein is a polymeric macromolecule made of amino acid building blocks arranged in a linear chain and joined together by peptide bonds. The primary structure is typically represented by a sequence of letters over a 20-letter alphabet associated with the 20 naturally occurring amino acids. Proteins are the main building blocks and functional molecules of the cell, taking up almost 20% of a eukaryotic cell s weight, the largest contribution after water (70%). Protein structure prediction is one of the most important problems in modern computational biology. It is therefore becoming increasingly important to predict protein structure from its amino acid sequence, using insight obtained from already known structures.the secondary structure is specified by a sequence classifying each amino acid into the corresponding secondary structure element (e.g., alpha, beta, or gamma). DOI : /ijsc
2 Fig 1: Protein Sequence-Structure-function. Proteins are probably the most important class of biochemical molecules, although of course lipids and carbohydrates are also essential for life. Proteins are the basis for the major structural components of animal and human tissue. Extensive biochemical experiments [5], [6], [9], [10] have shown that a protein s function is determined by its structure. Experimental approaches such as X-ray crystallography [11], [12] and nuclear magnetic resonance (NMR) spectroscopy [13], [14] are the main techniques for determining protein structures. Since the determination of the first two protein structures (myoglobin and hemoglobin) using X-ray crystallography [5], [6], the number of proteins with solved structures has increased rapidly. Currently, there are about proteins with empirically known structures deposited in the Protein Data Bank (PDB) [15]. 2. PROTEIN STRUCTURE 2.1 Primary Structure Fig 2: Structure of Amino Acid The primary structure refers to amino acid sequence is called primary structure. The primary structure is held together by covalent or peptide bonds, which are made during the process of protein biosynthesis or translation. The primary structure of a protein is determined by the gene corresponding to the protein. A specific sequence of nucleotides in DNA is transcribed into mrna, which is read by the ribosome in a process called translation. The sequence of a protein is unique to that protein, and defines the structure and function of the protein. The sequence of a protein can be determined by methods such as Edman degradation or tandem mass spectrometry. Often however, it is read directly from the sequence of the gene using the genetic code. Posttranslational modifications such as disulfide formation, phosphorylations and glycosylations are usually also considered a part of the primary structure, and cannot be read from the gene [16]. 2.2 Secondary Structure Alpha helix may be considered the default state for secondary structure. Although the potential energy is not as low as for beta sheet, H-bond formation is intra-strand, so there is an entropic advantage over beta sheet, where H-bonds must form from strand to strand, with strand segments that may be quite distant in the polypeptide sequence. Two main types of secondary structure, the alpha helix and the beta strand, were suggested in 1951 by Linus Pauling and coworkers. These secondary structures are defined by patterns of hydrogen bonds between the main-chain peptide groups. They have a regular geometry, being constrained to specific values of the dihedral angles 68
3 ψ and φ on the Ramachandran plot. Both the alpha helix and the beta-sheet represent a way of saturating all the hydrogen bond donors and acceptors in the peptide backbone [17]. 2.3 Tertiary structure The folding is driven by the non-specific hydrophobic interactions (the burial of hydrophobic residues from water), but the structure is stable only when the parts of a protein domain are locked into place by specific tertiary interactions, such as salt bridges, hydrogen bonds, and the tight packing of side chains and disulfide bonds. The disulfide bonds are extremely rare in cytosolic proteins, since the cytosol is generally a reducing environment [18]. 2.4 Quaternary structure Protein quaternary structure can be determined using a variety of experimental techniques that require a sample of protein in a variety of experimental conditions. The experiments often provide an estimate of the mass of the native protein and, together with knowledge of the masses and/or stoichiometry of the subunits, allow the quaternary structure to be predicted with a given accuracy. It is not always possible to obtain a precise determination of the subunit composition for a variety of reasons. The subunits are frequently related to one another by symmetry operations, such as a 2-fold axis in a dimer. Multimers made up of identical subunits are referred to with a prefix of "homo-" (e.g. a homotetramer) and those made up of different subunits are referred to with a prefix of "hetero-" (e.g. a heterotetramer, such as the two alpha and two beta chains of hemoglobin) [19]. Fig 3: Four levels of protein structure. 69
4 3. METHOD AND MATERIAL 3.1. Database of homology-derived structure (HSSP) Content of database More than 300 files were produced, one for each PDB protein from the fall 1989 release of PDB with release 12 of EMBL/Swissprot (12305 sequences). This corresponds to derived structures for 3512 proteins or protein fragments; 1854 of these are homologous over a length of at least 80 residues. Some of these proteins are very similar to their PDB cousin, differing by as little as one residue out of several hundred Size of database The increase in total information content in HSSP over PDB is as difficult to quantify as the increase in information when a homologous protein is solved by crystallography. A rough conservative estimate can be made as follows. The average number of aligned sequences is 103 per PDB entry. Of the 3512 aligned sequences (counting each protein exactly once) 1831 are more than 50% different (sequence identity) from any PDB cousin; after filtering out short fragments and potential unexpected positives by requiring an alignment length of at least 80 residues [29] Limited database Any empirical investigation is limited by the size of the database. Deviations from the principles observed here are possible as more and perhaps new classes of protein structures become known Dictionaries Secondary Structure Protein (DSSP) The DSSP classifies residues into eight different secondary structure classes: H (o-helix), G ( helix), I (21-helix), E (strand), B (isolated n-bridge), T (turn), S (bend), and- (rest). In this study, these eight classes are reduced into three regular classes based on the following Table 1. There are other ways of class reduction as well but the one applied in this study is considered to be more effective [21]. DSSP Class helix α-helix π-helix 8-state symbol G H I 3-state symbol H Class Name Helix Β-strand E E Sheet Isolated β-bridge Bend Turn Rest(connection region) B S T - C Loop Table: 1 8-To-3 state reduction method in secondary structure 70
5 The RS 126 data set is proposed by Rost & Sander and according to their definition, it is nonhomologous set Data Coding Feature extraction is a form of pre-processing in which the original variables are transformed into new inputs for classification. This initial process is important in protein structure prediction as the primary sequences of the data are presented as single letter code. It is therefore important to transform them into numbers. Different procedures can be adopted for this purpose, however, for the purpose of the present study, orthogonal coding will be used to convert the letters into numbers. Input and coding system Fig 4: Input and output coding for protein secondary structure prediction Fig: 3 give a network structure for a general classifier. The primary sequences are used as inputs to the network. To determine these inputs, a similar coding scheme as used by Holley and Karplus [28] has been adopted. To read the inputs into the network, a network encodes a moving window through the primary sequences. 4. STRUCTURE PREDICTION 4.1. Structure Prediction 1-D Protein Sequence: Input 1D MVLSEGEWQLVLHVWAKVEADVAGHGQDILIRLFKSHPMVLSEGEWQLVLHVWAKV CCCCCHHHHHHHHHHHHHHCCCHHHHHHHHHHHHHHCCCCCHHHEEEEEEHHHHH Protein Structure: Output 1D 4.2. Structure Prediction 2-D Example depicts a predicted 2-D contact map with an 8 Angstrom cutoff. The protein sequence is aligned along the sides of the contact map both horizontally and vertically showing fig.4. 71
6 4.3. Structure Prediction 3-D Fig 5: Two-dimensional protein structure prediction. Fig 6: Three-dimensional protein structure prediction There are 20 different amino acids that can occur in proteins. Their names are abbreviated in a three letter code or a one letter code. The amino acids and their letter codes are given in Table. Glycine Gly G Tyrosine Try Y Alanine Ala A Methionine Mer M Serine ser S Tryptophan Trp T Threonine Thr T Asparagine Asn A Cysteine Cys C Glutamine Gln G Valine Val V Histidine His H Isoleucine Ile I Aspartic Acid Asp A Leucine Leu L Glutamic Acid Glu G Proline Pro P Lysine Lys L Phenylalanine Phe P Arginine Arg A Table: 2 Amino acids 72
7 5. INTRODUCTION SUPPORT VECTOR MACHINE Support Vector Machine is supervised Machine Learning technique. The existence of SVM is shown in figure 6. Computer Vision is the broad area whereas Machine Learning is one of the application domains of Artificial Intelligence along with pattern recognition, Robotics, Natural Language Processing [18]. Supervised learning, Un-supervised learning, Semi-supervised learning and reinforcement learning are various types of Machine Learning. 5.1 Support Vector Machine Fig 7: Existence of Support Vector Machine In this section, we give a brief review of SVM classification. SVM is a novel-learning machine first developed by Vapnik[16]. We consider a binary classification task with input variables X i (i 1,...,l) = having corresponding labels Y i = {-1,+1}. SVM finds the hyperplane to separate these two classes with a maximum margin. This is equivalent to solving the following optimization problem: Min ½ w T w... (1) Subject to: y i (w.x i +b)...(2) In Fig. 8, is a sample linearly separable case, soli d points and circle points represent two kinds of sample e separately. H is the separating hyperplane. H1 and H2 are two hyperplane through the closest points (the Support Vectors, SVs). The margin is the perpendicular distance between the separating hyperplane H1 and H2. 73
8 Fig 8: Optimal separation hyperplane To allow some training errors for generalization, slack variables ξ i and penalty parameter C are introduced. The optimization problem is re-formulated as: l Min ½ w T.w + C ξ i... (3) i=1 y i (w.x i +b) 1- ξ i Subject to:... (4) i = 1...l; ξ i 0 l The purpose of C ξ i is to control the number of i=1 Misclassified samples. The users choose parameter C so that a large C corresponds to assigning a higher penal ty to errors [20]. By introducing Lagrange multiples α i for the constraints in the (3) (4), the problem can be transformed into its dual form l l Min ½ y i y j α i α j (x i. x) - α i... (5) α i=1 j=1 Decision function as: l Subject to: y i α i, 0 α i C, i=1...,l...(6) i=1 f(x) = sign (w.x + b) l = sign ( α i y i (x i.x) + b)...(7) i=1 For nonlinear case, we map the input space into high dimension feature space by a nonlinear mapping. However, here we only need to select a kernel function and regularization parameter C to train the SVM. Our substantial tests show that the RBF (radial basis function) kernel, defined as, K (x i, x j ) = exp (-ϒ x i x j 2 )...(8) 74
9 With a suitable choice of kernel RBF (Radial Basis Function) the data can become separable in feature space despite being non-separable in the original input space. 6. RESEARCH AND METHODOLOGY The proposed method used to RBF (Radial Basis Function) of SVM. Protein structure prediction is the prediction of the three-dimensional structure of a protein from its amino acid sequence that is, the prediction of its secondary, and tertiary from its primary structure. Structure prediction is fundamentally different from the inverse problem of protein design. Protein structure prediction is one of the most important goals pursued by bioinformatics and theoretical chemistry; it is highly important in medicine (for example, in drug design) and biotechnology (for example, in the design of novel enzymes). Bioinformatics method to used protein secondary structure prediction mostly depends on the information available in amino acid sequence. SVM represents a new approach to supervised pattern classification which has been successfully applied to a wide range of pattern recognition problems, including object recognition, speaker identification, gene function prediction with microarray expression profile, etc. It is a good method of protein structure prediction which is based on the theory of SVM [26] Protein Structure Prediction Based SVM Protein structure prediction has performed by machine learning techniques such as support vector machines (SVM s).using SVM to construct classifiers to distinguish parallel and antipararallel beta sheets. Sequences are encoded as psiblast profiles. With seven-cross validation carried on a non-homologous protein dataset, the obtain result shows that this two categories are separable by sequence profile [27]. β-turns play an important role in protein structures not only because of their sheer abundance, which is estimated to be approximately 25% of all protein residues, but also because of their significance in high order structures of proteins. Hua-Sheng Chiu introduces a new method of β-turn prediction based SVM that uses a two-stage classification scheme and an integrated framework for input features. The experimental results demonstrate that it achieves substantial improvements over Beta turn, the current best method [28]. 7. RESULT ANALYSIS Protein Name Accuracy rate = Correct Predicted Instance *100 No. of Instance Protein ID Accuracy Structure Prediction (1D) Accuracy Structure Prediction (2D) Accuracy Structure Prediction (3D) Average Accuracy Prediction Execute Time BACTERIOPHYTOCHROME 1ztu m5 Sec RIBONUCLEASE 2bir Sec GLOBIN 2w Sec MYOGLOBIN 101m Sec Table: 3 Protein Prediction 1D, 2D, and 3D Accuracy 75
10 Fig 9: Prediction Accuracy of 1-D, 2-D, and 3-D Structure Average Accuracy rate = [1D+2D+3D] /3 * m 2bir 2w31 1ztu 8. CONCLUSIONS Fig 10: Protein Prediction 1-D, 2-D, and 3-D Average Accuracy Support Vector Machine is learning system that uses a high dimensional feature space, trained with a learning algorithm from optimization theory. Since SVM has many advantageous features including effective avoidance of over-fitting, the ability to manage large feature spaces, and information condensing of the given data, it has been gradually applied to pattern classification problem in biology. As a part of future work, accuracy rate need to be tested by increasing datasets. This paper implements RBF kernel function of SVM. By application of other kernel functions such as Linear Function, Polynomial Kernel function, Sigmoidal function, this accuracy of Protein Structure Prediction can be further increased. 76
11 9. REFERENCES International Journal on Soft Computing ( IJSC ) Vol.3, No.1, February 2012 [1] Rost, B. and Sander, C. "Improved prediction of protein secondary structure by use of sequence profile and neural networks.", Proc Natl Acad Sci U S A 90, pp , 1993 [2] Blaise Gassend et al. Secondary Structure Prediction of All-Helical Proteins Using HiddenMarkov Support Vector Machines. Technical Report MIT-CSAIL-TR , MIT, October [3] Spritz: a server for the prediction of intrinsically disordered regions in protein sequences using kernel machines. Nucleic Acids Research, 2006, Vol. 34. [4] Yann Guermeur, Gianluca Pollastri et al.combining protein secondary structure prediction models with ensemble methods of optimal complexity. Neurocomputing, 2004(56): [5] Jayavardhaha Rame G.L. etal, Disulphide Bridge Prediction using Fuzzy Support Vector Machines,Intelligent Sensing and Information Processing, pp [6] Long-Hui Wang, Juan Liu. Predicting Protein Secondary Structure by a Support Vector Machine Based on a New Coding Scheme, Genome Informatics, 2004, 15(2): pp [7] K. A. Dill, Dominant forces in protein folding, Biochemistry, vol. 31, pp , [8] R. A. Laskowski, J. D. Watson, and J. M. Thornton, From protein structure to biochemical function?, J. Struct. Funct. Genomics, vol. 4, pp , [9] A. Travers, DNA conformation and protein binding, Ann. Rev. Biochem., vol. 58, pp , [10] P. J. Bjorkman and P. Parham. Structure, function and diversity of class I major histocompatibility complex molecules, Ann. Rev. Biochem., vol. 59, pp , [11] L. Bragg, The Development of X-Ray Analysis, London, U.K.: G. Bell, [12] T. L. Blundell and L. H. Johnson, Protein Crystallography. New York: Academic, [13] K. Wuthrich, NMR of Proteins and Nucleic Acids. New York: Wiley, [14] E. N. Baldwin, I. T. Weber, R. S. Charles, J. Xuan, E. Appella, M. Yamada, K. Matsushima, B. F. P. Edwards, G. M. Clore, A. M. Gronenborn, and A. Wlodawar, Crystal structure of interleukin 8: Symbiosis of NMR and crystallography, Proc. Nat. Acad. Sci., vol. 88, pp , [15] H. M. Berman, J. Westbrook, Z. Feng, G. Gilliland, T. N. Bhat, H. Weissig, I. N. Shindyalov, and P. E. Bourne, The protein data bank, Nucl. Acids Res., vol. 28, pp , [16] C Cortes and V Vapnik. Support-vector networks. Machine Learning, 1995, 20(3): [17] Osuna E., Freund R., Girosi F.: Support vector machines: Training and applications, Massachusetts Institute of Technology, AI Memo No [18] Sujun Hua and Zhirong Sun. A Novel Method of Protein Secondary Structure Prediction with High Segment Overlap Measure: Support Vector Machine Approach,J. Mol. Biol.(2001)308, [19] Jian Guo, Hu Chen, Zhirong Sun. A Novel Method for Protein Secondary Structure Prediction Using Dual-Layer SVM and Profiles.PROTEINS: Structure, Function, and Bioinformatics, 54: (2004). 77
12 [20] Shing-Hwang Doong,Chi-yuan Yeh.A Hybrid Method for Protein Secondary Structure Prediction. Computer Symposium,Dec.12-17,2004,Taipei,Taiwan [21] Ian H. Witten, Eibe Frank, Data Mining-Practical Machine Learning Tools and Techniques, Morgan Kaufmann Publishers, Second Edition, 2005, pp [22] Sujun Hua and Zhirong Sun. A Novel Method of Protein Secondary Structure Prediction with High Segment Overlap Measure: Support Vector Machine Approach.J. Mol. Biol. (2001)308, [23] Anjum Reyaz-Ahmed and Yan-Qing Zhang Protein Secondary Structure Prediction Using Genetic Neural Support Vector Machines [24] Lipontseng Cecilia Tsilo Prediction secondary structure prediction using neural network and SVM.33 [25] Jianlin Cheng, Allison N. Tegge, Member, IEEE, and Pierre Baldi, Senior Member,IEEE REVIEWS IN BIOMEDICAL ENGINEERING,VOL.1,2008. Machine Learning Method for Protein Structure Prediction. [26] Olav Zimmermann and Ulrich H.E. Hansmann. Support Vector Machines for Prediction of Dihedral Angle Regions.Bioinformatics Advance Access. September 27, 2006 [27] Longhui Wang, Olav Zimmermann, and Ulrich H.E. Hansmann. Prediction of Parallel and Antiparallel Beta Sheets Based on Sequence Profiles Using Support Vector Machines, Neumann Institute for Computing Workshop, [28] Hua-Sheng Chiu, Hsin-Nan Lin, Allan Lo, Ting-Yi Sung et al, A Two-stage Classifier for Protein β- turn Prediction Using Support Vector Machines IEEE International Conference on Granular Computing,2006. [29] Chris Sander and Reinhard Schneider Database of homology derived protein structures and the structural meaning of sequence alignment. Proteins, 9, 56-69, (1991). Authors Anil Kumar Mandle: Has completed his B.E. in Information Technology from Guru Ghasidas University Bilaspur (C.G.) in the year 2007 and is pursuing M.Tech from SATI Vidisha (M.P.) Pranita Jain: Astt.Prof. Information Technology Department Samrat Ashok Technological Institute Vidisha, (M. P.) INDIA Prof. Shailendra Kumar Srivastava: HOD, Information Technology Department Samrat Ashok Technological Institute Vidisha, (M. P.) INDIA 78
Pipe Cleaner Proteins. Essential question: How does the structure of proteins relate to their function in the cell?
Pipe Cleaner Proteins GPS: SB1 Students will analyze the nature of the relationships between structures and functions in living cells. Essential question: How does the structure of proteins relate to their
Advanced Medicinal & Pharmaceutical Chemistry CHEM 5412 Dept. of Chemistry, TAMUK
Advanced Medicinal & Pharmaceutical Chemistry CHEM 5412 Dept. of Chemistry, TAMUK Dai Lu, Ph.D. [email protected] Tel: 361-221-0745 Office: RCOP, Room 307 Drug Discovery and Development Drug Molecules Medicinal
IV. -Amino Acids: carboxyl and amino groups bonded to -Carbon. V. Polypeptides and Proteins
IV. -Amino Acids: carboxyl and amino groups bonded to -Carbon A. Acid/Base properties 1. carboxyl group is proton donor! weak acid 2. amino group is proton acceptor! weak base 3. At physiological ph: H
Part A: Amino Acids and Peptides (Is the peptide IAG the same as the peptide GAI?)
ChemActivity 46 Amino Acids, Polypeptides and Proteins 1 ChemActivity 46 Part A: Amino Acids and Peptides (Is the peptide IAG the same as the peptide GAI?) Model 1: The 20 Amino Acids at Biological p See
Peptide bonds: resonance structure. Properties of proteins: Peptide bonds and side chains. Dihedral angles. Peptide bond. Protein physics, Lecture 5
Protein physics, Lecture 5 Peptide bonds: resonance structure Properties of proteins: Peptide bonds and side chains Proteins are linear polymers However, the peptide binds and side chains restrict conformational
Amino Acids, Peptides, Proteins
Amino Acids, Peptides, Proteins Functions of proteins: Enzymes Transport and Storage Motion, muscle contraction Hormones Mechanical support Immune protection (Antibodies) Generate and transmit nerve impulses
Shu-Ping Lin, Ph.D. E-mail: [email protected]
Amino Acids & Proteins Shu-Ping Lin, Ph.D. Institute te of Biomedical Engineering ing E-mail: [email protected] Website: http://web.nchu.edu.tw/pweb/users/splin/ edu tw/pweb/users/splin/ Date: 10.13.2010
Amino Acids. Amino acids are the building blocks of proteins. All AA s have the same basic structure: Side Chain. Alpha Carbon. Carboxyl. Group.
Protein Structure Amino Acids Amino acids are the building blocks of proteins. All AA s have the same basic structure: Side Chain Alpha Carbon Amino Group Carboxyl Group Amino Acid Properties There are
The peptide bond is rigid and planar
Level Description Bonds Primary Sequence of amino acids in proteins Covalent (peptide bonds) Secondary Structural motifs in proteins: α- helix and β-sheet Hydrogen bonds (between NH and CO groups in backbone)
Recap. Lecture 2. Protein conformation. Proteins. 8 types of protein function 10/21/10. Proteins.. > 50% dry weight of a cell
Lecture 2 Protein conformation ecap Proteins.. > 50% dry weight of a cell ell s building blocks and molecular tools. More important than genes A large variety of functions http://www.tcd.ie/biochemistry/courses/jf_lectures.php
Built from 20 kinds of amino acids
Built from 20 kinds of amino acids Each Protein has a three dimensional structure. Majority of proteins are compact. Highly convoluted molecules. Proteins are folded polypeptides. There are four levels
(c) How would your answers to problem (a) change if the molecular weight of the protein was 100,000 Dalton?
Problem 1. (12 points total, 4 points each) The molecular weight of an unspecified protein, at physiological conditions, is 70,000 Dalton, as determined by sedimentation equilibrium measurements and by
A. A peptide with 12 amino acids has the following amino acid composition: 2 Met, 1 Tyr, 1 Trp, 2 Glu, 1 Lys, 1 Arg, 1 Thr, 1 Asn, 1 Ile, 1 Cys
Questions- Proteins & Enzymes A. A peptide with 12 amino acids has the following amino acid composition: 2 Met, 1 Tyr, 1 Trp, 2 Glu, 1 Lys, 1 Arg, 1 Thr, 1 Asn, 1 Ile, 1 Cys Reaction of the intact peptide
CSC 2427: Algorithms for Molecular Biology Spring 2006. Lecture 16 March 10
CSC 2427: Algorithms for Molecular Biology Spring 2006 Lecture 16 March 10 Lecturer: Michael Brudno Scribe: Jim Huang 16.1 Overview of proteins Proteins are long chains of amino acids (AA) which are produced
The Organic Chemistry of Amino Acids, Peptides, and Proteins
Essential rganic Chemistry Chapter 16 The rganic Chemistry of Amino Acids, Peptides, and Proteins Amino Acids a-amino carboxylic acids. The building blocks from which proteins are made. H 2 N C 2 H Note:
Protein Physics. A. V. Finkelstein & O. B. Ptitsyn LECTURE 1
Protein Physics A. V. Finkelstein & O. B. Ptitsyn LECTURE 1 PROTEINS Functions in a Cell MOLECULAR MACHINES BUILDING BLOCKS of a CELL ARMS of a CELL ENZYMES - enzymatic catalysis of biochemical reactions
Concluding lesson. Student manual. What kind of protein are you? (Basic)
Concluding lesson Student manual What kind of protein are you? (Basic) Part 1 The hereditary material of an organism is stored in a coded way on the DNA. This code consists of four different nucleotides:
Structure and properties of proteins. Vladimíra Kvasnicová
Structure and properties of proteins Vladimíra Kvasnicová Chemical nature of proteins biopolymers of amino acids macromolecules (M r > 10 000) Classification of proteins 1) by localization in an organism
BOC334 (Proteomics) Practical 1. Calculating the charge of proteins
BC334 (Proteomics) Practical 1 Calculating the charge of proteins Aliphatic amino acids (VAGLIP) N H 2 H Glycine, Gly, G no charge Hydrophobicity = 0.67 MW 57Da pk a CH = 2.35 pk a NH 2 = 9.6 pi=5.97 CH
DBDB : a Disulfide Bridge DataBase for the predictive analysis of cysteine residues involved in disulfide bridges
DBDB : a Disulfide Bridge DataBase for the predictive analysis of cysteine residues involved in disulfide bridges Emmanuel Jaspard Gilles Hunault Jean-Michel Richer Laboratoire PMS UMR A 9, Université
Amino Acids and Proteins
Amino Acids and Proteins Proteins are composed of amino acids. There are 20 amino acids commonly found in proteins. All have: N2 C α R COO Amino acids at neutral p are dipolar ions (zwitterions) because
H H N - C - C 2 R. Three possible forms (not counting R group) depending on ph
Amino acids - 0 common amino acids there are others found naturally but much less frequently - Common structure for amino acid - C, -N, and functional groups all attached to the alpha carbon N - C - C
Paper: 6 Chemistry 2.130 University I Chemistry: Models Page: 2 of 7. 4. Which of the following weak acids would make the best buffer at ph = 5.0?
Paper: 6 Chemistry 2.130 University I Chemistry: Models Page: 2 of 7 4. Which of the following weak acids would make the best buffer at ph = 5.0? A) Acetic acid (Ka = 1.74 x 10-5 ) B) H 2 PO - 4 (Ka =
Covalent bonds are the strongest chemical bonds contributing to the protein structure A peptide bond is formed between with of the following?
MCAT Question Covalent bonds are the strongest chemical bonds contributing to the protein structure A peptide bond is formed between with of the following? A. Carboxylic group and amino group B. Two carboxylic
Sickle cell anemia: Altered beta chain Single AA change (#6 Glu to Val) Consequence: Protein polymerizes Change in RBC shape ---> phenotypes
Protein Structure Polypeptide: Protein: Therefore: Example: Single chain of amino acids 1 or more polypeptide chains All polypeptides are proteins Some proteins contain >1 polypeptide Hemoglobin (O 2 binding
Lecture 19: Proteins, Primary Struture
CPS260/BGT204.1 Algorithms in Computational Biology November 04, 2003 Lecture 19: Proteins, Primary Struture Lecturer: Pankaj K. Agarwal Scribe: Qiuhua Liu 19.1 The Building Blocks of Protein [1] Proteins
PRACTICE TEST QUESTIONS
PART A: MULTIPLE CHOICE QUESTIONS PRACTICE TEST QUESTIONS DNA & PROTEIN SYNTHESIS B 1. One of the functions of DNA is to A. secrete vacuoles. B. make copies of itself. C. join amino acids to each other.
MCAT Organic Chemistry - Problem Drill 23: Amino Acids, Peptides and Proteins
MCAT rganic Chemistry - Problem Drill 23: Amino Acids, Peptides and Proteins Question No. 1 of 10 Question 1. Which amino acid does not contain a chiral center? Question #01 (A) Serine (B) Proline (C)
Disulfide Bonds at the Hair Salon
Disulfide Bonds at the Hair Salon Three Alpha Helices Stabilized By Disulfide Bonds! In order for hair to grow 6 inches in one year, 9 1/2 turns of α helix must be produced every second!!! In some proteins,
Ch18_PT MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.
Ch18_PT MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. 1) All of the following can be classified as biomolecules except A) lipids. B) proteins. C)
This class deals with the fundamental structural features of proteins, which one can understand from the structure of amino acids, and how they are
This class deals with the fundamental structural features of proteins, which one can understand from the structure of amino acids, and how they are put together. 1 A more detailed view of a single protein
Chapter 26 Biomolecules: Amino Acids, Peptides, and Proteins
John E. McMurry www.cengage.com/chemistry/mcmurry Chapter 26 Biomolecules: Amino Acids, Peptides, and Proteins Proteins Amides from Amino Acids Amino acids contain a basic amino group and an acidic carboxyl
2007 7.013 Problem Set 1 KEY
2007 7.013 Problem Set 1 KEY Due before 5 PM on FRIDAY, February 16, 2007. Turn answers in to the box outside of 68-120. PLEASE WRITE YOUR ANSWERS ON THIS PRINTOUT. 1. Where in a eukaryotic cell do you
INTRODUCTION TO PROTEIN STRUCTURE
Name Class: Partner, if any: INTRODUCTION TO PROTEIN STRUCTURE PRIMARY STRUCTURE: 1. Write the complete structural formula of the tripeptide shown (frame 10). Circle and label the three sidechains which
Peptide Bonds: Structure
Peptide Bonds: Structure Peptide primary structure The amino acid sequence, from - to C-terminus, determines the primary structure of a peptide or protein. The amino acids are linked through amide or peptide
NO CALCULATORS OR CELL PHONES ALLOWED
Biol 205 Exam 1 TEST FORM A Spring 2008 NAME Fill out both sides of the Scantron Sheet. On Side 2 be sure to indicate that you have TEST FORM A The answers to Part I should be placed on the SCANTRON SHEET.
Proteins. Proteins. Amino Acids. Most diverse and most important molecule in. Functions: Functions (cont d)
Proteins Proteins Most diverse and most important molecule in living i organisms Functions: 1. Structural (keratin in hair, collagen in ligaments) 2. Storage (casein in mother s milk) 3. Transport (HAEMOGLOBIN!)
Amino Acids, Proteins, and Enzymes. Primary and Secondary Structure Tertiary and Quaternary Structure Protein Hydrolysis and Denaturation
Amino Acids, Proteins, and Enzymes Primary and Secondary Structure Tertiary and Quaternary Structure Protein Hydrolysis and Denaturation 1 Primary Structure of Proteins H 3 N The particular sequence of
The peptide bond Peptides and proteins are linear polymers of amino acids. The amino acids are
Introduction to Protein Structure Proteins are large heteropolymers usually comprised of 50 2500 monomer units, although larger proteins are observed 7. The monomer units of proteins are amino acids. The
Guidelines for Writing a Scientific Paper
Guidelines for Writing a Scientific Paper Writing an effective scientific paper is not easy. A good rule of thumb is to write as if your paper will be read by a person who knows about the field in general
A disaccharide is formed when a dehydration reaction joins two monosaccharides. This covalent bond is called a glycosidic linkage.
CH 5 Structure & Function of Large Molecules: Macromolecules Molecules of Life All living things are made up of four classes of large biological molecules: carbohydrates, lipids, proteins, and nucleic
Chapter 5. The Structure and Function of Macromolecule s
Chapter 5 The Structure and Function of Macromolecule s Most Macromolecules are polymers: Polymer: (poly: many; mer: part) Large molecules consisting of many identical or similar subunits connected together.
Refinement of a pdb-structure and Convert
Refinement of a pdb-structure and Convert A. Search for a pdb with the closest sequence to your protein of interest. B. Choose the most suitable entry (or several entries). C. Convert and resolve errors
From Sequence to Structure
1 From Sequence to Structure The genomics revolution is providing gene sequences in exponentially increasing numbers. onverting this sequence information into functional information for the gene products
Carbohydrates, proteins and lipids
Carbohydrates, proteins and lipids Chapter 3 MACROMOLECULES Macromolecules: polymers with molecular weights >1,000 Functional groups THE FOUR MACROMOLECULES IN LIFE Molecules in living organisms: proteins,
From DNA to Protein. Proteins. Chapter 13. Prokaryotes and Eukaryotes. The Path From Genes to Proteins. All proteins consist of polypeptide chains
Proteins From DNA to Protein Chapter 13 All proteins consist of polypeptide chains A linear sequence of amino acids Each chain corresponds to the nucleotide base sequence of a gene The Path From Genes
Invariant residue-a residue that is always conserved. It is assumed that these residues are essential to the structure or function of the protein.
Chapter 6 The amino acid side chains have polar and nonpolar properties, and the relative hydrophobicity of the amino acid side chains is critical for the folding and stability of a protein. The more hydrophobic
Proteins the primary biological macromolecules of living organisms
Proteins the primary biological macromolecules of living organisms Protein structure and folding Primary Secondary Tertiary Quaternary structure of proteins Structure of Proteins Protein molecules adopt
Introduction to Chemical Biology
Professor Stuart Conway Introduction to Chemical Biology University of xford Introduction to Chemical Biology ecommended books: Professor Stuart Conway Department of Chemistry, Chemistry esearch Laboratory,
18.2 Protein Structure and Function: An Overview
18.2 Protein Structure and Function: An Overview Protein: A large biological molecule made of many amino acids linked together through peptide bonds. Alpha-amino acid: Compound with an amino group bonded
Structure of proteins
Structure of proteins Primary structure: is amino acids sequence or the covalent structure (50-2500) amino acids M.Wt. of amino acid=110 Dalton (56 110=5610 Dalton). Single chain or more than one polypeptide
Biochemistry - I. Prof. S. Dasgupta Department of Chemistry Indian Institute of Technology, Kharagpur Lecture-11 Enzyme Mechanisms II
Biochemistry - I Prof. S. Dasgupta Department of Chemistry Indian Institute of Technology, Kharagpur Lecture-11 Enzyme Mechanisms II In the last class we studied the enzyme mechanisms of ribonuclease A
AMINO ACIDS & PEPTIDE BONDS STRUCTURE, CLASSIFICATION & METABOLISM
AMINO ACIDS & PEPTIDE BONDS STRUCTURE, CLASSIFICATION & METABOLISM OBJECTIVES At the end of this session the student should be able to, recognize the structures of the protein amino acid and state their
Acidic amino acids: Those whose side chains can carry a negative charge at certain ph values. Typically aspartic acid, glutamic acid.
A Acidic amino acids: Those whose side chains can carry a negative charge at certain ph values. Typically aspartic acid, glutamic acid. Active site: Usually applied to catalytic site of an enzyme or where
Myoglobin and Hemoglobin
Myoglobin and Hemoglobin Myoglobin and hemoglobin are hemeproteins whose physiological importance is principally related to their ability to bind molecular oxygen. Myoglobin (Mb) The oxygen storage protein
Structure Determination
5 Structure Determination Most of the protein structures described and discussed in this book have been determined either by X-ray crystallography or by nuclear magnetic resonance (NMR) spectroscopy. Although
http://faculty.sau.edu.sa/h.alshehri
http://faculty.sau.edu.sa/h.alshehri Definition: Proteins are macromolecules with a backbone formed by polymerization of amino acids. Proteins carry out a number of functions in living organisms: - They
THREE DIMENSIONAL REPRESENTATION OF AMINO ACID CHARAC- TERISTICS
THREE DIMENSIONAL REPRESENTATION OF AMINO ACID CHARAC- TERISTICS O.U. Sezerman 1, R. Islamaj 2, E. Alpaydin 2 1 Laborotory of Computational Biology, Sabancı University, Istanbul, Turkey. 2 Computer Engineering
CHAPTER 29 AMINO ACIDS, POLYPEPTIDES, AND PROTEINS SOLUTIONS TO REVIEW QUESTIONS
APTER 29 AMI AIDS, PLYPEPTIDES, AD PRTEIS SLUTIS T REVIEW QUESTIS 1. The designation, α, means that the amine group in common amino acids is connected to the carbon immediately adjacent to the carboxylic
Combinatorial Biochemistry and Phage Display
Combinatorial Biochemistry and Phage Display Prof. Valery A. Petrenko Director - Valery Petrenko Instructors Galina Kouzmitcheva and I-Hsuan Chen Auburn 2006, Spring semester COMBINATORIAL BIOCHEMISTRY
Ionization of amino acids
Amino Acids 20 common amino acids there are others found naturally but much less frequently Common structure for amino acid COOH, -NH 2, H and R functional groups all attached to the a carbon Ionization
Chemistry 110. Bettelheim, Brown, Campbell & Farrell. Introduction to General, Organic and Biochemistry Chapter 22 Proteins
hemistry 110 Bettelheim, Brown, ampbell & Farrell Ninth Edition Introduction to General, rganic and Biochemistry hapter 22 Proteins Step-growth polyamide (polypeptide) polymers or oligomers of L-α-aminoacids.
Disaccharides consist of two monosaccharide monomers covalently linked by a glycosidic bond. They function in sugar transport.
1. The fundamental life processes of plants and animals depend on a variety of chemical reactions that occur in specialized areas of the organism s cells. As a basis for understanding this concept: 1.
RNA and Protein Synthesis
Name lass Date RN and Protein Synthesis Information and Heredity Q: How does information fl ow from DN to RN to direct the synthesis of proteins? 13.1 What is RN? WHT I KNOW SMPLE NSWER: RN is a nucleic
Papers listed: Cell2. This weeks papers. Chapt 4. Protein structure and function
Papers listed: Cell2 During the semester I will speak of information from several papers. For many of them you will not be required to read these papers, however, you can do so for the fun of it (and it
RNA & Protein Synthesis
RNA & Protein Synthesis Genes send messages to cellular machinery RNA Plays a major role in process Process has three phases (Genetic) Transcription (Genetic) Translation Protein Synthesis RNA Synthesis
Ms. Campbell Protein Synthesis Practice Questions Regents L.E.
Name Student # Ms. Campbell Protein Synthesis Practice Questions Regents L.E. 1. A sequence of three nitrogenous bases in a messenger-rna molecule is known as a 1) codon 2) gene 3) polypeptide 4) nucleotide
Helices From Readily in Biological Structures
The α Helix and the β Sheet Are Common Folding Patterns Although the overall conformation each protein is unique, there are only two different folding patterns are present in all proteins, which are α
Previously published in Biophysical Society On-line Textbook PROTEINS CHAPTER 1. PROTEIN STRUCTURE. Section 1. Primary structure, secondary motifs,
Previously published in Biophysical Society On-line Textbook PROTEINS CHAPTER 1. PROTEIN STRUCTURE Section 1. Primary structure, secondary motifs, tertiary architecture, and quaternary organization Jannette
Hydrogen Bonds The electrostatic nature of hydrogen bonds
Hydrogen Bonds Hydrogen bonds have played an incredibly important role in the history of structural biology. Both the structure of DNA and of protein a-helices and b-sheets were predicted based largely
Structures of Proteins. Primary structure - amino acid sequence
Structures of Proteins Primary structure - amino acid sequence Secondary structure chain of covalently linked amino acids folds into regularly repeating structures. Secondary structure is the result of
Chapter 3: Biological Molecules. 1. Carbohydrates 2. Lipids 3. Proteins 4. Nucleic Acids
Chapter 3: Biological Molecules 1. Carbohydrates 2. Lipids 3. Proteins 4. Nucleic Acids Elements in Biological Molecules Biological macromolecules are made almost entirely of just 6 elements: Carbon (C)
Chapter 2: Biochemistry Problems
hapter 2: Biochemistry Problems Biochemistry Problems If you were a biochemist, you would study chemical substances and vital processes that occur in living organisms. You might study macromolecules such
Sidechain Torsional Potentials and Motion of Amino Acids in Proteins:
Proc. Nat. Acad. Sci. USA Vol. 72, No. 6, pp. 2002-2006, June 1975 Sidechain Torsional Potentials and Motion of Amino Acids in Proteins: Bovine Pancreatic Trypsin Inhibitor (nuclear magnetic resonance
Lecture 13-14 Conformation of proteins Conformation of a protein three-dimensional structure native state. native condition
Lecture 13-14 Conformation of proteins Conformation of a protein refers to the three-dimensional structure in its native state. There are many different possible conformations for a molecule as large as
Chapter 12 - Proteins
Roles of Biomolecules Carbohydrates Lipids Proteins 1) Catalytic 2) Transport 3) Regulatory 4) Structural 5) Contractile 6) Protective 7) Storage Nucleic Acids 12.1 -Amino Acids Chapter 12 - Proteins Amino
FINDING RELATION BETWEEN AGING AND
FINDING RELATION BETWEEN AGING AND TELOMERE BY APRIORI AND DECISION TREE Jieun Sung 1, Youngshin Joo, and Taeseon Yoon 1 Department of National Science, Hankuk Academy of Foreign Studies, Yong-In, Republic
Molecular Facts and Figures
Nucleic Acids Molecular Facts and Figures DNA/RNA bases: DNA and RNA are composed of four bases each. In DNA the four are Adenine (A), Thymidine (T), Cytosine (C), and Guanine (G). In RNA the four are
THE CHEMICAL SYNTHESIS OF PEPTIDES
TE EMIAL SYTESIS F PEPTIDES Peptides are the long molecular chains that make up proteins. Synthetic peptides are used either as drugs (as they are biologically active) or in the diagnosis of disease. Peptides
Transcription and Translation of DNA
Transcription and Translation of DNA Genotype our genetic constitution ( makeup) is determined (controlled) by the sequence of bases in its genes Phenotype determined by the proteins synthesised when genes
In addition to being shorter than a single bond, the double bonds in ethylene don t twist the way single bonds do. In other words, the other atoms
In addition to being shorter than a single bond, the double bonds in ethylene don t twist the way single bonds do. In other words, the other atoms attached to the carbons (hydrogens in this case) can no
Basic Concepts of DNA, Proteins, Genes and Genomes
Basic Concepts of DNA, Proteins, Genes and Genomes Kun-Mao Chao 1,2,3 1 Graduate Institute of Biomedical Electronics and Bioinformatics 2 Department of Computer Science and Information Engineering 3 Graduate
Consensus alignment server for reliable comparative modeling with distant templates
W50 W54 Nucleic Acids Research, 2004, Vol. 32, Web Server issue DOI: 10.1093/nar/gkh456 Consensus alignment server for reliable comparative modeling with distant templates Jahnavi C. Prasad 1, Sandor Vajda
Nafith Abu Tarboush DDS, MSc, PhD [email protected] www.facebook.com/natarboush
Nafith Abu Tarboush DDS, MSc, PhD [email protected] www.facebook.com/natarboush α-keratins, bundles of α- helices Contain polypeptide chains organized approximately parallel along a single axis: Consist
Biological Molecules
Biological Molecules I won t lie. This is probably the most boring topic you have ever done in any science. It s pretty much as simple as this: learn the material deal with it. Enjoy don t say I didn t
Molecular Genetics. RNA, Transcription, & Protein Synthesis
Molecular Genetics RNA, Transcription, & Protein Synthesis Section 1 RNA AND TRANSCRIPTION Objectives Describe the primary functions of RNA Identify how RNA differs from DNA Describe the structure and
Introduction to the Protein Folding Problem
Lecture Notes - 1 7.24/7.88J/5.48J The Protein Folding Problem Student Review: Side chains of the L amino acids and their pk's L/D difference Planarity of the peptide Bond Lecture Overview: Introduction
Translation Study Guide
Translation Study Guide This study guide is a written version of the material you have seen presented in the replication unit. In translation, the cell uses the genetic information contained in mrna to
FTIR Analysis of Protein Structure
FTIR Analysis of Protein Structure Warren Gallagher A. Introduction to protein structure The first structures of proteins at an atomic resolution were determined in the late 1950 s. 1 From that time to
PHYSIOLOGY AND MAINTENANCE Vol. II - On The Determination of Enzyme Structure, Function, and Mechanism - Glumoff T.
ON THE DETERMINATION OF ENZYME STRUCTURE, FUNCTION, AND MECHANISM University of Oulu, Finland Keywords: enzymes, protein structure, X-ray crystallography, bioinformatics Contents 1. Introduction 2. Structure
Support Vector Machines with Clustering for Training with Very Large Datasets
Support Vector Machines with Clustering for Training with Very Large Datasets Theodoros Evgeniou Technology Management INSEAD Bd de Constance, Fontainebleau 77300, France [email protected] Massimiliano
Conformational Properties of Polypeptide Chains
Conformational Properties of Polypeptide Chains Levels of Organization Primary structure Amino acid sequence of the protein Secondary structure H bonds in the peptide chain backbone α helix and β sheets
Structure and Function of DNA
Structure and Function of DNA DNA and RNA Structure DNA and RNA are nucleic acids. They consist of chemical units called nucleotides. The nucleotides are joined by a sugar-phosphate backbone. The four
The chemistry of insulin
FREDERICK S ANGER The chemistry of insulin Nobel Lecture, December 11, 1958 It is great pleasure and privilege for me to give an account of my work on protein structure and I am deeply sensitive of the
Control of Gene Expression
Control of Gene Expression What is Gene Expression? Gene expression is the process by which informa9on from a gene is used in the synthesis of a func9onal gene product. What is Gene Expression? Figure
Lecture Series 7. From DNA to Protein. Genotype to Phenotype. Reading Assignments. A. Genes and the Synthesis of Polypeptides
Lecture Series 7 From DNA to Protein: Genotype to Phenotype Reading Assignments Read Chapter 7 From DNA to Protein A. Genes and the Synthesis of Polypeptides Genes are made up of DNA and are expressed
