Computational methods for increasing the stability of type IV pilin protein from Pseudomonas aeruginosa



Similar documents
Built from 20 kinds of amino acids

Amino Acids and Proteins

Amino Acids. Amino acids are the building blocks of proteins. All AA s have the same basic structure: Side Chain. Alpha Carbon. Carboxyl. Group.

Peptide bonds: resonance structure. Properties of proteins: Peptide bonds and side chains. Dihedral angles. Peptide bond. Protein physics, Lecture 5

18.2 Protein Structure and Function: An Overview

Recap. Lecture 2. Protein conformation. Proteins. 8 types of protein function 10/21/10. Proteins.. > 50% dry weight of a cell

Myoglobin and Hemoglobin

Amino Acids, Proteins, and Enzymes. Primary and Secondary Structure Tertiary and Quaternary Structure Protein Hydrolysis and Denaturation

Peptide Bonds: Structure

Chapter 12 - Proteins

IV. -Amino Acids: carboxyl and amino groups bonded to -Carbon. V. Polypeptides and Proteins

Advanced Medicinal & Pharmaceutical Chemistry CHEM 5412 Dept. of Chemistry, TAMUK

Paper: 6 Chemistry University I Chemistry: Models Page: 2 of Which of the following weak acids would make the best buffer at ph = 5.0?

Pipe Cleaner Proteins. Essential question: How does the structure of proteins relate to their function in the cell?

Consensus alignment server for reliable comparative modeling with distant templates

Structure Tools and Visualization

Structure of proteins

Disulfide Bonds at the Hair Salon

Helices From Readily in Biological Structures


Disaccharides consist of two monosaccharide monomers covalently linked by a glycosidic bond. They function in sugar transport.

Part A: Amino Acids and Peptides (Is the peptide IAG the same as the peptide GAI?)

AP BIOLOGY 2008 SCORING GUIDELINES

Protein Physics. A. V. Finkelstein & O. B. Ptitsyn LECTURE 1

The peptide bond is rigid and planar

I N V E S T I C E D O R O Z V O J E V Z D Ě L Á V Á N Í

INTRODUCTION TO PROTEIN STRUCTURE

Sickle cell anemia: Altered beta chain Single AA change (#6 Glu to Val) Consequence: Protein polymerizes Change in RBC shape ---> phenotypes

Proteins. Proteins. Amino Acids. Most diverse and most important molecule in. Functions: Functions (cont d)

CSC 2427: Algorithms for Molecular Biology Spring Lecture 16 March 10

H H N - C - C 2 R. Three possible forms (not counting R group) depending on ph

Papers listed: Cell2. This weeks papers. Chapt 4. Protein structure and function

PROTEINS THE PEPTIDE BOND. The peptide bond, shown above enclosed in the blue curves, generates the basic structural unit for proteins.

This class deals with the fundamental structural features of proteins, which one can understand from the structure of amino acids, and how they are

8/20/2012 H C OH H R. Proteins

Carbohydrates, proteins and lipids

RETRIEVING SEQUENCE INFORMATION. Nucleotide sequence databases. Database search. Sequence alignment and comparison

Transcription in prokaryotes. Elongation and termination

Guide for Bioinformatics Project Module 3

Protein engineering for structural biology

Invariant residue-a residue that is always conserved. It is assumed that these residues are essential to the structure or function of the protein.

Ionization of amino acids

ENZYMES. Serine Proteases Chymotrypsin, Trypsin, Elastase, Subtisisin. Principle of Enzyme Catalysis

Hydrogen Bonds The electrostatic nature of hydrogen bonds

Shu-Ping Lin, Ph.D.

A. A peptide with 12 amino acids has the following amino acid composition: 2 Met, 1 Tyr, 1 Trp, 2 Glu, 1 Lys, 1 Arg, 1 Thr, 1 Asn, 1 Ile, 1 Cys

Gold (Genetic Optimization for Ligand Docking) G. Jones et al. 1996

Proteins and Nucleic Acids

Chapter 5. The Structure and Function of Macromolecule s

Biological Molecules

Student name ID # 2. (4 pts) What is the terminal electron acceptor in respiration? In photosynthesis? O2, NADP+

Protein Folding. The resulting three-dimensional structure is determined by the amino acid sequence (Anfinsen's dogma).

Chapter 3. Protein Structure and Function

Chapter 5: The Structure and Function of Large Biological Molecules

AP BIOLOGY 2010 SCORING GUIDELINES (Form B)

Exam 4 Outline CH 105 Spring 2012

CHAPTER 29 AMINO ACIDS, POLYPEPTIDES, AND PROTEINS SOLUTIONS TO REVIEW QUESTIONS

Structure Check. Authors: Eduard Schreiner Leonardo G. Trabuco. February 7, 2012

BIOLOGICAL MEMBRANES: FUNCTIONS, STRUCTURES & TRANSPORT

Preliminary MFM Quiz

Combinatorial Biochemistry and Phage Display

Translation Study Guide

NO CALCULATORS OR CELL PHONES ALLOWED

Nafith Abu Tarboush DDS, MSc, PhD

Bio-Informatics Lectures. A Short Introduction

Protein Sequence Analysis - Overview -

Cystic Fibrosis Webquest Sarah Follenweider, The English High School 2009 Summer Research Internship Program

Some terms: An antigen is a molecule or pathogen capable of eliciting an immune response

The Organic Chemistry of Amino Acids, Peptides, and Proteins

Chemical Basis of Life Module A Anchor 2

Interaktionen von RNAs und Proteinen

Structure and properties of proteins. Vladimíra Kvasnicová

Ch18_PT MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Linear Sequence Analysis. 3-D Structure Analysis

Refinement of a pdb-structure and Convert

Replication Study Guide

Introduction to Bioinformatics 3. DNA editing and contig assembly

Computational Systems Biology. Lecture 2: Enzymes

Structures of Proteins. Primary structure - amino acid sequence

Bioinformatics for Biologists. Protein Structure

Blood-Based Cancer Diagnostics

DBDB : a Disulfide Bridge DataBase for the predictive analysis of cysteine residues involved in disulfide bridges

Problem Set 3 KEY

Steffen Lindert, René Staritzbichler, Nils Wötzel, Mert Karakaş, Phoebe L. Stewart, and Jens Meiler

Proteins the primary biological macromolecules of living organisms

A disaccharide is formed when a dehydration reaction joins two monosaccharides. This covalent bond is called a glycosidic linkage.

Supplementary Materials for

Concluding lesson. Student manual. What kind of protein are you? (Basic)

Peptide Bond Amino acids are linked together by peptide bonds to form polypepetide chain.

agucacaaacgcu agugcuaguuua uaugcagucuua

Name Class Date. Figure Which nucleotide in Figure 13 1 indicates the nucleic acid above is RNA? a. uracil c. cytosine b. guanine d.

PyRy3D: a software tool for modeling of large macromolecular complexes MODELING OF STRUCTURES FOR LARGE MACROMOLECULAR COMPLEXES

Lecture 15: Enzymes & Kinetics Mechanisms

How To Understand The Chemistry Of Organic Molecules

LECTURE 6 Gene Mutation (Chapter )

Transcription:

2/15/2015 Computational methods for increasing the stability of type IV pilin protein from Pseudomonas aeruginosa John Loft. Bioengineering 488 Computational Protein Design (Winter 2015), University of Washington, Box 355013, Seattle, Washington 98195-5013, USA. 1

Abstract Pseudomonas aeruginosa is a multiple drug resistant bacterium commonly known to form biofilms in the lungs, urinary tract, and kidneys. In some studies, it has been responsible for over one fourth of hospital-acquired infections in intensive care units (Vincent 1995). It binds to hosttargets through a mechanism that is dependent upon type-iv pilin, an adhesion protein that is antigenic. Engineered mutants of the type-iv pilin protein with varying sequences and chainlengths may act as potential vaccines against P. aeruginosa, while retaining stable epitopes at increased temperatures. This can increase the shelf life of the vaccines and ease storage requirements for distribution. Utilizing the programs Modeller, FoldIt, and Chimera, a variety of mutants were generated through homologous modeling of the crystal structure of the globular domain of type- IV pilin, obtained from the Protein Data Bank. Five redesigns of the globular domain were initially created. The relaxation of these mutants in water at 310 Kelvin for 5 nanoseconds was then simulated using a molecular mechanics kernel called in lucem molecular mechanics (ilmm), developed by the Daggett Research Group (Beck, et al., 2000-2013). Effective mutations were categorized by multiple data outputs from ilmm, including the residence time of amino acids in pertinent secondary structures, the α-carbon root-mean-square-deviation (Cα RMSD) through time, and the contact time of interacting amino acids. This information was applied to a final redesign of 1DZO, in an effort to optimize the stability further. Introduction The conglomeration of bacteria into biofilms occurs through cell-cell interactions mediated by lipoprotein complexes. Research into disrupting these interactions has become a large part of modern biomedicine. Here we explore computational methods used to design a 2

prospective vaccine for P. aeruginosa by modifying a truncated PAK pilin protein (PDB ID: 1DZO) in both sequence and length. Our consideration focused only on the globular domain of pilin, using residues Gly25 through Arg142, with amino acid numbering beginning at 25 because the fimbrillar portion of the protein is not represented in the 1DZO structure. A patent for multiple antigenic sequences (patent # US 5612036 A) in pilin proteins has been filed that covers residues 129 through 142 for the 1DZO protein. The goal of our simulations was not to discover antigenic sequences, but to create variants of vaccines with higher thermostability without disrupting antigenicity. A handful of proteins were engineered through Chimera and FoldIt, including an automated FoldIt design (AFD), a manual FoldIt design (MFD), an intuitive Chimera design (ID), a fragment of the wild-type (FWT), and a mutated fragment of the wild-type (FMD). Of these structures, four of the five mutants exhibited smaller α-c RMSDs than the wild-type (WT). Some exhibited only trivial gains in thermostability, and speculation is offered on why some mutations and mutation techniques were more advantageous than others. Methods Homology Modeling Our first step was to obtain a known sequence that encodes for the pilin protein from GenBank, an NIH genetic sequence database. The sequence we used was entitled, type 4 fimbrial precursor PilA [Pseudomonas aeruginosa PAO1] (Genbank ID: AAG07913.1). We obtained a FASTA file for this sequence, and BLAST searched the sequence to locate experimentally determined structures in the PDB, selecting the model 1DZO, determined by x- ray diffraction with a resolution of 1.63 Å. 3

After finding the first overlapping sequence of 1DZO with the GenBank sequence, we truncated the GenBank sequence to eliminate the fimbrillar section of the protein. A homology model matching the GenBank sequence to the secondary structures of 1DZO was then generated using perl scripts from the Modeller bioinformatics package, with the computational aid of the Stampede supercomputer, based in the University of Texas, Austin. These operations calculated a minimized energy structure of the GenBank pilin protein. Aligning the two structures in Chimera and calculating the RMSD between them was then used to validate the model. Homology modeling can be a key tool that allows for the prediction of structures from unconserved sequences, and this exercise proved practical for future research on mutated strains of P. aeruginosa, but for our mutated models in subsequent research, we used only the 1DZO sequence. Computational Design of the Pilin Adhesion Protein In the next component of the study, FoldIt was used to generate an automated design (AFD). The methodology behind the automated design featured freezing the epitope and allowing all other residues to mutate in a manner that reduced the total Rosetta energy score. Fifty-five mutations were made in the AFD, lowering the sequence identity to 61.67% of the original 1DZO. Fine-grain energy minimization was conducted afterwards. A manual design (MFD) was also created in FoldIt, and the methodology behind the MFD was similar to that of the AFD, except that in addition to the epitope, all cysteines, glycines, and prolines were not permitted to mutate. FoldIt s automatic mutation feature conducted the changes, and by shear coincidence, only fifty-five mutations were made again. The AFD and MFD sequences were then examined in Chimera and found to share 60% sequence identity, ensuring that they were not the same protein. 4

Additionally, a fragment of the WT protein (FWT) was created in FoldIt, simply by deleting the first five residues in the 1DZO structure. This fragment was virtually replicated, and a mutated fragment (FMD) based off of it was designed, containing a total of ten mutations. Seven of these mutations were made in the alpha helix in an attempt to increase hydrogen bonding between coils and three were made in loop structures, with the mutation selection again based off of FoldIt s minimum Rosetta energy scoring function. A fifth design was also created exclusively in Chimera. Only two mutations were made, Ala86Thr and Ile115Thr. Mutation Ala86Thr was committed because alanine is hydrophobic and residue 86 is located on the outside of the structure. Modification to threonine, a polar uncharged side chain, reduces hydrophobic interactions while increasing the chance for hydrogen bonding between beta-sheets. This change may result in a lower free energy score by increasing the entropy of the protein, as threonine can take on more possible configurations than alanine. Mutation Ile115Thr was also made to increase beta-sheet hydrogen bonding and increase the solvable surface area of the protein. These alterations were facilitated by the rotamer selection feature in Chimera, using the Dunbrack library, and it should be noted that in future studies, the dynameomics library should be used instead. Molecular Dynamics The five redesigns of the truncated pilin protein were then simulated in a molecular dynamics kernel. In lucem molecular mechanics (ilmm) prepared the PDB files, making a number of assumption that can be read in the referenced literature. One important faulty assumption the ilmm kernel made was that no disulfide bond occurred between residues 104 and 117. This error was rectified by manually specifying a disulfide bond. The proteins were simulated for 5 ns at 310 K through multiple cycles of steepest decent minimization. With all 5

parameters specified, each simulation on the Stampede supercomputer took several hours. The results of these simulations were then integrated to produce a final redesign. Results Our homology model created with Modeller displayed an excellent RMSD of 0.368 Å, with a shared sequence identity of only 56.52%. Of the five initial designs plus the final design created with Chimera and FoldIt, all designs except the MFD exhibited a smaller, final Cα RMSD than the WT protein. The AFD produced the best results, with a final Cα RMSD reduction of 0.81 Å. The ID and FMD produced reductions of 0.53 Å and 0.57 Å respectively. The final design, however, only reduced the Cα RMSD by 0.18 Å. The dssp modules output from the simulation showed a shift of -5 amino acids from the actual residue number that correlated to the displayed secondary structure. It is unknown why this occurred. From the WT dssp module, it was noticed that a single amino acid, Pro21, has trouble adopting an alpha helix structure. It was also noted that intermediate residues in the third beta-sheet could be mutated to more strictly adopt the beta-sheet conformation. Some of the dssp modules from mutants indicated that a strengthened 3/10 helix might be created in one of the mid-chain loop structures as well. Alarmingly, the dssp for the MFD and the final design showed that the 3/10 helix located in the epitope was largely absent. Discussion It was discouraging to see that the AFD displayed the largest reduction in Cα RMSD. Although the AFD had the most hydrophilic and charged substitutions with many long lyseines exposed into the solvent, the fact that no other mutants displayed competitive thermostability suggests that greater intuition should have been employed in the initial protein engineering. More 6

modifications to the hydrophobic core and the soluble accessible beta-sheets during the intuitive design phase may have resulted in a greater change in the ID thermostability. Further discouraging results occur in the dssp data, with the MFD and final design lacking a consistent 3/10 helix in the epitope region, suggesting a loss of antigenicity. It is suspected that in the MFD, this loss is due simply to poor mutations. In the final design, however, an error was made in not explicitly specifying the disulfide bond that occurs in the epitope. This oversight is likely the culprit of flawed dssp data, and the simulation should be rerun to examine if the final design can maintain antigenicity. While the final design may not claim the highest thermostabiliy, it does exhibit smaller perturbations in Cα RMSD through time, indicating its total energy as a function of conformation may have comparatively fewer local minima, and therefore the protein might be more rigid in its range of motion, despite being only slightly more thermostable than the WT. Conclusion Objectively from these trials, the AFD trumped all other designs in thermostability, and if it can be proved that this mutant can fold into the proper structure from an unfolded state and maintain antigenicity, the AFD would act as a superior vaccine. As a first exploration into protein engineering, many pitfalls occurred in this study. There is little doubt that more advantageous mutants can be created. Nonetheless, the methods demonstrated within show where caution should be taken in protein engineering and molecular dynamics simulations, and provide indispensable tools for further research. 7

References 1.) Vincent, J.-L. (1995). The Prevalence of Nosocomial Infection in Intensive Care Units in Europe. JAMA, 274(8), 639. doi:10.1001/jama.1995.03530080055041 2.) Beck D.A.C., McCully M.E., Alonso D.O.V., Daggett V. (2000-2012) in lucem Molecular Mechanics (ilmm). University of Washington, Seattle. 3.) Crosslinked polypeptide vaccine with cysteine groups and carriers. (1997, March 18). Retrieved from http://www.google.com/patents/us5612036 Appendix Tables Protein( Type( Final(Cα( RMSD((Å)( WT+ 3.03961+ FINAL+ 2.85542+ ID+ 2.49824+ MFD+ 3.8353+ AFD+ 2.23056+ FWT+ 2.65751+ FMD+ 2.47399+ Table 1. The final Cα RMSDs of each simulation relative to its starting structure. Mutation Pro21Arg Ser35Glu Val54Arg Ala55Ser Ala56Lys Tyr63Arg Ala65Phe Iso94Val Reasoning Decrease kink in alpha helix Increase beta bridge stability Create 3/10 helix Create 3/10 helix Create 3/10 helix Increase beta sheet stability Increase beta sheet stability Increase beta sheet stability Table 2. Justification for each mutation in the final design. 8

Figure Legends Figure 1. The wild-type structure of 1DZO. Figure 2. The patented residues of pilin protein colored red in 1DZO. Figure 3. The intuitively designed mutant with altered residues shown in orange and residues within 4.0 Å of altered residues colored in purple. Figure 4. The automated FoldIt Design colored with the same paradigm as figure 3. Figure 5. The fragment of the wild-type colored with the same paradigm as figure 3. Figure 6. The final design with mutated fragments colored in red. Figures 7-9. The Cα RMSD of the ID & WT through time, the Cα RMSD of the MFD & WT through time, and the Cα RMSD of the AFD & WT through time. Figure 10-12. The Cα RMSD of the FWT & WT through time, the Cα RMSD of the FMD & WT through time, and the Cα RMSD of the final design & WT through time. Figure 13. WT secondary structures through time. Figure 14. Secondary structures of the Chimera intuitive design through time. Figure 15. Secondary structures of the manual Foldit design through time. Figure 16. Secondary structures of the automated Foldit design through time. Figure 17. Secondary structures of the WT fragment through time. Figure 18. Secondary structures of the manual designed fragment through time. Figure 19. Secondary structures of the final design through time. 9

Figures 10

+ 11

12

13

14

15

16 0.00E+00+ 5.00E801+ 1.00E+00+ 1.50E+00+ 2.00E+00+ 2.50E+00+ 3.00E+00+ 3.50E+00+ 4.00E+00+ 0+ 150+ 310+ 470+ 630+ 790+ 950+ 1110+ 1270+ 1430+ 1590+ 1750+ 1910+ 2070+ 2230+ 2390+ 2550+ 2710+ 2870+ 3030+ 3190+ 3350+ 3510+ 3670+ 3830+ 3990+ 4150+ 4310+ 4470+ 4630+ 4790+ 4950+ Cα(RMSD((Å)( Time((ps)( WT+RMSD+ Intui?ve+Design+ 0.00E+00+ 1.00E+00+ 2.00E+00+ 3.00E+00+ 4.00E+00+ 5.00E+00+ 0+ 150+ 310+ 470+ 630+ 790+ 950+ 1110+ 1270+ 1430+ 1590+ 1750+ 1910+ 2070+ 2230+ 2390+ 2550+ 2710+ 2870+ 3030+ 3190+ 3350+ 3510+ 3670+ 3830+ 3990+ 4150+ 4310+ 4470+ 4630+ 4790+ 4950+ Cα(RMSD((Å)( Time((ps)( WT+RMSD+ Manual+FoldIt+Design+RMSD+ 0.00E+00+ 5.00E801+ 1.00E+00+ 1.50E+00+ 2.00E+00+ 2.50E+00+ 3.00E+00+ 3.50E+00+ 4.00E+00+ 0+ 150+ 310+ 470+ 630+ 790+ 950+ 1110+ 1270+ 1430+ 1590+ 1750+ 1910+ 2070+ 2230+ 2390+ 2550+ 2710+ 2870+ 3030+ 3190+ 3350+ 3510+ 3670+ 3830+ 3990+ 4150+ 4310+ 4470+ 4630+ 4790+ 4950+ Cα(RMSD((Å)( Time((ps)( WT+RMSD+ Automated+FoldIt+Design+RMSD+

17 + 0.00E+00+ 5.00E801+ 1.00E+00+ 1.50E+00+ 2.00E+00+ 2.50E+00+ 3.00E+00+ 3.50E+00+ 4.00E+00+ 0+ 150+ 310+ 470+ 630+ 790+ 950+ 1110+ 1270+ 1430+ 1590+ 1750+ 1910+ 2070+ 2230+ 2390+ 2550+ 2710+ 2870+ 3030+ 3190+ 3350+ 3510+ 3670+ 3830+ 3990+ 4150+ 4310+ 4470+ 4630+ 4790+ 4950+ Cα(RMSD((Å)( Time((ps)( WT+RMSD+ WT+Fragment+RMSD+ 0.00E+00+ 5.00E801+ 1.00E+00+ 1.50E+00+ 2.00E+00+ 2.50E+00+ 3.00E+00+ 3.50E+00+ 4.00E+00+ 0+ 150+ 310+ 470+ 630+ 790+ 950+ 1110+ 1270+ 1430+ 1590+ 1750+ 1910+ 2070+ 2230+ 2390+ 2550+ 2710+ 2870+ 3030+ 3190+ 3350+ 3510+ 3670+ 3830+ 3990+ 4150+ 4310+ 4470+ 4630+ 4790+ 4950+ Cα(RMSD((Å)( Time((ps)( WT+RMSD+ Manually+designed+fragment+RMSD+ 0.00E+00+ 5.00E801+ 1.00E+00+ 1.50E+00+ 2.00E+00+ 2.50E+00+ 3.00E+00+ 3.50E+00+ 4.00E+00+ 0+ 150+ 310+ 470+ 630+ 790+ 950+ 1110+ 1270+ 1430+ 1590+ 1750+ 1910+ 2070+ 2230+ 2390+ 2550+ 2710+ 2870+ 3030+ 3190+ 3350+ 3510+ 3670+ 3830+ 3990+ 4150+ 4310+ 4470+ 4630+ 4790+ 4950+ Cα(RMSD((Å)( Time((ps)( WT+RMSD+ Final+Design+RMSD+

18

19

20

21

22

23

24