Identifying and modelling key water molecules

Similar documents
Assessing Checking the the reliability of protein-ligand structures

! # % & ( ) +, % (!. / ( / 0 # % ( / ( /

THE CAMBRIDGE CRYSTALLOGRAPHIC DATA CENTRE (CCDC)

Molecular Docking. - Computational prediction of the structure of receptor-ligand complexes. Receptor: Protein Ligand: Protein or Small Molecule

Consensus Scoring to Improve the Predictive Power of in-silico Screening for Drug Design

The Ramachandran Map of More Than. 6,500 Perfect Polypeptide Chains

Scoring Functions and Docking. Keith Davies Treweren Consultants Ltd 26 October 2005

Ensemble Docking Revisited

Gold (Genetic Optimization for Ligand Docking) G. Jones et al. 1996

Refinement of a pdb-structure and Convert

Data Mining Analysis of HIV-1 Protease Crystal Structures

3D structure visualization and high quality imaging. Chimera

SAnDReS Tutorial 01 Prof. Dr. Walter F. de Azevedo Jr.

Solid Form Informatics for pharmaceuticals and agrochemicals:

PDBML: the representation of archival macromolecular structure data in XML

Multiobjective Robust Design Optimization of a docked ligand

file:///c /Documents%20and%20Settings/terry/Desktop/DOCK%20website/terry/Old%20Versions/dock4.0_faq.txt

DBDB : a Disulfide Bridge DataBase for the predictive analysis of cysteine residues involved in disulfide bridges

Guide for Bioinformatics Project Module 3

Integrating Medicinal Chemistry and Computational Chemistry: The Molecular Forecaster Approach

Consensus alignment server for reliable comparative modeling with distant templates

Protein Studies Using CAChe

Mercury User Guide and Tutorials. Copyright 2013 The Cambridge Crystallographic Data Centre Registered Charity No

bioavailability active transport blood-brain barrier transport absorption volume of distribution drug binding to plasma proteins

computer programs mmlib Python toolkit for manipulating annotated structural models of biological macromolecules

On the Impact of Virtual Screening to Genuine Lead Generation Campaigns

What is a weak hydrogen bond?

Structure Check. Authors: Eduard Schreiner Leonardo G. Trabuco. February 7, 2012

Publication of small-unit-cell structures in Acta Crystallographica Michael Hoyland ECM28 University of Warwick, 2013

Replication Study Guide

Amino Acids. Amino acids are the building blocks of proteins. All AA s have the same basic structure: Side Chain. Alpha Carbon. Carboxyl. Group.

Phase determination methods in macromolecular X- ray Crystallography

Web server to identify Similarity of Amino Acid Motifs to Compounds (SAAMCO)

Functional Architecture of RNA Polymerase I

VMD - High Resolution Graphics

DNA Worksheet BIOL 1107L DNA

Hydrogen Bonds The electrostatic nature of hydrogen bonds

Helices From Readily in Biological Structures

THREE DIMENSIONAL REPRESENTATION OF AMINO ACID CHARAC- TERISTICS

feature articles The Protein Data Bank: a historical perspective 88 doi: /s Acta Cryst. (2008). A64, 88 95

RETRIEVING SEQUENCE INFORMATION. Nucleotide sequence databases. Database search. Sequence alignment and comparison

JBS FUNDAMENT Thermofluor Screen

AP BIOLOGY 2010 SCORING GUIDELINES (Form B)

Consideration of Molecular Weight during Compound Selection in Virtual Target-Based Database Screening

pharmacophore: Multiple Flexible Ligand Alignment Based on Ant Colony Optimization

Chapter 11: Molecular Structure of DNA and RNA

Cheminformatics and its Role in the Modern Drug Discovery Process

Hot Spot Analysis for Driving the Development of Hits into Leads in Fragment-Based Drug Discovery

Linear Sequence Analysis. 3-D Structure Analysis

Ligand-Based Design Workflow. Paul Hawkins and Geo Skillman OpenEye Scientific Software

Lecture 15: Enzymes & Kinetics Mechanisms

Translation. Translation: Assembly of polypeptides on a ribosome

Translation Study Guide

Molecular Docking: A Problem With Thousands Of Degrees Of Freedom

Transcription and Translation of DNA

Patrick, An Introduction to Medicinal Chemistry 4e Chapter 13 Drug design: optimizing target interactions. Pyrrole ring N H

AP Biology 2013 Free-Response Questions

1. Three-Color Light. Introduction to Three-Color Light. Chapter 1. Adding Color Pigments. Difference Between Pigments and Light. Adding Color Light

MASCOT Search Results Interpretation

Accurate Prediction of Protein Disordered Regions by Mining Protein Structure Data

TEACHING CONTEMPORARY CHEMISTRY, BIOCHEMISTRY AND BIOLOGY: FREE AVAILABLE DATABASES, WEB TUTORIALS AND ON-LINE TOOLS

NMR and other Instrumental Techniques in Chemistry and the proposed National Curriculum.

Systematic assessment of cancer missense mutation clustering in protein structures

Structure Determination

Bio 102 Practice Problems Chromosomes and DNA Replication

Report on the Examination

Molecular Visualization. Introduction

CHEMISTRY STANDARDS BASED RUBRIC ATOMIC STRUCTURE AND BONDING

Protein Preparation Guide

Ms. Campbell Protein Synthesis Practice Questions Regents L.E.

PV (0.775 atm)( L) n = = = mol RT -1-1

Lettings & Portfolio Management. London l Surrey l Hampshire

Learner Guide. Cambridge IGCSE Economics

Bioinformatics for Biologists. Protein Structure

S1. Sample applications of the PRS method

Analyzing A DNA Sequence Chromatogram

CSC 2427: Algorithms for Molecular Biology Spring Lecture 16 March 10

FTIR Analysis of Protein Structure

RNA & Protein Synthesis

Structure and Dynamics of Hydrogen-Bonded Systems October Hydrogen Bonds and Liquid Water

pka based protonation states and microspecies for protein-ligand docking

X-ray Powder Diffraction Pattern Indexing for Pharmaceutical Applications

Section I Using Jmol as a Computer Visualization Tool

Original article: A SIMPLE CLICK BY CLICK PROTOCOL TO PERFORM DOCKING: AUTODOCK 4.2 MADE EASY FOR NON-BIOINFORMATICIANS

Examiner s report F8 Audit & Assurance December 2014

READING WORKSHOP Mr Jassal Mrs Manning

ENZYME SCIENCE AND ENGINEERING PROF. SUBHASH CHAND DEPARTMENT OF BIOCHEMICAL ENGINEERING AND BIOTECHNOLOGY IIT DELHI LECTURE 4 ENZYMATIC CATALYSIS

Name: Date: Period: DNA Unit: DNA Webquest

BEHRINGER B-CONTROL Programming Guide

A Strategy for Teaching Finite Element Analysis to Undergraduate Students

DNA Replication & Protein Synthesis. This isn t a baaaaaaaddd chapter!!!

Introduction to Principal Components and FactorAnalysis

Chapter 6 DNA Replication

BIOC351: Proteins. PyMOL Laboratory #1. Installing and Using

AP BIOLOGY 2008 SCORING GUIDELINES

Supporting Information

Problem of the Month: Cutting a Cube

Structural Bioinformatics Main resource on the web: PDB History Kind of data How to search

The Steps. 1. Transcription. 2. Transferal. 3. Translation

AN INDICATORS PROBLEM 2004, 2001 by David A. Katz. All rights reserved. Permission for academic use, provided the original copyright is included.

Transcription:

Identifying and modelling key water molecules Aim Water molecules can play key roles in mediating protein-ligand interactions. In this use case we illustrate how to make the most out the vast amounts of structural data available from the PDB and the CSD when trying to identifying key hydration sites. We will also show how to model water molecules in protein-ligand docking. Introduction It has long been recognised that water molecules can play a key role in protein-ligand recognition. 1 In the protein-ligand docking package GOLD 2 water molecules can be allowed to spin and toggle on and off. 3 Toggling a water molecule on introduces an entropic penalty to the scoring function which needs to be offset by forming hydrogen bonds to the protein and the ligand. If the hydrogen bonds formed by the water molecules does not offset the entropic penalty introduced by turning the water molecule on then the water molecule will be deselected for (turned off) during the genetic algorithm run. However, an assumption made in the design of the treatment of water molecules in GOLD was that the modeller would know the positions of any key water molecules. This use case will show how one can identify such potential hydration sites by making use of the structural data available in the PDB 4 and the CSD 5. Further, this use case will also illustrate a new feature of GOLD 5.0: the ability to allow key water molecules to translate during the docking. Method In this use case we will be using the a structure of neuraminidase in complex with Zanamavir (1a4g). 6 Accessing the structure through Relibase+ 7 immediately reveals that this particular complex has four water molecules mediating interactions between the protein and the ligand (figure 1). 1

Figure 1 Relibase+ has pre-calculated information on water molecules mediating interactions between protein and ligand molecules. In this neuraminidase structure there are four molecules mediating interactions between the protein and the ligand. Furthermore, by identifying similar binding sites and superimposing them one can find out which water molecules are conserved. Such a search in Relibase+ reveals that there are four structures with a sequence identity >95% to chain A of 1a4g (1a4q, 1nsb, 1nsc and 1nsd). 2

Figure 2 Binding site superimposition analysis in Relibase+. Note that the conserved waters between the reference structure and the superimposed hits are calculated on the fly (bottom right hand column). In this instance we find that the water molecules at the bottom of the cavity (HOH689 and HOH711) are conserved in all structures. However, four protein structures might not be enough to make an informed decision on which waters to include in a docking experiment. Clearly, we could make the sequence similarity cut-off less stringent. However, the protein-ligand structures in Relibase+ (derived from the PDB) are not the only source of structural data. The CSD now contains over half a million small molecule crystal structures, many of which are hydrates. Information on propensities of water probes around functional groups is available from IsoStar 8 (a knowledge base of intermolecular interactions derived from the CSD). Further, the program SuperStar 9 is capable of combining IsoStar propensity maps in order to calculate hotspots in protein binding sites. 3

Figure 3 SuperStar water hotspots calculated for neuraminidase structure 1a4g (purple spheres). The native ligand is displayed in cyan. Water molecules from an apo structure of neuraminidase (1nsb) are shown as red spheres. The water probe hotspots calculated by SuperStar for 1a4g show good agreement with water molecules from the apo structure 1nsb. The SuperStar water hotspot at the bottom of the cavity, figure 3, corresponds to the conserved water HOH711. The SuperStar water hotspot at the edge of the cavity is in all holo structures displaced by carboxylate groups of the ligands. Having identified potential key water molecules we set up a number of docking experiments. The first experiment did not include any water molecules. The second experiment included the native HOH711 water molecule, which was allowed to spin and toggle on and off. The third experiment included two water molecules positioned at the SuperStar calculated water hotspots. The waters were again allowed to spin and toggle on and off. Finally, a fourth docking experiment was set up again using the two SuperStar water hotspots. In this experiment the water molecules were allowed to translate up to 1Å from the original position as well as spin and toggle on and off. All docking experiments used default settings and the ChemScore scoring function. 4

Results When the docking was run without any water molecules the correct pose was not obtained, figure 4. Figure 4 Docking the native ligand into the 1a4g structure does not yield the correct pose when water HOH711 is absent. However, when the native water was included and it was allowed to spin and toggle on and off the correct pose was obtained, figure 5. Figure 5 Docking the native ligand whilst allowing the native water molecule HOH711 to spin and toggle on and off resulted in the correct pose. 5

When using the SuperStar water hotspots the correct pose was not obtained. This can be explained by the SuperStar hotspot corresponding to HOH711 being more buried than the native water molecule. This is not surprising as the SuperStar hotspot was calculated without the ligand present in the binding site. As such the water hotspot was optimised towards the protein carboxylate groups (figure 6). Figure 6 Docking the native ligand whilst allowing the SuperStar calculated water molecules to spin and toggle on and off. The correct ligand pose was not obtained. However, when the SuperStar water molecules were allowed to translate during the docking the correct pose was obtained (figure 7). It is worth noting that the second SuperStar water was always (correctly) toggled off. Figure 7 Docking the native ligand whilst allowing the SuperStar calculated water molecules to spin, toggle on and off and translate resulted in the correct ligand pose. Note that the second water was correctly toggled off. 6

Conclusions There is a vast amount of structural data available, both in terms of protein-ligand complexes and small molecule crystal structures. Using Relibase+ and SuperStar one can make the most out of this data when trying to identify key hydration sites. In protein-ligand docking water molecules can make the difference between success and failure. Further, subtle variations in the orientation and position of the waters can have large effects. GOLD s flexible treatment of water molecules allows modellers to customise the behaviour of individual water molecules; waters in GOLD 5 can be allowed to spin, translate and toggle on and off during the genetic algorithm run. References 1. J.E. Ladbury. Chem. & Biol., 1996, 3, 973-980 2. G. Jones, P. Willett and R. C. Glen. J. Mol. Biol., 1995, 245, 43-53 3. M. L. Verdonk, G. Chessari, J. C. Cole, M. J. Hartshorn, C. W. Murray, J. W. M. Nissink, R. D. Taylor and R. Taylor. J. Med. Chem., 2005, 48, 6504-6515 4. H. M. Berman, J. Westbrook, Z. Feng, G. Gilliland, T. N. Bhat, H. Weissig, I. N. Shindyalov and P. E. Bourne. Nucleic Acids Res, 2000, 28, 235-242 5. F. H. Allen. Acta Cryst., 2002, B58, 380-388 6. N. R. Taylor et al. J. Med. Chem., 1998, 41, 798-807 7. M. Hendlich, A. Bergner, J. Günther, G. Klebe. J. Mol. Biol., 2003, 326, 607-620 8. I. J. Bruno, J. C. Cole, J. P. M. Lommerse, R. S. Rowland, R. Taylor and M. L. Verdonk. J. Comput.-Aided Mol. Des., 1997, 11, 525-537 9. M. L. Verdonk, J. C. Cole and R. Taylor. J. Mol. Biol., 1999, 289, 1093-1108 7

Products CSD the world s only comprehensive, fully curated database of crystal structures, containing over 500,000 entries Relibase+ - an essential tool for searching, exploring and comparing all protein-ligand data from public and in-house data sources IsoStar a knowledge base of intermolecular interactions which provides easy appreciation of the geometry, strength and stability of interactions SuperStar a tool for investigating interaction sites in proteins making it easy to generate pharmacophores using experimental data GOLD an accurate and reliable protein-ligand docking program Hermes CCDC s life science visualiser, used by GOLD, GoldMine, Relibase+ and SuperStar For further information please contact Cambridge Crystallographic Data Centre, 12 Union Road, Cambridge CB2 1EZ, UK. Tel: +44 1223 336408, Fax: +44 1223 336033, Email: admin@ccdc.cam.ac.uk 8