Scoring Functions and Docking. Keith Davies Treweren Consultants Ltd 26 October 2005

Similar documents
Molecular Docking. - Computational prediction of the structure of receptor-ligand complexes. Receptor: Protein Ligand: Protein or Small Molecule

Consensus Scoring to Improve the Predictive Power of in-silico Screening for Drug Design

Christian Lemmen HYDE: Scoring for Lead Optimization

Gold (Genetic Optimization for Ligand Docking) G. Jones et al. 1996

Multiobjective Robust Design Optimization of a docked ligand

Assessing Checking the the reliability of protein-ligand structures

1 The water molecule and hydrogen bonds in water

Protein Studies Using CAChe

Refinement of a pdb-structure and Convert

Cheminformatics and its Role in the Modern Drug Discovery Process

file:///c /Documents%20and%20Settings/terry/Desktop/DOCK%20website/terry/Old%20Versions/dock4.0_faq.txt

Integrating Medicinal Chemistry and Computational Chemistry: The Molecular Forecaster Approach

STRUCTURE-GUIDED, FRAGMENT-BASED LEAD GENERATION FOR ONCOLOGY TARGETS

Hands-on exercises on solvent models & electrostatics EMBnet - Molecular Modeling Course 2005

Ligand-Based Design Workflow. Paul Hawkins and Geo Skillman OpenEye Scientific Software

Use the Force! Noncovalent Molecular Forces

Data Visualization in Cheminformatics. Simon Xi Computational Sciences CoE Pfizer Cambridge

Cheminformatics and Pharmacophore Modeling, Together at Last

The strength of the interaction

How To Understand Protein Ligand Interaction

Original article: A SIMPLE CLICK BY CLICK PROTOCOL TO PERFORM DOCKING: AUTODOCK 4.2 MADE EASY FOR NON-BIOINFORMATICIANS

Molecular Visualization. Introduction

How To Understand The Chemistry Of A 2D Structure

Topic 2: Energy in Biological Systems

How To Understand Protein-Protein Interaction And Inhibitors

Introduction, Noncovalent Bonds, and Properties of Water

Hydrogen Bonds The electrostatic nature of hydrogen bonds

THE CAMBRIDGE CRYSTALLOGRAPHIC DATA CENTRE (CCDC)

Lecture 15: Enzymes & Kinetics Mechanisms

Molecular Dynamics Simulations

Peptide bonds: resonance structure. Properties of proteins: Peptide bonds and side chains. Dihedral angles. Peptide bond. Protein physics, Lecture 5

Fingerprint-Based Virtual Screening Using Multiple Bioactive Reference Structures

How to create and interpret the predictive analysis of a compound

AP CHEMISTRY 2007 SCORING GUIDELINES. Question 6

Chemical Bonds. Chemical Bonds. The Nature of Molecules. Energy and Metabolism < < Covalent bonds form when atoms share 2 or more valence electrons.

Physicochemical Properties of Drugs

SAnDReS Tutorial 01 Prof. Dr. Walter F. de Azevedo Jr.

DOCKING AND SCORING IN VIRTUAL SCREENING FOR DRUG DISCOVERY: METHODS AND APPLICATIONS

1 Peptide bond rotation

Consideration of Molecular Weight during Compound Selection in Virtual Target-Based Database Screening

Structure of proteins

Helices From Readily in Biological Structures

Molecular Spectroscopy

QSAR. The following lecture has drawn many examples from the online lectures by H. Kubinyi

Protein Preparation Guide

Amino Acids. Amino acids are the building blocks of proteins. All AA s have the same basic structure: Side Chain. Alpha Carbon. Carboxyl. Group.

HYDRONMR and Fast-HYDRONMR

Intelligent Heuristic Construction with Active Learning

In Silico Drug Design Studies of Bioactive Cannabinoid and [60]Fullerene Derivatives

Combinatorial Chemistry and solid phase synthesis seminar and laboratory course

Structural Bioinformatics (C3210) Experimental Methods for Macromolecular Structure Determination

Bonding & Molecular Shape Ron Robertson

Nucleophilic Substitution and Elimination

Use of Predictive ADME in Library Profiling and Lead Optimization

Combinatorial Biochemistry and Phage Display

Brønsted-Lowry Acids and Bases

Steffen Lindert, René Staritzbichler, Nils Wötzel, Mert Karakaş, Phoebe L. Stewart, and Jens Meiler

bioavailability active transport blood-brain barrier transport absorption volume of distribution drug binding to plasma proteins

CHEMISTRY STANDARDS BASED RUBRIC ATOMIC STRUCTURE AND BONDING

Lead generation and lead optimisation:

The Synthesis of trans-dichlorobis(ethylenediamine)cobalt(iii) Chloride

Activation of Carbon-Hydrogen Bonds at Transition Metal Centers!

EXPERIMENT 1: Survival Organic Chemistry: Molecular Models

Using AutoDock with AutoDockTools: A Tutorial

Determining the Structure of an Organic Compound

Language: English Lecturer: Gianni de Fabritiis. Teaching staff: Language: English Lecturer: Jordi Villà i Freixa

Paper 1 (7404/1): Inorganic and Physical Chemistry Mark scheme

INTERMOLECULAR FORCES

Ionization of amino acids

2012 HORIBA Scientific. All rights reserved HORIBA Scientific. All rights reserved.

MULTIPLE CHOICE QUESTIONS

Phase diagram of water. Note: for H 2 O melting point decreases with increasing pressure, for CO 2 melting point increases with increasing pressure.

Chapter 4 Lecture Notes

Analysis of structural dynamics by H/D-exchange coupled to mass spectrometry HDX-MS

DISTANCE DEGREE PROGRAM CURRICULUM NOTE:

Lecture 19: Proteins, Primary Struture

RESONANCE, USING CURVED ARROWS AND ACID-BASE REACTIONS

CSC 2427: Algorithms for Molecular Biology Spring Lecture 16 March 10

Chapter 8: Energy and Metabolism

Chemistry INDIVIDUAL PROGRAM INFORMATION Macomb1 ( )

Structure Tools and Visualization

Molecular Formula Determination

Enzymes Enzyme Mechanism

C o m p u te r M o d e lin g o f M o le c u la r E le c tro n ic S tru c tu re

Peptide Bonds: Structure

California State Polytechnic University, Pomona. Exam Points 1. Nomenclature (1) 30

agucacaaacgcu agugcuaguuua uaugcagucuua

Advanced Medicinal & Pharmaceutical Chemistry CHEM 5412 Dept. of Chemistry, TAMUK

H H N - C - C 2 R. Three possible forms (not counting R group) depending on ph

Transcription:

Scoring Functions and Docking Keith Davies Treweren Consultants Ltd 26 October 2005

Overview Applications Docking Algorithms Scoring Functions Results Demonstration

Docking Applications Drug Design Lead Generation Lead Optimisation Library Design Compound Purchases Academic Predicting Crystal Structure Complexes

Lead Generation Alternative to Experimental HighThroughput Screening Issues Availability of Crystal Structures False Positives Often Difficult to Demonstrate Cost Benefit

Lead Optimisation Ranking Derivatives of Known Active Molecules Subject to Systematic Failures Not Sufficiently Discriminating for Many Optimisation Series Changing Chemical Series is really Lead Generation

Library Design Aim to Eliminates Molecules which not Credible Receptor Shape is Important Can be Lead Optimisation when only Substituents on an Active core vary

Compound Purchases Millions of Catalogue Molecules Diverse Approximately 0.5 Million Drug-like Issues False Negatives Delivery Timescales Cost Benefit is Demonstratable

Drug-like Filtering Properties Substructures Unstable Reactive Undesirable (eg toxic) Centres 2:9 Mass 150:800 XSA 20:240 Rot-Bonds <=10 Conformers <=1000000

Classifying Docking Algorithms Ligand Conformations Rigid Fixed Sample ~ 100 Flexible Constraints Residues and Pockets Pharmacophores Charges and Tautomers Water Molecules

Conformations Small Molecules are NOT Rigid Literature on Conformational Generation Use of Constraints Speed Enhancements Often 000 s Representative Conformations Suitable Problem for Parallel Computing

Constraints: Binding Site Identified by Crystal Structure Activity associated with another site Different binding mode at same site Confirmed by Mutagensis Studies Suggested by Software Size of cavity/cleft Binding Potential

Binding Site Identification Place Protein in 3-D Grid Remove Grid Points Inside Protein Remove Grid Points Inside 8 Å Sphere Positioned on the Grid Outside the Protein Binding Site Minimum of 3 Grid Points Minimum of 3 Residues

Constraints: Pharmacophores Centre Types H-bond Donor H-bond Acceptor Acid Base Positive Charge Negative Charge Aromatic Ring Lipophile Lewis Base 2 User Definable Options 3 or 4 Centre Types User Definable Bins Occurrence Frequency Output to file

Centre Positions

Charges and Tautomers Docking Doesn t Need Hydrogen Positions Charges Important for Energy Functions Sometimes Alternative Charge Models Rarely Multiple Same Charges Within Site Tautomer State Sometimes Too Many Alternatives Charges and Tautomeric State Must Complement Ligand

Water Molecules Essential Mediates binding for all Ligands Optional Presence required by some Ligands Inhibits binding of other Ligands Solvation & Desolvation Free Energy Critical for Scoring Function

Structure-based Virtual Screening CAN-DDO Project Suppliers Catalogues Property & Substructure Filtering Conformer Generation 3.5 Billion Molecules 12 Proteins 1.7 Millon PCs Combinatorial Libraries Dock & Score Saved Hits Protein Active Site Generation of Queries Pharmaco- Phores

2D -> 3D Coordinates Pharm list by permuting centres Centre & distance constraints Generate next conformer Match Pharmacophore? Yes No Saved Solution Dock by Fitting to Pharm

2D -> 3D Coordinates Pharm list by permuting centres Centre & distance constraints Generate next conformer VdW or CPK Contacts? No Match Pharmacophore? Yes Yes No Skip intervening conformers Deduce bond to relieve distance constraint Saved Solution Refine & Score Dock by Fitting to Pharm

Limitations of Current Methods Rigid Protein Side-Chains Binding Relevance for Biological Activity Scoring Functions

Terms in Scoring Functions Time Required Level of Approximation Hydrophobic Electrostatics H-bonding VdW Contacts Entropy (De)Solvation

Classifying Scoring Functions Knowledge-based (Atom pairs in contact) DrugScore, PMF Energy GOLD, DOCK, LigandFit, MOE Energy + Parameterised Solvation ChemScore (Glide, THINK) Free Energy Perturbation

Knowledge-Based Functions Score = r<cutoff A ij (r) Potentials of Mean Force (PMF) J Med Chem (1999) 42 p791-804 DrugScore J Med Chem (2005) 48 p6296-6303 Less Confused by Crystal Structure Precision

Energy Lennard Jones ij ( A / r ij 12 B/r ij 6 ) Torsion Term Σ ( 1- cos2 ) Conjugated Σ ( 1+ cos3 ) Non-conjugated Electrostatics ij q i q j / r ij

Enhanced ChemScore G = G 0 + G hbond *N hbond + G lipo *N lipo + G bad *N bad + G rot *N rot +E where G 0 G hbond G lipo G bad G rot are constants (-5.48;-3.34; -0.117; 0.058; 2.56) N hbond is the number of interactions (using geometric criteria) N lipo is the number of lipohilic contacts (cf PMF, DrugScore) N bad is the number of lipohilic-hydrophilic contacts (extension) N rot is the number of frozen rotatable bonds in the ligand E is the VdW interaction energy and ligand torsional energy (extension)

Free Energy Perturbation G sol = - G sol (ligand) - G sol (protein) + G gas + G sol (complex) Error Prone due to Subtraction of Large Numbers Solvent Accessible Surface Area (SASA) approximation (cf ChemScore) J Med Chem (2004) 47 p3065-74

General Observations Single Electrostatic and Tautomer models have Systematic Failures Energy Inadequate for Ranking Series of Diverse Molecules Free Energy Perturbation Methods Slow

Excuses for Inaccuracies Rigid Side-Chains Estimation of Solvation Effects Precision of Force Fields Biologically Non-relevant Binding Kinetics vs Thermodynamics

Ostriches Ignoring Fundamental Theory PMF DrugScore Omitting Geometry Refinement J Med Chem (2004) 47.12 p3032-47 Rigid and Semi-Rigid Ligands Dock LigandFit Biased Validations J Med Chem (2005)

Performance THINK 1.03 (used for CAN-DDO) 42,000,000,000 molecules 126,000 years 900 molecules per CPU day (excluding redundancy) THINK 1.30 (current release) Optimised with assistance from Intel Up to 100 times faster More centres useful for larger sites Refinement of docked geometry About 500,000 molecules per 2GHz CPU day

Validation and Results Reproduce Ligand-Protein Crystal Structures RMS Deviation of non-h Atoms Docking Score Dock Actives Used for Developing Scoring Functions Prediction Enrichment over Random Percentage of False Positives

Selection Criteria Possible Kinases Cancer Targets X-ray Crystal Structures in PDB Resolution (1.9-2.8) All Atoms (1IAN C only) Ligand Flexibility <=10 Rotatable Bonds (excludes 1GAG, 1IR3, 2FGI, 5TMP, 1LCK) 21 Structures Processed

PDB ID 2KI5 Score -52.2 (-39.8) RMS 5.88 (5.32) Notes 3 A S 1QHI -78.4 (-73.5) 1.03 (0.98) 3 A 1STC -105.7 0.59 2 1E8Z 1AQ1 1AGW -86.8-87.0-60.0 (-53.9) 1.77 0.89 3.45 (0.65) 2 2 2 1FGI 1FVV 1FVT -83.8 (-69.4) -81.3 (-73.3) -69.8 (-58.5) 0.56 (0.52) 0.61 (0.56) 1.44 (0.72) 2 3 3 A All Site Points S Single Bond Increment C Conjugated Bond Increment 1 Two Centre Fit 3 Three Centre Fit W Water Site Point

PDB ID Score RMS Notes 1DI8-55.2 (-52.9) 0.98 (0.70) 2 W 1DI9-39.5 1.97 2 W C 1YDR -60.1 (-43.9) 3.00 (2.33) 2 1YDS -55.5 (-48.1) 2.80 (0.76) 2 W 1YDT -36.8 2.86 2 1PME -28.4 1.30 3 W*2 C 4ERK -17.9 (-11.8) 7.83 (3.33) 2 1IEP -90.6 (-84.8) 0.73 (0.69) 3 1FPU -47.1 0.75 3 C 1DM2-54.0 0.61 3 1CKP -43.2 (-43.0) 1.99 (1.10) 2 C 1QCF -58.8 1.33 2

Find-a-Drug Cancer Results PDB Code 1FLT 821P 1C1Y 1FGI 1E7U Protein VEGFr RAS RAF FGFr-1 PI3K Number Tested 3 48 3 44 47 Number Active 3 4 1 6 5

Find-a-Drug Project Processed 270+ protein targets 60+ billion molecules (40-500 million per query) Validation using NCI Cancer Data 20% true positive >10x enrichment

Download THINK software www.treweren.com Virtual Screening Keith.Davies@treweren.com