Multiobjective Robust Design Optimization of a docked ligand Carlo Poloni,, Universitaʼ di Trieste Danilo Di Stefano, ESTECO srl Design Process DESIGN ANALYSIS MODEL Dynamic Analysis Logistics & Field Support Cost Analysis Mechanical Design Aerodynamics Stress Analysis Risk/Life Management Manufacturing
Decomposed design process DESIGN ANALYSIS MODEL SUB-PROBLEM ANALYSIS MODEL SINGLE TASK ANALYSIS The design analysis can be decomposed into similar structures Each sub-structure can be approached in a similar way Using design automation procedures the whole design process or specific sub-problems can be analized in a systematically by means of: Design Of Experiment DESIGN ANALYSIS MODEL Optimization algorithms Decision Making Procedures modefrontier
Design of Experiment: Prepare and execute a given number of experiments in order to maximise the knowledge acquisition Optimisation algorithms: Iterate the analysis process to maximise given performances staisfing constraints Decision Making: By comparing and judging a subset of configuration capture the decision maker preferences The automation of the design process means: Formulate a logic analysis process Execute simulations or experiments efficiently IT infrastructure (Intranet) Logic and Optimisation Algorithms Archive in an organized way sensible data in an easyly accessible way (through a web browser) Take rational decisions on the best compromise between cost and performances 6
Handling uncertainties When facing creative design uncertainties in the performance prediction must be taken into account: Noise due to unpredictable environment Noise due to geometrical tolerances Numerical inaccuracies Different tools are available in modefrontier to face these issues: - statistical tools for DFSS - RSM methodology - MORDO techniques Uncertainty Quantification Comparisons Uncertainty Quantification methodologies estimate the mean and variance of a response (e.g., the output of a CFD solver) when input is uncertain (stochastic) Uncertain Response ( Y=f(X Example: Stochastic Input X
Uncertainty Quantification Comparisons Deterministic Response Stochastic Response MORDO searches for the optima of the mean and s t a n d a rd d ev i a t i o n o f a stochastic response rather than the optima of the deterministic response (the output from the solver) Polynomial Chaos Expansion Error Comparison Former example: to reach 1% Error in Mean needs 8 points Standard Deviation needs 12 points
One example with VI-Car_Real_Time Vehicle Dynamic p.o.v. Model: open wheel car Software: VI-Car_Real_Time Targets: - Increase AVG speed - Minimize the Steer Oscillations Standard Parameters (10): - Setup of the car: - front/rear ride_heights (2) - front/rear springs_stiffesses (2) - front/rear antirollbar_stiffnesses (2) - front/rear dampers scaling factors (2) - Limited Slip Differential setting: C0 coefficient Preload (1) C1 coefficient (1) Stocastic Parameters (3): - Driver behaviour: preview_time (1) - Tires: front/rear lateral grip factor (2) The Workflow
Optimization set-up Scheduler : NSGA-II DOE : 24 Designs Number of generations : 10 Number of Concurrent designs : 28 Number of MORDO samples : 28 28*28*10 = 7840 design evaluations : About 4 minutes per analysis on a AMD Opteron 2.4 Ghz MORDO optimization time : 22 h Optimization results : MORDO
Optimization results : best design statistical properties conclusion Optimize means improve performance:...even when uncertainties are dominant Is all this applicable to drug design?
Multiobjective Robust Design Optimization of a docked ligand modefrontier applied to protein ligand docking Carlo Poloni,, Universitaʼ di Trieste Danilo Di Stefano, ESTECO srl Protein ligand docking: definition Main features of a docking methodology: 1. Molecular Representation: Autodock molecular representation is adopted 2. Search Method: modefrontier schedulers used 3. Scoring of different poses: a Pareto frontier of poses is found
Protein ligand docking: Autodock v4 It uses a grid molecular representation. It pre-calculates one energygrid map for each atom type + electrostatic, so it suffices only to recall it at runtime. Ligand is flexible, protein is rigid. Morris, G.M., et al. (1998) J. Computational Chemistry, 19: 1639-1662. Automated Docking Using a Lamarckian Genetic Algorithm and an Empirical Binding Free Energy Function" Autodock Empirical Scoring Function The Autodock scoring function is made up of a weighted sum of different contributions: Short-range weak van der Waals attractive forces Long-range electrostatic forces Hydrogen bonds Conformational entropy Desolvation measure
Protein ligand docking: Autodock as an energy server The output of Autodock allows to distinguish the ligand internal energy and intermolecular energy. This allows for the possibility to build a multi-objective docking analysis. Objective 1: Minimization of the ligand internal energy Objective 2: Minimization of the intermolecular energy This way, we used Autodock as an "energy server" to score different poses and a Multi-Objective Genetic Algorithm to search for docked conformations. Protein ligand docking: bound docking tests At left: PDB structure of the complex 1MEH. At right: details of bonded and non-bonded interactions of the ligand in the binding site. The total number of degrees of freedom of the system is: 3 (translational) + 4 (orientational) + 8 (torsional) = 15.
Protein ligand docking: modefrontier settings MOGA-II Parameter Settings for 1MEH Population: 75 Initial Population: Random Number of Generations: 500 Probability of Crossover: 0.8 Probability of Selection: 0.05 Probability of Mutation: 0.02 Protein ligand docking: Pareto frontier of docked solutions Convergence after 7000 energy evaluations. All the solutions have a RMSD (Root Mean Squared Deviation) from the PDB Structure less than 1.5 Å and cover a sufficiently wide range of energies (they are not "the same" solution).
Protein ligand docking: bound docking tests At left: PDB structure of the complex 1PMN. At right: details of bonded and non-bonded interactions of the ligand in the binding site. The total number of degrees of freedom of the system is: 3 (translational) + 4 (orientational) + 7 (torsional) = 14. Protein ligand docking: Pareto frontier of docked solutions Convergence after 7000 energy evaluations. All the solutions have a RMSD (Root Mean Squared Deviation) from the PDB Structure less than 1.5 Å and cover a sufficiently wide range of energies (they are not "the same" solution).
Inclusion of protein side-chains flexibility The aim is: allowing for a limited protein flexibility at side-chain level by exploiting a multi-objective and robust approach to docking The variables are a selection of binding site torsions (side chains) and the objective is the minimization of the mean value of docked conformations energies with a constraint on the variance (to prevent to have over-optimized solutions) Why using Robust Design in docking? Robust Design could be useful to introduce a limited flexibility of the binding site Torsions of selected sidechains are considered as noisy variables of a docking process. Side-chains are selected from residues actually interacting with the ligand
A test case: phosphocholine complex (2MCP.PDB) Main interactions involve eight residues of the binding site. 2MCP: selected side-chains torsions H/ARG 52/CG H/ARG 52/CB H/ARG 52/CD H/ARG 52/CG H/ARG 52/NE H/ARG`52/CD H/TYR 33/OH H/TYR 33/CZ H/TYR 33/CG H/TYR 33/CB H/TRP 107/CG H/TRP 107/CB H/ASN 101/CG H/ASN 101/CB L/ASP`97/CG L/ASP 97/CB L/HIS 98/CG L/HIS 98/CB L/SER`99/OG L/SER`99/CB L/TYR 100/CG L/TYR 100/CB L/TYR 100/OH L/TYR 100/CZ
modefrontier workflow: single objective A single objective robust optimization is performed. The objective is the maximizing the global Autodock fitness value. M.O.R.D.O. module of modefrontier : Latin Hypercube with Polynomial chaos collocation. modefrontier workflow: multiobjective Multi-objective robust optimization.objectives: maximizing intramolecular and intermolecular contributions to Autodock fitness value. M.O.R.D.O. module of modefrontier : Latin Hypercube with Polynomial chaos collocation.
Robust docked system (single objective) The best-docked ligand of the most robust receptor s side chains conformation (at left) and X-ray structure (at right) are reported. Note the slightly different binding-site shape. Robust docked system (single objective) These five torsions seem to be more effective in driving the robustness of the solution (TRP H107, HIS L98, TYR L100, ARG H52)
Robust docked system multi objective Sampling of the global objective space: Pareto solutions marked. A trade-off is detected between the two objectives. SOM of the Pareto frontier The robustness of the intermolecular interactions is mainly driven by torsions L/ASP`97/CG L/ ASP 97/CB and L/TYR 100/OH L/TYR 100/CZ. That of intramolecular by torsions H/ARG 52/CG H/ARG 52/CB and H/ARG 52/CD H/ARG 52/CG. Torsions H/ASN 101/CG H/ASN 101/CB and H/ ARG 52/NE H/ARG`52/CD influence both intra- and intermolecular.
Robustness-driving interactions Single-objective Multi-objective Intermolecular Intramolecular Both intra- and inter- Conclusions The multi-objective approach proves to be meaningful. It could be used to tune the coefficients of a generic weighted-sum scoring function in a Single-Objective analysis The analysis of the shape of the Pareto Frontier could reveal a predominance of one or more of the objectives. This way, a corresponding change in the relative coefficients could be made, and a single-objective approach could yield more accurate results (so overcoming of one the drawbacks of empirical scoring functions) A robust-design based approach to docking seems to allow for an effective limited inclusion of receptor flexibility