Assessing Checking the the reliability of protein-ligand structures Aim A common task in structure-based drug design is the validation of protein-ligand structures. This process needs to be quick, visual and aided by numeric data. Here we present a simple workflow for rapidly detecting suspect and/or interesting features of protein-ligand complexes. Introduction Visual inspection and assessment of protein-ligand complexes is a common task for people involved in structure-based drug design. However, such visual analysis can be aided by tools that automatically detect dubious structural features. Further, easy access to structural knowledge bases can provide rapid validation of interesting features observed in the structures. In this case study three factor Xa structures from the PDB (1nfy 1, 1ksn 2 and 1xka 3 ) will be used to illustrate a simple workflow for validating protein-ligand structures. Factor Xa is a serine protease that plays a key role in the blood coagulation cascade. Accumulation of factor Xa can lead to thrombosis and as such it is well researched drug target. Figure 1 - Ligands used in this study. In orange the inhibitor FXV673 (1ksn). In magenta the inhibitor RPR200095 (1nfy). In green the inhibitor FX-2212A (1xka). 1
Although three structures from the PDB were used in this study the same workflow could easily be applied to structures derived from in-house crystallography programs or ligand poses produced by docking studies. Method Dubious water molecules in the crystal structures were identified using the water module in Relibase+ 4. Key water molecules forming part of hydrogen bond networks between the protein and the ligand were, similarly, identified using the water module in Relibase+ 5. Crystallographic packing effects, arising from symmetry related molecules, were analysed using Relibase+ 6. The geometries of the ligands were analysed using Mogul 7. Key interactions between the proteins and the ligands were analysed using IsoStar 8 and using 3D protein-ligand interaction searches in Relibase+. Both Mogul and IsoStar form part of the CSD system and are accessible through the Hermes visualiser. All protein-ligand figures were generated using Hermes. As Relibase+ is completely comprehensive with respect to the PDB there was, in this case, no need to input the structures into Relibase+ a priori to the analysis. However, when available in-house data can easily be imported, stored and maintained in Relibase+. Results The structure 1ksn shows the inhibitor FXV673 bound to factor Xa. In this case Relibase+ flags water 11 as being a dubious water based upon its low B-factor (27.73), octahedral coordination and short bonds (<0.9Å). This indicates that there is a high likelihood that this particular water is in fact a metal ion. This could have a huge impact on the results of any molecular mechanics or molecular dynamics experiments carried out using this model as important charges would be missed out. Equivalent dubious water molecules are present in the 1nfy and 1xka structures as well. 1ksn 1nfy 1xka Water ID 11 24 680 Low B-factor (average) 23.70 (35.12) 15.76 (26.76) 2.00 (20.77) Octahedral coordination RMS 0.16 0.15 0.17 Short bonds <0.9 Å <0.9 Å <0.9 Å Table 1 Dubious water molecules highlighted by Relibase+. Relibase+ was also used to identify key waters mediating hydrogen bonds between the proteins and the ligands. This revealed one of the key waters being displaced by the phenyl-chloride group of the inhibitor RPR200095 (1nfy). 2
Figure 2 The phenyl-chloride group of the inhibitor RPR200095 (1nfy) displacing a key water observed in the 1ksn structure. Note also the amidino-carboxylate interaction formed by the inhibitor FXV673 (1ksn) and ASP189. An interaction worthy of comment is that formed between the carboxylate group of ASP189 and the amidino group of the inhibitors FXV673 (1ksn) and FX-2212A (1xka). In these interactions the closest distance between the interacting oxygen and nitrogen atoms is around 2.6 Å and the angle between the planes of the functional groups is approximately 60. This type of interaction is easily validated using IsoStar. Figure 3 IsoStar plots for the amidino-carboxylate interaction. Interestingly, the CSD data (left) shows a distinctly more planar type of interaction than the PDB data (right). 3
In order to get more specific detail on the distance and angular distribution of this interaction a 3D substructure search was set up using Relibase+. Figure 4-3D search setup using Relibase+ retrieving the N-O distance and the angle of the interaction of the amidino-carboxylate interaction. This revealed that the mode of the interaction distance was 2.8 Å and the mode of the angle between the planes was 60 validating that the interaction observed in 1ksn and 1xka was normal. Figure 5 Histogram generated by Relibase+ showing the distribution of the angle of the amidinocarboxylate interaction. Another observation that is easy to make using Relibase+ is that the binding mode of the inhibitor FX-2212A (1xka) is potentially influenced by packing effects arising from symmetry related 4
molecules. This type of information is particularly important to consider when deciding upon which protein model(s) to use in docking studies. Figure 6 - In the 1xka structure (capped sticks) symmetry related molecules impose on the binding site causing a side-chain movement in TYR99. The geometries of the inhibitors were assessed using Mogul. The inhibitor RPR200095 (1nfy) has a six-membered ring in a half-chair conformation. Intuition, might have lead us to think of this as an unreasonable conformation. However, Mogul tells us that this conformation is not unusual for this particular ring structure (albeit with the warning that it only found 9 hits in the CSD). Two features stand out from the Mogul analysis of the inhibitor FXV673 (1ksn). First of all all but 3 of the carbon-carbon bond distances are too long, ranging from 0.03 to 0.08 Å. The second observation is that the torsion connecting the aromatic ring in the S1 pocket does not appear in a minima. 5
Figure 7 - Histogram generated by Mogul demonstrating that the torsion highlighted does not appear in a minima. The inhibitor FX-2212A (1xka) also has several bonds that are too long, ranging from 0.04 to 0.1 Å. However, more importantly Mogul highlights three torsions that fall far away from their minima. The bonds affected all appear in the region of the binding site that is affected by the packing of symmetry related molecules, confirming the suspicion that the binding mode of this ligand is influenced by crystallographic packing effects. Figure 8 - The three torsions highlighted (ball and stick) are flagged as unusual by Mogul, these torsions are all in close proximity to symmetry related molecules. 6
Conclusions In summary, a simple workflow for validating protein-ligand complexes has been described. Dubious water molecules, key water molecules and crystallographic packing effects are all easily identified using Relibase+. Validation of key protein-ligand interactions is quickly achieved using IsoStar and more in depth analysis can be carried out using 3D searches in Relibase+. Finally, ligand geometries are easily assessed using Mogul. Further, all these tools are tightly integrated into the Hermes visualiser making it is easy to cross reference any numeric data produced by the different programs with the 3D structures of the protein-ligand complexes. References 1. S. Maignan, J.-P. Guilloteau, Y. M. Choi-Sledeski, M. R. Becker, W. R. Ewing, H. W. Pauls, A. P. Spada and V. Mikol, J. Med. Chem., 46, 2003, 685-690 2. K. R. Guertin et al., Bioorg. Med. Chem. Lett., 12, 2002, 1671-1674 3. K. Kamata, H. Kawamoto, T. Honma, T. Iwama and S.-H. Kim, Proc. Natl. Acad. Sci. USA, 95, 1998, 6630-6635 4. M. Hendlich, A. Bergner, J. Günther, G. Klebe, J. Mol. Biol., 326, 2003, 607-620 5. J. Günther, A. Bergner, M. Hendlich and G. Klebe, J. Mol. Biol., 326, 2003, 621-636 6. A. Bergner, J. Günther, M. Hendlich, G. Klebe and M. Verdonk, Biopolymers (Nucleic Acid Sci.), 61, 99-110, 2002 7. I. J. Bruno, J. C. Cole, M. Kessler, Jie Luo, W. D. S. Motherwell, L. H. Purkis, B. R. Smith, R. Taylor, R. I. Cooper, S. E. Harris and A. G. Orpen, J. Chem. Inf. Comput. Sci., 44, 2004, 2133-2144 8. I. J. Bruno, J. C. Cole, J. P. M. Lommerse, R. S. Rowland, R. Taylor and M. L. Verdonk, J. Comput.-Aided Mol. Des., 11, 1997, 525-537 7
Products Relibase+ - a powerful tool for accessing and analysing protein-ligand data from public and in-house data sources CSD the world s only comprehensive, fully curated database of crystal structures, containing over 500,000 entries IsoStar a knowledge base of intermolecular interactions which provides easy appreciation of the geometry, strength and stability of interactions. Mogul a knowledge base of CSD-derived molecular geometries providing a rapid method of validating computational models and newly refined structures For further information please contact Cambridge Crystallographic Data Centre, 12 Union Road, Cambridge CB2 1EZ, UK. Tel: +44 1223 336408, Fax: +44 1223 336033, Email: admin@ccdc.cam.ac.uk 8