Phase determination methods in macromolecular X- ray Crystallography Importance of protein structure determination: Proteins are the life machinery and are very essential for the various functions in the human body. The primary structure of proteins are simply made up of 21 different amino acids and occupying certain fold/structure to impose their particular function. To know the function, it is necessary to know the protein fold which is often described as protein structure. The protein structures are in extensive aids in drug design and for determining the structure- based functional relationship. Apparently, the most common and fast way for structure determination is by using X- ray crystallography. With this method it is possible to obtain the global or entire fold to the molecular structural and atomic details of the protein. Introduction of X- ray crystallography method: The year of 2014 has been declared as the International year of Crystallography as it turns 100 years old. The Nobel prizes for 1914 was for X- ray diffraction experiments on crystals with the work by Max Von Laue work in 1912; and 1915 prize for William and Lawrence Braggs. In this vast field, macromolecule structural biology and biophysics took a large leap desire after the discovery of the structure of 54 notable crystal structure including the structure of DNA (J. S. Richardson & Richardson, 2014). This led to the demand of using high X- ray synchrotron sources in many countries, and many advanced new generations of intense X- ray sources like X- ray Free Electron Lasers (XFEL). However, the most wanted list of proteins like oncoproteins c- Myc and p53 and many membrane proteins involving in drug targets remains unknown in a complexes manner. Here, I describe the stepwise procedure of macromolecule structure determination and details of phasing the X- ray diffraction data. The steps involved in protein structure determinations using X- ray Crystallography are: 1. Crystallization of the desired protein, 2. X- ray diffraction and data collection, 3. From pattern of diffraction to the electron density, and 4. X- ray structure quality assessment. 1
Figure 1. step- wise procedure to determine protein three dimensional crystal structure. In the first step, the high quality protein crystals are produced and using highly intense beam of X- rays, the intensities of diffracting crystals are measured. The electron density maps are calculated with the complete data sets. Finally, three- dimensional atomic models of the macromolecules were built after several rounds of quality assessment and refinement. The figures are from my own work for the structure solving of the PP2Aa protein (Anandapadamanaban M et al., Master Thesis, unpublished work) Phase Problem: The diffracted X- rays from the crystal are recorded using an advanced detector that records the relative phase and amplitude. The major problem here is, that we are unable to record the phase differences, which is often called the phase problem. To solve the structure, it is required to calculate the electron density map with the diffraction data. However, for this, the phase must also be derived. Currently, the initial phasing is calculated with several methods based on the advantages for the specific case. The main methods are: 1. Isomorphous replacement 2. Molecular replacement 3. Density modification 4. Direct Methods 1. Isomorphous Replacement: At the first step, heavy atoms like Mercury, which have many electrons, are soaked in native crystal conditions. It then provides the heavy atoms to assemble in macromolecules called a derivative crystal. This method is used considering isomorphous folds of packing and conformation for both native and derivative crystal. Assuming this, both crystals are used for X- ray diffraction data collection. The next step is to find the heavy atom positions using Patterson function (Fig 2). This provides calculation of relative phase angles of the reflections in the native crystal diffraction data. 2
F PH' F H' α P' F P' Figure 2. Vector diagram showing the native protein (FP), heavy atom (FH) and heavy atom derivative (FPH) and the phase angle for the protein. Adapted and modified from the Macromolecular crystallography book (Sanderson and Skelly, 2007) FPH = FP + FH The steps involved in Isomorphous replacement are preparation of heavy atom derivate in protein crystal and collection of X- ray diffraction data for native and derivative crystals. Using the Patterson function, locating the heavy atoms (Figure 3) and refinement of parameters for the phase calculation of the native crystal are obtained. With this, an electron density map is calculated. =' Difference'of'diffracGon'paqern'' Light'atoms'cancel'out'and'heavy'atoms'remains' Figure 3. Schematic representation of isomorphous difference data. The derivative protein crystal have small atoms including heavy atom, the diffraction data is collected together with native protein crystal. Subtracting the both diffraction data leads to the calculation of heavy atom phase. 2. Molecular replacement (MR): If the homologous protein structure is known, it can be used to find the unknown protein structure. As a rule of thumb, if the sequences of two proteins are within 30% sequence identity, this method can be used to find the relative phases (Rossmann, 2001). This is based on the fact that similar protein fold will give similar X-ray scattering. Since the phase 3
was determined for a known protein macromolecule, it can be used as initial phasing for an unknown protein molecule. Molecular replacement calculates by rotation and translation in the unit cell or asymmetric unit until it finds the solution for the unknown structure. The rotation and translation function contributes to 6-dimensional searches, this method is well understood if the Patterson function is solved (Navaza, Panepucci, & Martin, 1998). As the shape of the protein is guessed initially with the unresolved vectors, the map is provided with Intramolecular and Intermolecular vectors. This is the best method if you know the protein structure and want to find the structure of the mutated variant or the protein-ligand conformations. F PH' The steps involved in MR are identifying the suitable search model. Once the model has been chosen and Fapplying the Rotation and P' Translation function to rotate and orient/position α P' the molecule has been applied (Fig 3) the refinement of the structure can be done using any evaluation programs and map calculations. Figure of merit, m: It is a vector with diameter of a circle to calculate how well the phase angle is determined. It is a correction and evaluation factor (Zwart, 2005). F H' R' T' RotaGon'' TranslaGon' Probe'A' Probe'B' Figure 4. Rotation and translation function as a probe with the known structure. R and T denotes for Rotation and translation, respectively. 3. Density Modification: In this method the Phases are improved (Terwilliger, 2003). The steps involved are preliminary phases identification using the obtained map from Isomorphous replacement. It is observed that this initial map contains more errors that are caused by the imperfections. The crucial step in density modification are flattening the solvent region and use of Non- crystallographic symmetry (NCS) (Argos & Rossmann, 1974) as an advantage, which modifies the map 4
to a measurable and realistics. The calculated phases are then improved and with the combination of initial phases and improved phases, new sets of maps are calculated as observed. 4. Direct Methods: Solvent modification (Tong & Rossmann, 1995) is a powerful tool to improve the crystallographic phases when they are diffracted in low resolution datasets. It functions with two parameters of initial experimental phases and alikely- hood function of electron density map for the available phases. However, this method is very relatively calculated. Histogram matching This is another phases improving method (Zhang & Main, 1990), The electron density distribution is dependent on resolution and the temperature factor with irrespective of the structure. References: Argos, P., & Rossmann, M. G. (1974). Determining heavy- atom positions using non- crystallographic symmetry.. Navaza, J., Panepucci, E. H., & Martin, C. (1998). On the use of strong Patterson function signals in many- body molecular replacement. Acta Crystallographica. Section D, Biological Crystallography, 54(Pt 5), 817 821. Richardson, J. S., & Richardson, D. C. (2014). Biophysical Highlights from 54 Years of Macromolecular Crystallography. Biophysical Journal, 106(3), 510 525. doi:10.1016/j.bpj.2014.01.001 Rossmann, M. G. (2001). Molecular replacement historical background. Acta Crystallographica. Section D, Biological Crystallography, 57(10), 1360 1366. doi:10.1107/s0907444901009386 Sanderson and Skelly (2007), Macromolecular Crystallography: conventional and high- throughput methods. Terwilliger, T. C. (2003). Improving macromolecular atomic models at moderate resolution by automated iterative model building, statistical density modification and refinement. Acta Crystallographica. Section D, Biological Crystallography, 59(7), 1174 1182. doi:10.1107/s0907444903009922 Tong, L., & Rossmann, M. G. (1995). Reciprocal- space molecular- replacement 5
averaging. Acta Crystallographica Section D: Biological. Zhang, K. Y. J., & Main, P. (1990). Histogram matching as a new density modification technique for phase refinement and extension of protein molecules. Acta Crystallographica Section a: Foundations of Crystallography, 46(1), 41 46. doi:10.1107/s0108767389009311 Zwart, P. H. (2005). Anomalous signal indicators in protein crystallography. Acta Crystallographica. Section D, Biological Crystallography, 61(11), 1437 1448. doi:10.1107/s0907444905023589 6