A Three-Dimensional Correlation Method for Registration of Medical Images in Radiology Michalakis F. Georgiou 1, Joachim H. Nagel 2, George N. Sfakianakis 3 1,3 Department of Radiology, University of Miami / Jackson Memorial Hospital, Miami, FL 33136, USA 1 mgeorgio@mednet.med.miami.edu 3 gsfakian@mednet.med.miami.edu 2 Institute of Biomedical Engineering, University of Stuttgart, Stuttgart 70174, Germany 2 jn@bmt.uni-stuttgart.de ABSTRACT The availability of methods to register multi-modality images in order to fuse them and to correlate their information is increasingly becoming an important requirement for various diagnostic and therapeutic procedures. A variety of image registration methods have been developed but they remain limited to specific clinical applications. Assuming rigid body transformation, two images can be registered if their differences are calculated in terms of translation, rotation and scaling. This paper describes the development and testing of a new correlation based approach for three-dimensional image registration. First, the scaling factors introduced by the imaging devices are calculated and compensated for. Then, the two images become translation invariant by computing their threedimensional Fourier magnitude spectra. Subsequently, spherical coordinate transformation is performed and then the three-dimensional rotation is computed using a novice approach referred to as polar shells. The method of polar shells maps the three angles of rotation into one rotation and two translations of a two-dimensional function and then proceeds to calculate them using appropriate transformations based on the Fourier invariance properties. A basic assumption in the method is that the three-dimensional rotation is constrained to one large and two relatively small angles. This assumption is generally satisfied in normal clinical settings. The new three-dimensional image registration method was tested with simulations using computer generated phantom data as well as actual clinical data. Performance analysis and accuracy evaluation of the method using computer simulations yielded errors in the sub-pixel range. INTRODUCTION Image registration and fusion of medical data involves the comparison, combination or correlation of information present in two different images of the same patient acquired under different conditions, such as from different imaging modalities or from the same modality but at different times. In recent years, image fusion has emerged as a powerful tool for both diagnostic and therapeutic purposes. The diagnostic potential of imaging studies can be greatly enhanced by combining the accuracy and high resolution of anatomical localization by Computed Tomography (CT), Magnetic Resonance Imaging (MRI), X-ray and Ultrasound with the extracted complimentary functional and metabolic activity information from Single Photon Emission Computed Tomography (SPECT) and Positron Emission Tomography (PET). Image fusion is also applied in radiation therapy, planning and treatment, in image-guided surgery, and for compensation of motion during image acquisition. The difficulty in registering or matching images is due to the non-uniform scale factors of the imaging devices and misalignments because of different viewing angles and acquisition planes. Registration of two images involves the calculation of a transformation function, which compensates for the differences in translations, rotations, and scaling factors. The matching between the two images refers to the correspondence of pixel coordinates from one image to pixel coordinates of the other image (reference). The problem of calculating the registration variables (translations, rotations, and scaling factors) is not trivial mainly because they are coupled together and cannot be computed simultaneously. Appropriate transformations may be needed in order to un-couple the registration variables and to calculate them independently.
Because of the early and continued need for medical image registration, investigators developed a number of different approaches, which vary considerably on the assumptions made, the actual implementation, or the clinical application [1]. In clinical practice, the visual re-positioning and realignment of the images by the physician is the most common method that has been used for correlating images from multiple modalities or from a single modality acquired at different times. This process usually involves matching the images based on structures or specific regions that can be identified in all the modalities involved. This method is subjective, requires a lot of the expert s time, its reproducibility may not be very high, and its accuracy depends on the degree of expertise of the physician. Improvement of accuracy of the visual method is achieved when the registration process is aided by computer software which permits the manual shifting, re-orienting and zooming of the images until they are aligned; still, however, the approach is labor intensive, time consuming and lacks reproducibility. Another set of methods uses external markers (fiducials) such as stereotactic frames and special fixation devices. These markers appear in the images and provide reference points that can be unambiguously identified in the various modalities involved. Even though methods using external markers have been applied successfully, they suffer from the disadvantage of inconvenience and discomfort to the patient. They are impractical, sometimes cumbersome and cannot be applied retrospectively. A popular set of methods uses surfaces, volumes or scattered points in order to calculate the lower moments of inertia and to derive the principal axes and center of mass of an image. The registration variables are calculated by considering the differences between the angles and positions of the centroids between two images (rotation and shift) and the relative magnitude of the principal axes (scaling). Even though these methods are direct and can be applied retrospectively, they suffer from sensitivity to incomplete scan coverage and asymmetries or deformations between the images. Another set of methods that has attained widespread use is the surface or volume fitting methods. A surface model or the head is extracted from one image and it is related to a set of points or the hat from the other image. The fitting of the hat onto the head takes place by minimizing a non-linear function expressing the distance between the two surfaces at each point. These methods are three-dimensional and have been used clinically with reported success. However, they may require user interaction in the selection of points and this may be labor intensive. Fitting methods may also suffer from convergence problems or long execution times. A class of registration methods obtains the registration variables based on the calculation of the degree of similarity or correlation between two images. These methods are applied using the original pixel intensities of the images or, as more often in multi-modality images, some kind of transformation into another domain precedes the cross-correlation. Correlation based image registration methods provided robust and reliable performance, as described in the literature. From the clinical applications standpoint, most of the developed methods for image registration concentrated on the brain rather than the body, mainly because the former is held in a rigid volume (the skull) but the latter is not. The thorax, abdomen, pelvis and the organs inside them can vary in shape and even in size depending on the condition of the patient (positioning, respiration, cardiac pumping, food and digestion, bowel content, etc.). METHOD A group of correlation based approaches, of particular interest to this work, utilizes the invariance properties of the Fourier Transform in order to accomplish un-coupling, independent treatment and computation of the registration variables [2,3]. Cideciyan, et al., [3] developed a triple invariant descriptor correlation method. First, the magnitude spectra of the images are calculated in order to make them translation invariant. The resulting spectra are transformed into log-polar coordinates and subsequently their cross-correlation is calculated yielding the scaling and rotation information. At this point one image is scaled and rotated with respect to the other and a new cross-correlation function is calculated with the original image to obtain the shifts. This method solved the twodimensional registration problem and was applied to high resolution retinal images. A modified version of the method was applied for compensation of patient motion in nuclear medicine renal studies [4].
When 2D registration methods are applied to 3D data, such as in tomography, it is assumed that the images to be registered are co-planar. For this assumption to be valid special patient positioning would be required and this is generally not the case in most clinical protocols. In 3D modalitites, such as CT, MRI, SPECT and PET, the data is volumetric and even a small variation in patient positioning can result in large changes in the reconstructed slices. Registration of 3D data sets with 2D methods is therefore considered inadequate. The volume needs to be considered as a whole in the registration process and the data can be subsequently re-sliced in the various axes or planes of interest. Assuming that the scale factors are known from the acquisition devices, then the main difference between the 2D and 3D image registration problem is the larger number of degrees of freedom that must be computed: 2 translations and 1 rotation in the former versus 3 translations and 3 rotations in the latter. The real difficulty in the 3D case lies in determining the 3 angles of rotation because they map into non-linear functions and cannot be computed in a straightforward fashion. The current work described in this paper deals with the implementation of a novice threedimensional cross-correlation based image registration method. The new method accomplishes the uncoupling of all the registration variables involved. A basic assumption in the method is rigid body transformation (i.e. no warping). The scale factors are assumed known from the image acquisition devices (e.g. CT and SPECT scanners). The data is therefore adjusted through appropriate interpolations so that the voxels become isotropic (i.e. of equal size in all three dimensions) in both volumes to be registered. In the next step, a three-dimensional Fourier transformation is performed using a 3D Fast Fourier Transform (FFT) algorithm. By obtaining the Fourier magnitude spectra, the 3D translation effect is eliminated and as the only difference remains the 3D rotation between the two volumes, according to the translation property of the Fourier Transform [5]. Appropriate windowing and zero-padding are performed prior to the Fourier Transformation in order to avoid aliasing effects. In the next step, the two Fourier magnitude spectra are transformed from rectrangular to spherical coordinates in order to pursue the determination of the three rotations using the method of polar shells, as follows: Points on a sphere in 3D space (spatial frequency domain) are extracted. With a radius r of the sphere, and the center of the sphere being the origin of the coordinate system, all points are distinguished by the same magnitude of their spatial frequency. In an extension of the method, a range of frequencies is used by considering a shell with limited thickness instead of the surface of a sphere. In this case, for each angle in the 3D space, the values are integrated along the corrsponding radius vector from the inner to the outer surface of the shell. Thus, the registration is based on a frequency range and not just a single frequency, making the procedure more reliable. Figure 1: The top half of the sphere forms a polar shell. The top portion of the sphere (Figure 1) is used to calculate a 2D projection onto a plane parallel to the axial plane ( equator ). This projection is sampled appropriately for both volumes thus creating two 2D images containing the rotational information. At this point, the 3D rotations are mapped onto the projections of the polar shells as follows: The rotation around the Z-axis, i.e. the axial rotation as a 2D rotation, and the rotations around the X-axis and Y-axis, i.e. coronal and sagittal respectively, as a 2D translation. Thus, the 3D rotation problem is reduced to the 2D registration problem solved previously [2,3,4], and the 3 angles of rotation can be obtained. The inherent difficulty of the method is dealing with the projection of the curvature of the sphere because it maps the coronal and sagittal rotations as non-linear translations on the shell and introduces an error. Under
the assumption, however, that the 3 angles are constrained to one large (axial < 50 o ) and two relatively small angles (coronal and sagittal < 30 o ), the implementation is made possible. This assumption is valid for body imaging in a clinical setting where the patient s position is restricted on an imaging table. With some considerations, this assumption may be applicable for brain imaging as well. A critical factor in the method of polar shells is the successful choice of the radius r of the sphere. Since the method operates on the Fourier magnitude spectra of the two volumes, the radius r determines which frequency domain characteristics will be present in both polar shells when the cross-correlation is performed. A very small radius yields polar shells that contain information close to the DC component that are are not useful for the registration. A relatively large radius produces polar shells with high frequencies that contain a lot of noise. The appropriate choice of the radius can be best obtained using simulation experiments with the particular type of data that will be used in the registration method. Once the 3D rotations are found, they are applied to the original space domain volumes; thus the two volumes are corrected for their rotational difference. At this point, the two volumes differ by a 3D translation only. A 3D cross-correlation is performed (using a frequency domain implementation) between the rotation-corrected volumes. The maximum of the cross-correlation yields the 3D translational difference. The two volumes are subsequently corrected for their translational difference. At this point, the two volumes are fully registered and the algorithm terminates. RESULTS AND DISCUSSION The new registration method was tested extensively using simulations with an anthropomorphic mathematical phantom called MCAT [6]. An MCAT volume was created in a 64x64x64 format with isotropic voxels of size 6.25mm in each dimension. Fifty misaligned versions of this volume were generated by applying arbitrary combinations of 3D translations and 3D rotations. The combined translations were maintained below 9.50cm while the rotations were within the limits of an axial angle below 50 o and a coronal and sagittal angles below 30 o. Following the perturbations, the 3D magnitude spectrum was calculated for each volume in order to separate translation from rotation effects. Polar shells were extracted in the frequency domain from the magnitude spectra of the reference (no translation or rotation) and the misaligned volumes. The radius of the sphere for the polar shells was 0.25 cycles/cm, with a thickness of 0.1 cycles/cm, so the actual range of the polar shell was 0.2 0.3 cycles/cm, corresponding to 10 discrete samples. The 2D registration algorithm was applied to the 2D projections of the polar shells to yield independently the axial rotation (2D rotation) and the coronal and sagittal rotations (maximum of cross-correlation), Figure 2. These computed 3D angles were applied to the misaligned volumes in space domain thus correcting them for their rotational difference. Another set of 3D cross-correlations was applied to the rotation-corrected volumes in order to calculate their translational effects. The computed translations were compensated for and the volumes became aligned with the reference volume. The absolute mean error in translation, expressed as the combined effect of the displacement in each individual axis, was 0.99mm. The absolute mean error for the rotations was 2.56 o for the axial, 1.89 o for the coronal and 1.88 o for the sagittal, respectively. Figure 2: Left: The Polar Shells of the two volumes (rotations by 20 o axially, 10 o coronally and 10 o sagittally). Right: The computed 2D linear cross-correlation yields a maximum (arrow) corresponding to the sagittal and coronal rotations.
The method was also applied in a limited fashion to actual clinical data. As an intra-modality application, the method was tested with serial studies of the same patient acquired at different times. The scans were tumor localization SPECT studies (Sestamibi Technetium99m). The results were evaluated visually and appeared to be realistic. Application of the method to multi-modality data faces the same issues as other correlation based approaches do; i.e. straight-forward application using pixel intensities is generally not sufficient. The images have to be made similar through appropriate pre-processing techniques, such as extraction of boundaries, segmentation, common features, etc. The new method is currently being tested in patient multi-modality studies of SPECT and CT (Figure 3). Figure 3: Left: A slice of a patient s CT scan showing lymphomas (arrows) in the liver. Right: The corresponding SPECT slice of a Gallium scan is registered and overlayed with the CT, showing high activity on the areas of the CT lesions. CONCLUSION Registration and fusion of intra- and multi-modality radiologic images is becoming increasingly important for diagnostic and therapeutic procedures. Methods have been developed but they remain limited to specific modalities or clinical applications. Previous correlation methods accomplished the un-coupling of the registration variables (translation, rotation, and scaling) using invariance properties of the Fourier Transform, but they have been limited to two-dimensional implementations. The new method that has been developed and described in this paper is correlation based, three-dimensional and accomplishes un-coupling and independent calculation of the registration variables. The challenging issue of finding the 3D rotational difference between two volumes is performed with the novice approach of polar shells. Extensive simulation studies with mathematical phantoms yielded combined translation and rotation error in the sub-pixel range. The method had limited application to clinical data but it is currently being tested with patient multi-modality studies. REFERENCES 1. Van den Elsen PA, Pol ED, Viergever MA: Medical Image Matching - A Review with Classification. IEEE Engineering in Medicine and Biology, 12:26-39, 1993. 2. Apicella A, Kippenhan JS, Nagel JH: Fast multi-modality image matching. SPIE 1092:252-263, 1989. 3. Cideciyan AV, Jacobson SG, Kemp CM, Knighton RW, Nagel JH: Registration of high resolution images of the retina. SPIE 1652:310-322, 1992. 4. Georgiou M.F, Nagel JH, Cideciyan AV, Sfakianakis GN: Compensation of patient motion in nuclear medicine renal studies by fast correlation image registration. Proceedings 15th Annual International Conference IEEE EMBS, San Diego, California, October 28-31, pp 107-107, 1993.
5. Gonzalez RC, Woods RE: Digital Image Processing, Addison-Wesley Publishing Company, USA, September 1993. 6. LaCroix KJ, Tsui BMW: An evaluation of the effect of nonuniform attenuation compensation on defect detection for Tc-99m myocardial SPECT images. The Journal of Nuclear Medicine, 38(5): 19P, (abstract), 1997.