Data mining in brain imaging
|
|
|
- Clarence Ryan
- 9 years ago
- Views:
Transcription
1 Statistical Methods in Medical Research 2000; 9: Data mining in brain imaging Vasileios Megalooikonomou, James Ford, Li Shen, Fillia Makedon Department of Computer Science, Dartmouth Experimental Visualization Laboratory, Dartmouth College, Hanover, New Hampshire, USA and Andrew Saykin Brain Imaging Laboratory, Departments of Psychiatry and Radiology, Dartmouth Medical School, Dartmouth Hitchcock Medical Center, Lebanon, New Hampshire, USA Data mining in brain imaging is proving to be an effective methodology for disease prognosis and prevention. This, together with the rapid accumulation of massive heterogeneous data sets, motivates the need for efficient methods that filter, clarify, assess, correlate and cluster brain-related information. Here, we present data mining methods that have been or could be employed in the analysis of brain images. These methods address two types of brain imaging data: structural and functional. We introduce statistical methods that aid the discovery of interesting associations and patterns between brain images and other clinical data. We consider several applications of these methods, such as the analysis of taskactivation, lesion-deficit, and structure morphological variability; the development of probabilistic atlases; and tumour analysis. We include examples of applications to real brain data. Several data mining issues, such as that of method validation or verification, are also discussed. 1 Introduction Data mining in brain imaging is an emerging field of high importance for providing prognosis, treatment, and a deeper understanding of how the brain functions. The field of data mining addresses the question of how best to use this data to discover new knowledge and improve the process of decision making. The discovery of associations between human brain structures and functions (i.e. human brain mapping) has been recognized as the main goal of the Human Brain Project, 1 which is a high-priority project funded by several government initiatives. Mining problems can be grouped in three categories: 2 identifying classifications, finding sequential patterns, and discovering associations. Although data mining is a powerful knowledge discovery technique, there are constraints in the way it can be applied: it is applicationdependent, different applications usually require different mining techniques, and data must be of a certain size and format. 3 In this paper we survey current mining methods, give a critical review of the main computational obstacles that lie behind our ability to perform automatic data mining on brain imaging and propose some solutions. There are various problems in mining of brain images that need to be addressed. The first problem is that most fundamental mining algorithms (rule-based learning systems, neural networks, decision trees, Bayesian networks, logistic regressions, and so on), which have been used with great success in medicine, assume that data sets contain only simple numeric and symbolic entries. It is important, therefore, to Address for correspondence: V Megalooikonomou, Department of Computer and Information Sciences, Temple University, 314 Wachman Hall, Philadelphia, PA 19122, USA. [email protected] Ó Arnold (00)SM221RA
2 360 V Megalooikonomou et al. consider how to preprocess brain images (multidimensional arrays of data) so that we can transform them to data representations which are amenable to data mining techniques. A second problem is that, although there are algorithms for classifying images, there is a lack of effective algorithms for learning from images directly. 4 Again, this implies the use of methods that transform images to a format conducive for learning algorithms. Most early medical analysis ignored the image or raw sensor portions of the medical record or summarized them in a very simplified form (e.g. normal or abnormal ). A third problem in mining brain images is the heterogeneity of the brain imaging data: different modalities, formats and resolutions prevent a common analysis and require integration. Integrating data from different studies often means integrating different formats, which, in turn, implies imposing several assumptions on the data representation. Many studies today, especially those using functional imaging, only focus on a specific clinical question and deal only with a small set of subjects, mainly due to the high cost of acquiring image data. Other difficulties inherent in mining of associations in brain images are that: (1) due to inter-subject variation and noise, a large number of subjects have to be studied; (2) functions may correspond to more than one location and can relocate in the presence of structure abnormalities; (3) brain lesions and other abnormalities have a complex spatial distribution, typically covering multiple brain structures; and (4) normal brain function can be affected in varying ways due to the complexity of the functional organization of the human brain. These obstacles aside, there have been recent technological advances that make available enormous amounts of data. Imaging studies of the human brain at active medical institutions today routinely accumulate more than 5 terabytes of clinical data per year. Data in this domain usually consist of three-dimensional (3-D) images from different medical imaging modalities that capture structural (e.g. MRI, y CT, z histology x ) and functional/physiological (e.g. PET jj, fmri, { SPECT yy ) information about the human brain. On the other hand, there is now a wide availability of noninvasive methods for assessing macroscopic brain structure, particularly magnetic resonance (MR) techniques that complement clinical functional assessment. 5 Also, there is continued development of improved functional imaging techniques and normalization methods. Greater computer capabilities are leading to the creation of large databases of structure/function information, the efficiency of which, depends on interoperable multimedia data representation that is easy to search. This trend is reflected in the work of Fox et al., 6 9 Evans et al., 10,11 the QBISM database by Arya et al., 12 the BRAID database by Letovsky et al., 13,14 the BrainMap database of neuroimaging data, 15 and other neuroimaging databases. 16,17 The problem of multidimensional data (e.g. brain images), can be solved with newer mining methods which are applied directly to the images in order to capture most of their information content. As was mentioned, mining is heavily dependent on statistical methods for discovering associations and classifications among disparate ymagnetic resonance imaging: shows soft-tissue structural information. zcomputed tomography: shows hard-tissue structural information. xhistology images are acquired by physically slicing and photographing tissue. jjpositron emission tomography: shows physiological activity. {Functional-magnetic resonance imaging: shows physiological activity. yysingle photon emission computed tomography: shows physiological activity.
3 Data mining in brain imaging 361 types of data. We exploit this fact and consider methods that combine the information from image and behavioural data about the brain and present methods for developing probabilistic brain atlases. The results of these methods are new representations of the information content of brain images and statistical maps. The rest of this paper is organized as follows: in Section 2 we present the preprocessing phase of data mining in brain imaging, including the segmentation and spatial normalization of images. Although smoothing and reduction of noise can be considered to be part of preprocessing, here we include them as part of the mining methods in Section 3. In Section 3 we present mining methods that have been used in structural and functional imaging. These methods are useful for: (a) the efficient discovery of associations between structures and functions; (b) the classification of structural information, including both normal structures and abnormalities such as tumours; and (c) recently, the discovery of associations between gene expressions, morphology, and function. In Section 4 we present important issues in mining of brain images that are common to structural and functional brain data. We also consider the problem of verification of mining methods. This review concludes with a discussion in Section 5. 2 Data preprocessing After image data is collected for each subject and before mining is performed, the data has to pass through a preprocessing phase. This phase identifies and normalizes the brain objects to be stored in a database and mined later. In anatomically related studies, after an anatomical (i.e. structural) image is collected for each subject, each lesion, area, or structure of interest is delineated (segmented) as a region of interest (ROI) on each slice automatically, semi-automatically, or manually. Extensive work has been done on automating image segmentation The first segmentation methods were solely intensity based. 22 However, since many structures were not distinguishable on the basis of signal intensity alone, prior spatial information was incorporated either in the form of intensity gradients, 23 prior spatial probability distribution over signal intensity, 18,23,24 or by registering the image to a segmented atlas using a spatial probability map for each voxel. 28 Using the slice data, each ROI is reconstructed in three dimensions. Then, in both functional and anatomical images, normalization or image registration has to be performed to make image data comparable across subjects without morphological and acquisition variability. This process maps homologous anatomical regions to the same location in a stereotaxic space, such as the Talairach anatomical atlas. 29 Several linear and nonlinear spatial transformations have been developed to bring the 3-D atlas and the subject s 3-D image into register, i.e. spatial coincidence. 27,30 32 As an example, the effect of registration of an MR image to the Talairach atlas using a nonlinear method based on a 3-D elastically deformable model 32 is presented in Figure 1. In addition to 3-D image data, modern brain image databases usually contain a set of generic anatomic atlases of the human brain that model the exact shapes and positions of anatomical structures. A raw MR or fmr image does not identify the structure to which each voxel (3-D volume element) belongs, but an anatomical atlas
4 362 V Megalooikonomou et al. (a) (b) (c) Figure 1 A slice of (a) original MR image, (b) atlas, and (c) atlas image overlaid on the deformed MR image. Picture from Megalooikonomou et al. 97 can supply this information (with the accuracy of the registration methods) when overlaid on the image (see Figure 1). A variety of brain structure maps have been derived, at several spatial scales, from 3-D tomographic images, 33 anatomic specimens, 29,34,35 and a variety of histologic preparations that reveal regional cytoarchitecture 36 and molecular content. Other brain maps have concentrated on function, 37,38 or neuronal connectivity and circuitry. 39,40 Here, for clarity of presentation, we concentrate on the widely used Talairach atlas. 3 Data mining We first present methods for discovering associations between brain functions and structures. Traditionally, two approaches have been employed for functional brain mapping. The first approach seeks associations between lesioned structures and concomitant neurological or neuropsychological deficits for example, between trauma lesions and deficits like left visual field deficit. The second approach measures brain activation in subjects as they are asked to perform certain tasks. We present methods for both approaches. The problem of efficiently finding similar ROIs in brain image databases is also addressed. We present methods for extracting knowledge about the morphological variability of brain structures and about abnormalities such as tumours. Methods that can be potentially applied to both structural and functional imaging are also presented. 3.1 Functional imaging of the brain Functional brain imaging uses technologies such as PET, SPECT or fmri with specifically designed experiments to identify activated regions of the brain under different conditions. The cumulative nature of PET greatly limits its spatial and temporal resolution. fmri has much better temporal resolution, and a spatial resolution on the order of 2 4 mm that is limited by the characteristics of the underlying vascular structure (Ogawa et al. 41 present an in depth review of fmri). Recently, diffusion tensor imaging (DTI) has emerged as a new application of MR technology to the problem of human brain mapping. DTI calculates 3-D diffusion
5 Data mining in brain imaging 363 tensor maps by measuring water proton mobility in diffusion weighted MR images. 42 DTI can be used to indirectly image nerve fibre bundles, especially those in white matter areas that are connecting links between various (grey matter) brain areas and invisible so far in other imaging. The data from functional imaging scans are typically in the form of measurements at thousands of voxels. In PET, a voxel measurement reflects the amount of activity in a brain region. In fmri, a voxel s time series of measurements is meaningless by itself, but becomes useful when compared to another series of measurements at the same place under different conditions. This is possible after registration and motion correction Since different subjects may have different strategies for accomplishing the same task, it can be useful to be able to average all subjects in a multi-subject study to see the common areas of activation. 45 Averaging multiple subjects also increases the statistical power of the analysis, and is usually necessary for fmri with its low signalto-noise ratio. Smoothing is usually then performed by applying a spatial smoothing filter to reduce the effect of motion that was not completely removed, and other unwanted noise. 47 The typical choice is a low-pass Gaussian filter in the spatial domain, which smooths high frequency variation in the data. Several researchers have suggested that the use of Gaussian filters for denoising in the spatial domain introduces unwanted biases, and that the optimal filter width is a function of the size of activation foci, 50,54 which cannot generally be determined a priori except possibly by reference to underlying neuroanatomy. The low-pass filter approach will also blur and displace activated areas and remove the areas of least activation, reducing spatial resolution. There have been suggestions for addressing these limitations of Gaussian filtering by focusing on signal restoration instead of noise removal, looking at multiple scales to overcome the problem of scale-specificity, or using approaches that do not impose a priori models on the data. The scale-space approach 50,55,56 builds a multi-resolution representation of functional data. Voxels are organized in clusters (called blobs 50,55 ) that may be of different shapes and sizes at different scales. Blobs themselves are hierarchically organized, with blob trees dividing scales higher in the tree (lower resolutions) into smaller blobs at lower scales, down to individual voxels at scale 0. Each scale s > 0 corresponds to the convolution of the original n-d signal with a Gaussian kernel of width s. 50 The multifiltering approach 57 is similar in that analysis is done on smoothed and unsmoothed data (although only one level of smoothing is used) for the purpose of picking up large regions of relatively weak activations. Other analysis techniques operate directly on the data, for example in the wavelet domain, and do not explicitly smooth the data. In this case, the fact that background noise is distributed compared to spatially localized signals can aid in useful extraction of the signals. The complex-denoising noise reduction process can also increase signal-to-noise by thresholding discrete wavelet transform coefficients. 61 Temporal smoothing is also often done in fmri, and the signal series at each voxel is usually convolved with some approximation of the haemodynamic response function (HRF) e.g. a Poisson function, the Gamma function, a linear combination of the Gamma function and its temporal derivatives, or the Gaussian function that is chosen heuristically. 62 This approach identifies activated voxels, although particular analysis methods can take many forms. There has been recent interest in improving HRFs. 62,63
6 364 V Megalooikonomou et al Statistical parametric mapping (SPM) One of the most common analysis approaches currently in use, called statistical parametric mapping (SPM), 64,65 analyses each voxel s changes independently of the others and builds a map of statistic values for each voxel (see Figure 2). The significance of each voxel can be ascertained statistically with a Student s t-test, an F- test, a correlation coefficient, or any other univariate statistical parametric test 43 (more details about the use of t-test in SPM are in appendix A1). The result of the t-test is a t- value that, when indexed with the number of degrees of freedom, gives a probability value indicating how likely it is that this difference could occur by chance. The significance threshold (alpha value) is typically chosen to be 0.05 or less, indicating a 5% or smaller chance of a false positive decision of significance. The t-value can also be expressed as a Z-score, indicating how many standard deviations the t-value represents on a Gaussian distribution. Z-scores and alpha values are the most popular means for reporting the significance of this difference. The t-test is mathematically equivalent to a one-way analysis of variance (ANOVA), and both can be expressed in the general linear model for regression analysis 64 Yt ðþ¼xðþþ" t ðþ t at each voxel. The general linear model finds (and tests significance of) linear regressions between two data sets X and Y for each member of X (including dummy members added to represent experimental conditions). Methods based on general linear models have more variables (voxels) than samples (e.g. the number of examinations) and thus are severely underconstrained. By applying a uniform threshold of statistical significance it is possible to determine which voxels are likely to have had significant changes between conditions, and with what likelihood, assuming a null hypothesis of no changes between conditions. Data from individuals can be combined into groups, in which case statistical inferences can be drawn either about the significance of differences between subjects, or the likelihood of consistent activation in the population from which the subjects were drawn. 66 Some early studies have examined the possibility of doing post-analysis on statistical maps, allowing comparisons of activations between independently analysed Figure 2 Statistical parametric map. These displays of a statistical parametric map (SPM), produced using the SPM99 software package, show two views (one overlaid on a surface map, the other an axial slice) of 3D brain activations for a simple motor task (courtesy Brain Imaging Lab, Dartmouth-Hitchcock Medical Center).
7 Data mining in brain imaging 365 subjects. 55,67 One problem here is that when using voxel data, comparison of different scans may not align activations that are slightly misregistered. Recently, the spatial extent, i.e. the area of activation, has been used to detect significant regions of activation. If a spatial extent of activation is reported, however, it is sometimes only given as a voxel count above a threshold, which has been found to be a very unstable indicator of activation across trials. 68 Another issue of concern in functional imaging is the consistency of observations across subjects. It is common for some activations during a study to be similar across subjects (and thus significantly correlated with the task under study), and for others to be idiosyncratic. There has been some report in the literature about the difficulty of reproducing voxel level significance maps in fmri. 68,69 In the case of Tegeler, 69 even the 2% most significant voxels were found to vary considerably across runs, subjects, and analysis techniques. Variability in activation maps is a concern; however, reproducibility of fmri activations at a regional level has been found to be good in general across sites, subjects, and techniques, 70 and comparisons of signal changes rather than significance values may address problems in voxel-level comparisons. 68 A third issue in functional imaging is the choice of analysis methods for generating activation data. Several reports have substantiated the difference in activations observed using different analysis techniques. 69,71,72 These may be a concern in a large database system. However, if results are tagged according to the analysis technique and parameters used to generate them, multiple analysis methods may actually be beneficial in supporting a pluralistic strategy for analysis. 72 A fourth issue in functional imaging is the necessity of limiting assumptions in analysis. For example, the SPM model assumes a uniform distribution of noise with covariance between voxels estimated by a Gaussian distribution that decays with increasing distance, and unfortunately not all data conforms well to this noise model (noise from unmodelled biological variability such as venous activity may not, for example). It is difficult to assess the extent that conditions deviate from the model in practice, 52 but tests with synthetic data have demonstrated large biases in the false positive rate 73 as a result of low frequency physiological fluctuations and a variation of signal-to-noise ratio with imaging rate. Techniques for correcting long-term changes in mean signal intensity can improve the sensitivity of statistical analysis in this particular case Other methods An analysis method similar in concept to statistical parametric mapping is correlation of observed signal changes with experimental blocks This method can be applied at the voxel level, with a resulting activation map much like an SPM (but not necessarily statistical in nature). Other variations analyse groups of voxels with an overall mean change in signal, 80 incorporate prior information 81 or wavelet analysis 82 into an SPM-type framework, use expectation maximization to estimate labelling parameters in a Markov random field model, 83 or combine tests for significance with detection of high intensity signals. 84 Another approach used in the analysis of brain activations is to analyse relations between voxels, by using, for example, a six-dimensional correlation map correlating every voxel with each other voxel in order to detect correlated changes 85 or by using
8 366 V Megalooikonomou et al. structural equation modelling to relate functional neuroimaging signals to underlying neurobiological activities. 86 Similarly, one can use an empirical assessment of the relation between data in random pairs of conditions in place of statistical tests of significance 87 although this is computationally intensive. Other approaches are based on the generation of cross-correlation image maps, 88 Fourier-analysis-based timeseries regression models, 89 principal component analysis, 90 partial least squares, 76 independent component analysis, 91 or structural equations to model functional connectivity among ROIs. 92 Spatial-lattice models are often applied to image analysis; these techniques are often based on Markov random fields, with inference techniques based on various modifications of likelihood-maximization procedures. 93 Another alternative is to divide brain images into meshes, treat functions as classes and meshes as attributes, and find rules like A and B ) positive, which means if mesh A and mesh B are active, then some function is positive/on: 94 thus the problem can be reduced to supervised inductive learning. The usual inductive learning algorithms such as C do not work for this problem, however, because (1) there are strong correlations between attributes and (2) there are usually too many attributes (say = ) and too few samples (say 100). However, nonparametric regression can be applied to solve these problems. One algorithm for the discovery of rules from brain images consists of the following two steps: (1) a nonparametric regression 96 is applied on the training data set. The results are linear formulas of the form y ¼ p 1 x 1 þ...þ p n x n þ p nþ1, where y is a dependent variable (i.e. function), and x 1 ;...; x n are Boolean independent variables (i.e. grids). (2) Rules are extracted from the linear formula y (y is normalized to [0,1]) by converting it to a Boolean function y 0 using an approximation: if y 0:5, then y 0 ¼ 1; otherwise, y 0 ¼ 0. Since the naive approach always runs exponentially to complete the second step and becomes unrealistic in practice, a better algorithm is to generate terms from low order to high order while applying a pruning strategy. Experiments on artificial data showed: 94 (1) when there are no correlations between adjacent attributes, the accuracies are almost the same as the accuracies of C4.5, and (2) when there are strong correlations between adjacent attributes, the algorithm works better than C4.5 in terms of the accuracy of the result. 3.2 Structural imaging of the brain Lesion-deficit analysis In principle, lesion-deficit data could be analysed using methods similar to those for activation studies (e.g. SPM). We now present several additional statistical methods to determine structure function relations through the study of lesions and associated deficits. After the preprocessing, i.e. the segmentation of lesions and registration of the binary images to a common standard, the binary images consist of normal and abnormal (lesioned) voxels. This type of structural image data, combined with the behavioural variables, form the data for each subject. Mining methods for the discovery of the structure function associations from this data can operate on a resolution range from the spatially distinct structures of an anatomical atlas (atlasbased analysis) to the voxel level (voxel-based analysis). 97
9 Data mining in brain imaging Atlas-based analysis. In the case where anatomical structures represent functional units, the atlas-based analysis is more sensitive than voxel-based analysis since the atlas provides significant prior knowledge. The first step in the atlas-based analysis is to calculate for each structure s i and subject p j, the fraction of lesioned volume, f si ;p j, which is defined as the volume of the lesioned part of s i divided by the volume of s i. These fractions form the continuous structural variables. Here, we present methods for both categorical and continuous structural variables. 13,14,97 In the case of categorical variables, the lesioned fraction determines if a structure is lesioned (abnormal) or not. For example, a patient might be treated as having a lesion in a structure if the intersection of all his/her lesions with the structure is at least one voxel. To eliminate thresholding effects, the atlas structures can also be analysed as continuous variables, considering for each one the fraction that is lesioned. When the search for the model that explains the data can be directed through specific hypotheses or prior knowledge, the situation is easier. Hypotheses can be formed after using explorative visualization or other methods, and can be tested using statistical analysis. An example where visualization helps to reduce the search space for a model is shown in Figure 3. If there is little preconception about the relationships between the variables, all the possibilities may have to be explored. This exploratory, or data mining, analysis is presented below. There are two analysis approaches one can follow: bivariate (pairwise) and multivariate. Bivariate analysis. Let F be the number of functional and S be the number of structural variables respectively. In the case of categorical structural variables, F S ADHD+ ADHD- Tal-113 Tal-116 Tal-119 Tal-124 Figure 3 Visualization helps reducing the search space for the model that explains the data. Sum of lesions for the ADHD+ and the ADHD group of patients (four slices of the Talairach atlas are shown for each group). The right putamen and left thalamus are highlighted. Picture from Megalooikonomou et al. 181
10 368 V Megalooikonomou et al. two-way contingency tables are constructed and for each the Fisher exact test 98 is computed. The associations between structures and deficits are sorted in order of the p-values returned from the exact tests, and the ones with the lowest p-values are reported. For the same type of analysis with continuous structural variables one can use the Mann Whitney test and logistic regression analysis (the Mann Whitney statistic is appropriate because the distributions of the fractions of lesioned volumes are not Gaussian). In exploratory analysis of either categorical or continuous structural variables, computing a statistic for many pairwise tests leads to the multiple comparison problem, i.e. the situation where a certain undesirably high number of the tests are expected to be positive by chance (see Section 4.1 for a more complete treatment of the multiple comparison problem). Multivariate analysis. Multivariate analysis may find complex multivariate associations not found by multiple uses of bivariate statistics. For example, consider a deficit that is associated with two structures and appears only when both of them are lesioned. Multivariate analysis is free of the multiple comparison problem, since it evaluates an entire model with one statistic. One multivariate extension of the chi-square test for categorical variables is log-linear analysis; logistic regression is another multivariate method that can be used to relate the log-odds of having a particular deficit to the fraction of lesioned structures. The stepwise logistic regression has also been used, 97 where the algorithm for discovering the model that explains the interactions starts with no associations and a greedy approach is applied to add (or delete) associations based on their relative strength Voxel-based analysis. Atlas-based analysis results are only as good as the atlas that is being used. Instead of imposing any high level structure on the image data, one can analyse them on a voxel-by-voxel basis. Voxels are typically labelled as either normal or abnormal, and thus structural variables are in this case categorical. Given that the number of voxels that are considered is typically on the order of 10 7, a like number (i.e ) of Fisher exact tests have to be performed for each of the functional variables that are examined. This procedure can be seen as clustering the voxels by functional association. 97 The calculation of the contingency table and the Fisher exact test is computationally intensive, and the multiple-comparison problem is also severe due to the large number of tests that must be performed. However, in this case it can be attacked with clustering analysis since false positives will not tend to cluster. Voxel-based regression analysis can also be used to determine whether voxels in a certain region are associated with a functional variable. One can construct a regression equation that relates lesions in a sphere of a given radius and centre to a deficit, and the causal brain region in which lesions are most strongly associated with that deficit can then be identified. 13,97 Let l be a lesion, o a sphere, vðrþ the volume of a region r, and iðr 1, r 2 Þ the intersection of two regions r 1 and r 2. Then the identification of the causal region is done by calculating the optimal centre and radius for the logistic regression equation: log itðdþ ¼ log ðodds d Þ¼af s þ b
11 Data mining in brain imaging 369 where odds d ¼ p d ðþ=ð1 p d ðþþ, p d ðþ is the probability of having a certain deficit d, f s ¼ vðiðl; oþþ=vðoþ is the fraction of the sphere that is lesioned, a=(log odds of d)/ (lesioned fraction of sphere volume), and b is the prior log odds of deficit d. Given the centre ðx; y; zþ and the radius r of the sphere, one can find values for the parameters a and b such that the sum of squares of residuals is minimized. The goal is to optimize the sphere parameters ðx; y; z; rþ to obtain the best fit of the data to a regression line. The solution is the sphere that best discriminates between lesions that are and are not associated with deficit d. This nonlinear optimization procedure is computationally intensive, and cannot describe multifocal functional associations Results from mining lesion-deficit associations. In this section we present results from the mining process in BRAID. 13,14 The Brain Image Database includes images and clinical information from over 700 subjects from two different studies: the Cardiovascular Health Study (CHS) 99 and the Frontal Lobe Injury in Children (FLIC) study. 100 Visualization applied prior to the analysis procedure can help direct the analysis by choosing certain structures of the anatomical atlas to examine further using the statistical tests. Figure 3 shows the sum of lesions over all subjects that did and did not develop ADHD (Attention-Deficit Hyperactivity Disorder), i.e. ADHD+ and ADHD, respectively. Based on these images and on previous research implicating a frontal lobe-basal ganglia-thalamic pathway, the right putamen and the left thalamus (highlighted in Figure 3)y were chosen for further analysis using the Fisher exact test for categorical and the Mann Whitney test for continuous structural variables. The p-values in Table 1 confirm a strong association between lesions in the two structures and development of ADHD. Running an exploratory analysis on the CHS data set (300 subjects) using the chisquare test to evaluate two-way contingency tables for all pairwise combinations of atlas structures (90) and functional variables (14) returns a list (sorted by p-value) of structure function associations. 13,97 The five most significant associations are presented in Table 2. Highly significant lesion-deficit associations detected by BRAID, such as visual field deficit and lesions in contralateral orbital or cuneate gyrus, are also consistent with current clinical knowledge. 101 The incorrect association between the left hippocampus and a right visual field deficit is due to registration error, since the hippocampus is next to the optic radiations that are very well known to be correlated with a visual field deficit. Preliminary stepwise logistic regression analysis using continuous structural variables from the FLIC data set show similar results for the development of ADHD. This method identifies the left SupCerebellarA (which is lateral to the left putamen area) as a strong predictor. Results from a preliminary voxel-based analysis for the ADHD variable of the FLIC data set are presented in Figure 4(a). Each voxel represents the p-value for the association between the voxel being lesioned and the development of ADHD. A 3-D reconstruction is shown in Figure 4(b). These results ydue to the compromised connections between the frontal lobe and these two structures, it is believed that the frontal lobe is not able to exert its normal oversight function to suppress impulsive urges and behaviours. A common behavioural pattern in patients with ADHD is impulsivity and lack of self-control.
12 370 V Megalooikonomou et al. p-value (a). (b) Figure 4 Voxel-based analysis for development of ADHD. Six slices (a) of the Talairach atlas and a colour bar that shows the correspondence between p-values and colour values are shown. (b) Visualization of the voxelbased analysis p-value volume for development of ADHD. The higher the intensity the lower the p-value. Picture from Megalooikonomou et al. 97 are consistent with those of the atlas-based analysis. Figure 5 shows one representative slice (119) of the Talairach atlas for the voxel-based regression analysis for ADHD. These results are consistent with all the previous ones for ADHD.
13 Data mining in brain imaging 371 Table 1 Visualization directed mining. Statistical analysis of selected Talairach atlas structures for association with ADHD (FLIC data set) 97 Structure Fisher s exact p-value Mann Whitney p-value R putamen L thalamus Table 2 Explorative analysis The five most significant structure-function associations given by the chi-square analysis on the CHS data set 97 Structure Function Chi-square p-value S-Bonf. Correct. p-value R globus pallid. R hemiparesis L hippocampus R visual defect R gyri angular L pronat. drift R gyri orbital L visual defect R gyri cuneus L visual defect Structure morphology analysis Several methods have been applied in extracting knowledge about the morphology variability of brain structures. Study of the location, size, surface area, volume, and shape of specific brain regions is critical for discovering normal brain organization, for defining anatomically-driven search areas for brain activity in functional imaging (PET, fmri) scans, and for investigating pathological changes in the case of diseases affecting these structures. Some of the same voxel-based analysis techniques described in relation to functional studies have been applied to anatomy as well; in general, voxel-based morphometry identifies changes in gray matter on a voxel-by-voxel basis ,186 This method is used to study the different composition of brain tissue after macroscopic shape differences are discounted using spatial normalization. Another common approach is to use a warping (deformation) of an individual s brain to an anatomical template (e.g. the Talairach atlas 29 ) and gathers details about the warping that are used in the analysis. 106,107 A deformation function dðu; vþ, defined at each point ðu; vþ of the atlas structure, S; of interest, measures the enlargement or shrinkage associated with the transformation from an infinitesimal region around a point in the atlas space to its corresponding infinitesimal region in the subject space. In this method a comparison of two different brains or, more generally, two populations is achieved by comparing the corresponding deformation fields: regions with statistically significant differences are regions of morphological differences between the two populations. Results from applying this methodology 106 to a study of the corpus callosum for a small group of elderly subjects are shown in Figure 6. More details on the use of a deformation function in the analysis of morphological variability can be found in Appendix A2. Surface-based mesh modelling is a similar approach. 108,109 After minimal registration a parametric mesh is stretched over the surface contour of a structure or ROI
14 372 V Megalooikonomou et al. (a) (b) (c) Figure 5 The optimal regression sphere (c) that best discriminates the two groups, i.e. between lesions that are (a) and are not (b) associated with the development of ADHD. Picture from Megalooikonomou et al. 97 (see Figure 7). It is then compared to an average parametric mesh that is formed by calculating the mean and variation between corresponding points on the mesh. Finally, displacement vectors are generated for each individual structure. A local profile of change in structures in certain conditions can be provided through colourcoded topographic maps (see Figure 8). This method first aligns each brain volume using distance scaling to control for head size differences, allowing for inter-individual and group comparisons. A strategy for creating a population-based brain atlas using (a) (b) (c) Figure 6 Morphological variability of the corpus callosum between women and men for a group of elderly subjects. The posterior part (in white) was found to be significantly larger in women than in men (a). The average shape of the corpus callosum for (b) men and (c) women in a study by Davatzikos et al. 106 Pictures from Davatzikos et al. 106
15 Data mining in brain imaging 373 Figure 7 Extracting meshes (a) to create a cortical surface database, to search for differences where the deformation is regarded as an observation from a random vector field. Variability is calculated based on 3D displacement maps, which locally encode the amount of deformation required (b) to drive each subject s gyral pattern into exact correspondence with the average cortex for the group. Pictures from Thompson et al. 184 volumetric warps is shown in Figure 9. The application of these methods in several studies has already revealed differences in the shape and size of certain structures related to gender (e.g. corpus callosum 106,110 ), in disorders such as schizophrenia, 107,111,112 in normal aging, and in Alzheimer s disease. 113 Probabilistic atlas approaches have been used for studying both normal and abnormal brains. 114 Another approach attempts to identify and register landmark configurations (defined as point sets that correspond biologically across images). 115 Image deformation algorithms designed to accomplish these goals are useful for identifying and measuring variations in structure, although they are not designed for tasks like finding tumours or activations. The Procrustes distance is one of the core tools of image deformation algorithms, and is calculated for two landmark configurations with the same landmarks by minimizing the sum of distances between corresponding landmark points while rotating around the normalized centroid of each. Finally, point-wise t- tests, ANOVAs, and partial correlations 116 as well as eigenvector and related analysis 115, have been used in computational neuroanatomy to study group differences in morphology and its associations with cognitive variables. Other related work is the study of human anatomy, 120,121 which presents the most difficult challenges to the understanding of typicality and variablity. While biological shapes are highly structured, they are not rigid. Miller s group have been using Grenander s deformable anatomical templates for the representation of typicality and variability. For this, complex anatomical templates (human and macaque brains) are annotated with coordinate systems defined within them. High-dimensional vector fields applied to these coordinate systems carry the templates with all of its geometry into the target. This allows for understanding modulo individual variation Morphological analysis of tree-like structures. Another tool for the analysis of brain structure and function is through the morphological characterization of neurological brain structures. Tree-like structures, such as nerve-fibre tracing in
16 374 V Megalooikonomou et al. (a) (b) Figure 8 Three-dimensional visualizations of structural variability, asymmetry and group-specific differences. (a) Anatomical variability of the cerebral cortex in male schizophrenia patients and controls. Variability is shown on an average surface representation of the cortex derived from schizophrenia (left) and normal control (right) populations. Individual variations in brain structure in frontal association areas are greater in schizophrenia. Variability is calculated based on 3D displacement maps, which locally encode the amount of deformation required to drive each subject s gyral pattern into exact correspondence with the average cortex for the group. Picture from Narr et al. 182 (b) Ventricle variability maps for Alzheimer s disease. Pictures from Thompson et al. 183 Figure 9 Creating a population-based brain atlas to quantify local structural variations. A family of highdimensional volumetric warps relating a new 3D MRI scan to each normal scan in a brain image database is calculated (I II, above). The resulting warps encode the distribution in stereotaxic space of anatomic points that correspond across a normal population (III), and their dispersion is used to determine the likelihood (IV) of local regions of the new subject s anatomy being in their actual configuration. Colour-coded topographic maps highlight regional patterns of deformity in the anatomy of the new subject. Abnormal structural patterns are quantified locally, and mapped in three dimensions. Pictures from Thompson et al. 185
17 Data mining in brain imaging 375 DTI MR angiography or confocal microscopy, are registered (after segmentation and skeletonization) with standard structural and functional volumes. In addition, morphological analysis of these structures using various path analyses tools is performed. Morphological descriptors such as Sholl analysis, 122 moment analysis, and fractal dimension analysis are used to support content-based retrieval operations in 3D cell-centred neuronal databases. 126,127 Recently, visual data mining techniques combined with computational neural modelling have developed a very effective means to detect morphological influences on neuronal function Brain tumour analysis and classification. Classification is an important problem in data mining. Classifiers are useful for building taxonomies of images and subsequently performing image context based searches. 129 Methods for finding similar tumour shapes in structural images 130 can also be used for brain tumours. Korn et al. use concepts from mathematical morphology, namely the pattern spectrum of a shape, to map each shape to a point in n-dimensional space. Starting from a natural similarity function (the maximum morphological distance ), they first prove a lower bound for it and then demonstrate how to search efficiently for nearest neighbours in large collections of tumour-like shapes using R-trees 131 and the Feature index (Findex) approach. 132 The technique was applied to realistic tumour shapes generated using an established tumour-growth model 133 and the results were very encouraging (see Figure 10). Fractal features and texture analysis have also been used for the quantitative description and recognition of brain tumours in 3-D MR images Combined structural and functional imaging of the brain Structural and functional imaging are often combined. It is common to restrict activation studies to a certain area of interest that corresponds to an anatomical structure. Here, we present a new area of research where both structural and functional images have to be mined together, and methods that can be potentially applied to both Gene expression, morphology and function Discovering patterns of gene expression and their complex interaction with brain morphology and function is a fundamental goal in recent molecular biology and neurobiology studies. In situ hybridization and MRI have provided very high resolution images of gene expressions in animal models. In addition, gene expression brain atlases for the mouse and the rat have started to appear After the registration of anatomic and gene expression images across modalities and subjects through more involved methods than those presented in Section 2, spatial statistics methods have to be applied to find associations between anatomic, genetic, and nonimage variables such as behavioural measures, response to drugs, or onset of disease. The main challenge in finding associations among patterns of gene expressions and phenotype is the synthesis of temporal information, spatial information, and static data. Similar work has been done in the analysis of functional images (as described earlier) where changes in signal intensity occur in response to
18 376 V Megalooikonomou et al. Figure 10 Query tumour images (left column) and their nearest neighbours, with respect to morphological distance. Picture from Korn et al. 130 processing different kinds of stimuli. However, considering that multiple genes can be expressed in the same brain location, and that the time sequence of gene expression may also be important, makes the problem even more challenging Bayesian networks Multivariate analysis methods like log-linear regression and logistic regression provide relatively simple methods for generating candidate models, usually relying on modifications of greedy search and making assumptions about cell frequencies or total number of samples that may not hold for rare cases. A more promising approach generates models called Bayesian networks 142 that consist of graphical structures along with statistical independence models. This method scores each model M, and returns the most probable model that could have generated the data D at hand (i.e. the multivariate multinomial distribution that generated D). 143,144 Briefly, a Bayesian network is a directed acyclic graph in which nodes represent variables of interest, such as structures or functions, and edges represent associations among these variables. Each node has a conditional-probability table that quantifies the strength of the associations between that node and its parents. Given the prior probabilities for the root nodes and conditional probabilities for other nodes, we can derive all joint probabilities 145 over these variables. An approach for generating a Bayesian network from data is described in Appendix A3. Recently the Minimum Description Length (MDL) principle has been applied to Bayesian network learning. 146,147 The principle states that the best model of a collection of data is the one that minimizes the sum of the encoding lengths of the data and the model itself. 148 The MDL metric is defined to measure the total description length DL of a network structure G, which is the sum of description lengths of each node. 147,149 The description length of each node is defined from two components, the network description length and the data description length. The first is the description length for encoding the network structure, which measures the simplicity of the network. The second is the description length for encoding the data, which measures the accuracy of the network.
19 Data mining in brain imaging Behavioural imaging Another approach for modelling structure function relationships is to transform neuropsychological test scores that assess cognitive functions to a 3-D spatial representation of the predicted sites of regional dysfunction. Gur et al. 150 presented such an algorithm for display and analysis of neuropsychological test scores that produces regional values from standardized (z-transformed) neuropsychological test scores using the formula: B j ¼ X Wði; jþs i = X Wi; ð jþ where B j is the index of behavioural functioning for a given region, Wði; jþ is the weight assigned to the jth brain region for the ith behavioural score, and S i is the test score. The method was demonstrated on a sample of hemi-parkinson patients 151 and later used to examine the sensitivity of cognitive test scores to lesions in specific ROIs, inter-expert agreement, and intra-expert reliability. 152 The method can be used to relate cognitive test scores to the results of structural and functional imaging, and has great potential for integrative data mining. Turkheimer et al. 187 also quantitatively examined the relationship between neuropsychological test scores and lesion locations on structural neuroimaging. 4 Important issues in mining of brain images 4.1 The multiple comparisons problem Using an exploratory analysis and computing a statistic for many tests (as in the case of pairwise test) leads to the multiple comparisons problem, i.e. the situation where a certain undesirably high number of the tests are expected to be positive by chance. A standard Bonferroni correction 98,153 suggests that one divide the significance threshold by the number of independent tests performed. This typically overestimates the number of independent tests performed, since test results are often correlated for neighbouring structures (activations or lesions often extend over neighbouring structures), and leads to loss of sensitivity. A heuristic modification of the Bonferroni correction, the sequential Bonferroni correction, 98 can be used to get less pessimistic results. To do this, one sequentially increases the value of the significance threshold as hypotheses are evaluated. In task-activation studies, increasing the threshold for statistical significance increases the number of false positive activations that are detected. The Bonferroni correction only applies in the case where a null hypothesis is to be rejected (i.e. any lesions or activations at all are treated as unexpected). The correction is not necessary in the case where a hypothesized region of lesion/activation or a structure is chosen, 154 since in this case the null hypothesis of no changes is necessarily relaxed. If the cross-correlation between adjoining voxels is considered, a higher threshold can be used more safely. Single voxels that have signal changes of low significance may be the result of noise, but several voxels together that have a correlated change have a higher likelihood of representing a true lesion or activation. 155 One heuristic alternative to the Bonferroni correction is cluster filtering; in this method, clusters smaller than a certain size (number of voxels) are simply
20 378 V Megalooikonomou et al. discounted. Taking advantage of this kind of approach increases sensitivity, but it also adds risk of error at the cluster level (rather than only at the voxel level). 84 The overall error rate can still be controlled, so the net effect is to reduce errors. 4.2 Clustering voxels Clustering is the process of finding, in a contiguous spatial region, voxels with similar significance in a voxel-based activation or lesion-deficit study. Clustering can be done after independent significance tests are performed by grouping adjacent significant voxels, or clusters can be calculated directly from the data by detecting correlated changes. 85, In the latter case, clustering can be viewed as a method for generating hypotheses that statistical testing can evaluate. 159 Clustering has been used to differentiate functional activations from other activity in the brain using statistical methods and neural networks. 166,167 Clustering has been claimed as a higher quality analysis tool than correlation analysis because of its ability to detect unanticipated difference in response, such as differing levels of activation 168 or similarities in the time-course of fmri signal changes and stimuli. 163 The cluster filtering approach mentioned earlier depends on having an estimate of the likelihood of each cluster size, which depends on the noise distribution. Images of the noise distribution can be obtained by, for example, subtracting the results of one condition from a repetition of that condition. Using simulated images derived with the same spatial correlation as these images, one can estimate the probability of observing clusters above a given size and thus the probability of each cluster in the original data. 164 Then, clusters below a desired probability threshold can be discarded as too uncertain. 4.3 Verification of mining and power considerations In previous sections we discussed methods for finding associations between tasks and activations, or between lesions and deficits. However, the evaluation of the discovered knowledge for the structure function analysis methods is not usually addressed. Several researchers have studied the correspondence of sample size to power for statistical tests such as the chi-square and Fisher exact tests of independence, 169 and compared the relative power of different statistical tests of independence In addition, simulations have studied the power of chi-square analysis in sample spaces of much higher dimensionality, as one would expect to find in many epidemiological studies However, no closed-form power analyses exist that can account for the simultaneous effects of image noise and registration error, in addition to the characteristics of the statistical methods being employed. One can use a simulator 177 to not only test the scalability of mining methods, but also evaluate different methods as a function of the number of samples needed, the strength and complexity of associations, the spatial distribution of ROIs, and the registration method used. A simulator can generate a large number of artificial subjects and construct a probabilistic model of lesion-deficit or task-activation associations. One can then model the error of a given registration method, apply it to the image data, perform mining, and compare the generated associations with those detected by the mining methods. The number of subjects required to recover the known associations reflects the statistical power of the particular combination of image-processing and statistical methods being evaluated.
21 Data mining in brain imaging Using a simulator As a case study, we show results from the evaluation of the Fisher exact test for the detection of lesion-deficit associations. 177 The results quantify the sensitivity and accuracy of the mining method as a function of the number of subjects in the sample, the strength and complexity of the associations, and the errors that arise due to imperfect registration. Comparing the results of simulated analysis to known associations allows one to quantify the performance of a mining method. For this study, the simulation parameters for the distributions were obtained from data collected as part of the Frontal Lobe Injury in Childhood (FLIC) 100 lesion-deficit study. Simulated lesions were generated using distributions for the number, size, and location of lesions. Because misregistration introduces noise in the form of false-negative and false-positive associations, this source of error was modelled by assuming that it follows a 3-D nonstationary Gaussian distribution. Registration error was estimated by measuring the error on distinct anatomical landmarks on a number of subjects and then interpolating the error in the rest of the brain. The lesion-deficit-association model, with its conditional-probability tables and prior probabilities, describes the relationships between structures and functions. In the case where structure and function variables are categorical (normal vs abnormal), these associations can be modelled using Bayesian networks (BNs) 145 as covered in Section To examine the effect of the strength of the lesion-deficit associations on the ability of the mining methods to detect them, Table 3 presents three cases corresponding to strong, moderate, and weak associations. Thus, a strong association between a structure s i and a function f j is denoted by conditional probabilities p(f j =A s i = N) = 0, p(f j =A s i =A)=1, p(f j =N s i = N) = 1 and p(f j =N s i = A) = 0, where A means abnormal and N normal. Moderate and weak associations were defined similarly. Nondeterministic disjunctive interactions between more than one structure and a function were modelled using a noisy-or model. 142 The prior probability of structure abnormality for each structure s i, in each subject p j, was calculated from f si ;p j : the fraction of the volume of s i that overlapped with lesions for p j. The conditional probability p( s i f si ;p j ) is expected to be a sigmoid function, although a step function with an appropriate threshold is used for simplicity. Each structure with at least 1% of its volume overlapping with lesions was labelled as abnormal for that subject. For each pair of simulated subject and structure, the priorprobability distribution was sampled and a binary vector for the structures was generated. By instantiating the states of all structure variables of the BN, the Table 3 Three cases of BNs considered in a simulator 177 Case Association Conditional probabilities for functions 1 Strong 0/1 2 Moderate 0.25/ Weak 0.49/0.51
22 380 V Megalooikonomou et al. conditional probability for each function variable was determined by table lookup. This probability was then used to generate the binary vector for the function variables, and Fisher s exact test of independence was applied to each structure-function pair Results from the evaluation of a mining system In this section, we describe how a lesion deficit simulator can be used to determine the number of subjects needed to discover the simulated lesion-deficit associations represented by a Bayesian network, the strengths of associations, the number of associations, the degree of the network (i.e. the number of structures related to a particular function), and the prior probabilities for structural abnormalities. A Bayesian network with sufficient complexity was used 177 to demonstrate the use of a simulator in reaching meaningful results regarding the performance of the Fisher exact test and the effects of misregistration. Since the performance of any method for detecting associations depends on the characteristics of the conditional-probability tables, three cases (see Table 3) were examined to study this effect. The prior probability of abnormality for each structure was set to 0.5 to allow testing the behaviour of the Fisher exact test for the optimal value of the prior probability. To generate the conditional-probability table for those function variables that were related to more than one structure, a noisy-or model was used. The threshold was used for the p-value, since this gives a good trade-off between the number of simulated associations and the number of false positives detected. Figure 11 demonstrates the dramatic effects of the different conditional-probability distributions on the power of lesion-deficit analysis. As expected, more samples are required to detect weaker associations. The degree of the associations of the Bayesian network was found to have a much greater effect on the performance of the Fisher exact test than the total number of associations. This result implies that, for functions that are associated with many structures, identification of structure function associations is difficult and requires a larger sample size. Figure 12(a) shows the performance of the Fisher exact test for three networks of 20, 40, and 80 edges and of the same degree (4) for the moderate case (i.e. case 2) of the conditional-probability tables. Figure 12(b) shows the effect of increasing the degree of the network (the number of structures affecting a particular function) while fixing the total number of edges using the moderate case of the conditional-probability tables. Figure 11(b) demonstrates the performance of the Fisher exact test for the three cases of Bayesian network conditional probabilities (see Table 3) when the prior probability of a given structure being abnormal is obtained from the simulated data set. The number of edges that could actually be discovered is 55 (80%), since there were 14 edges from structures that did not intersect any lesions. Comparing this figure with Figure 11(a), in which uniform prior probabilities were used, more subjects are required to recover all associations when data-derived prior probabilities are used instead of uniform prior probabilities, as expected. Also as expected, the number of subjects needed is inversely proportional to the smallest prior probability. The detection of false-positive associations is due to the existence of associations among neighbouring structures due to lesions that intersect more than one structure. Additional false positives can be observed in cases where associations occur between
23 Data mining in brain imaging 381 (a) t (b) Figure 11 Evaluating a mining method. Performance of the Fisher exact test (p 0.001) for (a) uniform (0.5) prior probabilities and (b) data-derived prior probabilities of structure abnormality, for the three strengths of lesion-deficit associations from Table 3 that correspond to strong (case 1), moderate (case 2) and weak (case 3) associations. The difference between the total number of associations detected and the number of true associations detected is the number of false-positive associations detected for each case. The horizontal line in (a) represents the total number of simulated edges (69) and in (b) represents the total number of simulated edges that can be detected (55). Graphs from Megalooikonomou et al. 177
24 382 V Megalooikonomou et al. (a) (b) Figure 12 (a) Evaluating a mining method. Performance of the Fisher exact test (p 0.001) for BNs with degree 4 with 20, 40 and 80 edges. (b) Performance of the Fisher exact test (p 0.001) for BNs with 48 edges, and with degree 4, 6, and 8. Graphs from Megalooikonomou et al. 177
25 Data mining in brain imaging 383 behavioural variables. On average the specific registration method used reduces the number of associations discovered by 13% for the same number of subjects when compared with perfect registration. 5 Concluding remarks In this review we have presented data mining methods that have been or could be used for knowledge discovery from brain images of different modalities along with other clinical data. We have focused on the problems of: (1) finding associations between structures and functions through task-activation and lesion-deficit studies, (2) studying the morphological variability of brain structures and finding associations with certain conditions, (3) classifying shapes of brain structures, including tree-like structures such as nerve fibres and abnormalities such as tumours, and searching for similarity, and (4) finding associations between gene expressions, morphology, and function. We have presented results of applying mining methods to epidemiological data that demonstrate detection of several clinically meaningful associations in different studies. These methods can lead to interesting conclusions about the functional mapping of the human brain, the effect of lesions or other abnormalities in the development of neurological and neuropsychological deficits, and the effect of certain diseases and gene expressions on structural morphology and function. Visualization can help reduce the inherently enormous search space in statistical analysis. Exploratory analysis through the use of a statistic for many tests produces reasonable results, although one has to deal with the multiple-comparison problem. Voxel-based approaches show encouraging results, but are computationally intensive and even more severely impacted by the multiple comparison (although the latter can be addressed with clustering analysis). Statistical simulations show that more advanced mining methods and large sample sizes are required to determine lesiondeficit associations accurately, with reduced number of false positive associations. Simulators can be used for verifying and comparing mining methods in brain imaging. Their use is very important especially in determining the number of subjects needed to detect all associations while reducing false positives. In particular, in lesiondeficit analysis, simulators have shown that the number of subjects required to detect all and only those associations in the underlying model (i.e. the ground truth) may be in the thousands, even for strong associations, particularly if the spatial distribution of lesions does not extend to all structures. The more one descends from the 0.5 level for prior probabilities, the more difficult it becomes to discover associations. These results underline the necessity of developing large image databases for the purpose of metaanalysis of data pooled from multiple studies, so that more meaningful results can be obtained. The testing procedure framework is very important, since it can be used to characterize the power of methods for detecting multivariate associations while taking into account the effects of registration and noise. Simulators can also be used in the evaluation of new analysis methods, as well as in the study of the effect of different registration and segmentation algorithms. Existing mining algorithms are limited in that they typically assume data will consist of individual numeric and symbolic features. We still lack effective algorithms
26 384 V Megalooikonomou et al. for learning from data that is represented as a combination of various types (i.e. multimedia data). Predictions based on the full medical record could potentially achieve much greater accuracy than those that are limited to one data type. In addition, prediction accuracy can be improved by inventing more appropriate features to describe the brain data. We need new methods that actively generate optimal experiments to collect the most informative data. Another obstacle is integrating data from different investigators and analysing them jointly. Brain imaging data are usually collected in a single database for a specific study and with a specific data mining task in mind, so an additional important issue is interoperability and the ability to learn from multiple databases. 178 Also, the mining algorithms developed so far tend to be fully automated and therefore do not allow active experimentation, i.e. guidance from experts at key stages in the search for brain data regularities. Ideally, human experts should be able to collaborate closely with a mining algorithm to form hypotheses and test them against the data. In addition, mining methods need to be able to scale to extremely large data sets. Research during the past few years has already produced more efficient algorithms for such problems as learning association rules 2 and efficient visualization of large data sets. 179 A closer integration of machine learning algorithms into database management systems is also needed. Acknowledgements The authors wish to thank Christos Davatzikos, Eddie Herskovits, Christos Faloutsos, Paul Thompson, David Isecke, Ling Cheng, and Tilmann Steinberg for providing pictures, comments, and other helpful information. This work was supported in part by the Ira DeCamp Foundation, NARSAD and New Hampshire Hospital. Support was also provided by the Dartmouth Experimental Visualization Laboratory (DEVLAB). References 1 Koslow SH, Huerta MF eds. Neuroinformatics: an overview of the Human Brain Project. Mahway, NJ: Lawrence Erlbaum, Agrawal R, Imielinski T, Swami A. Database mining: a performance perspective. IEEE Transactions on Knowledge and Data Engineering 1993; 5: Huerta MF, Koslow SH, Leshner AI. The Human Brain Project: an international resource. Trends in Neuroscience 1993; 16: Mitchell TM. Machine learning and data mining. Communications of the ACM 1999; 42: Anderson S, Damasio H. Neuropsychological impairments associated with lesions caused by tumor or stroke. Archives in Neurology 1990; 47: Fox P, Mintum M, Reiman E, Raichle M. Enhanced detection of brain responses using intersubject averaging and changedistribution analysis of subtracted PET images. Journal of Cerebral Blood Flow Metabolism 1988; 8: Fox P. Functional brain mapping with positron emission tomography. Seminars in Neurology 1989; 9: Fox P, Mintum M. Noninvasive functional brain mapping by change-distribution analysis of averaged PET images of H2150 tissue activity. Journal of Nuclear Medicine 1989; 30: Fox P. Physiological ROI definition by image subtraction. Journal of Cerebral Blood Flow Metabolism 1991; 11: A Evans A, Beil C, Marrett S, Thompson C, Hakim A. Anatomical function correlation using an adjustable MRI-based region of interest atlas with positron emission tomography. Journal of Cerebral Blood Flow Metabolism 1988; 8:
27 Data mining in brain imaging Evans A, Marrett S, Torrescorzo J, Ku S, Collins L. MRI PET correlation in three dimensions using a volume-of-interest (VOI) atlas. Journal of Cerebral Blood Flow Metabolism 1991; 11: A Arya M, Cody W, Faloutsos C, Richardson J, Toga A. A 3D medical image database management system. International Journal of Computerized Medical Imaging and Graphics 1996; 20: Letovsky SI, Whitehead SH, Paik CH et al. A brain image database for structure function analysis. American Journal of Neuroradiology 1998; 19: Herskovits EH, Megalooikonomou V, Davatzikos C, Chen A, Bryan RN, Gerring JP. Is the spatial distribution of brain lesions associated with closed-head injury predictive of subsequent development of attention-deficit hyperactivity disorder?: Analysis with brainimage database. Radiology 1999; 213: Nielsen FA, Hansen LK. Modeling of brainmap data. In: NIPS 99. Denver, Colorado; Nowinski WL, Fang A, Nguyen BT et al. Multiple brain atlas database and atlas-based neuroimaging system. Computer Aided Surgery 1997; 2: Levrier O, Poline J, Tzourio N, Mazoyer B, Salamon G. Individual functional neuroanatomy using PET MRI integration. In: 31st Annual Meeting of the American Society of Neuroradiology. Vancouver, BC, Rajapakse J, Giedd J, Rapoport J. Statistical approach to segmentation of single-channel cerebral MR images. IEEE Transactions on Medical Imaging 1997; 16: Pal N, Pal S. A review on image segmentation techniques. Pattern Recognition 1993; 26: Zhang Y. A survey on evaluation methods for image segmentation. Pattern Recognition 1996; 29: Worth A, Makris N, Caviness V, Kennedy D. Neuroanatomical segmentation in MRI: Technological objectives. International Journal of Pattern Recognition and Artificial Intelligence 1997; 11: Vannier M, Butterfield R, Rickman D, Jordan D, Murphy W, Biondetti P. Multispectral magnetic resonance image analysis. CRC Critical Reviews in Biomedical Engineering 1987; 15: Held K, Korps E, Krause B, Wells W, Kikinis R, Muller-Gartner H. Markov random field segmentation of brain MR images. IEEE Transactions on Medical Imaging 1997; 16: Chang M, Sezan M, Tekalp A, Berg M. Bayesian segmentation of multislice brain magnetic resonance imaging using threedimensional Gibbsian priors. Optical Engineering 1996; 35: Collins D, Evans A. ANIMAL: validation and applications of nonlinear registration-based segmentation. International Journal of Pattern Recognition and Artificial Intelligence 1997; 11: Gee J, Reivich M, Bajcsy R. Elastically deforming 3D atlas to match anatomical brain images. Journal of Computed Assisted Tomography 1993; 17: Miller M, Christensen G, Amit Y, Grenander U. Mathematical textbook of deformable neuroanatomies. Proceedings of the National Academy of Sciences 1993; 90: Kamber M, Shingal R, Collins D, Francis G, Evans A. Model-based 3-D segmentation of multiple sclerosis lesions in magnetic resonance brain images. IEEE Transactions on Medical Imaging 1995; 14: Talairach J, Tournoux P. Co-planar stereotaxic atlas of the human brain. Stuttgart: Thieme, Bookstein F. Principal warps: thin-plate splines and the decomposition of deformations. IEEE Transactions on Pattern Analysis and Machine Intelligence 1989; 11: Collins D, Neelin P, Peters T, Evans A. Automatic 3D intersubject registration of MR volumetric data in standardized Talairach space. Journal of Computer Assisted Tomography 1994; 18: Davatzikos C. Spatial transformation and registration of brain images using elastically deformable models. Computer Vision and Image Understanding 1997; 66: Damasio H. Human brain anatomy in computerized images. Oxford: Oxford University Press, Talairach J, Szikla G. Atlas d anatomie stereotaxique du telencephale: etudes anatomoradiologiques. Paris: Masson, Ono M, Kubik S, Abernathey C. Atlas of the cerebral sulci. Stuttgart: Thieme, Brodmann K. Vergleichende Lokalisationslehre der Grosshirnrinde in ihren Principien dargestellt auf Grund des Zellenbaues, Barth, Leipzig. In: Some Papers on the Cerebral Cortex. Springfield, IL: Thomas, 1960:
28 386 V Megalooikonomou et al. 37 Minoshima S, Koeppe R, Frey K, Ishihara M, Kuhl D. Stereotactic PET atlas of the human brain: aid for visual interpretation of functional brain images. Journal of Nuclear Medicine 1994; 35: Bihan DL. Functional MRI of the brain: principles, applications and limitations. Neuroradiology 1996; 23: Essen DV, Maunsell J. Hierarchical organization of functional streams in the visual cortex. Trends in Neurological Sciences 1983; 6: Wible CG, Shenton ME, Fischer IA et al. Parcellation of the human prefrontal cortex using MRI. Psychiatry Research 1997; 76: Ogawa S, Menon RS, Kim SG, Ugurbil K. On the characteristics of functional magnetic resonance imaging of the brain. Annual Review of Biophysical and Biomolecular Structures 1998; 27: Bahn MM. A linear relationship exists among brain diffusion eigenvalues measured by diffusion tensor magnetic resonance imaging. Journal of Magnetic Resonance 1999; 137: Friston KJ, Williams S, Howard R, Frackowiak RSJ, Turner R. Movement related effects in fmri time series. Magnetic Resonance in Medicine 1996; 35: Thacker NA, Burton E, Lacey AJ, Jackson A. The effects of motion on parametric fmri analysis techniques. Physiological Measurement 1999; 20: Ashburner J, Friston KJ. The role of registration and spatial normalization in detecting activations in functional imaging. Clinical MRI/Developments in MR 1997; 7: Kim B, Boes JL, Bland PH, Chenevert TL, Meyer CR. Motion correction in fmri via registration of individual slices into an anatomical volume. Magnetic Resonance in Medicine 1999; 41: Worsley KJ, Marrett S, Neelin P, Vandal AC, Friston KJ, Evans AC. A unified statistical approach to determining significant signals in images of cerebral activation. Human Brain Mapping 1996; 4: Descombes X, Kruggel F, Cramon DYv. fmri signal restoration using a spatio-temporal Markov random field preserving transitions. NeuroImage 1998; 8: Descombes X, Kruggel F, Cramon DYv. Spatio-temporal fmri analysis using Markov random fields. IEEE Transactions on Medical Imaging 1998; 17: Lindeberg T, Lidberg Par, Roland PE. Analysis of brain activation patterns using a 3- D scale-space primal sketch. Human Brain Mapping 1999; 7: Lowe MJ, Sorenson JA. Spatially filtering functional magnetic resonance imaging data. Magnetic Resonance in Medicine 1997; 37: Raz J, Turetsky B. Wavelet ANOVA and fmri. In: Wavelet Applications in Signal and Image Processing VIII. Denver, Colorado, Sijbers J, Dekker AJd, Van der Linden A, Verhoye TM, Van Dyck D. Adaptive anisotropic noise filtering for magnitude MR data. Magnetic Resonance Imaging 1999; 17: Skudlarski P, Constable RT, Gore JC. ROC analysis of statistical methods used in functional MRI: individual subjects. NeuroImage 1999; 9: Coulon O, Mangin J-F, Poline J-B, Frouin V, Bloch I. Structural group analysis of functional maps. In: Kuba MSaAT-P A ed. 16th International Conference on Information Processing in Medical Imaging. Visegrad, 1999: Worsley KJ, Marrett S, Neelin P, Evans AC. Searching scale space for activation in PET images. Human Brain Mapping 1996; 4: Poline J-B, Mazoyer BM. Enhanced detection in brain activation maps using a multifiltering approach. Journal of Cerebral Blood Flow Metabolism 1994; 14: Brammer MJ. Multidimensional wavelet analysis of functional magnetic resonance images. Human Brain Mapping 1998; 6: Jansen M, Uytterhoeven G, Bultheel A. Image de-noising by integer wavelet transforms and generalized cross validation. Medical Physics 1999; 26: Ruttimann UE, Unser M, Rawlings RR et al. Statistical analysis of functional MRI data in the wavelet domain. IEEE Transactions on Medical Imaging 1998; 17: Zaroubi S, Goelman G. Complex denoising of MR data via wavelet analysis: application for functional MRI. Magnetic Resonance Imaging 2000; 18: Kruggel F, Cramon DYv. Physiologically oriented models of the hemodynamic response in functional MRI. Lecture Notes in Computer Science. Berlin: Springer, Kruggel F, Cramon DYv. Modeling the hemodynamic response in single-trial functional MRI experiments. Magnetic Resonance in Medicine 1999; 42:
29 Data mining in brain imaging Friston KJ, Holmes AP, Worsley KJ, Poline JP, Frith CD, Frackowiak RSJ. Statistical parametric maps in functional imaging: a general linear approach. Human Brain Mapping 1995; 2: Friston K. Statistical parametric mapping and other analyses of functional imaging data. In Toga A, Mazziotta J eds. Brain mapping: the methods. San Diego, CA: Academic Press, Friston KJ, Holmes AP, Price CJ, Buchel C, Worsley KJ. Multisubject fmri studies and conjunction analyses. NeuroImage 1999; 10: Bosch V. Statistical analysis of multi-subject fmri data: assessment of focal activations. Journal of Magnetic Resonance Imaging 2000; 11: Cohen MS, DuBois RM. Stability, repeatability, and the expression of signal magnitude in functional magnetic resonance imaging. Journal of Magnetic Resonance Imaging 1999; 10: Tegeler C, Strother SC, Anderson JR, Kim SG. Reproducibility of BOLD-based functional MRI obtained at 4 T. Human Brain Mapping 1999; 7: Casey BJ, Cohen JD, O Craven K et al. Reproducibility of fmri results across four institutions using a spatial working memory task. NeuroImage 1998; 8: Constable RT, Skudlarski P, Mencl E et al. Quantifying and comparing region-of-interest activation patterns in functional brain MR imaging: methodology considerations. Magnetic Resonance Imaging 1998; 16: Lange N, Strother SC, Anderson JR et al. Plurality and resemblance in fmri data analysis. NeuroImage 1999; 10: Purdon PL, Weisskoff RM. Effect of temporal autocorrelation due to physiological noise and stimulus paradigm on voxel-level falsepositive rates in fmri. Human Brain Mapping 1998; 6: Lowe MJ, Russell DP. Treatment of baseline drifts in fmri time series analysis. Journal of Computer Assisted Tomography 1999; 23: Maas LC, Frederick Bd, Yurgelun-Todd DA, Renshaw PF. Autocovariance based analysis of functional MRI data. Biological Psychiatry 1996; 39: McIntosh AR, Brookstein FL, Haxby JV, Grady CL. Spatial pattern analysis of functional brain images using partial least squares. NeuroImage 1996; 3: Owen CB. Multiple media correlation: theory and applications. Hanover: Dartmouth College, Owen CB, Makedon F. Computed synchronization for multimedia applications. Dordrecht: Kluwer, Worsley KJ, Poline JB, Friston KJ, Evans AC. Characterizing the response of PET and fmri data using multivariate linear models. NeuroImage 1997; 6: Worsley KJ, Poline J-B, Vandal AC, Friston KJ. Tests for distributed, non-focal brain activations. NeuroImage 1995; 2: Frank LR, Buxton RB, Wong EC. Probabilistic analysis of functional magnetic resonance imaging data. Magnetic Resonance in Medicine 1998; 39: Fu Z, Hui Y, Liang Z-P. Joint spatiotemporal statistical analysis of functional MRI data. In: Proceedings of IPCIP 98 International Conference on Image Processing, 1998: Svensen M, Kruggel F, Von Cramon DY. Markov random field modelling of fmri data using a mean field EM-algorithm. In Pelillo ERHaM ed. Second International Workshop on Energy Minimization Methods in Computer Vision and Pattern Recognition. York. London: Springer, 1999: Poline J-B, Worsley KJ, Evans AC, Friston KJ. Combining spatial extent and peak intensity to test for activations in functional imaging. NeuroImage 1997; 5: Worsley KJ, Cao J, Paus T, Petrides M, Evans AC. Applications of random field theory to functional connectivity. Human Brain Mapping 1998; 6: Horwitz B. Modeling of functional brain imaging data. In: SPIE Ninth Workshop on Virtual Intelligence/Dynamic Neural Networks. Stockholm, Sweden, Holmes AP, Blair RC, Watson JD, Ford I. Nonparametric analysis of statistic images from functional mapping experiments. Journal of Cerebral Blood Flow Metabolism 1996; 16: Bandettini PA, Jesmanowicz A, Wong EC, Hyde JS. Processing strategies for time-course data sets in functional MRI of the human brain. Magnetic Resonance in Medicine 1993; 30: Bullmore E, Brammer M, Williams S et al. Statistical methods of estimation and inference for functional MR image analysis. Magnetic Resonance in Medicine 1996; 35:
30 388 V Megalooikonomou et al. 90 Strother S, Kanno I, Rottenberg D. Principal component analysis, variance partitioning, and functional connectivity. Journal of Cerebral Flow and Metabolism 1995; 15: McKeown MJ, Makeig S, Brown CG et al. Analysis of fmri data by blind separation into independent spatial components. Human Brain Mapping 1998; 6: Buchel C, Coull J, Friston K. The predictive value of changes in effective connectivity for human learning. Science 1999; 283: Cressie N. Statistics for spatial data. New York: John Wiley, Tsukimoto H, Morita C. The discovery of rules from brain images. In: Discovery Science First International Conference, 1998: Quinlan J. Induction of decision trees. Machine Learning 1986; 1: Enbank R. Spline smoothing and nonparametric regression. New York: Marcel Dekker, Megalooikonomou V, Davatzikos C, Herskovits E. Mining lesion-deficit associations in a brain image database. In: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. San Diego, CA, 1999: Andersen E. Introduction to the statistical analysis of categorical data. Berlin: Springer, Bryan R, Manolio T. A method for using MR to evaluate the effects of cardiovascular disease of the brain: the cardiovascular health study. American Journal of Neuroradiology 1994; 15: Gerring J, Brady K, Chen A et al. Neuroimaging variables related to the development of secondary attention deficit hyperactivity disorder in children who have moderate and severe closed head injury. Journal of the American Academy of Child and Adolescent Psychiatry 1998; 37: Bryan R, Wells S, Miller T et al. Infarctlike lesions in the brain: prevalence and anatomic characteristics at MR imaging of the elderly data from cardiovascular health study. Radiology 1997; 202: Mummery CJ, Patterson K, Price CJ, Ashburner J, Frackowiak RS, Hodges JR. A voxel-based morphometry study of semantic dementia: relationship between temporal lobe atrophy and semantic memory. Annals in Neurology 2000; 47: Burgel U, Schormann T, Schleicher A, Zilles K. Mapping of histologically identified long fiber tracts in human cerebral hemispheres to the MRI volume of a reference brain: position and spatial variability of the optic radiation. NeuroImage 1999; 10: May A, Ashburner J, Buchel C et al. Correlation between structural and functional changes in brain in an idiopathic headache syndrome [see comments]. Nature Medicine 1999; 5: Mummery CJ, Patterson K, Wise RJ, Vandenbergh R, Price CJ, Hodges JR. Disrupted temporal lobe connections in semantic dementia. Brain 1999; 122: Davatzikos C, Vaillant M, Resnick SM, Prince JL, Letovsky S, Bryan RN. A computerized approach for morphological analysis of the corpus callosum. Journal of Computer Assisted Tomography 1996; 20: Csernansky J, Joshi S, Wang L et al. Hippocampal morphometry in schizophrenia by high dimensional brain mapping. Proceedings of the National Academy of Sciences of the United States of America 1998; 95: Thompson PM, Schwartz C, Toga AW. Highresolution random mesh algorithms for creating a probabilistic 3D surface atlas of the human brain. NeuroImage 1996; 3: Thompson PM, Schwartz C, Lin RT, Khan AA, Toga AW. Three-dimensional statistical analysis of sulcal variability in the human brain. Journal of Neuroscience 1996; 16: Alen L, Richey M, Cahi Y, Gorski R. Sex differences in the corpus callosum of the living human being. Journal of Neuroscience 1991; 11: Delisi L. Brain imaging studies of cerebral morphology and activation in schizophrenia. In: Steinhauer SR, Gruzelier H eds. Neuropsychology, psychophysiology, and information processing. Amsterdam: Elsevier, 1991: Nelson MD, Saykin AJ, Flashman LA, Riordan HJ. Hippocampal volume reduction in schizophrenia as assessed by magnetic resonance imaging: a meta-analytic study. Archives in General Psychiatry 1998; 55: Thompson PM, Moussai J, Zohoori S et al. Cortical variability and asymmetry in normal aging and Alzheimer s disease. Cerebral Cortex 1998; 8: Mazziotta JC, Toga AW, Evans A, Fox P, Lancaster J. A probabilistic atlas of the human brain: theory and rationale for its development. The International Consortium for Brain Mapping (ICBM). NeuroImage 1995; 2:
31 Data mining in brain imaging Bookstein F. Biometrics, biomathematics, and the morphometric synthesis. Bulletin of Mathematical Biology 1996; 58: Davatzikos C, Resnick S. Sex differences in anatomic measures of interhemisperic connectivity: correlations with cognition in men but not in women. Cerebral Cortex 1998; 8: Miller M, Banerjee A, Christensen G et al. Statistical methods in computational anatomy. Statistical Methods in Medical Research 1997; 6: Gee J, LeBriquer L, Barillot C. Probabilistic matching of brain images. In: Bizais Y, Barillot C, Di Paola R eds. Information processing in medical imaging. Dordrecht: Kluwer, 1995: Haller J, Christensen G, Joshi S et al. Hippocampal MR imaging morphometry by means of general pattern matching. Radiology 1996; 199: Grenander U, Miller MI. Computational anatomy: an emerging discipline. Statistical Computing and Graphics Newsletter 1996; 7: Grenander U, Miller MI. Computational anatomy: an emerging discipline. Quarterly of Applied Mathematics 1998; LVI: Sholl D. Dendritic organization in the neurons of the visual and motor cortices of the cat. Journal of Anatomy 1953; 87: Hu MK. Visual pattern recognition by moment invariants. IRE Transactions on Information Theory 1962; 8: Sadjadi F, Hall E. Three-dimensional moment invariants. IEEE Transactions on Pattern Analysis and Machine Intelligence 1980; 2: Teh C-H, Chin R. On image analysis by the methods of moments. IEEE Transactions on Pattern Analysis and Machine Intelligence 1988: 10: Karten HJ, Kelly P. Content-based query and retrieval in neuroscience databases, Jacobs GA, Theunissen FE. Extraction of sensory parameters from a neural map by primary sensory interneurons. Journal of Neuroscience 2000; 20: Symanzik J, Ascoli GA, Washington SS, Krichmar JL. Visual data mining of brain cells. Computing Science and Statistics 1999; 31: Rossmanith C, Handels H, Poppl S, Rinast E, Weiss D. Characterisation and classification of brain tumors in three-dimensional MR image sequences. In: Fourth International Conference on Visualization in Biomedical Computing, 1996: Korn F, Sidiropoulos N, Faloutsos C, Siegel E, Protopapas Z. Fast and effective similarity search in medical tumor databases using morphology. In: SPIE Proceedings. Boston, MA, Guttman A. R-trees: a dynamic index structure for spatial searching. ACM SIGMOD 1984; Agrawal R, Faloutsos C, Swami A. Efficient similarity search in sequence databases. In: Foundations of Data Organization and Algorithms (FODO) Conference. Evanston, IL, Eden M. A two-dimensional growth process. In: Fourth Berkeley Symposium on Mathematical Statistics and Probability. Berkeley, CA: University of California Press, Ringwald M, Baldock R, Bard J et al. A database for mouse development. Science 1994; 265: Williams B, Doyle M. An internet atlas of mouse development. Computerized Medical Imaging and Graphics 1996; 20: Toga A, Santori E, Hazani R, Ambach K. A 3D digital map of the rat brain. Brain Research Bulletin 1995; 38: Cohen F, Yang Z, Huang Z, Nissanov J. Automatic matching of homologous histological sections. IEEE Transactions on Biomedical Engineering 1998; 45: Ali W, Cohen F. Registering coronal histological 2-D sections of a rat brain with coronal sections of a 3-D atlas using geometric curve invariants and B-spline representation. IEEE Transactions on Medical Imaging 1998; 17: Rangarajan A, Chui H, Mjolsness E et al. A robust point matching algorithm for autoradiograph alignment. Medical Image Analysis 1997; 4: Rangarajan A, Chui H, Bookstein F. The softassign Procrustes matching algorithm. Information Processing in Medical Imaging. London: Springer, 1997: Besl P, McKay N. A method for registration of 3-D shapes. IEEE Transactions on Pattern Analysis and Machine Intelligence 1992; 14: Pearl J. Fusion, propagation and structuring in belief networks. Artificial Intelligence 1986; 29: Herskovits E. Computer-based probabilistic network construction. PhD thesis. Stanford University, CA, 1991.
32 390 V Megalooikonomou et al. 144 Cooper G, Herskovits E. A Bayesian method for the induction of probabilistic networks from data. Machine Learning 1992; 9: Pearl J. Probabilistic reasoning in intelligent systems: networks of plausible inference. San Mateo, CA: Morgan Kaufmann, Bouckaert R. Properties of measures for Bayesian belief network learning. In: Tenth Conference on Uncertainty in Artificial Intelligence. Seattle, WA, Lam W, Bacchus F. Learning Bayesian belief networks an approach based on the MDL principle. Computational Intelligence 1994; 10: Rissanen J. Modeling by shortest data description. Automatica 1978; 14: Lam W. Bayesian network refinement via machine learning approach. IEEE Transactions on Pattern Analysis and Machine Intelligence 1998; 20: Gur RC, Trivedi SS, Saykin AJ, Gur RE. Behavioral imaging a procedure for analysis and display of neuropsychological test scores: I. Construction of algorithm and initial clinical application. Neuropsychiatry, Neuropsychology, and Behavioral Neurology 1988; 1: Gur RC, Saykin AJ, Blonder LX, Gur RE. Behavioral imaging : II. Application of the quantitative algorithm to hypothesis testing in the population of hemiparkinson patients. Neuropsychiatry, Neuropsychology, and Behavioral Neurology 1988; 1: Gur RC, Saykin AJ, Benton A et al. Behavioral imaging III. Interexpert agreement and reliability of weightings. Neuropsychiatry, Neuropyschology, and Behavioral Neurology 1990; 3: Fisher LD, Belle GV. Biostatistics: a methodology for the health sciences. New York: John Wiley, Friston KJ. Testing for anatomically specified regional effects. Human Brain Mapping 1997; 5: Forman SD, Cohen JD, Fitzgerald M, Eddy WF, Mintun MA, Noll DC. Improved assessment of significant activation in functional magnetic resonance imaging (fmri): use of a cluster-size threshold. Magnetic Resonance in Medicine 1995; 33: Baumgartner R, Somorjai R, Summers R, Richter W. Assessment of cluster homogeneity in fmri data using Kendall s coefficient of concordance. Magnetic Resonance Imaging 1999; 17: Baumgartner R, Ryner L, Richter W, Summers R, Jarmasz M, Somarjai R. Comparison of two exploratory data analysis methods for fmri: fuzzy clustering vs. principal component analysis. Magnetic Resonance Imaging 2000; 18: Golay X, Kollias S, Stoll G, Meier D, Valavanis A, Boesiger P. A new correlation-based fuzzy logic clustering algorithm for fmri. Magnetic Resonance in Medicine 1998; 40: Baumgartner R, Somorjai R, Summers R, Richter W, Ryner L, Jarmasz M. Resampling as a cluster validation technique in fmri. Journal of Magnetic Resonance Imaging 2000; 11: Baune A, Sommer FT, Erb M et al. Dynamical cluster analysis of cortical fmri activation. NeuroImage 1999; 9: Fadili MJ, Ruan S, Bloyet D, Mazoyer B. Unsupervised fuzzy clustering analysis of fmri series. In: Proceedings of the 20th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, Vol. 20, Biomedical Engineering Towards the Year 2000 and Beyond. Hong Kong, 1998: Filzmoser P, Baumgartner R, Moser E. A hierarchical clustering method for analyzing functional MR images. Magnetic Resonance Imaging 1999; 17: Goutte C, Toft P, Rostrup E, Nielsen F, Hansen L. On clustering fmri time series. NeuroImage 1999; 9: Ledberg A, Akerman S, Roland PE. Estimation of the probabilities of 3D clusters in functional brain images. NeuroImage 1998; 8: Moser E, Baumgartner R, Barth M, Windischberger C. Explorative signal processing in functional MR imaging. International Journal of Imaging Systems and Technology 1999; 10: Fischer H, Hennig J. Neural network-based analysis of MR time series. Magnetic Resonance in Medicine 1999; 41: Horwitz B, Tagamets MA. Predicting human functional maps with neural net modeling. Human Brain Mapping 1999; 8: Baumgartner R, Windischberger C, Moser E. Quantification in functional magnetic resonance imaging: fuzzy clustering vs. correlation analysis. Magnetic Resonance Imaging 1998; 16: Larntz K. Small-sample comparisons of exact levels for chi-squared goodness-of-fit statistics. Journal of the American Statistical Association 1978; 73:
33 Data mining in brain imaging Lee C, Shen S. Convergence-rates and power of 6 power-divergence statistics for testing independence in 2by2 contingency table. Communications in Statistics Theory and Methods 1994; 23: Oluyede B. A modified chi-square test of independence against a class of ordered alternatives in a RxC contingency table. Canadian Journal of Statistics 1994; 22: Harwell M, Serlin R. An empirical study of five multivariate tests for the singe-factor repeated measures model. Communications in Statistics Simulation and Computation 1997; 26: Tanizaki H. Power comparison of nonparametric tests: small-sample properties from Monte Carlo experiments. Journal of Applied Statistics 1997; 24: Osius G, Rojek D. Normal goodness-of-fit tests for multinomial models with large degrees of freedom. Journal of the American Statistical Association 1992; 87: Thomas R, Conlon M. Sample-size determination based on Fisher exact test for use in 22 comparative trials with low event rates. Controlled Clinical Trials 1992; 13: Mannan M, Nassar R. Size and power of test statistics for gene correlation in 22 contingency-tables. Biometrical Journal 1995; 37: Megalooikonomou V, Davatzikos C, Herskovits E. A simulator for evaluating methods for the detection of lesion-deficit associations. Human Brain Mapping in press, Shu Y, Liaw J-S, Berger T, Shahabi C. Data mining for neuroscience databases. In: 29th Annual Meeting of Society for Neuroscience. Miami Beach, FL, Faloutsos C, Lin K-I. FastMap: a fast algorithm for indexing, data mining and visualization of traditional and multimedia datasets. In: ACM SIGMOD Conference on Management of Data. San Jose, CA, Cohen J. Statistical power analysis for the behavioral sciences. Mahway, NJ: Lawrence Erlbaum, Megalooikonomou V, Herskovits E. Mining structure-function associations in a brain image database. In: Cios K ed. Medical data mining and knowledge discovery. Berlin: Springer, Narr K, Sharma T, Moussai J et al. 3D maps of cortical surface variability and sulcal asymmetries in schizophrenia and normal populations. In: 5th International Conference on Functional Mapping of the Human Brain. Dusseldorf, Thompson P, Mega M, Toga A. Diseasespecific brain atlases. In: Toga A, Mazziotta J, Frackowiak R eds. Brain mapping: the disorders. San Diego, CA: Academic Press, Thompson P, Woods R, Mega M, Toga A. Mathematical/computational challenges in creating deformable and probabilistic atlases of the human brain. Human Brain Mapping 2000; 9: Thompson P, Toga A. Detection, visualization and animation of abnormal anatomic structure with a deformable probabilistic brain atlas based on random vector field transformations. Medical Image Analysis 1997; 1: Ashburner J, Friston KJ. Voxel-based morphometry the methods. NeuroImage 2000; 11: Turkheimer E, Yeo RA, Jones C, Bigler ED. Quantitative assessment of covariation between neuropsychological function and location of naturally occurring lesions in humans. Journal of Clinical and Experimental Neuropsychology 1990; 12: Appendix A1: application of t-test in SPM The t-test compares two groups of samples and determines whether there is a significant difference; here the groups are MR signal readings during two different states (conditions): one R state (rest or control) and one T state (test or task). Let xyz denote a voxel (volume element) of a 3-D image and let N be the total number of voxels. Then the 3-D images [r xyz;r ], [r xyz;t ] exist for each subject k. Let [r xyz;k ] where r xyz;k ¼ r xyz;k;t r xyz;k;r be the voxel-by-voxel subtraction picture of the differences in
34 392 V Megalooikonomou et al. r between R and T. If all r xyz representing the brain in anatomically standardized pictures can be regarded as normally distributed, it is possible to calculate a descriptive t-test: Er xyz tr xyz ¼ rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Varr xyz n where the Er xyz and Varr xyz are the mean and variance of r xyz, respectively: and Er xyz ¼ 1 n Varr xyz ¼ 1 X n n 1 k X n k r xyz;k;t 2 r xyz Er xyz Appendix A2: use of the deformation function in analysis of morphological variability A deformation function dðu; vþ, is defined at each point ðu; vþ of the atlas structure, S, of interest. It measures the enlargement or shrinkage associated with the transformation from an infinitesimal region around a point in the atlas space to its corresponding infinitesimal region in the subject space. If the point ðu; vþ of the atlas space is mapped to the point [Uðu; vþ; V ðu; vþ] in the subject space, then dðu; vþ is defined by dðu; vþ ¼detfr½Uðu; vþ; Vðu; vþšg, where r denotes the gradient of a vector function and detðþ denotes the determinant of a matrix. Intersubject comparisons are made by comparing deformation functions. Specifically, let [U 1 ðu; vþ], ½V 1 ðu; vþš;...; ½U N ðu; vþ; V N ðu; vþš be the maps from the structure S of the atlas to the structure S of each of the N subjects of a population. Let also U p ðu; vþ and V p ðu; vþ be the average of the N functions U 1 ;...; U N and V 1 ;...; V N, respectively: U p ðu; vþ ¼1=N X U i ðu; vþ; V p ðu; vþ ¼1=N X V i ðu; vþ I¼1;...;N I¼1;...;N The average structure S p of that population is defined as the collection of the points where the atlas structure points are mapped: S p ¼[ ðu;vþ2sa ½U p ðu; vþ; V p ðu; vþš where S a is the collection of points belonging to structure S of the atlas. Let p ðu; vþ be the point-wise mean of the deformation function of the population: p ðu; vþ ¼1=N X d i ðu; vþ I¼1;...;N
35 Data mining in brain imaging 393 where d 1 ðu; vþ;...; d N ðu; vþ are the deformation functions of the N subjects. Then the difference between the two populations, denoted with subscripts 1 and 2, can be measured for each structural region as an effect size 180 defined as eðu; vþ ¼ð p1 ðu; vþ p2 ðu; vþþ=ðu; vþ where ðu; vþ is the point-wise standard deviation of the two populations combined. Appendix A3: generation of a Bayesian network from a database Without loss of generality, we assume in this section that each variable is Boolean, i.e. it represents a logical statement that is either true (T) or false (F). The problem of generating the Bayesian-network structure that is most likely to have generated the cases in the database D can be restated as B Smax ¼ argmax PðB S jdþ B s where B Smax is the network structure (i.e. set of associations) we seek. Using Bayes theorem, we obtain PDj ð B S ÞPB ð S Þ B Smax ¼ argmax B S PD ð Þ Since the prior probability of observing the data is constant for all models, the problem reduces to solving B Smax ¼ argmax B S PDj ð B S ÞPB ð S Þ Here, we describe an approach for generating a Bayesian network from a database, based on the above equation. 143 Although we will discuss the application of this method to a lesion-deficit study the method can be applied to the more general problem of finding associations. Let D be a database of lesion-deficit cases, let Z be the set of discrete variables represented by D, and let B S represent an arbitrary Bayesiannetwork structure containing just the variables in Z. In this section, we shall write as though database D were generated by Monte Carlo sampling of a Bayesian network with structure B S that is hidden from us, or, equivalently, from a multivariate distribution with conditional-independence among variables determined by B S. The primary goal is to use D to discover B S. The assumptions that explicitly delineate the problem are: (1) the process that generated D is modelled as a Bayesian network containing just the variables in Z; (2) cases occur independently, given a Bayesiannetwork model; (3) cases are complete (not missing data); (4) the distributions f(b P B S ) are independent of each other; and (5) the second-order probabilities are uniform. The application of assumption 1 yields Z PB ð S ; DÞ ¼ PDj ð B S ; B P B P ÞfðB P j B S ÞPB ð S Þ db P
36 394 V Megalooikonomou et al. where B P is a vector whose values denote the conditional-probability assignments associated with Bayesian-network structure B S, and f is the second-order conditionalprobability density function over B P given B S. The integral is over all possible value assignments to B P. Thus, the integration is over all possible Bayesian networks that can have structure B S. The integral in the above equation is a multiple integral in which the variables of integration are the conditional probabilities, B P, associated with structure B S. Note that this formulation explicitly admits the concept of a priorprobability distribution, P(B S ), over the possible Bayesian-network structures. One can assume uniform priors over structures, or calculate these prior probabilities. 143,144,177 From these five assumptions Cooper and Herskovits derive an equation for computing PðB S ; DÞ: PB ð S ; DÞ ¼ PB ð S Þ Yn Y q i i¼1 j¼1 ðr i 1Þ! Y r i N ij þ r i 1! where n is the number of variables in D (or nodes in B S ), r i is the number of values that node i can assume, q i is the number of different instantiations of the parents of node i found in D, N ij is the number of cases in D with the parents of node i assuming the jth value in the list i of their instantiations that are found in D, and N ijk is the number of the N ij cases in D with node i assuming value k. This equation allows the calculation of P(B S, D) using P(B S ) combined with enumeration over the cases in the database. Maximizing P(B S, D) over all possible B S is equivalent to maximizing P(B S D) over all possible B S, which is the goal of this approach. Cooper and Herskovits 144 prove that this metric has polynomial computational complexity, and is asymptotically optimal in terms of delineating all and only associations among the variables in the underlying distribution. Because the number of possible associations among the variables is more than exponential in the number of variables under consideration, it is critical that this metric be computationally efficient, and that heuristic search (e.g. simulated annealing or greedy search) be used. k¼1 N ijk!
Neuroimaging module I: Modern neuroimaging methods of investigation of the human brain in health and disease
1 Neuroimaging module I: Modern neuroimaging methods of investigation of the human brain in health and disease The following contains a summary of the content of the neuroimaging module I on the postgraduate
Principles of Data Mining by Hand&Mannila&Smyth
Principles of Data Mining by Hand&Mannila&Smyth Slides for Textbook Ari Visa,, Institute of Signal Processing Tampere University of Technology October 4, 2010 Data Mining: Concepts and Techniques 1 Differences
The Wondrous World of fmri statistics
Outline The Wondrous World of fmri statistics FMRI data and Statistics course, Leiden, 11-3-2008 The General Linear Model Overview of fmri data analysis steps fmri timeseries Modeling effects of interest
runl I IUI%I/\L Magnetic Resonance Imaging
runl I IUI%I/\L Magnetic Resonance Imaging SECOND EDITION Scott A. HuetteS Brain Imaging and Analysis Center, Duke University Allen W. Song Brain Imaging and Analysis Center, Duke University Gregory McCarthy
Statistics Graduate Courses
Statistics Graduate Courses STAT 7002--Topics in Statistics-Biological/Physical/Mathematics (cr.arr.).organized study of selected topics. Subjects and earnable credit may vary from semester to semester.
The Scientific Data Mining Process
Chapter 4 The Scientific Data Mining Process When I use a word, Humpty Dumpty said, in rather a scornful tone, it means just what I choose it to mean neither more nor less. Lewis Carroll [87, p. 214] In
Obtaining Knowledge. Lecture 7 Methods of Scientific Observation and Analysis in Behavioral Psychology and Neuropsychology.
Lecture 7 Methods of Scientific Observation and Analysis in Behavioral Psychology and Neuropsychology 1.Obtaining Knowledge 1. Correlation 2. Causation 2.Hypothesis Generation & Measures 3.Looking into
Improving the Performance of Data Mining Models with Data Preparation Using SAS Enterprise Miner Ricardo Galante, SAS Institute Brasil, São Paulo, SP
Improving the Performance of Data Mining Models with Data Preparation Using SAS Enterprise Miner Ricardo Galante, SAS Institute Brasil, São Paulo, SP ABSTRACT In data mining modelling, data preparation
Data, Measurements, Features
Data, Measurements, Features Middle East Technical University Dep. of Computer Engineering 2009 compiled by V. Atalay What do you think of when someone says Data? We might abstract the idea that data are
Silvermine House Steenberg Office Park, Tokai 7945 Cape Town, South Africa Telephone: +27 21 702 4666 www.spss-sa.com
SPSS-SA Silvermine House Steenberg Office Park, Tokai 7945 Cape Town, South Africa Telephone: +27 21 702 4666 www.spss-sa.com SPSS-SA Training Brochure 2009 TABLE OF CONTENTS 1 SPSS TRAINING COURSES FOCUSING
Machine Learning for Medical Image Analysis. A. Criminisi & the InnerEye team @ MSRC
Machine Learning for Medical Image Analysis A. Criminisi & the InnerEye team @ MSRC Medical image analysis the goal Automatic, semantic analysis and quantification of what observed in medical scans Brain
Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.
Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C
Image Segmentation and Registration
Image Segmentation and Registration Dr. Christine Tanner ([email protected]) Computer Vision Laboratory, ETH Zürich Dr. Verena Kaynig, Machine Learning Laboratory, ETH Zürich Outline Segmentation
Service courses for graduate students in degree programs other than the MS or PhD programs in Biostatistics.
Course Catalog In order to be assured that all prerequisites are met, students must acquire a permission number from the education coordinator prior to enrolling in any Biostatistics course. Courses are
Environmental Remote Sensing GEOG 2021
Environmental Remote Sensing GEOG 2021 Lecture 4 Image classification 2 Purpose categorising data data abstraction / simplification data interpretation mapping for land cover mapping use land cover class
STATISTICA Formula Guide: Logistic Regression. Table of Contents
: Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 Sigma-Restricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary
Medical Image Processing on the GPU. Past, Present and Future. Anders Eklund, PhD Virginia Tech Carilion Research Institute [email protected].
Medical Image Processing on the GPU Past, Present and Future Anders Eklund, PhD Virginia Tech Carilion Research Institute [email protected] Outline Motivation why do we need GPUs? Past - how was GPU programming
How are Parts of the Brain Related to Brain Function?
How are Parts of the Brain Related to Brain Function? Scientists have found That the basic anatomical components of brain function are related to brain size and shape. The brain is composed of two hemispheres.
Statistical Models in Data Mining
Statistical Models in Data Mining Sargur N. Srihari University at Buffalo The State University of New York Department of Computer Science and Engineering Department of Biostatistics 1 Srihari Flood of
II. DISTRIBUTIONS distribution normal distribution. standard scores
Appendix D Basic Measurement And Statistics The following information was developed by Steven Rothke, PhD, Department of Psychology, Rehabilitation Institute of Chicago (RIC) and expanded by Mary F. Schmidt,
Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012
Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization GENOME 560, Spring 2012 Data are interesting because they help us understand the world Genomics: Massive Amounts
Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics
Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics For 2015 Examinations Aim The aim of the Probability and Mathematical Statistics subject is to provide a grounding in
Norbert Schuff Professor of Radiology VA Medical Center and UCSF [email protected]
Norbert Schuff Professor of Radiology Medical Center and UCSF [email protected] Medical Imaging Informatics 2012, N.Schuff Course # 170.03 Slide 1/67 Overview Definitions Role of Segmentation Segmentation
Subjects: Fourteen Princeton undergraduate and graduate students were recruited to
Supplementary Methods Subjects: Fourteen Princeton undergraduate and graduate students were recruited to participate in the study, including 9 females and 5 males. The mean age was 21.4 years, with standard
Software Packages The following data analysis software packages will be showcased:
Analyze This! Practicalities of fmri and Diffusion Data Analysis Data Download Instructions Weekday Educational Course, ISMRM 23 rd Annual Meeting and Exhibition Tuesday 2 nd June 2015, 10:00-12:00, Room
Morphological analysis on structural MRI for the early diagnosis of neurodegenerative diseases. Marco Aiello On behalf of MAGIC-5 collaboration
Morphological analysis on structural MRI for the early diagnosis of neurodegenerative diseases Marco Aiello On behalf of MAGIC-5 collaboration Index Motivations of morphological analysis Segmentation of
CONTENTS PREFACE 1 INTRODUCTION 1 2 DATA VISUALIZATION 19
PREFACE xi 1 INTRODUCTION 1 1.1 Overview 1 1.2 Definition 1 1.3 Preparation 2 1.3.1 Overview 2 1.3.2 Accessing Tabular Data 3 1.3.3 Accessing Unstructured Data 3 1.3.4 Understanding the Variables and Observations
Data Preprocessing. Week 2
Data Preprocessing Week 2 Topics Data Types Data Repositories Data Preprocessing Present homework assignment #1 Team Homework Assignment #2 Read pp. 227 240, pp. 250 250, and pp. 259 263 the text book.
Chapter 20: Data Analysis
Chapter 20: Data Analysis Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 20: Data Analysis Decision Support Systems Data Warehousing Data Mining Classification
Descriptive Statistics
Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize
CS 591.03 Introduction to Data Mining Instructor: Abdullah Mueen
CS 591.03 Introduction to Data Mining Instructor: Abdullah Mueen LECTURE 3: DATA TRANSFORMATION AND DIMENSIONALITY REDUCTION Chapter 3: Data Preprocessing Data Preprocessing: An Overview Data Quality Major
Tutorial for proteome data analysis using the Perseus software platform
Tutorial for proteome data analysis using the Perseus software platform Laboratory of Mass Spectrometry, LNBio, CNPEM Tutorial version 1.0, January 2014. Note: This tutorial was written based on the information
Statistical Considerations in Magnetic Resonance Imaging of Brain Function
Statistical Considerations in Magnetic Resonance Imaging of Brain Function Brian D. Ripley Professor of Applied Statistics University of Oxford [email protected] http://www.stats.ox.ac.uk/ ripley Acknowledgements
Learning Example. Machine learning and our focus. Another Example. An example: data (loan application) The data and the goal
Learning Example Chapter 18: Learning from Examples 22c:145 An emergency room in a hospital measures 17 variables (e.g., blood pressure, age, etc) of newly admitted patients. A decision is needed: whether
Integration and Visualization of Multimodality Brain Data for Language Mapping
Integration and Visualization of Multimodality Brain Data for Language Mapping Andrew V. Poliakov, PhD, Kevin P. Hinshaw, MS, Cornelius Rosse, MD, DSc and James F. Brinkley, MD, PhD Structural Informatics
business statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar
business statistics using Excel Glyn Davis & Branko Pecar OXFORD UNIVERSITY PRESS Detailed contents Introduction to Microsoft Excel 2003 Overview Learning Objectives 1.1 Introduction to Microsoft Excel
NEURO M203 & BIOMED M263 WINTER 2014
NEURO M203 & BIOMED M263 WINTER 2014 MRI Lab 1: Structural and Functional Anatomy During today s lab, you will work with and view the structural and functional imaging data collected from the scanning
SAS Software to Fit the Generalized Linear Model
SAS Software to Fit the Generalized Linear Model Gordon Johnston, SAS Institute Inc., Cary, NC Abstract In recent years, the class of generalized linear models has gained popularity as a statistical modeling
COMMENTS AND CONTROVERSIES Why Voxel-Based Morphometry Should Be Used
NeuroImage 14, 1238 1243 (2001) doi:10.1006/nimg.2001.0961, available online at http://www.idealibrary.com on COMMENTS AND CONTROVERSIES Why Voxel-Based Morphometry Should Be Used John Ashburner 1 and
Social Media Mining. Data Mining Essentials
Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers
Final Project Report
CPSC545 by Introduction to Data Mining Prof. Martin Schultz & Prof. Mark Gerstein Student Name: Yu Kor Hugo Lam Student ID : 904907866 Due Date : May 7, 2007 Introduction Final Project Report Pseudogenes
Part-Based Recognition
Part-Based Recognition Benedict Brown CS597D, Fall 2003 Princeton University CS 597D, Part-Based Recognition p. 1/32 Introduction Many objects are made up of parts It s presumably easier to identify simple
Sanjeev Kumar. contribute
RESEARCH ISSUES IN DATAA MINING Sanjeev Kumar I.A.S.R.I., Library Avenue, Pusa, New Delhi-110012 [email protected] 1. Introduction The field of data mining and knowledgee discovery is emerging as a
Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm
Mgt 540 Research Methods Data Analysis 1 Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm http://web.utk.edu/~dap/random/order/start.htm
Information Management course
Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 01 : 06/10/2015 Practical informations: Teacher: Alberto Ceselli ([email protected])
MEDIMAGE A Multimedia Database Management System for Alzheimer s Disease Patients
MEDIMAGE A Multimedia Database Management System for Alzheimer s Disease Patients Peter L. Stanchev 1, Farshad Fotouhi 2 1 Kettering University, Flint, Michigan, 48504 USA [email protected] http://www.kettering.edu/~pstanche
CCNY. BME I5100: Biomedical Signal Processing. Linear Discrimination. Lucas C. Parra Biomedical Engineering Department City College of New York
BME I5100: Biomedical Signal Processing Linear Discrimination Lucas C. Parra Biomedical Engineering Department CCNY 1 Schedule Week 1: Introduction Linear, stationary, normal - the stuff biology is not
IMPROVING DATA INTEGRATION FOR DATA WAREHOUSE: A DATA MINING APPROACH
IMPROVING DATA INTEGRATION FOR DATA WAREHOUSE: A DATA MINING APPROACH Kalinka Mihaylova Kaloyanova St. Kliment Ohridski University of Sofia, Faculty of Mathematics and Informatics Sofia 1164, Bulgaria
Assessment. Presenter: Yupu Zhang, Guoliang Jin, Tuo Wang Computer Vision 2008 Fall
Automatic Photo Quality Assessment Presenter: Yupu Zhang, Guoliang Jin, Tuo Wang Computer Vision 2008 Fall Estimating i the photorealism of images: Distinguishing i i paintings from photographs h Florin
International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014
RESEARCH ARTICLE OPEN ACCESS A Survey of Data Mining: Concepts with Applications and its Future Scope Dr. Zubair Khan 1, Ashish Kumar 2, Sunny Kumar 3 M.Tech Research Scholar 2. Department of Computer
Blind Deconvolution of Barcodes via Dictionary Analysis and Wiener Filter of Barcode Subsections
Blind Deconvolution of Barcodes via Dictionary Analysis and Wiener Filter of Barcode Subsections Maximilian Hung, Bohyun B. Kim, Xiling Zhang August 17, 2013 Abstract While current systems already provide
Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.
Business Course Text Bowerman, Bruce L., Richard T. O'Connell, J. B. Orris, and Dawn C. Porter. Essentials of Business, 2nd edition, McGraw-Hill/Irwin, 2008, ISBN: 978-0-07-331988-9. Required Computing
Clustering. Adrian Groza. Department of Computer Science Technical University of Cluj-Napoca
Clustering Adrian Groza Department of Computer Science Technical University of Cluj-Napoca Outline 1 Cluster Analysis What is Datamining? Cluster Analysis 2 K-means 3 Hierarchical Clustering What is Datamining?
A Learning Based Method for Super-Resolution of Low Resolution Images
A Learning Based Method for Super-Resolution of Low Resolution Images Emre Ugur June 1, 2004 [email protected] Abstract The main objective of this project is the study of a learning based method
11. Analysis of Case-control Studies Logistic Regression
Research methods II 113 11. Analysis of Case-control Studies Logistic Regression This chapter builds upon and further develops the concepts and strategies described in Ch.6 of Mother and Child Health:
The Data Mining Process
Sequence for Determining Necessary Data. Wrong: Catalog everything you have, and decide what data is important. Right: Work backward from the solution, define the problem explicitly, and map out the data
Advanced MRI methods in diagnostics of spinal cord pathology
Advanced MRI methods in diagnostics of spinal cord pathology Stanisław Kwieciński Department of Magnetic Resonance MR IMAGING LAB MRI /MRS IN BIOMEDICAL RESEARCH ON HUMANS AND ANIMAL MODELS IN VIVO Equipment:
Gerry Hobbs, Department of Statistics, West Virginia University
Decision Trees as a Predictive Modeling Method Gerry Hobbs, Department of Statistics, West Virginia University Abstract Predictive modeling has become an important area of interest in tasks such as credit
EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set
EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set Amhmed A. Bhih School of Electrical and Electronic Engineering Princy Johnson School of Electrical and Electronic Engineering Martin
life science data mining
life science data mining - '.)'-. < } ti» (>.:>,u» c ~'editors Stephen Wong Harvard Medical School, USA Chung-Sheng Li /BM Thomas J Watson Research Center World Scientific NEW JERSEY LONDON SINGAPORE.
Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data
CMPE 59H Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data Term Project Report Fatma Güney, Kübra Kalkan 1/15/2013 Keywords: Non-linear
Organizing Your Approach to a Data Analysis
Biost/Stat 578 B: Data Analysis Emerson, September 29, 2003 Handout #1 Organizing Your Approach to a Data Analysis The general theme should be to maximize thinking about the data analysis and to minimize
How To Use Neural Networks In Data Mining
International Journal of Electronics and Computer Science Engineering 1449 Available Online at www.ijecse.org ISSN- 2277-1956 Neural Networks in Data Mining Priyanka Gaur Department of Information and
2. MATERIALS AND METHODS
Difficulties of T1 brain MRI segmentation techniques M S. Atkins *a, K. Siu a, B. Law a, J. Orchard a, W. Rosenbaum a a School of Computing Science, Simon Fraser University ABSTRACT This paper looks at
Cognitive Neuroscience. Questions. Multiple Methods. Electrophysiology. Multiple Methods. Approaches to Thinking about the Mind
Cognitive Neuroscience Approaches to Thinking about the Mind Cognitive Neuroscience Evolutionary Approach Sept 20-22, 2004 Interdisciplinary approach Rapidly changing How does the brain enable cognition?
Introduction to Pattern Recognition
Introduction to Pattern Recognition Selim Aksoy Department of Computer Engineering Bilkent University [email protected] CS 551, Spring 2009 CS 551, Spring 2009 c 2009, Selim Aksoy (Bilkent University)
Palmprint Recognition. By Sree Rama Murthy kora Praveen Verma Yashwant Kashyap
Palmprint Recognition By Sree Rama Murthy kora Praveen Verma Yashwant Kashyap Palm print Palm Patterns are utilized in many applications: 1. To correlate palm patterns with medical disorders, e.g. genetic
Lecture 10: Regression Trees
Lecture 10: Regression Trees 36-350: Data Mining October 11, 2006 Reading: Textbook, sections 5.2 and 10.5. The next three lectures are going to be about a particular kind of nonlinear predictive model,
Why do we have so many brain coordinate systems? Lilla ZölleiZ WhyNHow seminar 12/04/08
Why do we have so many brain coordinate systems? Lilla ZölleiZ WhyNHow seminar 12/04/08 About brain atlases What are they? What do we use them for? Who creates them? Which one shall I use? Brain atlas
SPECIAL PERTURBATIONS UNCORRELATED TRACK PROCESSING
AAS 07-228 SPECIAL PERTURBATIONS UNCORRELATED TRACK PROCESSING INTRODUCTION James G. Miller * Two historical uncorrelated track (UCT) processing approaches have been employed using general perturbations
240ST014 - Data Analysis of Transport and Logistics
Coordinating unit: Teaching unit: Academic year: Degree: ECTS credits: 2015 240 - ETSEIB - Barcelona School of Industrial Engineering 715 - EIO - Department of Statistics and Operations Research MASTER'S
MetaMorph Software Basic Analysis Guide The use of measurements and journals
MetaMorph Software Basic Analysis Guide The use of measurements and journals Version 1.0.2 1 Section I: How Measure Functions Operate... 3 1. Selected images... 3 2. Thresholding... 3 3. Regions of interest...
Introduction to Regression and Data Analysis
Statlab Workshop Introduction to Regression and Data Analysis with Dan Campbell and Sherlock Campbell October 28, 2008 I. The basics A. Types of variables Your variables may take several forms, and it
Data analysis process
Data analysis process Data collection and preparation Collect data Prepare codebook Set up structure of data Enter data Screen data for errors Exploration of data Descriptive Statistics Graphs Analysis
New Work Item for ISO 3534-5 Predictive Analytics (Initial Notes and Thoughts) Introduction
Introduction New Work Item for ISO 3534-5 Predictive Analytics (Initial Notes and Thoughts) Predictive analytics encompasses the body of statistical knowledge supporting the analysis of massive data sets.
A successful market segmentation initiative answers the following critical business questions: * How can we a. Customer Status.
MARKET SEGMENTATION The simplest and most effective way to operate an organization is to deliver one product or service that meets the needs of one type of customer. However, to the delight of many organizations
Introducing MIPAV. In this chapter...
1 Introducing MIPAV In this chapter... Platform independence on page 44 Supported image types on page 45 Visualization of images on page 45 Extensibility with Java plug-ins on page 47 Sampling of MIPAV
Classic EEG (ERPs)/ Advanced EEG. Quentin Noirhomme
Classic EEG (ERPs)/ Advanced EEG Quentin Noirhomme Outline Origins of MEEG Event related potentials Time frequency decomposition i Source reconstruction Before to start EEGlab Fieldtrip (included in spm)
An Introduction to Data Mining
An Introduction to Intel Beijing [email protected] January 17, 2014 Outline 1 DW Overview What is Notable Application of Conference, Software and Applications Major Process in 2 Major Tasks in Detail
Students' Opinion about Universities: The Faculty of Economics and Political Science (Case Study)
Cairo University Faculty of Economics and Political Science Statistics Department English Section Students' Opinion about Universities: The Faculty of Economics and Political Science (Case Study) Prepared
5 Factors Affecting the Signal-to-Noise Ratio
5 Factors Affecting the Signal-to-Noise Ratio 29 5 Factors Affecting the Signal-to-Noise Ratio In the preceding chapters we have learned how an MR signal is generated and how the collected signal is processed
STA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! [email protected]! http://www.cs.toronto.edu/~rsalakhu/ Lecture 6 Three Approaches to Classification Construct
131-1. Adding New Level in KDD to Make the Web Usage Mining More Efficient. Abstract. 1. Introduction [1]. 1/10
1/10 131-1 Adding New Level in KDD to Make the Web Usage Mining More Efficient Mohammad Ala a AL_Hamami PHD Student, Lecturer m_ah_1@yahoocom Soukaena Hassan Hashem PHD Student, Lecturer soukaena_hassan@yahoocom
International Journal of Computer Science Trends and Technology (IJCST) Volume 3 Issue 3, May-June 2015
RESEARCH ARTICLE OPEN ACCESS Data Mining Technology for Efficient Network Security Management Ankit Naik [1], S.W. Ahmad [2] Student [1], Assistant Professor [2] Department of Computer Science and Engineering
Data Mining: An Overview. David Madigan http://www.stat.columbia.edu/~madigan
Data Mining: An Overview David Madigan http://www.stat.columbia.edu/~madigan Overview Brief Introduction to Data Mining Data Mining Algorithms Specific Eamples Algorithms: Disease Clusters Algorithms:
4.1 Exploratory Analysis: Once the data is collected and entered, the first question is: "What do the data look like?"
Data Analysis Plan The appropriate methods of data analysis are determined by your data types and variables of interest, the actual distribution of the variables, and the number of cases. Different analyses
How To Fix Out Of Focus And Blur Images With A Dynamic Template Matching Algorithm
IJSTE - International Journal of Science Technology & Engineering Volume 1 Issue 10 April 2015 ISSN (online): 2349-784X Image Estimation Algorithm for Out of Focus and Blur Images to Retrieve the Barcode
SIGNATURE VERIFICATION
SIGNATURE VERIFICATION Dr. H.B.Kekre, Dr. Dhirendra Mishra, Ms. Shilpa Buddhadev, Ms. Bhagyashree Mall, Mr. Gaurav Jangid, Ms. Nikita Lakhotia Computer engineering Department, MPSTME, NMIMS University
LAGUARDIA COMMUNITY COLLEGE CITY UNIVERSITY OF NEW YORK DEPARTMENT OF MATHEMATICS, ENGINEERING, AND COMPUTER SCIENCE
LAGUARDIA COMMUNITY COLLEGE CITY UNIVERSITY OF NEW YORK DEPARTMENT OF MATHEMATICS, ENGINEERING, AND COMPUTER SCIENCE MAT 119 STATISTICS AND ELEMENTARY ALGEBRA 5 Lecture Hours, 2 Lab Hours, 3 Credits Pre-
Chapter 12 Discovering New Knowledge Data Mining
Chapter 12 Discovering New Knowledge Data Mining Becerra-Fernandez, et al. -- Knowledge Management 1/e -- 2004 Prentice Hall Additional material 2007 Dekai Wu Chapter Objectives Introduce the student to
Insurance Analytics - analýza dat a prediktivní modelování v pojišťovnictví. Pavel Kříž. Seminář z aktuárských věd MFF 4.
Insurance Analytics - analýza dat a prediktivní modelování v pojišťovnictví Pavel Kříž Seminář z aktuárských věd MFF 4. dubna 2014 Summary 1. Application areas of Insurance Analytics 2. Insurance Analytics
Algebra 1 2008. Academic Content Standards Grade Eight and Grade Nine Ohio. Grade Eight. Number, Number Sense and Operations Standard
Academic Content Standards Grade Eight and Grade Nine Ohio Algebra 1 2008 Grade Eight STANDARDS Number, Number Sense and Operations Standard Number and Number Systems 1. Use scientific notation to express
Introduction. Karl J Friston
Introduction Experimental design and Statistical Parametric Mapping Karl J Friston The Wellcome Dept. of Cognitive Neurology, University College London Queen Square, London, UK WCN 3BG Tel 44 00 7833 7456
Introduction to Statistics and Quantitative Research Methods
Introduction to Statistics and Quantitative Research Methods Purpose of Presentation To aid in the understanding of basic statistics, including terminology, common terms, and common statistical methods.
Principles of Dat Da a t Mining Pham Tho Hoan [email protected] [email protected]. n
Principles of Data Mining Pham Tho Hoan [email protected] References [1] David Hand, Heikki Mannila and Padhraic Smyth, Principles of Data Mining, MIT press, 2002 [2] Jiawei Han and Micheline Kamber,
Lecture 2: Descriptive Statistics and Exploratory Data Analysis
Lecture 2: Descriptive Statistics and Exploratory Data Analysis Further Thoughts on Experimental Design 16 Individuals (8 each from two populations) with replicates Pop 1 Pop 2 Randomly sample 4 individuals
