The Art of Spatial Data Mining A Review of A New Algorithm for Discovery of Spatial Association Rules

Size: px
Start display at page:

Download "The Art of Spatial Data Mining A Review of A New Algorithm for Discovery of Spatial Association Rules"

Transcription

1 The Art of Spatial Data Mining A Review of A New Algorithm for Discovery of Spatial Association Rules Expert System Prof. Glenn Shafer Fall 00 Anonymous

2 Abstract This paper is a literature review of a new algorithm for mining association rules in a spatial database. Spatial data mining, or knowledge discovery in large spatial databases, is the process of extracting implicit knowledge, spatial relations, or other patterns not explicitly stored in spatial databases. Recently, there has been a lot of research in data mining and these studies led to a set of interesting techniques, including methods for mining strong association and dependency rules, attribute-oriented induction for mining characteristic and discriminant rules, etc. Such studies set a foundation and provide some interesting methods for the exploration of highly promising spatial data mining techniques. Based on previous studies on spatial data mining and mining association rules in transaction-based databases, this paper will introduce and study an interesting method for mining strong spatial association rules in large spatial databases [6]. Discovery of spatial association rules may disclose interesting relationship among spatial and non-spatial data in large spatial database and thus it represents a new and promising direction in spatial data warehousing and spatial data mining. Basically the method that will be presented in this paper explores efficient mining of spatial association rules at multiple approximation and abstraction levels. It proposes first to perform less costly, approximate spatial computation to obtain approximate spatial relationships at a high abstraction level and then refine the spatial computation only for those data or predicates whose refined computation may contribute to the discovery of strong association rules. Such two-step spatial mining algorithm facilitates mining strong spatial association rules at multiple concept levels by a top-down, progressive deepening technique [6]. This method is based on the assumption that a user has reasonably good knowledge on what kind of rules he wants to find from the database, and that there exists good knowledge, such as concept or operation hierarchies, for spatial or non-spatial generalization. Such assumptions may rule out naive users and complex spatial databases with poorly understood structures. This paper is related to my current research topics of spatial database, spatial database warehouse modeling and spatial database mining techniques under the supervision of Prof. Adam and Prof. Atluri. I am doing a comprehensive survey right now to try to understand and organize what are the most recent theories and techniques in this area, and the algorithm presented in this paper is one of those that deserve more research efforts, in my opinion.

3 . Introduction to spatial data mining a. General introduction A spatial database stores a large amount of space-related data, such as maps, preprocessed remote sensing or medical imaging data, and VLSI chip layout data. Spatial databases have many features distinguish them from relational databases. They carry topological or distance information, usually organized by sophisticated, multidimensional spatial indexing structures that are accessed by spatial data access methods and often require spatial reasoning, geometric computation, and spatial knowledge representation techniques. Spatial data mining refers to the extraction of knowledge, spatial relationships, or other interesting patterns not explicitly stored in spatial databases. Such mining demands an integration of data mining with spatial database technologies. It can be used for understanding spatial data, discovering spatial relationships and relationships between spatial and non-spatial data, constructing spatial knowledge bases, reorganizing spatial databases, and optimizing spatial queries. It is expected to have wide applications in geographic information systems (GIS), geomarketing, remote sensing, image database exploration, medical imaging, navigation, traffic control, environmental studies, and many other areas where spatial data are used. However, extracting interesting and useful patterns from spatial databases is much more difficult than extracting corresponding patterns from traditional numeric and characterized data due to the complexity of spatial data types, spatial relationships, and spatial autocorrelation [8]. b. How to categorize spatial data mining based on the kinds of rules? Current spatial data mining is divided into three common fields: the first examines the classification of spatial datasets, the second applies the generalization of association rules to spatial co-location patterns, and the third focuses on detecting spatial outliers. Specifically, spatial data mining can also be categorized based on the kinds of rules to be discovered in spatial databases. A spatial characteristic rule is a general description of a set of spatial-related data. For example, the description of the general weather patterns in a set of geographic regions is a spatial characteristic rule. A spatial discriminant rule is the general description of the contrasting or discriminating features of a class of spatial-related data from other classes. For example, the comparison of two weather patterns in two geographic regions is a spatial discriminant rule. A spatial association rule, which will be discussed in this paper, is a rule that describes the implication of one or a set of features by another set of features in spatial databases. For example, a rule like most cities in Canada are close to the Canada-US border is a spatial association rule [6]. There have been some interesting studies related to the mining of spatial databases. In this paper, I will study the extension of the techniques for mining association rules in transaction-based databases. c. What is a spatial association rule? 3

4 A spatial association rule is a rule of the form A B, where A and B are sets of predicates and some of which are spatial ones. In a large database, many association relationships may exist, but some of them may occur rarely or may not hold in most cases. People are only interested in the association rules that occur very strongly, i.e., which occur frequently and hold in most cases. Due to this fact, the concepts of minimum support and minimum confidence are introduced. Informally, the support of a pattern A in a set of spatial objects S is the probability that a member of S satisfies pattern A, and the confidence of A B is the probability that pattern B occurs if pattern A occurs. A user or an expert may specify thresholds to confine the rules to be discovered as strong ones. For example, we may find that 9% of cities within British Columbia (BC) and adjacent to water are close to USA, which associates predicates is_a, within, and adjacent_to and spatial predicate close_to in the following format: is_a(x, city) within(x, BC) adjacent_to(x, water) close_to(x, USA). (9%) Although such rules are usually not 00% true, they have some nontrivial and valuable knowledge about spatial associations, and thus it is interesting to discover them from large spatial databases. Also, various kinds of spatial predicates can constitute a spatial association rule. Examples include distance information such as close_to and far_away, topological relations like intersect, overlap and disjoint, and spatial orientation like left_of and west_of. Since spatial association mining needs to evaluate multiple spatial relationships among a large number of spatial objects, the process could be quite costly. In this paper, an efficient method for mining spatial association rule is studied, with a topdown, progressive deepening search technique proposed. The technique firstly searches at a high concept level for strong patterns and implication relationships among the large patterns at a coarse resolution scale. Secondly, only for those large patterns, it deepens the search to lower concept levels (i.e., their lower level descendants). Such a deepening search process continues until no large patterns can be found. An important optimization technique is that the search for large patterns at high concept levels may apply efficient spatial computation algorithms at a coarse resolution scale, such as generalized close_to by using approximate spatial computation algorithm like R-trees or plane-sweep techniques operating on minimum bounding rectangles(mbr). Only the candidate spatial predicates, which are worth detailed examination, will be computed by refined spatial techniques. Such a multiple level approach saves much computation because it is very expensive to perform detailed spatial computation for all the possible spatial association relationships [6].. Some existing methods related to spatial data mining As we know, statistical analysis is widely used for data mining, thus it is reasonable to think using statistical techniques for spatial data mining. Actually, statistical spatial data analysis has been a popular approach to analyze spatial data. The approach handles numerical data well and usually proposes realistic models of spatial phenomena []. 4

5 However, it typically assumes statistical independence among the spatially distributed data, which is not true in reality since spatial objects are always inter-related. This assumption violates Tobler s first law of Geography: everything is related to everything else, but nearby things are more related than distant things. In other words, the values of attributes of nearby spatial objects tend to systematically affect each other. In spatial statistics, an area within statistics devoted to the analysis of spatial data, is called spatial autocorrelation, where researchers have created, adapted, and applied statistical techniques to spatial data. For example, in image processing and vision, Markov Random Field (MRF) is a popular model to incorporate context for image segmentation and classification. Another major approach in spatial data mining is to apply generalization techniques to spatial and non-spatial data to generalize detailed spatial data to certain high level and study the general characteristics and data distribution at this level. An attributeoriented induction method is quite popular and basically it generalizes data to high level concepts and describes general relationships between spatial and non-spatial data. Two algorithms were proposed: nonspatial-dominant generalization and spatialdominant generalization. The nonspatial-dominant generalization algorithm first performs attribute-oriented generalization on task-relevant nonspatial data describing the properties of spatial objects. In this step, numerical data can be generalized to ranges or descriptive high level concepts, and symbolic values to higher level concepts. By doing so, low level distinctive values may be generalized to identical high-level values, and such highlevel identical values among different tuples can be merged together with their spatial pointers clustered into one slot in the spatial attribute. Finally, the map consists of a small number of regions with high-level descriptions. On the other side, the spatial-dominant generalization first performs on query-related spatial data. Data are generalized using spatial data hierarchies such as geographic or administrative regions provided by users or hierarchical data structures such as quadtrees or R-trees. The generalized spatial entities cluster the related nonspatial data together. After generalization of spatial data, every region can be described at a high concept level by one or a set of predicates. Also, knowledge mining in image databases, which can be treated as a major type of spatial databases, has been studied recently. Method for the classification of sky objects and another method for recognition of volcanoes on the surface of Venus are studied, where classification trees were used to make final decisions. Finally, the spatial data mining techniques are closely related to traditional data mining methods in relational databases. In most cases, we first study mining algorithms for traditional cases, then apply them to spatial data to see if it is feasible or if it needs more modifications. 3. A new method for mining spatial association rules 5

6 a. Deep insights into spatial association rules Various kinds of spatial predicates can be involved in spatial association rules. They may represent topological relationships between spatial objects, such as disjoint, intersects, inside/outside, adjacent_to, covers/covered_by, equal, etc. They may also represent spatial orientation or ordering, such as left, right, north, east, etc, or some distance information, such as close_to, far_away, etc. For deep insights into the mining of spatial association rules, let s first introduce one formal definition. Definition: A spatial association rule is a rule in the form of: P Pm Q Qn (c%) Where at least one of the predicates is a spatial predicates, and c% is the confidence of the rule which indicates that c% of the objects satisfying the antecedent of the rule will also satisfy the consequent of the rule. Obviously, most people are only interested in the patterns that occur relatively frequently (with large supports) and the rules that have strong implications (with high confidence). The rules with large supports and high confidence are strong rules. Based on this, two kinds of thresholds: minimum support and minimum confidence can be introduces, which are set in advance by users or experts. Moreover, since many predicates and concepts may have strong association relationships at a relatively high concept level, the thresholds should be defined at different concept levels. For example, it is kind of difficult to find regular association patterns between a particular house and a particular beach, however, there may be strong association between many expensive houses and luxurious beaches. Therefore, it is expected that many spatial association rules are expressed at a relatively high concept level. To facilitate the specification of the specification of the primitives for spatial data mining, an SQL-like spatial data mining query interface, which is designed based on a spatial SQL, has been proposed to explain the following example, which is thoroughly studied in [6]: Example : Let the spatial database to be studied adopt an extended-relational data model and a SAND (spatial-and-nonspatial database) architecture, which is, it consists of a set of spatial objects and a relational database describing nonspatial properties of these objects. This example is confined to British Columbia (BC), a province in Canada, with the following database relations (tables) for organizing and representing spatial objects:. town (name, type, population, geo, ). road (name, type, geo, ) 3. water (name, type, geo, ) 4. mine (name, type, geo, ) 5. boundary (name, type, admin_region_, admin_region_, geo, ). Where the attribute geo represents a spatial object (a point, line, area, etc) whose spatial pointer is stored in a tuple of the relation (a row of the table) and points to a 6

7 geographic map. The attribute type of a relation is used to categorize the types of spatial objects in the relation. For example, the types of road could be (national highway, local highway, street), and the type for water could be (ocean, sea, inlets, lakes, rivers, bay, creeks). The boundary relation specifies the boundary between two Administrative regions. The omitted fields could be other pieces of information, such as the area of a lake and the flow of a river. Suppose a user is interested in finding within the map of British Columbia (BC) the strong spatial association relationships between large towns and other near_by objects including mines, country boundaries, water and major highways. The SQL query could be presented below: discover spatial association rules inside BC from road R, water W, mines M, boundary B in relevance to town T where g_close_to(t.geo, X.geo) and X in {R, W, M, B} and T.type = large and R.type = divided_highway and W.type in {sea, ocean, large_lake, large_river} and B.admin_region_ in BC and B.admin_region_ in USA In this query, a relation variable X is used to represent one of a set of four variables {R, W, M, B}, a predicate close_to(a, B) says that a spatial object A is close to another spatial object B, and g_closed_to is a predefined generalized predicate which covers a set of spatial predicates: intersect, adjacent_to, contains, close_to. Moreover, close_to is a condition dependent predicate and is defined by a set of knowledge rules, for example, if X is a town and Y is a country, then X is close to Y if their distance is within 80 mile, however, close_to between a town and a road will be defined by a smaller distance such as 5 miles. To facilitate mining multiple-level association rules and efficient processing, concept hierarchies are provided for both data and spatial predicates, defined as follows: (town(large_town(big_city, midium_city), small_town)( ) ) ). (water(sea(strait(george_strait, ), inlet( ), ), river(large_river(fraser_river, ), ), lake(large_lake(pkanagan_lake, ), ), ), ). (road(national_highway(routel, ), provincial_highway(highway_3, ), city_drive(hasting St., Kingsway, ), city_street(e_ st, Ave, ), ), ). Also, spatial predicates (topological relations) should be arranged into a hierarchy for computation of approximation spatial relations using coarse resolution at a high 7

8 concept level and refine the computation when it is confined to a set of more focused candidate objects. See the following for an example: g_close_to not_disjoint close_to intersects inside contains equal adjacent_to intersects covered_by inside covers contains b. A new algorithm for mining spatial association rules We examine how the data mining query posted in example is processed, which intuitively illustrates the method for mining spatial association rules. Firstly, the set of relevant data is retrieved by execution of the data retrieval methods of the data mining query, which extracts the following data sets whose spatial portion is inside BC: () towns: only large towns; () roads: only divided highways; (3) water: only seas, oceans, large lakes and large rivers; (4) mines: any mines; (5) boundary: only the boundary of BC and USA. Secondly, the generalized close_to (g_close_to) relationship between large towns and the other four classes of entities is computed at a relatively coarse resolution level using a less expensive spatial algorithm such as the MBR (minimum bounding rectangles) data structure and a plane sweeping algorithm, or R-tree and other approximation methods. The derived spatial predicates are collected in a g_close_to table, see table, which follows an extended relational model: each slot of the table may contain a set of entries. The support of each entry is then computed and those whose support is below the minimum support threshold, such as the column mine, are removed from the table. From the computed g_close_to relation, interesting large item sets can be discovered at different concept levels and the spatial association rules can be presented accordingly. Town Water Road Boundary Mine Victoria Juan_de_Fuca_Strait Highway_, US highway_7 Saanich Juan_de_Fuca_Strait Highway_, US highway_7 Prince_George Highway_97 Pentincton Okanagan_Lake Highway_97 US Alalla Table: The computed g_close_to relation 8

9 Since many people may not be satisfied with approximate spatial relationships, such as g_close_to, more detailed spatial computation are needed to performed to find the refined or precise spatial relationship in the spatial predicate hierarchy, thus we have the following refined computation which is performed on the large predicate sets, i.e., those retained in the g_close_to table. Each g_close_to predicate is replaced by one or a set of concrete predicates such as intersect, adjacent_to, close_to, inside, etc. Such a process results in Table. Town Water Road Boundary Victoria <adjacent_to, J.Fuca_Strait> <intersects, highway_>, <intersects, <close_to, US> Saanich <adjacent_to, J.Fuca_Strait> highway_7> <intersects, highway_>, <intersects, highway_7> <close_to, US> Prince_George <intersects, highway_97> Pentincton <adjacent_to, <intersects, <close_to, US> Pkanagan_Lake> highway_97> Table: Detailed spatial relationships for large sets Table forms a base for the computation of detailed spatial relationships at multiple concept levels. Based in this, the level-by-level detailed computation of large predicates and the corresponding association rules is presented. The computation starts at the top-most level and computes large predicates at this level. For example, for each row in the Table, i.e., for each large town, if the water attribute is nonempty, the count of water is incremented by one. Such a count accumulation forms -predicate rows (with k=) of Table 3 where the support count registered. If the support count of a row is smaller than the minimum support threshold, the row is removed from the table. Suppose the minimum support is set to 50% at level, a row whose count is less than 0 is removed. Similarly, the -predicate rows (with k=) are formed by the pair-wise combination of the large -predicates, with their support counts accumulated by checking against Table, and the rows with the count smaller than the minimum support will be removed. The same procedure applies to 3- predicates computation. Finally, the computation of large k-predicates results in Table 3. k large k-predicates set Count <adjacent_to, water> <intersects, highway> <close_to, highway> <close_to, us_boundary>

10 3 <adjacent_to, water>, <intersects, highway> <adjacent_to, water>, <close_to, us_boundary> <close_to, us_boundary>, <intersects, highway> <adjacent_to, water>, <close_to, us_boundary>, <intersects, highway> Table3: large k-predicates sets at the top concept level (for 40 large towns in BC) Thirdly, spatial association rules can be extracted directly from Table 3. For example, since <intersects, highway> has a support count of 9, and <adjacent_to, water> and <intersects, highway> has count of 5, and 5/9 = 86%, we get the following association rule: is_a(x, large_town) intersects(x, highway) adjacent_to(x, water). (86%). Since we are only dealing with large towns, is_a(x, large_town) is added here in the antecedent of the rule. If we set the minimum confidence threshold at 90%, this rule would have been removed from the list of the association rules to be generated. Finally, after mining rules at the highest level of the concept hierarchy, large k- predicates can be computed in the same way at the lower concept levels, which are Table 4 and Table 5. And similarly, spatial association rules can be derived directly from these tables for detail level and 3. k large k-predicates set count 3 <adjacent_to, sea> <adjacent_to, large_river> <close_to, us_boundary> <intersects, provincial highway> <close_to, provincial highway> <adjacent_to, sea>, <close_to, us_boundary> <close_to, us_boundary>, <intersects, provincial highway> <adjacent_to, sea>, <close_to, provincial highway> <close_to, us_boundary>, <close_to, provincial highway> <adjacent_to, sea>, <close_to, us_boundary>, <close_to, provincial highway> Table 4: large k-predicates sets at the second level (for 40 large towns in BC) K large k-predicate set count <adjacent_to, Georgia strait> <adjacent_to, fraser_river> <close_to, us_boundary> <adjacent_to, Georgia strait>, <close_to, us_boundary> Table 5: large k-predicates sets at the third level (for 40 large towns in BC) 0

11 For example, the following two rules can be derived from these tables: is_a(x, large_towns) adjacent_to(x, seas) (5.5%: /40 towns) level is_a(x, large_towns) adjacent_to(x, George_strait) close_to(x, US). (78%) level 3 Notice that only the descendants of the large -predicates will be examined at a lower concept level, and the mining process stops at the lowest level of the hierarchies or when an empty large -predicate set is derived. The above rule mining process can be summarized in the following algorithm: Algorithm: mining the spatial association rules defined by Definition in a large spatial database. Input: a spatial database, a mining query, and a set of thresholds:. a database consists of 3 parts: a spatial database SDB containing a set of spatial objects; a relational database RDB describing nonspatial properties of spatial objects; and a set of concept hierarchies.. a query consists of 3 parts: a reference class S; a set of task-relevant classes for spatial objects C,, Cn; a set of task-relevant spatial relations. 3. two thresholds: minimum support and minimum confidence for each level l of description. Output: strong multiple-level spatial association rules for the relevant sets of objects and relations. Method: mining spatial association rules proceeds as follows: Step : Task_relevant_DB := extract_task_objects(sdb, RDB); Step : Coarse_predicate_DB := coarse_spatial_computation(task_relevant_db); Step 3: Large_coarse_predicate_DB := filtering_with_minimum_support(coarse_predicate_db); Step 4: Fine_predicate_DB := refined_spatial_computation(large_coarse_predicate_db); Step 5: Find_large_predicates_and_mine_rules(Fine_predicate_DB). Pseudo code: where LL[l] is the large predicate set table at level l, and L[l, k] is the large k-predicate set table at level l. The syntax procedure is similar to Pascal. () Procedure find_large_predicates_and_mine_rules(db); () for (i :=; L[i, ] 0 and i < max_level; i++) do begin (3) L[i,] := get_large predicate_sets(db,i); (4) for (k :=; L[i,k-] 0; k++) do begin (5) P k := get_candidate_set(l[i,k-]); (6) foreach object s in S do begin (7) P s := get_subsets(p k,s); {Candidates satisfied by s} (8) foreach candidate p P s do p.support++; (9) end; (0) L[i,k] := {p P k p.support minsup[i]}; () end; () LL[i] := U k L[i,k]; (3) output := generate_association_rules(ll[i]);

12 (4) end (5) end c. A discussion of the algorithm Firstly, we discuss the correctness of this method as we normally do for evaluating algorithms. This method discovers the correct and complete set of association rules in the following steps. At the beginning, a query processing process extracts all data that are relevant to the spatial data mining process based on the completeness and correctness of query processing. Then the method applies a coarse spatial computation method that computes the whole set of relevant data and thus ensures completeness and correctness. After that, it filters out those -predicates whose support is smaller than the minimum support. Then it applies a fine spatial computation method that computes predicates from a set of derived coarse predicates and thus still ensure the completeness and correctness. At last, the method finds the complete set of association rules at multiple concept levels based on the previous studies at mining multiple-level association rules. From the above descriptions, we can see that each step ensures and discovers the correct and completes set of association rules above the minimum support threshold. Secondly, a theorem is presented to show the time complexity/efficiency for this method. Let the average costs for computing each spatial predicate at a coarse and fine resolution level be Cc and Cf respectively, the worst case time complexity of step -5 is O(Cc * Nc + Cc * Nf + Cnonspatial), where Nc is the number of predicates to be coarsely computed in the relevant spatial data sets, Nf is the number of predicates to be finely computed from the coarse predicate database, and Cnonspatial is the total cost of rule mining in a predicate database, which we don t discuss in the paper. Thirdly, the spatial data mining algorithm developed above has the following major strength for mining spatial association rules as stated in [6]: ). Focused data mining guided by users query The data mining process is directed by a user s query that specifies the relevant objects and spatial association relationship to be explored. This not only confines the mining process to a relatively small set of data and rules for efficient processing but also leads to desirable results. ). User-controlled interactive mining Uses may control, usually via a graphical user interface, minimum support and confidence thresholds at each abstraction level interactively based on the currently returned mining results. 3). Approximate spatial computation: substantial reduction of the candidate set Less costly but approximate spatial computation is performed at an abstraction level first on a relatively large set of data which substantially reduces the set of candidate data to be examined in the future.

13 4). Detailed spatial computation: performed once and used for knowledge mining at multiple levels The computation of support counts at each level can be performed by scanning through the same computed spatial predicate table. 5). Optimization on computation of k-predicate sets and on multiple-level mining These two optimization techniques are shared with the techniques for mining other nonspatial multiple association rules. First, it uses the (k-) predicate sets to derive the candidate k predicate sets at each level, which is similar to the apriori algorithm. Second, it starts at the top-most concept level and applies a progressive deepening technique to examine at a lower level only the descendants of the large l-predicates. Furthermore, many variations and extensions of the method can be explored to enhance the power and performance of spatial association rule mining as follows: ). Integration with nonspatial attributes and predicates The relevant set of predicates are mainly spatial ones, such as close_to, inside. Such a process can be integrated with the generalization and association of nonspatial data. ). Mining spatial association rules in multiple thematic maps In principle, the method developed here can be applied to handle the spatial databases with multiple thematic maps. The rule mining process will be similar to the one presented above since the judgment of g_close-to(x, Y) or intersect(x, Y) can be performed by an approximate or detailed map overlay. The mining algorithm itself will remain intact. 3). Multiple and dynamic concept hierarchies This method can also deal with the cases when there exist multiple concept hierarchies or when the concept hierarchies need to be adjusted dynamically based on data distributions. For example, towns can be classified into large or small according to an existing hierarchy, coast or in-land according to their distance to the ocean, or southwest, southeast according to their geographic areas. Different characteristics will be discovered based on different hierarchies or their adjustments, which is similar to execute the same algorithm based on different knowledge bases. 4. Conclusions Basically, the algorithm presented in this paper discusses efficient mining procedures for spatial association rules, which explores techniques at multiple approximation and abstraction levels. It proposes first to perform less costly, approximate spatial computation to obtain approximate spatial relationships at a high abstraction level and then refine the spatial computation only for those data or predicates whose refined computation may contribute to the discovery of strong association rules. Such two- 3

14 step spatial mining algorithm facilitates mining strong spatial association rules at multiple concept levels by a top-down, progressive deepening technique. This method is based on the assumption that a user has reasonably good knowledge on what kind of rules he wants to find from the database, and that there exists good knowledge, such as concept or operation hierarchies, for spatial or non-spatial generalization. Such assumptions may rule out naive users and complex spatial databases with poorly understood structures or knowledge, which needs more studies in the future. References: [] Tom Barclay, Jim Gray and Don Slutz, Microsoft TerraServer: a spatial data warehouse, Proceedings of the 000 ACM SIGMOD on Management of data, pages [] Peter Baumann, Web-enabled Raster GIS Services for Large Image and Map Databases, Proceedings of the ACM DEXA00, pages [3] Wendolin Bosques, Ricardo Rodriguez, Angelica Rondon and Ramon Vasquez, "A Spatial Data Retrieval and Image Processing Expert System for the World Wide Web," st International Conference on Computers and Industrial Engineering, 997, pages [4] Volker Coors, Volker Jung, Using VRML as an Interface to the 3D Data Warehouse, Proceedings of the third symposium on Virtual reality modeling language, 998, Page - 9. [5] Martin Ester, Hans-Peter Kriegel, Jorg Sabder, Knowledge Discovery in Spatial Databases, Invited Paper at 3rd German Conf. on Artificial Intelligence (KI 99), Bonn, Germany, [6] Jiawei Han, Krzysztof Koperski, Discovery of Spatial Association Rule in Geographic Information Databases, Proceedings of the Pacific-Asia conference on Knowledge Discovery and Data mining, 998. [7] Shashi Shekhar, Sanjay Chawla, Siva Ravadam Andrew Fetterer, Xuan Liu and Chang-tien Lu, Spatial Databases Accomplishments and Research Needs, IEEE Transactions on Knowledge and Data Engineering, Vol., No., 999. [8] N. Widmann, P. Baumann, Towards Comprehensive Database Support for Geoscientific Raster Data, Proceedings of ACM-GIS'97, Las Vegas/USA, November 997 4

CHAPTER-24 Mining Spatial Databases

CHAPTER-24 Mining Spatial Databases CHAPTER-24 Mining Spatial Databases 24.1 Introduction 24.2 Spatial Data Cube Construction and Spatial OLAP 24.3 Spatial Association Analysis 24.4 Spatial Clustering Methods 24.5 Spatial Classification

More information

Introduction. Introduction. Spatial Data Mining: Definition WHAT S THE DIFFERENCE?

Introduction. Introduction. Spatial Data Mining: Definition WHAT S THE DIFFERENCE? Introduction Spatial Data Mining: Progress and Challenges Survey Paper Krzysztof Koperski, Junas Adhikary, and Jiawei Han (1996) Review by Brad Danielson CMPUT 695 01/11/2007 Authors objectives: Describe

More information

Seminar Report. Algorithm of Spatial Data Mining. Rajiv Gandhi Roll no. 05331002 rgandhi@iitb.ac.in. CSRE, IIT Bombay.

Seminar Report. Algorithm of Spatial Data Mining. Rajiv Gandhi Roll no. 05331002 rgandhi@iitb.ac.in. CSRE, IIT Bombay. Seminar Report Algorithm of Spatial Data Mining Rajiv Gandhi Roll no. 05331002 rgandhi@iitb.ac.in CSRE, IIT Bombay. November 8, 2005 1 Topic Index Page 1. Introduction 3 2. Pattern Discovery 4 2.1 The

More information

Spatial Data Warehouse and Mining. Rajiv Gandhi

Spatial Data Warehouse and Mining. Rajiv Gandhi Spatial Data Warehouse and Mining Rajiv Gandhi Roll Number 05331002 Centre of Studies in Resource Engineering Indian Institute of Technology Bombay Powai, Mumbai -400076 India. As part of the first stage

More information

Oracle8i Spatial: Experiences with Extensible Databases

Oracle8i Spatial: Experiences with Extensible Databases Oracle8i Spatial: Experiences with Extensible Databases Siva Ravada and Jayant Sharma Spatial Products Division Oracle Corporation One Oracle Drive Nashua NH-03062 {sravada,jsharma}@us.oracle.com 1 Introduction

More information

Spatial Data Preparation for Knowledge Discovery

Spatial Data Preparation for Knowledge Discovery Spatial Data Preparation for Knowledge Discovery Vania Bogorny 1, Paulo Martins Engel 1, Luis Otavio Alvares 1 1 Instituto de Informática Universidade Federal do Rio Grande do Sul (UFRGS) Caixa Postal

More information

Principles of Data Mining by Hand&Mannila&Smyth

Principles of Data Mining by Hand&Mannila&Smyth Principles of Data Mining by Hand&Mannila&Smyth Slides for Textbook Ari Visa,, Institute of Signal Processing Tampere University of Technology October 4, 2010 Data Mining: Concepts and Techniques 1 Differences

More information

II. SPATIAL DATA MINING DEFINITION

II. SPATIAL DATA MINING DEFINITION Spatial Data Mining using Cluster Analysis Ch.N.Santhosh Kumar 1, V. Sitha Ramulu 2, K.Sudheer Reddy 3, Suresh Kotha 4, Ch. Mohan Kumar 5 1 Assoc. Professor, Dept. of CSE, Swarna Bharathi Inst. of Sc.

More information

International Journal of Advance Research in Computer Science and Management Studies

International Journal of Advance Research in Computer Science and Management Studies Volume 3, Issue 11, November 2015 ISSN: 2321 7782 (Online) International Journal of Advance Research in Computer Science and Management Studies Research Article / Survey Paper / Case Study Available online

More information

Data Mining: Concepts and Techniques. Jiawei Han. Micheline Kamber. Simon Fräser University К MORGAN KAUFMANN PUBLISHERS. AN IMPRINT OF Elsevier

Data Mining: Concepts and Techniques. Jiawei Han. Micheline Kamber. Simon Fräser University К MORGAN KAUFMANN PUBLISHERS. AN IMPRINT OF Elsevier Data Mining: Concepts and Techniques Jiawei Han Micheline Kamber Simon Fräser University К MORGAN KAUFMANN PUBLISHERS AN IMPRINT OF Elsevier Contents Foreword Preface xix vii Chapter I Introduction I I.

More information

Spatial Data Preparation for Knowledge Discovery

Spatial Data Preparation for Knowledge Discovery Spatial Data Preparation for Knowledge Discovery Vania Bogorny 1, Paulo Martins Engel 1, Luis Otavio Alvares 1 1 Instituto de Informática Universidade Federal do Rio Grande do Sul (UFRGS) Caixa Postal

More information

Optimal Cell Towers Distribution by using Spatial Mining and Geographic Information System

Optimal Cell Towers Distribution by using Spatial Mining and Geographic Information System World of Computer Science and Information Technology Journal (WCSIT) ISSN: 2221-0741 Vol. 1, No. 2, -48, 2011 Optimal Cell Towers Distribution by using Spatial Mining and Geographic Information System

More information

Introduction. A. Bellaachia Page: 1

Introduction. A. Bellaachia Page: 1 Introduction 1. Objectives... 3 2. What is Data Mining?... 4 3. Knowledge Discovery Process... 5 4. KD Process Example... 7 5. Typical Data Mining Architecture... 8 6. Database vs. Data Mining... 9 7.

More information

Using an Ontology-based Approach for Geospatial Clustering Analysis

Using an Ontology-based Approach for Geospatial Clustering Analysis Using an Ontology-based Approach for Geospatial Clustering Analysis Xin Wang Department of Geomatics Engineering University of Calgary Calgary, AB, Canada T2N 1N4 xcwang@ucalgary.ca Abstract. Geospatial

More information

SPATIAL DATA CLASSIFICATION AND DATA MINING

SPATIAL DATA CLASSIFICATION AND DATA MINING , pp.-40-44. Available online at http://www. bioinfo. in/contents. php?id=42 SPATIAL DATA CLASSIFICATION AND DATA MINING RATHI J.B. * AND PATIL A.D. Department of Computer Science & Engineering, Jawaharlal

More information

Data Mining and Database Systems: Where is the Intersection?

Data Mining and Database Systems: Where is the Intersection? Data Mining and Database Systems: Where is the Intersection? Surajit Chaudhuri Microsoft Research Email: surajitc@microsoft.com 1 Introduction The promise of decision support systems is to exploit enterprise

More information

Extend Table Lens for High-Dimensional Data Visualization and Classification Mining

Extend Table Lens for High-Dimensional Data Visualization and Classification Mining Extend Table Lens for High-Dimensional Data Visualization and Classification Mining CPSC 533c, Information Visualization Course Project, Term 2 2003 Fengdong Du fdu@cs.ubc.ca University of British Columbia

More information

A Spatial Decision Support System for Property Valuation

A Spatial Decision Support System for Property Valuation A Spatial Decision Support System for Property Valuation Katerina Christopoulou, Muki Haklay Department of Geomatic Engineering, University College London, Gower Street, London WC1E 6BT Tel. +44 (0)20

More information

Index Contents Page No. Introduction . Data Mining & Knowledge Discovery

Index Contents Page No. Introduction . Data Mining & Knowledge Discovery Index Contents Page No. 1. Introduction 1 1.1 Related Research 2 1.2 Objective of Research Work 3 1.3 Why Data Mining is Important 3 1.4 Research Methodology 4 1.5 Research Hypothesis 4 1.6 Scope 5 2.

More information

1. What are the uses of statistics in data mining? Statistics is used to Estimate the complexity of a data mining problem. Suggest which data mining

1. What are the uses of statistics in data mining? Statistics is used to Estimate the complexity of a data mining problem. Suggest which data mining 1. What are the uses of statistics in data mining? Statistics is used to Estimate the complexity of a data mining problem. Suggest which data mining techniques are most likely to be successful, and Identify

More information

DATA MINING - SELECTED TOPICS

DATA MINING - SELECTED TOPICS DATA MINING - SELECTED TOPICS Peter Brezany Institute for Software Science University of Vienna E-mail : brezany@par.univie.ac.at 1 MINING SPATIAL DATABASES 2 Spatial Database Systems SDBSs offer spatial

More information

Extraction of Satellite Image using Particle Swarm Optimization

Extraction of Satellite Image using Particle Swarm Optimization Extraction of Satellite Image using Particle Swarm Optimization Er.Harish Kundra Assistant Professor & Head Rayat Institute of Engineering & IT, Railmajra, Punjab,India. Dr. V.K.Panchal Director, DTRL,DRDO,

More information

Survey On: Nearest Neighbour Search With Keywords In Spatial Databases

Survey On: Nearest Neighbour Search With Keywords In Spatial Databases Survey On: Nearest Neighbour Search With Keywords In Spatial Databases SayaliBorse 1, Prof. P. M. Chawan 2, Prof. VishwanathChikaraddi 3, Prof. Manish Jansari 4 P.G. Student, Dept. of Computer Engineering&

More information

Comparative Analysis of EM Clustering Algorithm and Density Based Clustering Algorithm Using WEKA tool.

Comparative Analysis of EM Clustering Algorithm and Density Based Clustering Algorithm Using WEKA tool. International Journal of Engineering Research and Development e-issn: 2278-067X, p-issn: 2278-800X, www.ijerd.com Volume 9, Issue 8 (January 2014), PP. 19-24 Comparative Analysis of EM Clustering Algorithm

More information

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014 RESEARCH ARTICLE OPEN ACCESS A Survey of Data Mining: Concepts with Applications and its Future Scope Dr. Zubair Khan 1, Ashish Kumar 2, Sunny Kumar 3 M.Tech Research Scholar 2. Department of Computer

More information

Distance Learning and Examining Systems

Distance Learning and Examining Systems Lodz University of Technology Distance Learning and Examining Systems - Theory and Applications edited by Sławomir Wiak Konrad Szumigaj HUMAN CAPITAL - THE BEST INVESTMENT The project is part-financed

More information

College information system research based on data mining

College information system research based on data mining 2009 International Conference on Machine Learning and Computing IPCSIT vol.3 (2011) (2011) IACSIT Press, Singapore College information system research based on data mining An-yi Lan 1, Jie Li 2 1 Hebei

More information

Reading Questions. Lo and Yeung, 2007: 2 19. Schuurman, 2004: Chapter 1. 1. What distinguishes data from information? How are data represented?

Reading Questions. Lo and Yeung, 2007: 2 19. Schuurman, 2004: Chapter 1. 1. What distinguishes data from information? How are data represented? Reading Questions Week two Lo and Yeung, 2007: 2 19. Schuurman, 2004: Chapter 1. 1. What distinguishes data from information? How are data represented? 2. What sort of problems are GIS designed to solve?

More information

Efficient Storage and Management of Environmental Information

Efficient Storage and Management of Environmental Information Efficient Storage and Management of Environmental Information Nabil R. Adam, Vijayalakshmi Atluri, and Songmei Yu MSIS Department and CIMIC, Rutgers University Newark, New Jersey 07102 {adam, atluri, songmei}@cimic.rutgers.edu

More information

Algorithms and Applications for Spatial Data Mining

Algorithms and Applications for Spatial Data Mining Published in Geographic Data Mining and Knowledge Discovery, Research Monographs in GIS, Taylor and Francis, 2001. Algorithms and Applications for Spatial Data Mining Martin Ester, Hans-Peter Kriegel,

More information

DATA MINING TECHNIQUES AND APPLICATIONS

DATA MINING TECHNIQUES AND APPLICATIONS DATA MINING TECHNIQUES AND APPLICATIONS Mrs. Bharati M. Ramageri, Lecturer Modern Institute of Information Technology and Research, Department of Computer Application, Yamunanagar, Nigdi Pune, Maharashtra,

More information

Categorical Data Visualization and Clustering Using Subjective Factors

Categorical Data Visualization and Clustering Using Subjective Factors Categorical Data Visualization and Clustering Using Subjective Factors Chia-Hui Chang and Zhi-Kai Ding Department of Computer Science and Information Engineering, National Central University, Chung-Li,

More information

Vector storage and access; algorithms in GIS. This is lecture 6

Vector storage and access; algorithms in GIS. This is lecture 6 Vector storage and access; algorithms in GIS This is lecture 6 Vector data storage and access Vectors are built from points, line and areas. (x,y) Surface: (x,y,z) Vector data access Access to vector

More information

131-1. Adding New Level in KDD to Make the Web Usage Mining More Efficient. Abstract. 1. Introduction [1]. 1/10

131-1. Adding New Level in KDD to Make the Web Usage Mining More Efficient. Abstract. 1. Introduction [1]. 1/10 1/10 131-1 Adding New Level in KDD to Make the Web Usage Mining More Efficient Mohammad Ala a AL_Hamami PHD Student, Lecturer m_ah_1@yahoocom Soukaena Hassan Hashem PHD Student, Lecturer soukaena_hassan@yahoocom

More information

(b) How data mining is different from knowledge discovery in databases (KDD)? Explain.

(b) How data mining is different from knowledge discovery in databases (KDD)? Explain. Q2. (a) List and describe the five primitives for specifying a data mining task. Data Mining Task Primitives (b) How data mining is different from knowledge discovery in databases (KDD)? Explain. IETE

More information

The process of database development. Logical model: relational DBMS. Relation

The process of database development. Logical model: relational DBMS. Relation The process of database development Reality (Universe of Discourse) Relational Databases and SQL Basic Concepts The 3rd normal form Structured Query Language (SQL) Conceptual model (e.g. Entity-Relationship

More information

Tracking System for GPS Devices and Mining of Spatial Data

Tracking System for GPS Devices and Mining of Spatial Data Tracking System for GPS Devices and Mining of Spatial Data AIDA ALISPAHIC, DZENANA DONKO Department for Computer Science and Informatics Faculty of Electrical Engineering, University of Sarajevo Zmaja

More information

Building Data Cubes and Mining Them. Jelena Jovanovic Email: jeljov@fon.bg.ac.yu

Building Data Cubes and Mining Them. Jelena Jovanovic Email: jeljov@fon.bg.ac.yu Building Data Cubes and Mining Them Jelena Jovanovic Email: jeljov@fon.bg.ac.yu KDD Process KDD is an overall process of discovering useful knowledge from data. Data mining is a particular step in the

More information

Adobe Insight, powered by Omniture

Adobe Insight, powered by Omniture Adobe Insight, powered by Omniture Accelerating government intelligence to the speed of thought 1 Challenges that analysts face 2 Analysis tools and functionality 3 Adobe Insight 4 Summary Never before

More information

DATA MINING TECHNOLOGY. Keywords: data mining, data warehouse, knowledge discovery, OLAP, OLAM.

DATA MINING TECHNOLOGY. Keywords: data mining, data warehouse, knowledge discovery, OLAP, OLAM. DATA MINING TECHNOLOGY Georgiana Marin 1 Abstract In terms of data processing, classical statistical models are restrictive; it requires hypotheses, the knowledge and experience of specialists, equations,

More information

A New Approach for Evaluation of Data Mining Techniques

A New Approach for Evaluation of Data Mining Techniques 181 A New Approach for Evaluation of Data Mining s Moawia Elfaki Yahia 1, Murtada El-mukashfi El-taher 2 1 College of Computer Science and IT King Faisal University Saudi Arabia, Alhasa 31982 2 Faculty

More information

Healthcare Measurement Analysis Using Data mining Techniques

Healthcare Measurement Analysis Using Data mining Techniques www.ijecs.in International Journal Of Engineering And Computer Science ISSN:2319-7242 Volume 03 Issue 07 July, 2014 Page No. 7058-7064 Healthcare Measurement Analysis Using Data mining Techniques 1 Dr.A.Shaik

More information

Application of Data Mining Techniques in Intrusion Detection

Application of Data Mining Techniques in Intrusion Detection Application of Data Mining Techniques in Intrusion Detection LI Min An Yang Institute of Technology leiminxuan@sohu.com Abstract: The article introduced the importance of intrusion detection, as well as

More information

Tutorials for Project on Building a Business Analytic Model Using Data Mining Tool and Data Warehouse and OLAP Cubes IST 734

Tutorials for Project on Building a Business Analytic Model Using Data Mining Tool and Data Warehouse and OLAP Cubes IST 734 Cleveland State University Tutorials for Project on Building a Business Analytic Model Using Data Mining Tool and Data Warehouse and OLAP Cubes IST 734 SS Chung 14 Build a Data Mining Model using Data

More information

Information Management course

Information Management course Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 01 : 06/10/2015 Practical informations: Teacher: Alberto Ceselli (alberto.ceselli@unimi.it)

More information

Environmental Remote Sensing GEOG 2021

Environmental Remote Sensing GEOG 2021 Environmental Remote Sensing GEOG 2021 Lecture 4 Image classification 2 Purpose categorising data data abstraction / simplification data interpretation mapping for land cover mapping use land cover class

More information

Classification and Prediction

Classification and Prediction Classification and Prediction Slides for Data Mining: Concepts and Techniques Chapter 7 Jiawei Han and Micheline Kamber Intelligent Database Systems Research Lab School of Computing Science Simon Fraser

More information

COURSE RECOMMENDER SYSTEM IN E-LEARNING

COURSE RECOMMENDER SYSTEM IN E-LEARNING International Journal of Computer Science and Communication Vol. 3, No. 1, January-June 2012, pp. 159-164 COURSE RECOMMENDER SYSTEM IN E-LEARNING Sunita B Aher 1, Lobo L.M.R.J. 2 1 M.E. (CSE)-II, Walchand

More information

GEO-VISUALIZATION SUPPORT FOR MULTIDIMENSIONAL CLUSTERING

GEO-VISUALIZATION SUPPORT FOR MULTIDIMENSIONAL CLUSTERING Geoinformatics 2004 Proc. 12th Int. Conf. on Geoinformatics Geospatial Information Research: Bridging the Pacific and Atlantic University of Gävle, Sweden, 7-9 June 2004 GEO-VISUALIZATION SUPPORT FOR MULTIDIMENSIONAL

More information

Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches

Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches PhD Thesis by Payam Birjandi Director: Prof. Mihai Datcu Problematic

More information

Performance Analysis of Naive Bayes and J48 Classification Algorithm for Data Classification

Performance Analysis of Naive Bayes and J48 Classification Algorithm for Data Classification Performance Analysis of Naive Bayes and J48 Classification Algorithm for Data Classification Tina R. Patil, Mrs. S. S. Sherekar Sant Gadgebaba Amravati University, Amravati tnpatil2@gmail.com, ss_sherekar@rediffmail.com

More information

INTEGRATING GIS AND SPATIAL DATA MINING TECHNIQUE FOR TARGET MARKETING OF UNIVERSITY COURSES

INTEGRATING GIS AND SPATIAL DATA MINING TECHNIQUE FOR TARGET MARKETING OF UNIVERSITY COURSES ISPRS SIPT IGU UCI CIG ACSG Table of contents Table des matières Authors index Index des auteurs Search Recherches Exit Sortir INTEGRATING GIS AND SPATIAL DATA MINING TECHNIQUE FOR TARGET MARKETING OF

More information

Mining various patterns in sequential data in an SQL-like manner *

Mining various patterns in sequential data in an SQL-like manner * Mining various patterns in sequential data in an SQL-like manner * Marek Wojciechowski Poznan University of Technology, Institute of Computing Science, ul. Piotrowo 3a, 60-965 Poznan, Poland Marek.Wojciechowski@cs.put.poznan.pl

More information

DATA MINING CONCEPTS AND TECHNIQUES. Marek Maurizio E-commerce, winter 2011

DATA MINING CONCEPTS AND TECHNIQUES. Marek Maurizio E-commerce, winter 2011 DATA MINING CONCEPTS AND TECHNIQUES Marek Maurizio E-commerce, winter 2011 INTRODUCTION Overview of data mining Emphasis is placed on basic data mining concepts Techniques for uncovering interesting data

More information

How To Use Neural Networks In Data Mining

How To Use Neural Networks In Data Mining International Journal of Electronics and Computer Science Engineering 1449 Available Online at www.ijecse.org ISSN- 2277-1956 Neural Networks in Data Mining Priyanka Gaur Department of Information and

More information

An architecture for open and scalable WebGIS

An architecture for open and scalable WebGIS An architecture for open and scalable WebGIS Aleksandar Milosavljević, Leonid Stoimenov, Slobodanka Djordjević-Kajan CG&GIS Lab, Department of Computer Science Faculty of Electronic Engineering, University

More information

Object Recognition. Selim Aksoy. Bilkent University saksoy@cs.bilkent.edu.tr

Object Recognition. Selim Aksoy. Bilkent University saksoy@cs.bilkent.edu.tr Image Classification and Object Recognition Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr Image classification Image (scene) classification is a fundamental

More information

Draft Martin Doerr ICS-FORTH, Heraklion, Crete Oct 4, 2001

Draft Martin Doerr ICS-FORTH, Heraklion, Crete Oct 4, 2001 A comparison of the OpenGIS TM Abstract Specification with the CIDOC CRM 3.2 Draft Martin Doerr ICS-FORTH, Heraklion, Crete Oct 4, 2001 1 Introduction This Mapping has the purpose to identify, if the OpenGIS

More information

Prediction of Heart Disease Using Naïve Bayes Algorithm

Prediction of Heart Disease Using Naïve Bayes Algorithm Prediction of Heart Disease Using Naïve Bayes Algorithm R.Karthiyayini 1, S.Chithaara 2 Assistant Professor, Department of computer Applications, Anna University, BIT campus, Tiruchirapalli, Tamilnadu,

More information

EVENT CENTRIC MODELING APPROACH IN CO- LOCATION PATTERN ANALYSIS FROM SPATIAL DATA

EVENT CENTRIC MODELING APPROACH IN CO- LOCATION PATTERN ANALYSIS FROM SPATIAL DATA EVENT CENTRIC MODELING APPROACH IN CO- LOCATION PATTERN ANALYSIS FROM SPATIAL DATA Venkatesan.M 1, Arunkumar.Thangavelu 2, Prabhavathy.P 3 1& 2 School of Computing Science & Engineering, VIT University,

More information

MINING CLICKSTREAM-BASED DATA CUBES

MINING CLICKSTREAM-BASED DATA CUBES MINING CLICKSTREAM-BASED DATA CUBES Ronnie Alves and Orlando Belo Departament of Informatics,School of Engineering, University of Minho Campus de Gualtar, 4710-057 Braga, Portugal Email: {alvesrco,obelo}@di.uminho.pt

More information

Representing Geography

Representing Geography 3 Representing Geography OVERVIEW This chapter introduces the concept of representation, or the construction of a digital model of some aspect of the Earth s surface. The geographic world is extremely

More information

Fuzzy Spatial Data Warehouse: A Multidimensional Model

Fuzzy Spatial Data Warehouse: A Multidimensional Model 4 Fuzzy Spatial Data Warehouse: A Multidimensional Model Pérez David, Somodevilla María J. and Pineda Ivo H. Facultad de Ciencias de la Computación, BUAP, Mexico 1. Introduction A data warehouse is defined

More information

USING SPATIAL DATA MINING TO DISCOVER THE HIDDEN RULES IN THE CRIME DATA

USING SPATIAL DATA MINING TO DISCOVER THE HIDDEN RULES IN THE CRIME DATA USING SPATIAL DATA MINING TO DISCOVER THE HIDDEN RULES IN THE CRIME DATA Karel, JANEČKA 1, Hana, HŮLOVÁ 1 1 Department of Mathematics, Faculty of Applied Sciences, University of West Bohemia Abstract Univerzitni

More information

TOWARDS SIMPLE, EASY TO UNDERSTAND, AN INTERACTIVE DECISION TREE ALGORITHM

TOWARDS SIMPLE, EASY TO UNDERSTAND, AN INTERACTIVE DECISION TREE ALGORITHM TOWARDS SIMPLE, EASY TO UNDERSTAND, AN INTERACTIVE DECISION TREE ALGORITHM Thanh-Nghi Do College of Information Technology, Cantho University 1 Ly Tu Trong Street, Ninh Kieu District Cantho City, Vietnam

More information

Jordan University of Science & Technology Computer Science Department CS 728: Advanced Database Systems Midterm Exam First 2009/2010

Jordan University of Science & Technology Computer Science Department CS 728: Advanced Database Systems Midterm Exam First 2009/2010 Jordan University of Science & Technology Computer Science Department CS 728: Advanced Database Systems Midterm Exam First 2009/2010 Student Name: ID: Part 1: Multiple-Choice Questions (17 questions, 1

More information

How To Use Data Mining For Knowledge Management In Technology Enhanced Learning

How To Use Data Mining For Knowledge Management In Technology Enhanced Learning Proceedings of the 6th WSEAS International Conference on Applications of Electrical Engineering, Istanbul, Turkey, May 27-29, 2007 115 Data Mining for Knowledge Management in Technology Enhanced Learning

More information

Data Mining and Knowledge Discovery in Databases (KDD) State of the Art. Prof. Dr. T. Nouri Computer Science Department FHNW Switzerland

Data Mining and Knowledge Discovery in Databases (KDD) State of the Art. Prof. Dr. T. Nouri Computer Science Department FHNW Switzerland Data Mining and Knowledge Discovery in Databases (KDD) State of the Art Prof. Dr. T. Nouri Computer Science Department FHNW Switzerland 1 Conference overview 1. Overview of KDD and data mining 2. Data

More information

民 國 九 十 七 年 四 月 第 38 卷 第 2 期

民 國 九 十 七 年 四 月 第 38 卷 第 2 期 民 國 九 十 七 年 四 月 第 38 卷 第 2 期 1============================================================ Inside of Internet Data Nien-Yi Jan Ming-Tsung Chen Wan-Ting Chang Wei Shen Chow Along with the Internet technology

More information

Proc. of the 3rd Intl. Conf. on Document Analysis and Recognition, Montreal, Canada, August 1995. 1

Proc. of the 3rd Intl. Conf. on Document Analysis and Recognition, Montreal, Canada, August 1995. 1 Proc. of the 3rd Intl. Conf. on Document Analysis and Recognition, Montreal, Canada, August 1995. 1 A Map Acquisition, Storage, Indexing, and Retrieval System Hanan Samet Aya Soer Computer Science Department

More information

Data Mining: Principles and Algorithms

Data Mining: Principles and Algorithms Data Mining: Principles and Algorithms Chapter 10.1 Mining Object, Spatial, and Multimedia Data 1 SURESH BABU M ASST PROFESSOR VJIT Mining Object, Spatial and Multi-Media Data Mining object data sets Mining

More information

Spatial Data Mining Methods and Problems

Spatial Data Mining Methods and Problems Spatial Data Mining Methods and Problems Abstract Use summarizing method,characteristics of each spatial data mining and spatial data mining method applied in GIS,Pointed out that the space limitations

More information

IMPLEMENTING SPATIAL DATA WAREHOUSE HIERARCHIES IN OBJECT-RELATIONAL DBMSs

IMPLEMENTING SPATIAL DATA WAREHOUSE HIERARCHIES IN OBJECT-RELATIONAL DBMSs IMPLEMENTING SPATIAL DATA WAREHOUSE HIERARCHIES IN OBJECT-RELATIONAL DBMSs Elzbieta Malinowski and Esteban Zimányi Computer & Decision Engineering Department, Université Libre de Bruxelles 50 av.f.d.roosevelt,

More information

Efficient Integration of Data Mining Techniques in Database Management Systems

Efficient Integration of Data Mining Techniques in Database Management Systems Efficient Integration of Data Mining Techniques in Database Management Systems Fadila Bentayeb Jérôme Darmont Cédric Udréa ERIC, University of Lyon 2 5 avenue Pierre Mendès-France 69676 Bron Cedex France

More information

OLAP and Data Mining. Data Warehousing and End-User Access Tools. Introducing OLAP. Introducing OLAP

OLAP and Data Mining. Data Warehousing and End-User Access Tools. Introducing OLAP. Introducing OLAP Data Warehousing and End-User Access Tools OLAP and Data Mining Accompanying growth in data warehouses is increasing demands for more powerful access tools providing advanced analytical capabilities. Key

More information

Multi-Resolution Pruning Based Co-Location Identification In Spatial Data

Multi-Resolution Pruning Based Co-Location Identification In Spatial Data IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661, p- ISSN: 2278-8727Volume 16, Issue 2, Ver. VI (Mar-Apr. 2014), PP 01-05 Multi-Resolution Pruning Based Co-Location Identification In Spatial

More information

Course 803401 DSS. Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

Course 803401 DSS. Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Oman College of Management and Technology Course 803401 DSS Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization CS/MIS Department Information Sharing

More information

Quality Assessment in Spatial Clustering of Data Mining

Quality Assessment in Spatial Clustering of Data Mining Quality Assessment in Spatial Clustering of Data Mining Azimi, A. and M.R. Delavar Centre of Excellence in Geomatics Engineering and Disaster Management, Dept. of Surveying and Geomatics Engineering, Engineering

More information

Indian Agriculture Land through Decision Tree in Data Mining

Indian Agriculture Land through Decision Tree in Data Mining Indian Agriculture Land through Decision Tree in Data Mining Kamlesh Kumar Joshi, M.Tech(Pursuing 4 th Sem) Laxmi Narain College of Technology, Indore (M.P) India k3g.kamlesh@gmail.com 9926523514 Pawan

More information

An Overview of Knowledge Discovery Database and Data mining Techniques

An Overview of Knowledge Discovery Database and Data mining Techniques An Overview of Knowledge Discovery Database and Data mining Techniques Priyadharsini.C 1, Dr. Antony Selvadoss Thanamani 2 M.Phil, Department of Computer Science, NGM College, Pollachi, Coimbatore, Tamilnadu,

More information

Graph Mining and Social Network Analysis

Graph Mining and Social Network Analysis Graph Mining and Social Network Analysis Data Mining and Text Mining (UIC 583 @ Politecnico di Milano) References Jiawei Han and Micheline Kamber, "Data Mining: Concepts and Techniques", The Morgan Kaufmann

More information

Introduction to Data Mining

Introduction to Data Mining Introduction to Data Mining 1 Why Data Mining? Explosive Growth of Data Data collection and data availability Automated data collection tools, Internet, smartphones, Major sources of abundant data Business:

More information

Data Mining: A Preprocessing Engine

Data Mining: A Preprocessing Engine Journal of Computer Science 2 (9): 735-739, 2006 ISSN 1549-3636 2005 Science Publications Data Mining: A Preprocessing Engine Luai Al Shalabi, Zyad Shaaban and Basel Kasasbeh Applied Science University,

More information

Determining optimal window size for texture feature extraction methods

Determining optimal window size for texture feature extraction methods IX Spanish Symposium on Pattern Recognition and Image Analysis, Castellon, Spain, May 2001, vol.2, 237-242, ISBN: 84-8021-351-5. Determining optimal window size for texture feature extraction methods Domènec

More information

An Overview of Database management System, Data warehousing and Data Mining

An Overview of Database management System, Data warehousing and Data Mining An Overview of Database management System, Data warehousing and Data Mining Ramandeep Kaur 1, Amanpreet Kaur 2, Sarabjeet Kaur 3, Amandeep Kaur 4, Ranbir Kaur 5 Assistant Prof., Deptt. Of Computer Science,

More information

Search Result Optimization using Annotators

Search Result Optimization using Annotators Search Result Optimization using Annotators Vishal A. Kamble 1, Amit B. Chougule 2 1 Department of Computer Science and Engineering, D Y Patil College of engineering, Kolhapur, Maharashtra, India 2 Professor,

More information

IT and CRM A basic CRM model Data source & gathering system Database system Data warehouse Information delivery system Information users

IT and CRM A basic CRM model Data source & gathering system Database system Data warehouse Information delivery system Information users 1 IT and CRM A basic CRM model Data source & gathering Database Data warehouse Information delivery Information users 2 IT and CRM Markets have always recognized the importance of gathering detailed data

More information

Big Data: Rethinking Text Visualization

Big Data: Rethinking Text Visualization Big Data: Rethinking Text Visualization Dr. Anton Heijs anton.heijs@treparel.com Treparel April 8, 2013 Abstract In this white paper we discuss text visualization approaches and how these are important

More information

Spatial Data Mining and University Courses Marketing

Spatial Data Mining and University Courses Marketing Spatial Data Mining and University Courses Marketing Hong Tang School of Environmental and Information Science Charles Sturt University htang@csu.edu.au Simon McDonald Spatial Data Analysis Network Charles

More information

International Journal of Scientific & Engineering Research, Volume 5, Issue 4, April-2014 442 ISSN 2229-5518

International Journal of Scientific & Engineering Research, Volume 5, Issue 4, April-2014 442 ISSN 2229-5518 International Journal of Scientific & Engineering Research, Volume 5, Issue 4, April-2014 442 Over viewing issues of data mining with highlights of data warehousing Rushabh H. Baldaniya, Prof H.J.Baldaniya,

More information

Multi-dimensional index structures Part I: motivation

Multi-dimensional index structures Part I: motivation Multi-dimensional index structures Part I: motivation 144 Motivation: Data Warehouse A definition A data warehouse is a repository of integrated enterprise data. A data warehouse is used specifically for

More information

CubeView: A System for Traffic Data Visualization

CubeView: A System for Traffic Data Visualization CUBEVIEW: A SYSTEM FOR TRAFFIC DATA VISUALIZATION 1 CubeView: A System for Traffic Data Visualization S. Shekhar, C.T. Lu, R. Liu, C. Zhou Computer Science Department, University of Minnesota 200 Union

More information

Data Warehousing und Data Mining

Data Warehousing und Data Mining Data Warehousing und Data Mining Multidimensionale Indexstrukturen Ulf Leser Wissensmanagement in der Bioinformatik Content of this Lecture Multidimensional Indexing Grid-Files Kd-trees Ulf Leser: Data

More information

Weka-GDPM Integrating Classical Data Mining Toolkit to Geographic Information Systems

Weka-GDPM Integrating Classical Data Mining Toolkit to Geographic Information Systems Weka-GDPM Integrating Classical Data Mining Toolkit to Geographic Information Systems Vania Bogorny, Andrey Tietbohl Palma, Paulo Martins Engel, Luis Otavio Alvares Instituto de Informática Universidade

More information

SECONDARY STORAGE TERRAIN VISUALIZATION IN A CLIENT-SERVER ENVIRONMENT: A SURVEY

SECONDARY STORAGE TERRAIN VISUALIZATION IN A CLIENT-SERVER ENVIRONMENT: A SURVEY SECONDARY STORAGE TERRAIN VISUALIZATION IN A CLIENT-SERVER ENVIRONMENT: A SURVEY Kai Xu and Xiaofang Zhou School of Information Technology and Electrical Engineering The University of Queensland, Brisbane,

More information

A Study of Web Log Analysis Using Clustering Techniques

A Study of Web Log Analysis Using Clustering Techniques A Study of Web Log Analysis Using Clustering Techniques Hemanshu Rana 1, Mayank Patel 2 Assistant Professor, Dept of CSE, M.G Institute of Technical Education, Gujarat India 1 Assistant Professor, Dept

More information

The Scientific Data Mining Process

The Scientific Data Mining Process Chapter 4 The Scientific Data Mining Process When I use a word, Humpty Dumpty said, in rather a scornful tone, it means just what I choose it to mean neither more nor less. Lewis Carroll [87, p. 214] In

More information

Integrating Pattern Mining in Relational Databases

Integrating Pattern Mining in Relational Databases Integrating Pattern Mining in Relational Databases Toon Calders, Bart Goethals, and Adriana Prado University of Antwerp, Belgium {toon.calders, bart.goethals, adriana.prado}@ua.ac.be Abstract. Almost a

More information

Mapping Linear Networks Based on Cellular Phone Tracking

Mapping Linear Networks Based on Cellular Phone Tracking Ronen RYBOWSKI, Aaron BELLER and Yerach DOYTSHER, Israel Key words: Cellular Phones, Cellular Network, Linear Networks, Mapping. ABSTRACT The paper investigates the ability of accurately mapping linear

More information

Clustering. Adrian Groza. Department of Computer Science Technical University of Cluj-Napoca

Clustering. Adrian Groza. Department of Computer Science Technical University of Cluj-Napoca Clustering Adrian Groza Department of Computer Science Technical University of Cluj-Napoca Outline 1 Cluster Analysis What is Datamining? Cluster Analysis 2 K-means 3 Hierarchical Clustering What is Datamining?

More information