Spatial Data Mining Methods and Problems



Similar documents
EFFICIENT DATA PRE-PROCESSING FOR DATA MINING

How To Use Neural Networks In Data Mining

SPATIAL DATA CLASSIFICATION AND DATA MINING

Course DSS. Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

Data Mining: Concepts and Techniques. Jiawei Han. Micheline Kamber. Simon Fräser University К MORGAN KAUFMANN PUBLISHERS. AN IMPRINT OF Elsevier

Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

CHAPTER-24 Mining Spatial Databases

Introduction to Data Mining

Big Data with Rough Set Using Map- Reduce

Statistics for BIG data

Comparison of K-means and Backpropagation Data Mining Algorithms

An Overview of Knowledge Discovery Database and Data mining Techniques

Data Mining and Neural Networks in Stata

Information Visualization WS 2013/14 11 Visual Analytics

USING SELF-ORGANIZING MAPS FOR INFORMATION VISUALIZATION AND KNOWLEDGE DISCOVERY IN COMPLEX GEOSPATIAL DATASETS

A STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS

(b) How data mining is different from knowledge discovery in databases (KDD)? Explain.

Tracking System for GPS Devices and Mining of Spatial Data

Data Mining System, Functionalities and Applications: A Radical Review

Database Marketing, Business Intelligence and Knowledge Discovery

Introduction. A. Bellaachia Page: 1

The Research of Data Mining Based on Neural Networks

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014

DATA MINING TECHNIQUES SUPPORT TO KNOWLEGDE OF BUSINESS INTELLIGENT SYSTEM

Chapter 5. Warehousing, Data Acquisition, Data. Visualization

Mining. Practical. Data. Monte F. Hancock, Jr. Chief Scientist, Celestech, Inc. CRC Press. Taylor & Francis Group

Introduction to Data Mining and Business Intelligence Lecture 1/DMBI/IKI83403T/MTI/UI

The Scientific Data Mining Process

Neural Networks in Data Mining

DATA MINING TECHNIQUES AND APPLICATIONS

Graduate Co-op Students Information Manual. Department of Computer Science. Faculty of Science. University of Regina

REGULATIONS FOR THE DEGREE OF MASTER OF SCIENCE IN COMPUTER SCIENCE (MSc[CompSc])

DATA MINING TECHNOLOGY. Keywords: data mining, data warehouse, knowledge discovery, OLAP, OLAM.

Clustering Methods in Data Mining with its Applications in High Education

A STUDY OF DATA MINING ACTIVITIES FOR MARKET RESEARCH

Sanjeev Kumar. contribute

Healthcare Measurement Analysis Using Data mining Techniques

INDIVIDUAL COURSE DETAILS

Research of Postal Data mining system based on big data

What is GIS? Geographic Information Systems. Introduction to ArcGIS. GIS Maps Contain Layers. What Can You Do With GIS? Layers Can Contain Features

Information Management course

A quick overview of geographic information systems (GIS) Uwe Deichmann, DECRG

A Review of Data Mining Techniques

Federico Rajola. Customer Relationship. Management in the. Financial Industry. Organizational Processes and. Technology Innovation.

NEURAL NETWORKS IN DATA MINING

Introduction. Introduction. Spatial Data Mining: Definition WHAT S THE DIFFERENCE?

A User-Friendly Data Mining System. J. Raul Ramirez, Ph.D. The Ohio State University Center for Mapping

College information system research based on data mining

Professor, D.Sc. (Tech.) Eugene Kovshov MSTU «STANKIN», Moscow, Russia

Example application (1) Telecommunication. Lecture 1: Data Mining Overview and Process. Example application (2) Health

KEY WORDS: Geoinformatics, Geoinformation technique, Remote Sensing, Information technique, Curriculum, Surveyor.

Software Development Training Camp 1 (0-3) Prerequisite : Program development skill enhancement camp, at least 48 person-hours.

IMPROVING DATA INTEGRATION FOR DATA WAREHOUSE: A DATA MINING APPROACH

Introduction to Data Mining and Machine Learning Techniques. Iza Moise, Evangelos Pournaras, Dirk Helbing

1. What are the uses of statistics in data mining? Statistics is used to Estimate the complexity of a data mining problem. Suggest which data mining

OLAP and Data Mining. Data Warehousing and End-User Access Tools. Introducing OLAP. Introducing OLAP

DATA MINING - SELECTED TOPICS

Hexaware E-book on Predictive Analytics

Research of Smart Space based on Business Intelligence

6.2.8 Neural networks for data mining

Data Mining and Knowledge Discovery in Databases (KDD) State of the Art. Prof. Dr. T. Nouri Computer Science Department FHNW Switzerland

Reading Questions. Lo and Yeung, 2007: Schuurman, 2004: Chapter What distinguishes data from information? How are data represented?

A Knowledge Management Framework Using Business Intelligence Solutions

Data Mining for Customer Service Support. Senioritis Seminar Presentation Megan Boice Jay Carter Nick Linke KC Tobin

DATA MINING CONCEPTS AND TECHNIQUES. Marek Maurizio E-commerce, winter 2011

3. Common method of data mining

Conceptual Integrated CRM GIS Framework

Data Mining for Successful Healthcare Organizations

Master s Program in Information Systems

second level university master Academic Year 2013/14 QoLexity Measuring, Monitoring and Analysis of Quality of Life and its Complexity

D A T A M I N I N G C L A S S I F I C A T I O N

Chapter ML:XI. XI. Cluster Analysis

MEng, BSc Computer Science with Artificial Intelligence

Customer Classification And Prediction Based On Data Mining Technique

Introduction to Data Mining

PRACTICAL DATA MINING IN A LARGE UTILITY COMPANY

Dynamic Data in terms of Data Mining Streams

Is a Data Scientist the New Quant? Stuart Kozola MathWorks

International Journal of Scientific & Engineering Research, Volume 5, Issue 4, April ISSN

MEng, BSc Applied Computer Science

Development of a Network Configuration Management System Using Artificial Neural Networks

Digital Cadastral Maps in Land Information Systems

DSS based on Data Warehouse

A Systemic Artificial Intelligence (AI) Approach to Difficult Text Analytics Tasks

TIETS34 Seminar: Data Mining on Biometric identification

Index Contents Page No. Introduction . Data Mining & Knowledge Discovery

NEURAL NETWORK FUNDAMENTALS WITH GRAPHS, ALGORITHMS, AND APPLICATIONS

REGULATIONS FOR THE DEGREE OF MASTER OF SCIENCE IN COMPUTER SCIENCE (MSc[CompSc])

Introduction to Pattern Recognition

Digging for Gold: Business Usage for Data Mining Kim Foster, CoreTech Consulting Group, Inc., King of Prussia, PA

How To Get A Computer Engineering Degree

Using reporting and data mining techniques to improve knowledge of subscribers; applications to customer profiling and fraud management

Data Warehousing and Data Mining in Business Applications

DATA MINING AND WAREHOUSING CONCEPTS

A New Approach for Evaluation of Data Mining Techniques

Master of Science in Health Information Technology Degree Curriculum

Transcription:

Spatial Data Mining Methods and Problems

Abstract Use summarizing method,characteristics of each spatial data mining and spatial data mining method applied in GIS,Pointed out that the space limitations of current data mining, Analysis of the current problems in spatial data mining, explore the development trend of spatial data mining.

Introduction Due to the rapid development of earth observation technology,database technology, network technology and other space within the field of information technology in recent years, a large number of spatial data collected from remote sensing, GIS, GPS, multimedia systems, medical and satellite images, and other applications.the complexity and number of these data far beyond the analytical capacity of the human brain.although the spatial database objects have the ability to save space by the spatial relationship of these spatial data types and objects to represent,however, users can not detail all of the data on knowledge and extract interest,data mining will be an effective tool,spatial data mining technology to solve this problem provides an opportunity.

一 Spatial Data Mining Overview 1 Definition of Spatial Data Mining Spatial Data Mining, also known as data mining and knowledge-based spatial database found.as a new branch of data mining,it refers to the extraction of spatial patterns and characteristics of interest to the user from the spatial database, spatial relations and general non-spatial data in the database and some of its implicit universal data features.

2 Spatial data mining features Spatial data mining is the inevitable result of the development of spatial information technology, is a particular area of data mining, different from the general affairs or relational data mining. Spatial data mining has the following characteristics: (1)Data source is rich, the huge amount of data, information vague, data types, complex access methods; (2)The use of spatial indexing mechanism to organize data; (3)Wide range of applications, data and spatial location can be related to mining; (4)Mining methods and algorithms very much, and most complex algorithm; ( 5)Diverse expressions of knowledge, understanding and appreciation of knowledge depend on the person's awareness of the objective world; (6)Multi-scale spatial data, high-dimensional, and highly selfcorrelation between each other.

二 The main method of spatial data mining Spatial data mining is a multidisciplinary and cross-integration of a variety of new areas of technology, a collection of artificial intelligence, machine learning, databases, pattern recognition, statistics, GIS, knowledge-based systems, visualization and other areas related technologies.current methods commonly used are:

1 Spatial analysis methods:use a variety of GIS spatial analysis model and spatial operations on data crucial database for further processing to produce new information and knowledge.spatial analysis methods currently used by the comprehensive property data analysis, topology analysis, buffer analysis, density analysis, from the analysis, stack value analysis, network analysis, terrain analysis, trend surface analysis, predictive analysis, can find the target in space connected to the adjacent and symbiosis association rules, or find the shortest path between the objective knowledge, decision support optimal paths.spatial analysis is often used as pretreatment and feature extraction methods used in conjunction with other data mining methods.

2 Statistical analysis methods:statistical methods have been used to analyze spatial data, analysis focused on space objects and phenomena of non-spatial characteristics.statistical method has a strong theoretical foundation, with a large sophisticated algorithms, including many optimization techniques.in the use of statistical methods for data mining, the general nature of the data is not the space to be considered as a limiting factor, the specific spatial location spatial data described things in such mining is not a limiting factor.although the results of this excavation mode and general data mining is no essential difference, but the results were found after digging in the form of maps to describe, and the results found that the interpretation is bound to rely on geographic space, mining explanation and it must be reflected in space law.the shortcomings of statistical methods is difficult to deal with character data, and generally up to the rich experience of statistical experts.the biggest drawback of the statistical method is to assume that the spatial distribution of data are statistically uncorrelated, which cause problems in practice, because a lot of spatial data are interrelated.variogram and now represented by Geostatistics Kriging method is the more popular method of statistical analysis.

3 Neural network :Neural networks are a large number of neurons adaptive nonlinear dynamic systems through extremely rich and well connected to each, and have distributed memory, associative memory, massively parallel processing, self-learning, self-organizing, adaptive and other functions.neural network consists of an input layer, an intermediate layer and output layer.large number of neurons collectively through training to learn to be analyzed patterns in the data, describe the formation of complex nonlinear systems nonlinear function of environmental information adapted from complex background fuzzy inference rules are not explicit nonlinear space systems in mining classification knowledge in spatial data mining can be used for classification, clustering, characterized mining operations.currently used in spatial data mining neural network can be divided into three categories: for the prediction, pattern recognition feedforward networks, such as back-propagation model, function networks and fuzzy neural networks;associative memory and optimization of the feedback network, such as discrete models and continuous models for Hopfield etc;ad hoc network for clustering, such as ART models and Kohloen die hope and so on.neural networks have a distinct "to analyze specific issues," the characteristics of its convergence, stability, local minima and parameter adjustment issues to be more in-depth research, especially for multi-input variables, system complexity and nonlinearity of large cases.

4 Data visualization method:visualization technology is:mainly used to achieve a variety of purposes, including a visual analysis of the thinking process, visual analysis of the visual evoked insight and refining the concept as a distinct research methods.data visualization technology represented a lot of data in various forms to help people find data structure, characteristics, patterns, trends, anomalies or related relations.data visualization is not just a calculation method, is more important is to provide people with a cognitive tool that can greatly enhance the data processing capacity, is at all times be effectively utilized to generate massive amounts of data can be data in humans, information transmission between people, so that people can observe the hiding information,is found and provide a powerful tool for understanding the laws of science can be achieved on computing and programming guidance and control, the process is based on the condition change through interactive tools and observe its effects.

5 Rough Sets Theory:rough Sets Theory is an intelligent decision-making data analysis tool Z Pawlak professor at the University of Warsaw in 1982 proposed, has been extensively studied and applied imprecise, uncertain, incomplete classification analysis and knowledge to information.rough Sets Theory is important attributes of spatial data, attribute dependency attribute table to establish minimum decision-making and classification algorithm generation.rough Sets Theory and other knowledge discovery methods could obtain more knowledge of uncertainty in the case of spatial data in the database.currently Rough Sets Theory research is a hot spatial data mining research.

In addition to the above-described method, spatial data mining method are: spatial characteristics and trend detection method, cloud theory, image analysis and pattern recognition methods.theory of evidence,geo - informatic Tupu method,the computer and the, fuzzy set theory and the like.

三 spatial data mining architecture and processes 1 Architecture of Spatial Data Mining Matheus using more general multi-component spatial data mining architecture, shown in Figure:

SDB interfaces mainly by the mining process, focus, model extraction and evaluation of four modules to complete.wherein the SDB (Spatial Database) is a spatial database, SDBMS (Spatial Database Management System) is a spatial database management system, KDB (Knowledge Database) is the knowledge base.sdb interface utilizes spatial index structures (such as trees or R- R * - trees, etc.) to retrieve data from the data source to query optimization; focus module of object and extract attributes; model extraction module based on the module's focus on the use of the machine learning, neural networks, decision trees and other methods to find patterns or "knowledge"; evaluation module to tap into the "knowledge" to assess the removal of redundant information or known reality.four modules are not completely in only one direction, they interact through the controller. Therefore, based on this architecture, spatial data mining is a process of continuous feedback and adjustment. Finally, in the process, spatial data mining results are presented to the user.

2 Spatial data mining process Spatial data mining is an essential step process spatial KDB. because it can reveal hidden -known pattern. It consists of the following steps: (1) Data Cleanup: value by filling vacancies. Smooth noisy data, identify, remove the outliers and "clean up" inconsistent data; (2) Data Integration: to integrate multiple data sources; (3) Data Selection: The data retrieved from the database associated with the task; (4) data transformation: summary or aggregation operations by transforming data into a form suitable for data mining;

(5) Data Mining: Using intelligent way to extract the data model. Prior knowledge of the target and the type of data mining will be OK, and then select the appropriate mining algorithm based on the type of knowledge needed to finally acquire the knowledge required from the database in the selected mining algorithms; (6) Mode Assessment: to assess the knowledge model really interesting measure by some interest; (7) Knowledge Representation: Visualization through knowledge representation technology showcase mining knowledge to the user, through the above process continuous cycle operation, you can dig out of that knowledge for continuous refinement and deepened.

四 Spatial Data Mining Applications in GIS Spatial Data Mining combination of technology and GIS has a very broad application space.spatial Data Mining with GIS has three modes: one for loose coupling type, also known as external spatial data mining model that essentially GIS viewed as a spatial database in GIS environment by means of other external software or computer language spatial data mining, data communication between the GIS and the use of contact. The other is embedded, also known as the internal spatial data mining model, that in the spatial data mining technology integration in GIS spatial analysis functions to go. The third is a hybrid space model method is a combination of the first two methods, namely the use of GIS functionality provided as to minimize the workload and difficulty of the user self-developed, remain flexible external spatial data mining models.

The use of spatial data mining techniques can be found in the following several major types of knowledge from spatial databases: general knowledge of geometry, spatial distribution, spatial association rules, spatial clustering rules, spatial characteristic rules, the rules distinguish between space, spatial evolution of the rules for object. At present, this knowledge has been used in more mature Explorer military, land, electricity, telecommunications, oil and gas, urban planning, transportation, environmental monitoring and protection, 110 and 120 rapid response systems and urban management. In the market analysis, customer relationship management, banking, insurance, demographics, real estate development, personal location services and other areas are also received extensive attention and application, in fact, it is deep into every aspect of people work and live.

五 Current spatial data mining Problems Spatial data mining has become a database of information and decision-making is an important research direction, despite some progress, but it is still attractive and challenging, there are still many issues to be studied: 1 the majority of spatial data mining algorithms is a general migration from data mining algorithms, and did not consider the spatial data storage, processing and spatial characteristics of the data itself. Spatial data is different from the data in a relational database, is the use of complex, multi-dimensional spatial data index structure of the organization, has its unique spatial data access methods, thus traditional data mining technology is often not a good analysis of complex spatial phenomena and space object.

2 the spatial data mining algorithms is not efficient, not scouring discovery mode. Faced with massive database systems, spatial data mining process appears uncertain, the possibility of errors dimension model and problems to be solved are great, not only increases the algorithm of the search space, but also increased the blind searches possibility. And therefore it must be removed with the use of domain knowledge discovery tasks unrelated data, effectively reducing the dimension of the problem, design a more effective knowledge discovery algorithms. 3 There is no accepted standardized spatial data mining query language. One reason for the rapid development of database technology is the continuous improvement and development of a database query language, therefore, to continue to improve and develop spatial data mining is necessary to develop spatial data mining query language, digging the foundation for efficient spatial data.

4 Spatial Data Mining Knowledge Discovery System interaction is not strong,in the knowledge discovery process is difficult full and effective use of expert knowledge in the field, they can not very well control the spatial data mining process. 5 spatial data mining and integration with other systems is not enough, ignoring the GIS spatial knowledge discovery process in the role.one way and features a single scope of spatial data mining system will be subject to many restrictions, the development of the knowledge system is limited to the database field, if you want to find in a wider area, knowledge discovery system should be a database, knowledge base, expert systems, decision support systems, visualization tools, network systems integration and many other technologies. 6 spatial data mining method and single task,basically for a specific problem,it is possible to find limited knowledge.

六 trends of spatial data mining Due to space data has massive, non-linear, multi-scale and fuzzy and other characteristics,extract knowledge from spatial databases more difficult than extracting knowledge from traditional relational databases,his gives spatial data mining research challenges.spatial data mining in the future, there are many theories and methods need further study: 1 Algorithms and spatial data mining techniques.spatial association rule mining algorithm, time series data mining technology, space parity arithmetic, spatial classification technology, space outlier data mining algorithms, spatial research focus, while improving the efficiency of spatial data mining algorithms is also very important.

2 pre-processing of multi-source spatial data..spatial data includes DLG data, image data, digital elevation models and feature attribute data, due to the difficulties of its own complexity and data collection, spatial data, there is inevitably missing value, noise and inconsistent data data, pre-processing of multi-source spatial data is particularly important. 3 Spatial data mining network environments, visual data mining, integration of spatial data mining raster vector, background concept tree automatically generated (location, property, time, etc.) based on spatial data mining uncertainty, increasing data mining, multi-resolution and multi-level data mining, parallel data mining, data remote sensing image database mining, knowledge discovery multimedia spatial database integration of different spatial data mining methods and techniques of the future research directions.

It is foreseeable that spatial data mining will not only promote space science, the development of computer science, but also will enhance human understanding of the world, the discovery of knowledge, in order to better transform the world, the service of human society.

big data times share in favourable Thanks!!