Data Mining in the Application of Criminal Cases Based on Decision Tree
|
|
|
- Kristian Floyd
- 10 years ago
- Views:
Transcription
1 8 Journal of Computer Science and Information Technology, Vol. 1 No. 2, December 2013 Data Mining in the Application of Criminal Cases Based on Decision Tree Ruijuan Hu 1 Abstract A briefing on data mining technology used in criminal investigation work and the importance of using ID3 decision tree to structure the Decision Tree algorithm method is given by this paper. It makes a combination of criminal cases criminal suspects training data sets, using its decision tree analysis of the classification and uses Microsoft SQL Server 2005 Office 2007 Add to the data mining of Visio 2007 graphics. It shows and shares the form of mining model. Keywords: data mining; decision tree; ID3 algorithm; Visio Introduction With the rapid development of politics, economy and scientific technology, modern crime shows its high-speed, intelligent and high-tech features. Judging from the overall trend of crime, we can observe that the crime case number increases significantly. Since 1949, there have been five peaks of crime. Particularly in the fifth peak, which emerged after the reform and opening up policy coming out. What s more, every crime wave brought new kinds of sins(li Xian, 2009, p ). The rate of economic crimes, financial crimes and intelligent high-tech crimes rose dramatically the high rising speed surpassed a vicious, primitive, human instinct of crime, while the proportion of major cases also has increased. Under such circumstances, it is an urgent problem for the police department to figure out how to employ the data mining technology in criminal investigation work, to discover new rules to improve efficiency and rapid responding capability of law enforcement, and to prevent and strike criminal actions in time(luo Yi,2009.pp ). Data mining will serve as a tool for the leaders and police officers of basic unit to find the hidden information. 2. Data Mining Data Mining (DM) is the process of extracting useful information and knowledge from large quantities of structured and unstructured data, which is also an effective means of discovering knowledge( Han J,2002,Chapter 2). Data mining sometimes is also known as knowledge discovering or knowledge extracting. Data mining sometimes can be defined as the process of extracting implicit, previously unknown and useful information and knowledge from large quantities of incomplete, noisy, ambiguous, random data for practical application. This definition includes the following layers of meanings: the data source must be real in large quantity; the information that is discovered must arouse the users interests; the knowledge must be acceptable, understandable and applicable; the knowledge that is discovered is required only to cover specific business problem. In the process of data mining, the most important step is the data preprocessing, because there exist such bad data as noisy data, void data and inconsistent data due to the diversity and heterogeneity of historical data. 1 Dept. of Language Engineering, PLA University of Foreign Languages, 2 Guang Wen Road, Luoyang , China
2 Journal of Computer Science and Information Technology, Vol. 1 No. 2, December The bad data must be cleaned up because they will affect the accuracy of data mining. And the secondary process is the correlation analysis since many properties of data may concern themselves with the data classification and mission prediction. For example, the height of the suspect may indicate the irrelevance with certain cases. Therefore, the relevance analysis can be conducted to delete irrelevant or redundant properties during the studying process. And at last, the process of data exchanging, which may generalize the data onto a higher level of concept. And the concept hierarchy can be used for this purpose. This step is very important for the property with continuous values. For example, the numeral value of the attribute of age may be generalized to the discrete intervals such as youth, middle-aged and old-aged. Similarly, the nominal values, such as domicile, can be generalized to high-level concepts, such as city ID3 decision tree algorithm The basic algorithm of decision tree is greedy algorithm which makes use of recursion from the top down to construct the decision tree. In decision tree learning, ID3 (Iterative Dichotomiser 3) is a well-known algorithm used to generate a decision tree invented by Ross Quinlan. The ID3 algorithm can be summarized as follows( Kantardzic M,2002): Algorithm: Generate_decision_tree. Generate a decision tree by given training data. Input: samples(training samples) which are indicated by discrete attributes. attribute_list which is the set of attributes. Output: a decision tree. Method: 1) Create a root node for the tree. 2) If all samples are in the same class C,then 3) Return N as a leaf node and mark by class C. 4) If attribute_list is null then 5) Return N as a leaf node and mark by the commonest class of samples.// majority vote 6) Set GainMax=max(Gain1,Gain2 Gainn), if(gainmax)<criticality threshold (e.g.0.001). 7) Return N as a leaf node and mark by the commonest class of samples. 8) Select test_ attribute having the most information gain of the attribute_list. 9) Make node N as test_attribute. 10) For each known value ai of test_attribute.//to divide samples 11) Grow a branch by node N which meet test_attribute=ai. 12) Set si a sample set of test_attribute=ai. 13) If si is null then 14) Add a leaf, mark by the commonest class of samples. 15) Else add a return node by Generate_decision_tree(si,attribute_list,test_attribute). End ror 16) Return N. Make use of the attribute of the most information gain as the test_attribute of current node in each node of the tree. The expression of information gain which is a measure of information in attributes is information entropy. The higher the information gain is, the larger the differences among the attributes are, and the stronger the capability of distinguishment is. The method of information gain is as follows (XianMin Wei, 2011, p.78-80). Set S be a set of s data samples. Suppose the attribute having m different values and define m different classes C i (i=1,2,,m).s i is the number of samples in C i. The needing expected information of a giving sample can be given by formula (1) (J. Han, 1992, pp ).
3 10 Journal of Computer Science and Information Technology, Vol. 1 No. 2, December 2013 I S, S,, S = P log P (1) In formula (1), P i is probability that any sample belong to C i. P i = S i / S, and 2 is the bottom of the logarithm because we use binary system to code the information. The entropy of the subsets divided by A is as follows ( ZOU Yuan,2010). Set attribute A having V different values (A i,a2, av). Divide S into V subsets (S 1,S2, Sv) among which S j contains the samples of S while their values are aj of A. If A is test attribute, the subsets correspond to the branch growing by nodes of set S. Suppose Sij be the number of samples that are the subset sj in class Ci. The entropy or the expected information of the subsets divided by A is: E(A) = ((S + + S ) /S) I(S,, S ) (2) In formula (2),(S ij + +Smj)/S serves as the authority of the jth subset, and equals to the number of the samples in subsets dividing the total of samples in S. The smaller the entropy is, the higher the pureness of divided subsets is. I S,, S = P log (P ) (3) In formula (3), Pij = Sij/ Sj is probability that samples in Sj belong to Ci. Information gain: Gain(A) = I(S1,S2,,Sm) - E(A) (4) 2.2. ID3 algorithm in the work of criminal cases Here is an example of decision tree classification algorithm. A certain area criminal cases training data sets listed as table 1. From table 1, the attribute of the sample sets is criminal level which has two values as light and serious. Set C1 corresponding to light and C2 corresponding to serious. There are 9 samples in C1 while 6 samples in C2. To compute the information gain of each attribute, we should compute the needing expected information of the given sample classes. I(S1,S2) = I(9,6) = -(9/15)*log(9/15)-(6/15)*log(6/15) = Next, compute information entropy of each attribute. Let s begin at family background and observe the distributions of light and serious. Then compute the expected information in connection with each distribution. Family background = middle, S11=6, S21=0, I(S11,S21)=0 Family background = poor, S12=3, S22=6, I(S12,S22)=0.918 If samples are divided according to family background, the needing expected information of the given sample classes is: E(family background) = (0/15)*I(S11,S21) + (9/15)*I(S12,S22) = The information gain: Gain(family background) =I(S1,S2) - E(family background) =0.420 The similar compution: Gain(criminal experiences or not)=0.116 Gain(specialty or not)=0.146 Gain(foreigner or not)=0.249 In this way, Gain( family background ) has the largest value. It shows this attribution perform an important function in composing data to subclasses. We can create the first node family background and divide it to two parts, and so forth, we can get figure 1 as follows.
4 Journal of Computer Science and Information Technology, Vol. 1 No. 2, December From figure1,we can see one rule created in each route from root to leaf. When the family background is good, the criminal level of the suspect is light; when the family background is poor and he is foreigner having no specialty, the criminal level of the suspect is light; when the family background is poor and he has specialty, the criminal level of the suspect is serious; when one has specialty and he is native, the criminal level of the suspect is serious too. 3. Demonstration of the mining Results Office 2007 Visio is data mining template which has the function of prediction and analysis. Using Microsoft SQL Server 2005 Office 2007 data mining external connection program and the draw form of Office Visio 2007, the results of DM based on decision tree can be appeared and shared as Figure 2 below (Zhu Deli. 2007, Chapter 11). 4. Conclusions Data mining technology is a new kind of science. Its effective application in the criminal investigation is the trend of times and the necessity of the public security work. In face of information sea, Data mining application in criminal cases can help to pick out the intelligence that reflects the security situation timely, comprehensively and accurately, and the intelligent staff will be freed from the massive statistic work, so they can devote more energy and time to focusing on information analysis, which will bring much more improvement to the judgment and efficiency. References Luo Yi,&Chen Lian.(2009). Research on crime case using rough set theory and data mining technology. Microcomputer Information, doi: CNKI: SUN: WJSJ Du Haizhou, &Ma Chong. (2009). Study on Constructing Generalized Decision Tree by Using DNA Coding Genetic Algorithm. Web Information Systems and Mining, 2009.WISM International Conference on. 31, doi: /WISM , XianMin Wei(2011). Research and application of conditional probability decision tree algorithm in data mining. Circuits,Communications and System (PACCS), 2010 Second Pacific-Asia Conference on. 2, doi: /PACCS Li Xian,& Chen Ye gang(2009). Computer Crime Data Mining Based on FP-array. Journal of University of Electronic Science and Technology of China.(38), Han J, Kamber M. (2002). Data mining: Concept and technologies. (Chapter 2). AGRAWAL R, IMIELINSKI T,&SWAMI A(1993). Mining association rules between sets of items in large databases. Proc of the ACM SIGMOD Conf on Management of Data(SIGMOD 93).New York:ACM Press, HAN J, PEI J, &YIN Y(2000).Mining frequent patternswithout candidate generation. Proc of the 2000 ACM SIGMOD Int l Conf on Managenent of Data(SIGMOD 2000).New York:ACM Press,1-12. J. Han, Y. Cai, &N. Cercone. (1992). Knowledge discovery in databases: An attribute oriented approach. In Proc. of the VLDB Conference, Vancouver, British Columbia, Canada, Kantardzic M. (2002). Data Mining Concept, Models, Methods and Algorithms. IEEE Press. S. Muggleton, &C. Feng. (1992). E-cient induction of logic programs. In S. Muggleton, editor, Inductive Logic Programming. Academic Press. LU S F,&LU Z D(2001) Fastmining maximum frequent itemsets. Journal of Software.12(2): ZHOU Yan, LIU Jie, &SUN Ke. (2011). Improved Decision Tree Algorithm Based on Decision Attribute Selection Strategy. Journal of Shenyang Normal University (Natural Science Edition), (01),60-64 ZOU Yuan. (2010).Data Mining Algorithm Based on Decision Tree Application and Research. Science Technology and Engineering, (18) Zhu Deli. (2007). SQL Server 2005 Data Mining complete solutions and business intelligence. Beijing: Electronic Industry Press, (Chapter 11).
5 12 Journal of Computer Science and Information Technology, Vol. 1 No. 2, December 2013 Table 1: a certain area criminal cases training data sets Family Criminal Criminal No Specialty Foreigner background experience level R01 middle yes yes yes light R02 poor no yes yes serious R03 poor no yes yes serious R04 middle yes no yes light R05 poor no no no serious R06 poor no no yes light R07 middle no yes no light R08 poor yes yes no serious R09 middle yes no yes light R10 middle no no yes light R11 poor no yes no serious R12 poor yes no yes light R13 poor yes no yes light R14 poor no no no serious R15 middle no no yes light Figure 1: criminal cases based on Decision Tree family background middle light poor specialty yes no serious foreigner no yes serious light
6 Journal of Computer Science and Information Technology, Vol. 1 No. 2, December Figure 2: the results of DM based on Decision Tree
Data Mining based on Rough Set and Decision Tree Optimization
Data Mining based on Rough Set and Decision Tree Optimization College of Information Engineering, North China University of Water Resources and Electric Power, China, [email protected] Abstract This paper
College information system research based on data mining
2009 International Conference on Machine Learning and Computing IPCSIT vol.3 (2011) (2011) IACSIT Press, Singapore College information system research based on data mining An-yi Lan 1, Jie Li 2 1 Hebei
Binary Coded Web Access Pattern Tree in Education Domain
Binary Coded Web Access Pattern Tree in Education Domain C. Gomathi P.G. Department of Computer Science Kongu Arts and Science College Erode-638-107, Tamil Nadu, India E-mail: [email protected] M. Moorthi
How To Solve The Kd Cup 2010 Challenge
A Lightweight Solution to the Educational Data Mining Challenge Kun Liu Yan Xing Faculty of Automation Guangdong University of Technology Guangzhou, 510090, China [email protected] [email protected]
Method of Fault Detection in Cloud Computing Systems
, pp.205-212 http://dx.doi.org/10.14257/ijgdc.2014.7.3.21 Method of Fault Detection in Cloud Computing Systems Ying Jiang, Jie Huang, Jiaman Ding and Yingli Liu Yunnan Key Lab of Computer Technology Application,
A NEW DECISION TREE METHOD FOR DATA MINING IN MEDICINE
A NEW DECISION TREE METHOD FOR DATA MINING IN MEDICINE Kasra Madadipouya 1 1 Department of Computing and Science, Asia Pacific University of Technology & Innovation ABSTRACT Today, enormous amount of data
AnalysisofData MiningClassificationwithDecisiontreeTechnique
Global Journal of omputer Science and Technology Software & Data Engineering Volume 13 Issue 13 Version 1.0 Year 2013 Type: Double Blind Peer Reviewed International Research Journal Publisher: Global Journals
Social Media Mining. Data Mining Essentials
Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers
COMP3420: Advanced Databases and Data Mining. Classification and prediction: Introduction and Decision Tree Induction
COMP3420: Advanced Databases and Data Mining Classification and prediction: Introduction and Decision Tree Induction Lecture outline Classification versus prediction Classification A two step process Supervised
Random forest algorithm in big data environment
Random forest algorithm in big data environment Yingchun Liu * School of Economics and Management, Beihang University, Beijing 100191, China Received 1 September 2014, www.cmnt.lv Abstract Random forest
EFFICIENT DATA PRE-PROCESSING FOR DATA MINING
EFFICIENT DATA PRE-PROCESSING FOR DATA MINING USING NEURAL NETWORKS JothiKumar.R 1, Sivabalan.R.V 2 1 Research scholar, Noorul Islam University, Nagercoil, India Assistant Professor, Adhiparasakthi College
Finding Frequent Patterns Based On Quantitative Binary Attributes Using FP-Growth Algorithm
R. Sridevi et al Int. Journal of Engineering Research and Applications RESEARCH ARTICLE OPEN ACCESS Finding Frequent Patterns Based On Quantitative Binary Attributes Using FP-Growth Algorithm R. Sridevi,*
II. OLAP(ONLINE ANALYTICAL PROCESSING)
Association Rule Mining Method On OLAP Cube Jigna J. Jadav*, Mahesh Panchal** *( PG-CSE Student, Department of Computer Engineering, Kalol Institute of Technology & Research Centre, Gujarat, India) **
PREDICTING STUDENTS PERFORMANCE USING ID3 AND C4.5 CLASSIFICATION ALGORITHMS
PREDICTING STUDENTS PERFORMANCE USING ID3 AND C4.5 CLASSIFICATION ALGORITHMS Kalpesh Adhatrao, Aditya Gaykar, Amiraj Dhawan, Rohit Jha and Vipul Honrao ABSTRACT Department of Computer Engineering, Fr.
Optimization of C4.5 Decision Tree Algorithm for Data Mining Application
Optimization of C4.5 Decision Tree Algorithm for Data Mining Application Gaurav L. Agrawal 1, Prof. Hitesh Gupta 2 1 PG Student, Department of CSE, PCST, Bhopal, India 2 Head of Department CSE, PCST, Bhopal,
DECISION TREE INDUCTION FOR FINANCIAL FRAUD DETECTION USING ENSEMBLE LEARNING TECHNIQUES
DECISION TREE INDUCTION FOR FINANCIAL FRAUD DETECTION USING ENSEMBLE LEARNING TECHNIQUES Vijayalakshmi Mahanra Rao 1, Yashwant Prasad Singh 2 Multimedia University, Cyberjaya, MALAYSIA 1 [email protected]
ASSOCIATION RULE MINING ON WEB LOGS FOR EXTRACTING INTERESTING PATTERNS THROUGH WEKA TOOL
International Journal Of Advanced Technology In Engineering And Science Www.Ijates.Com Volume No 03, Special Issue No. 01, February 2015 ISSN (Online): 2348 7550 ASSOCIATION RULE MINING ON WEB LOGS FOR
Classification and Prediction
Classification and Prediction Slides for Data Mining: Concepts and Techniques Chapter 7 Jiawei Han and Micheline Kamber Intelligent Database Systems Research Lab School of Computing Science Simon Fraser
Mining Association Rules: A Database Perspective
IJCSNS International Journal of Computer Science and Network Security, VOL.8 No.12, December 2008 69 Mining Association Rules: A Database Perspective Dr. Abdallah Alashqur Faculty of Information Technology
Static Data Mining Algorithm with Progressive Approach for Mining Knowledge
Global Journal of Business Management and Information Technology. Volume 1, Number 2 (2011), pp. 85-93 Research India Publications http://www.ripublication.com Static Data Mining Algorithm with Progressive
Data Preprocessing. Week 2
Data Preprocessing Week 2 Topics Data Types Data Repositories Data Preprocessing Present homework assignment #1 Team Homework Assignment #2 Read pp. 227 240, pp. 250 250, and pp. 259 263 the text book.
DEVELOPMENT OF HASH TABLE BASED WEB-READY DATA MINING ENGINE
DEVELOPMENT OF HASH TABLE BASED WEB-READY DATA MINING ENGINE SK MD OBAIDULLAH Department of Computer Science & Engineering, Aliah University, Saltlake, Sector-V, Kol-900091, West Bengal, India [email protected]
Selection of Optimal Discount of Retail Assortments with Data Mining Approach
Available online at www.interscience.in Selection of Optimal Discount of Retail Assortments with Data Mining Approach Padmalatha Eddla, Ravinder Reddy, Mamatha Computer Science Department,CBIT, Gandipet,Hyderabad,A.P,India.
A Study of Detecting Credit Card Delinquencies with Data Mining using Decision Tree Model
A Study of Detecting Credit Card Delinquencies with Data Mining using Decision Tree Model ABSTRACT Mrs. Arpana Bharani* Mrs. Mohini Rao** Consumer credit is one of the necessary processes but lending bears
131-1. Adding New Level in KDD to Make the Web Usage Mining More Efficient. Abstract. 1. Introduction [1]. 1/10
1/10 131-1 Adding New Level in KDD to Make the Web Usage Mining More Efficient Mohammad Ala a AL_Hamami PHD Student, Lecturer m_ah_1@yahoocom Soukaena Hassan Hashem PHD Student, Lecturer soukaena_hassan@yahoocom
Application of Data Mining Techniques in Intrusion Detection
Application of Data Mining Techniques in Intrusion Detection LI Min An Yang Institute of Technology [email protected] Abstract: The article introduced the importance of intrusion detection, as well as
International Journal of Advance Research in Computer Science and Management Studies
Volume 2, Issue 12, December 2014 ISSN: 2321 7782 (Online) International Journal of Advance Research in Computer Science and Management Studies Research Article / Survey Paper / Case Study Available online
Data Mining. 1 Introduction 2 Data Mining methods. Alfred Holl Data Mining 1
Data Mining 1 Introduction 2 Data Mining methods Alfred Holl Data Mining 1 1 Introduction 1.1 Motivation 1.2 Goals and problems 1.3 Definitions 1.4 Roots 1.5 Data Mining process 1.6 Epistemological constraints
Extend Table Lens for High-Dimensional Data Visualization and Classification Mining
Extend Table Lens for High-Dimensional Data Visualization and Classification Mining CPSC 533c, Information Visualization Course Project, Term 2 2003 Fengdong Du [email protected] University of British Columbia
NEW TECHNIQUE TO DEAL WITH DYNAMIC DATA MINING IN THE DATABASE
www.arpapress.com/volumes/vol13issue3/ijrras_13_3_18.pdf NEW TECHNIQUE TO DEAL WITH DYNAMIC DATA MINING IN THE DATABASE Hebah H. O. Nasereddin Middle East University, P.O. Box: 144378, Code 11814, Amman-Jordan
Business Lead Generation for Online Real Estate Services: A Case Study
Business Lead Generation for Online Real Estate Services: A Case Study Md. Abdur Rahman, Xinghui Zhao, Maria Gabriella Mosquera, Qigang Gao and Vlado Keselj Faculty Of Computer Science Dalhousie University
Data Mining Solutions for the Business Environment
Database Systems Journal vol. IV, no. 4/2013 21 Data Mining Solutions for the Business Environment Ruxandra PETRE University of Economic Studies, Bucharest, Romania [email protected] Over
Blog Post Extraction Using Title Finding
Blog Post Extraction Using Title Finding Linhai Song 1, 2, Xueqi Cheng 1, Yan Guo 1, Bo Wu 1, 2, Yu Wang 1, 2 1 Institute of Computing Technology, Chinese Academy of Sciences, Beijing 2 Graduate School
Predicting the Risk of Heart Attacks using Neural Network and Decision Tree
Predicting the Risk of Heart Attacks using Neural Network and Decision Tree S.Florence 1, N.G.Bhuvaneswari Amma 2, G.Annapoorani 3, K.Malathi 4 PG Scholar, Indian Institute of Information Technology, Srirangam,
ANALYSIS OF FEATURE SELECTION WITH CLASSFICATION: BREAST CANCER DATASETS
ANALYSIS OF FEATURE SELECTION WITH CLASSFICATION: BREAST CANCER DATASETS Abstract D.Lavanya * Department of Computer Science, Sri Padmavathi Mahila University Tirupati, Andhra Pradesh, 517501, India [email protected]
Data Mining Classification: Decision Trees
Data Mining Classification: Decision Trees Classification Decision Trees: what they are and how they work Hunt s (TDIDT) algorithm How to select the best split How to handle Inconsistent data Continuous
International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014
RESEARCH ARTICLE OPEN ACCESS A Survey of Data Mining: Concepts with Applications and its Future Scope Dr. Zubair Khan 1, Ashish Kumar 2, Sunny Kumar 3 M.Tech Research Scholar 2. Department of Computer
Data Mining Part 5. Prediction
Data Mining Part 5. Prediction 5.1 Spring 2010 Instructor: Dr. Masoud Yaghini Outline Classification vs. Numeric Prediction Prediction Process Data Preparation Comparing Prediction Methods References Classification
International Journal of Scientific & Engineering Research, Volume 5, Issue 4, April-2014 442 ISSN 2229-5518
International Journal of Scientific & Engineering Research, Volume 5, Issue 4, April-2014 442 Over viewing issues of data mining with highlights of data warehousing Rushabh H. Baldaniya, Prof H.J.Baldaniya,
Horizontal Aggregations In SQL To Generate Data Sets For Data Mining Analysis In An Optimized Manner
24 Horizontal Aggregations In SQL To Generate Data Sets For Data Mining Analysis In An Optimized Manner Rekha S. Nyaykhor M. Tech, Dept. Of CSE, Priyadarshini Bhagwati College of Engineering, Nagpur, India
Association rules for improving website effectiveness: case analysis
Association rules for improving website effectiveness: case analysis Maja Dimitrijević, The Higher Technical School of Professional Studies, Novi Sad, Serbia, [email protected] Tanja Krunić, The
Implementation of Data Mining Techniques to Perform Market Analysis
Implementation of Data Mining Techniques to Perform Market Analysis B.Sabitha 1, N.G.Bhuvaneswari Amma 2, G.Annapoorani 3, P.Balasubramanian 4 PG Scholar, Indian Institute of Information Technology, Srirangam,
Data Mining Framework for Direct Marketing: A Case Study of Bank Marketing
www.ijcsi.org 198 Data Mining Framework for Direct Marketing: A Case Study of Bank Marketing Lilian Sing oei 1 and Jiayang Wang 2 1 School of Information Science and Engineering, Central South University
Classification On The Clouds Using MapReduce
Classification On The Clouds Using MapReduce Simão Martins Instituto Superior Técnico Lisbon, Portugal [email protected] Cláudia Antunes Instituto Superior Técnico Lisbon, Portugal [email protected]
Comparison of Data Mining Techniques for Money Laundering Detection System
Comparison of Data Mining Techniques for Money Laundering Detection System Rafał Dreżewski, Grzegorz Dziuban, Łukasz Hernik, Michał Pączek AGH University of Science and Technology, Department of Computer
Customer Classification And Prediction Based On Data Mining Technique
Customer Classification And Prediction Based On Data Mining Technique Ms. Neethu Baby 1, Mrs. Priyanka L.T 2 1 M.E CSE, Sri Shakthi Institute of Engineering and Technology, Coimbatore 2 Assistant Professor
Data Mining for Knowledge Management. Classification
1 Data Mining for Knowledge Management Classification Themis Palpanas University of Trento http://disi.unitn.eu/~themis Data Mining for Knowledge Management 1 Thanks for slides to: Jiawei Han Eamonn Keogh
Efficient Integration of Data Mining Techniques in Database Management Systems
Efficient Integration of Data Mining Techniques in Database Management Systems Fadila Bentayeb Jérôme Darmont Cédric Udréa ERIC, University of Lyon 2 5 avenue Pierre Mendès-France 69676 Bron Cedex France
Towards applying Data Mining Techniques for Talent Mangement
2009 International Conference on Computer Engineering and Applications IPCSIT vol.2 (2011) (2011) IACSIT Press, Singapore Towards applying Data Mining Techniques for Talent Mangement Hamidah Jantan 1,
Distributed forests for MapReduce-based machine learning
Distributed forests for MapReduce-based machine learning Ryoji Wakayama, Ryuei Murata, Akisato Kimura, Takayoshi Yamashita, Yuji Yamauchi, Hironobu Fujiyoshi Chubu University, Japan. NTT Communication
DATA MINING USING INTEGRATION OF CLUSTERING AND DECISION TREE
DATA MINING USING INTEGRATION OF CLUSTERING AND DECISION TREE 1 K.Murugan, 2 P.Varalakshmi, 3 R.Nandha Kumar, 4 S.Boobalan 1 Teaching Fellow, Department of Computer Technology, Anna University 2 Assistant
Automatic Mining of Internet Translation Reference Knowledge Based on Multiple Search Engines
, 22-24 October, 2014, San Francisco, USA Automatic Mining of Internet Translation Reference Knowledge Based on Multiple Search Engines Baosheng Yin, Wei Wang, Ruixue Lu, Yang Yang Abstract With the increasing
ON INTEGRATING UNSUPERVISED AND SUPERVISED CLASSIFICATION FOR CREDIT RISK EVALUATION
ISSN 9 X INFORMATION TECHNOLOGY AND CONTROL, 00, Vol., No.A ON INTEGRATING UNSUPERVISED AND SUPERVISED CLASSIFICATION FOR CREDIT RISK EVALUATION Danuta Zakrzewska Institute of Computer Science, Technical
Implementation of a New Approach to Mine Web Log Data Using Mater Web Log Analyzer
Implementation of a New Approach to Mine Web Log Data Using Mater Web Log Analyzer Mahadev Yadav 1, Prof. Arvind Upadhyay 2 1,2 Computer Science and Engineering, IES IPS Academy, Indore India Abstract
Extension of Decision Tree Algorithm for Stream Data Mining Using Real Data
Fifth International Workshop on Computational Intelligence & Applications IEEE SMC Hiroshima Chapter, Hiroshima University, Japan, November 10, 11 & 12, 2009 Extension of Decision Tree Algorithm for Stream
International Journal of World Research, Vol: I Issue XIII, December 2008, Print ISSN: 2347-937X DATA MINING TECHNIQUES AND STOCK MARKET
DATA MINING TECHNIQUES AND STOCK MARKET Mr. Rahul Thakkar, Lecturer and HOD, Naran Lala College of Professional & Applied Sciences, Navsari ABSTRACT Without trading in a stock market we can t understand
A Survey on Association Rule Mining in Market Basket Analysis
International Journal of Information and Computation Technology. ISSN 0974-2239 Volume 4, Number 4 (2014), pp. 409-414 International Research Publications House http://www. irphouse.com /ijict.htm A Survey
Optimization of Image Search from Photo Sharing Websites Using Personal Data
Optimization of Image Search from Photo Sharing Websites Using Personal Data Mr. Naeem Naik Walchand Institute of Technology, Solapur, India Abstract The present research aims at optimizing the image search
TOWARDS SIMPLE, EASY TO UNDERSTAND, AN INTERACTIVE DECISION TREE ALGORITHM
TOWARDS SIMPLE, EASY TO UNDERSTAND, AN INTERACTIVE DECISION TREE ALGORITHM Thanh-Nghi Do College of Information Technology, Cantho University 1 Ly Tu Trong Street, Ninh Kieu District Cantho City, Vietnam
Knowledge Discovery and Data Mining. Structured vs. Non-Structured Data
Knowledge Discovery and Data Mining Unit # 2 1 Structured vs. Non-Structured Data Most business databases contain structured data consisting of well-defined fields with numeric or alphanumeric values.
Building A Smart Academic Advising System Using Association Rule Mining
Building A Smart Academic Advising System Using Association Rule Mining Raed Shatnawi +962795285056 [email protected] Qutaibah Althebyan +962796536277 [email protected] Baraq Ghalib & Mohammed
Searching frequent itemsets by clustering data
Towards a parallel approach using MapReduce Maria Malek Hubert Kadima LARIS-EISTI Ave du Parc, 95011 Cergy-Pontoise, FRANCE [email protected], [email protected] 1 Introduction and Related Work
EMPIRICAL STUDY ON SELECTION OF TEAM MEMBERS FOR SOFTWARE PROJECTS DATA MINING APPROACH
EMPIRICAL STUDY ON SELECTION OF TEAM MEMBERS FOR SOFTWARE PROJECTS DATA MINING APPROACH SANGITA GUPTA 1, SUMA. V. 2 1 Jain University, Bangalore 2 Dayanada Sagar Institute, Bangalore, India Abstract- One
Discretization and grouping: preprocessing steps for Data Mining
Discretization and grouping: preprocessing steps for Data Mining PetrBerka 1 andivanbruha 2 1 LaboratoryofIntelligentSystems Prague University of Economic W. Churchill Sq. 4, Prague CZ 13067, Czech Republic
FRAUD DETECTION IN ELECTRIC POWER DISTRIBUTION NETWORKS USING AN ANN-BASED KNOWLEDGE-DISCOVERY PROCESS
FRAUD DETECTION IN ELECTRIC POWER DISTRIBUTION NETWORKS USING AN ANN-BASED KNOWLEDGE-DISCOVERY PROCESS Breno C. Costa, Bruno. L. A. Alberto, André M. Portela, W. Maduro, Esdras O. Eler PDITec, Belo Horizonte,
The multilayer sentiment analysis model based on Random forest Wei Liu1, Jie Zhang2
2nd International Conference on Advances in Mechanical Engineering and Industrial Informatics (AMEII 2016) The multilayer sentiment analysis model based on Random forest Wei Liu1, Jie Zhang2 1 School of
Data quality in Accounting Information Systems
Data quality in Accounting Information Systems Comparing Several Data Mining Techniques Erjon Zoto Department of Statistics and Applied Informatics Faculty of Economy, University of Tirana Tirana, Albania
Discovery of Maximal Frequent Item Sets using Subset Creation
Discovery of Maximal Frequent Item Sets using Subset Creation Jnanamurthy HK, Vishesh HV, Vishruth Jain, Preetham Kumar, Radhika M. Pai Department of Information and Communication Technology Manipal Institute
IMPROVING BUSINESS PROCESS MODELING USING RECOMMENDATION METHOD
Journal homepage: www.mjret.in ISSN:2348-6953 IMPROVING BUSINESS PROCESS MODELING USING RECOMMENDATION METHOD Deepak Ramchandara Lad 1, Soumitra S. Das 2 Computer Dept. 12 Dr. D. Y. Patil School of Engineering,(Affiliated
Healthcare Data Mining: Prediction Inpatient Length of Stay
3rd International IEEE Conference Intelligent Systems, September 2006 Healthcare Data Mining: Prediction Inpatient Length of Peng Liu, Lei Lei, Junjie Yin, Wei Zhang, Wu Naijun, Elia El-Darzi 1 Abstract
An Analysis of Missing Data Treatment Methods and Their Application to Health Care Dataset
P P P Health An Analysis of Missing Data Treatment Methods and Their Application to Health Care Dataset Peng Liu 1, Elia El-Darzi 2, Lei Lei 1, Christos Vasilakis 2, Panagiotis Chountas 2, and Wei Huang
DATA PREPARATION FOR DATA MINING
Applied Artificial Intelligence, 17:375 381, 2003 Copyright # 2003 Taylor & Francis 0883-9514/03 $12.00 +.00 DOI: 10.1080/08839510390219264 u DATA PREPARATION FOR DATA MINING SHICHAO ZHANG and CHENGQI
IDENTIFYING BANK FRAUDS USING CRISP-DM AND DECISION TREES
IDENTIFYING BANK FRAUDS USING CRISP-DM AND DECISION TREES Bruno Carneiro da Rocha 1,2 and Rafael Timóteo de Sousa Júnior 2 1 Bank of Brazil, Brasília-DF, Brazil [email protected] 2 Network Engineering
Data Mining Algorithms Part 1. Dejan Sarka
Data Mining Algorithms Part 1 Dejan Sarka Join the conversation on Twitter: @DevWeek #DW2015 Instructor Bio Dejan Sarka ([email protected]) 30 years of experience SQL Server MVP, MCT, 13 books 7+ courses
Dynamic Data in terms of Data Mining Streams
International Journal of Computer Science and Software Engineering Volume 2, Number 1 (2015), pp. 1-6 International Research Publication House http://www.irphouse.com Dynamic Data in terms of Data Mining
Foundations of Business Intelligence: Databases and Information Management
Foundations of Business Intelligence: Databases and Information Management Content Problems of managing data resources in a traditional file environment Capabilities and value of a database management
A Load Balancing Algorithm based on the Variation Trend of Entropy in Homogeneous Cluster
, pp.11-20 http://dx.doi.org/10.14257/ ijgdc.2014.7.2.02 A Load Balancing Algorithm based on the Variation Trend of Entropy in Homogeneous Cluster Kehe Wu 1, Long Chen 2, Shichao Ye 2 and Yi Li 2 1 Beijing
Application of Data Warehouse and Data Mining. in Construction Management
Application of Data Warehouse and Data Mining in Construction Management Jianping ZHANG 1 ([email protected]) Tianyi MA 1 ([email protected]) Qiping SHEN 2 ([email protected])
Integrating Pattern Mining in Relational Databases
Integrating Pattern Mining in Relational Databases Toon Calders, Bart Goethals, and Adriana Prado University of Antwerp, Belgium {toon.calders, bart.goethals, adriana.prado}@ua.ac.be Abstract. Almost a
Research of Data Mining Algorithm Based on Cloud Database
S Orders for Reprints to [email protected] The Open Automation and Control Systems Journal, 2013, 5, 187-193 187 Research of Data Mining Algorithm Based on Cloud Database Open Access Tianxiang
Impact of Feature Selection on the Performance of Wireless Intrusion Detection Systems
2009 International Conference on Computer Engineering and Applications IPCSIT vol.2 (2011) (2011) IACSIT Press, Singapore Impact of Feature Selection on the Performance of ireless Intrusion Detection Systems
Effective Data Mining Using Neural Networks
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 8, NO. 6, DECEMBER 1996 957 Effective Data Mining Using Neural Networks Hongjun Lu, Member, IEEE Computer Society, Rudy Setiono, and Huan Liu,
A Serial Partitioning Approach to Scaling Graph-Based Knowledge Discovery
A Serial Partitioning Approach to Scaling Graph-Based Knowledge Discovery Runu Rathi, Diane J. Cook, Lawrence B. Holder Department of Computer Science and Engineering The University of Texas at Arlington
Mining Generalized Query Patterns from Web Logs
Mining Generalized Query Patterns from Web Logs Charles X. Ling* Dept. of Computer Science, Univ. of Western Ontario, Canada [email protected] Jianfeng Gao Microsoft Research China [email protected] Huajie
A Property & Casualty Insurance Predictive Modeling Process in SAS
Paper AA-02-2015 A Property & Casualty Insurance Predictive Modeling Process in SAS 1.0 ABSTRACT Mei Najim, Sedgwick Claim Management Services, Chicago, Illinois Predictive analytics has been developing
An Overview of Knowledge Discovery Database and Data mining Techniques
An Overview of Knowledge Discovery Database and Data mining Techniques Priyadharsini.C 1, Dr. Antony Selvadoss Thanamani 2 M.Phil, Department of Computer Science, NGM College, Pollachi, Coimbatore, Tamilnadu,
Research on Trust Management Strategies in Cloud Computing Environment
Journal of Computational Information Systems 8: 4 (2012) 1757 1763 Available at http://www.jofcis.com Research on Trust Management Strategies in Cloud Computing Environment Wenjuan LI 1,2,, Lingdi PING
Introduction to Data Mining Techniques
Introduction to Data Mining Techniques Dr. Rajni Jain 1 Introduction The last decade has experienced a revolution in information availability and exchange via the internet. In the same spirit, more and
KNOWLEDGE DISCOVERY and SAMPLING TECHNIQUES with DATA MINING for IDENTIFYING TRENDS in DATA SETS
KNOWLEDGE DISCOVERY and SAMPLING TECHNIQUES with DATA MINING for IDENTIFYING TRENDS in DATA SETS Prof. Punam V. Khandar, *2 Prof. Sugandha V. Dani Dept. of M.C.A., Priyadarshini College of Engg., Nagpur,
Mining various patterns in sequential data in an SQL-like manner *
Mining various patterns in sequential data in an SQL-like manner * Marek Wojciechowski Poznan University of Technology, Institute of Computing Science, ul. Piotrowo 3a, 60-965 Poznan, Poland [email protected]
Mining Online GIS for Crime Rate and Models based on Frequent Pattern Analysis
, 23-25 October, 2013, San Francisco, USA Mining Online GIS for Crime Rate and Models based on Frequent Pattern Analysis John David Elijah Sandig, Ruby Mae Somoba, Ma. Beth Concepcion and Bobby D. Gerardo,
Foundations of Business Intelligence: Databases and Information Management
Foundations of Business Intelligence: Databases and Information Management Problem: HP s numerous systems unable to deliver the information needed for a complete picture of business operations, lack of
A Review of Data Mining Techniques
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 4, April 2014,
Data Mining on Streams
Data Mining on Streams Using Decision Trees CS 536: Machine Learning Instructor: Michael Littman TA: Yihua Wu Outline Introduction to data streams Overview of traditional DT learning ALG DT learning ALGs
Design call center management system of e-commerce based on BP neural network and multifractal
Available online www.jocpr.com Journal of Chemical and Pharmaceutical Research, 2014, 6(6):951-956 Research Article ISSN : 0975-7384 CODEN(USA) : JCPRC5 Design call center management system of e-commerce
