A Fraud Detection Approach in Telecommunication using Cluster GA

Size: px
Start display at page:

Download "A Fraud Detection Approach in Telecommunication using Cluster GA"

Transcription

1 A Fraud Detection Approach in Telecommunication using Cluster GA V.Umayaparvathi Dept of Computer Science, DDE, MKU Dr.K.Iyakutti CSIR Emeritus Scientist, School of Physics, MKU Abstract: In trend mobile is one of the important devices in public sector. In mobile community there are N numbers of vendors, manufacturers available. The telecommunications industry was first to adopt data mining technology. Telecommunication companies day by day generate and store enormous amounts of high-quality data, have a very large customer data base and operate in a rapidly changing and highly competitive business environment. Telecommunication companies utilize data mining technique to improve their marketing efforts, identify fraud and better manage their telecommunication network connection. In this paper we provide an effective solution to identify the fraud detection in telecommunication using Data Mining clustering techniques with GA as well as misbehavior users. We believe this approach will help in our society. Keywords: Sequential pattern clustering, Genetic Algorithm (GA), Data Mining, Telecommunication. Introduction: The international telecommunications play an important role in message sharing and information passing. In the last decade a dramatic change in the structure of telecommunications companies has been taken place, from public monopolies to private companies. The quick development of mobile telephone networks and video calling and Internet technologies has created enormous competitive pressure on the companies sector. As new competitors arise in market, telecom need intelligent tools to gain profit and withstand. Also, stock market expectations are huge and investors, financial analysts need tested tools to gain information about how companies perform financially compared to their competitors, what they are good at, who are the major competitors are, etc. In other term, the telecom companies need to benchmark their performances against compete trends in order to remain important role in this market. There is a enormous amount of information about these companies financial performance that is now publicly available. This amount greatly exceeds our capacity to analyze it; the problem is that we often lack tools to quickly and accurately process these data. Data mining for telecommunications companies involve the use of simple, traditional and advanced mathematical techniques used to analyze large populations of data and deliver insights, forecasts, explanations and predictions of how systems, customers, network and marketplace are likely to react to different situations. The hypercompetitive nature of the telecom industry has created a need to understand customers, to keep them, and to model effective ways to market new products. This creates a great demand for innovation like data mining technique to help understand the new business trends involved, catch fraudulent activities, identify telecommunication patterns, make better use of resources and improve the quality of services. ISSN: IJCTT

2 Types of telecom data The Initial step in the data mining process is to understand the data. Here we discuss three main types of telecom data. call summary data Every time a call is placed on a telecom network, descriptive information about the call is saved as a call detail for future record. At a minimum, each call detail record will include the originating and terminating phone numbers, the date and time of the call and the duration of the call. Network data Telecommunication networks are extremely complex configurations of equipment, comprised of interconnected components. Each network element is capable of generating error and status messages, which leads to a tremendous amount of network data. customer data Telecommunication companies, like other businesses have millions of customer s. For necessity they have to maintaining a database of information on these customers. this information will include name, address and may include other information such as service plan, credit score, contract information, family income and payment history. Data mining applications in telecom The telecommunication industry was an early adopter of data mining technology and therefore many applications exist. Four major applications include: marketing customer profiling, fraud detection, churn management and network fault isolation. Each one will be described in more detail. Also many other applications exists which can be classified in these four categories such as : telecom market research, telecom customer segmentation, telecom market sizing, telecom territory design and alignment, telecom market forecasting, telecom fraud prediction, telecom sales force optimization and telecom marketing campaign optimization. Data mining costs Unfortunately, telecom data mining tools and algorithm makings are very expensive proposition. The big costs are associated with finding the data cleansing, required it and making it useful. Privacy concerns are an important issue for data mining especially in telecom industry, since telecom companies maintain highly private information, such as which each customer calls. Also there are many legal restrictions in each country about data mining, much of the rationale for these prohibitions relates to competition. The aim of this paper is to find out the fraud detection in telecommunication using Data Mining clustering techniques with GA. Related Works: There are several algorithms for mining sequential patterns. Apriori, GSP, SPADE, Prefix Span and Spam are just the simpler ones. From these, GSP and Prefix Span are the best known algorithms, and represent the two main approaches to the problem: apriori-based and pattern growth methods. Next, we will describe both approaches and compare their advantages and disadvantages. Apriori based Approaches. GSP follows the candidate generation and test philosophy. It begins with the discovery of frequent 1 sequences, and then generates the set of potentially frequent (k+1) sequences from the set off frequent k-sequences. The generation of potentially frequent k-sequences uses the ISSN: IJCTT

3 frequent (k-1) sequences discovered in the previous step, which may reduce significantly the number of sequences to consider at each moment. Note that to decide if one sequence s is frequent or not, it is necessary to scan the entire database, verifying if s is contained in each sequence in the database. In a way to reduce its processing time, GSP adopts three optimizations. First, it maintains all candidates in a hash-tree to scan the database once per iteration. Second, it only creates a new k candidate when there are two frequent (k-1) sequences with the prefix of one equal to the suffix of the other. Third, it eliminates all candidates that have some non frequent maximal subsequence. By using these strategies, GSP reduces the time spent in scanning the database, increasing its general performance. In general, apriori based methods can be seen as breath-first traversal algorithms, since they construct all k patterns simultaneously. At each step GSP only maintains in memory the already discovered patterns and the k candidates. Pattern growth methods are a more trendy approach to deal with sequential pattern mining problems. The key idea is to avoid the candidate generation step altogether, and to focus the search on a restricted portion of the initial database. Prefix Span is the most promising of the pattern-growth methods and is based on recursively constructing the patterns, and simultaneously, restricting the search to projected databases. A projected database is the set of subsequences in the database, which are suffixes of the sequences that have prefix a. At each step, the algorithm looks for the frequent sequences with prefix a, in the corresponding projected database. In this way, the search space is reduced at each step, allowing for better performances in the presence of small support thresholds. In general, pattern growth methods can be seen as depth first traversal algorithms, since they construct each pattern separately, in a recursive way. As pointed in, when a gap constraint is used, neither Prefix Span nor Prefix Growth can be applied directly. The generalization proposed Gen Prefix Span, is based on the redefinition of the method used to construct the projected databases. Instead of looking only for the first occurrence of the item, every occurrence is considered. Proposed Method: Sequential Pattern Mining algorithms solve the problem of discovering the maximum frequent sequences in a given database. Algorithms for this problem are relevant when the data to be mined has some sequential nature, when each piece of data is an ordered set of elements, like events in the case of temporal information. The goal of sequential pattern mining is to discover all frequent sequences of item sets from the dataset. In particular, an item set is a non empty subset of elements from a set the item collection, called items. In this manner, an item set represents the set of items that occur together. The item set composed of items a and b is denoted by ab. A sequence is an ordered list of item sets. A sequence is maximal if it is not contained in any other sequence. A sequence with k items is called a k sequence. The number of elements in a sequence s is the length of the sequence and is denoted by s. The ith item set in the sequence is represented by si and the set of considered sequences is usually designated by database (DB), and the number of sequences by database size. Sequential Pattern Clustering Definition: ISSN: IJCTT

4 Let I ={x1...xn} be a set of items. An item set is a non-empty subset of items, and an item set with k items is called k-item set. A sequence s=(x1...xl) is an order list of item sets, and an item set Xi (1 i l) in a sequence is called a transaction. In a set of sequences, a sequence s is maximal if s is not contained in other sequences. Genetic Algorithm: Genetic algorithms are one of the best way to solve a problem for which little is known. They are a very general algorithm and will work well in any search space. All you need to know is what you need the solution to be able to do well, and a genetic algorithm will be able to create a high quality solution. Genetic algorithms use the principles of selection and evolution to produce several solutions to a given problem. Genetic algorithms tend to thrive in an environment in which there is a very large set of candidate solutions and in which the search space is uneven and has many hills and valleys. True, genetic algorithms will do well in any environment but they will be greatly outclassed by more situation specific algorithms in the simpler search spaces. Therefore you must keep in mind that genetic algorithms are not always the best choice. Sometimes they can take quite a while to run and are therefore not always feasible for real time use. They are, however, one of the most powerful methods with which to quickly create high quality solutions to a problem. Now, before we start, I'm going to provide you with some key terms so that this article makes sense. Sequential pattern clustering with GA: We are going mingle with sequential pattern clustering with Genetic algorithm to optimize the SPC (sequential pattern clustering). In this work telephone no code act as chromosome Based on that we can generate the initial population size from that we can identify the group of corresponding number groups. In the large no group we will apply the GA for generating the new optimized group based on some fitness function value. Fitness function value is nothing but it s a criterion for generating the new solution or new candidate functions. Before get into the technique we have to follow some initial pre activities like defining the fitness value, initial population size, mutation point, crossover point and ordering data. After define everything we will apply the technique into the data. In our telecommunication data base contain 2000 records, which are got from one of the Indian call center sector. Next thing what we going to define is initial population size that s near to 1000 records in that we are going to apply the genetic algorithm(ga) technique. In GA there are two important concepts is there like mutation, crossover. We will see how this technique will work Samples Input: Crossover Mutation ISSN: IJCTT

5 Initial samples size: N Maximum generations: G Threshold value: T Minimum fitness: minf Output: Malpractice user no lists Begin Step1: Initialize counter count = 0 Step2: Initial population IN of size N Step3: For each chromosome i IN If S1& S2 is given then Measure the fitness F(i, S1, S2,T) Else If S1 is given then Measure the fitness F(i, S1, T) Else If S2 is given then Measure the fitness F(i, S2, T) Else Measure the fitness F(i, T) Step4: Mutate and crossover P. Step5: IF (fitness minf) Select fittest rules from P Step6: Set temp = temp +1 Step7: IF (t > G) then S = P Stop Else Go to Step 3 End Fitness function measurement will evaluate the sequential pattern clustering. Based on that we will fix threshold value to determine the spoofed calls and malpractice customer no. threshold value determinate the whether the call is legitimate or not. Its value derived based on the time duration of the call, country code, destination and frequency. Destination code acts as important point in threshold value because customer can modify or clone them mobile no but destination no won t change at any cost. It s tedious process to identify which customer legible or hacker. Even though they change the IMEI no we can identify the based on the destination no and country code. S.N o Result Analysis: Phone number Date & Time 18:46:52 21:25:45 17:30: :55:31 05:32:59 Destinatio n Countr y code This simulation done in mat lab editor while executing this algorithm we will get list of customer those who are consider as malpractice user or hacker. The sample out put displayed here instead of complete model. Some time our algorithm results legitimate customer as hacker or malpractice person. We have to choose exact person no who is hacker because its just list out all possibilities of hacker no. Conclusion: In this paper we provide an effective solution to identify the misuse customer and malpractice customer in the telecommunication area using sequential pattern clustering and genetic algorithm. This paper will help to society for utilizing mobile in secure way and identify the Call durat ion 00:01 :48 00:00 :41 00:02 :47 00:01 :21 00:01 :25 ISSN: IJCTT

6 misbehavior customer in telecommunication department. Future work: In future we are going to develop model for business trends, marketing strategy, analysis report from the existing database. Some time our algorithm result legitimate customer as hacker or malpractice person. We have to overcome this in our future work. In this paper we provide an effective solution for identify the fraud and malpractice user identification. In future we going to develop the complete model for telecom industry to enhance and compete with other vendors. References: [1] Saleh Al Kodhair, Mining Association Rules using Genetic Algorithm,, Master Thesis. [2] S. Sakurai, Y. Kitahara, and R. Orihara, A Sequential Pattern Mining Method based on Sequential Interestingness, Fall., International Journal of Computational Intelligence, pp [3] Jay Ay res, Johannes Gehr ke, Tomi Yiu, and Jason Flannick, Sequential Pattern Mining using A Bitmap Representation, 2002, SIGKDD 02 Edmonton, Alberta, Canada. [4]Malone, J., etc.. (2005). Data mining Using Rule Extraction from Kohonen Self-organising Maps. Neural Comput & Applic 15: [5]Wu, S., Gao, X. and Bastian, M. (2003). Data Warehousing and Data Mining. Metallurgical Industry Press. Beijing. [6]Wang, Y. (2004). The Study on Data Warehouse and Data Mining in Telecom industry Management Analysis and Application. College of Computer Science, Chongqing University. Chongqing. ISSN: IJCTT

not possible or was possible at a high cost for collecting the data.

not possible or was possible at a high cost for collecting the data. Data Mining and Knowledge Discovery Generating knowledge from data Knowledge Discovery Data Mining White Paper Organizations collect a vast amount of data in the process of carrying out their day-to-day

More information

An Efficient GA-Based Algorithm for Mining Negative Sequential Patterns

An Efficient GA-Based Algorithm for Mining Negative Sequential Patterns An Efficient GA-Based Algorithm for Mining Negative Sequential Patterns Zhigang Zheng 1, Yanchang Zhao 1,2,ZiyeZuo 1, and Longbing Cao 1 1 Data Sciences & Knowledge Discovery Research Lab Centre for Quantum

More information

Data Mining Solutions for the Business Environment

Data Mining Solutions for the Business Environment Database Systems Journal vol. IV, no. 4/2013 21 Data Mining Solutions for the Business Environment Ruxandra PETRE University of Economic Studies, Bucharest, Romania ruxandra_stefania.petre@yahoo.com Over

More information

D A T A M I N I N G C L A S S I F I C A T I O N

D A T A M I N I N G C L A S S I F I C A T I O N D A T A M I N I N G C L A S S I F I C A T I O N FABRICIO VOZNIKA LEO NARDO VIA NA INTRODUCTION Nowadays there is huge amount of data being collected and stored in databases everywhere across the globe.

More information

An Overview of Knowledge Discovery Database and Data mining Techniques

An Overview of Knowledge Discovery Database and Data mining Techniques An Overview of Knowledge Discovery Database and Data mining Techniques Priyadharsini.C 1, Dr. Antony Selvadoss Thanamani 2 M.Phil, Department of Computer Science, NGM College, Pollachi, Coimbatore, Tamilnadu,

More information

How To Use Neural Networks In Data Mining

How To Use Neural Networks In Data Mining International Journal of Electronics and Computer Science Engineering 1449 Available Online at www.ijecse.org ISSN- 2277-1956 Neural Networks in Data Mining Priyanka Gaur Department of Information and

More information

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014 RESEARCH ARTICLE OPEN ACCESS A Survey of Data Mining: Concepts with Applications and its Future Scope Dr. Zubair Khan 1, Ashish Kumar 2, Sunny Kumar 3 M.Tech Research Scholar 2. Department of Computer

More information

Session 10 : E-business models, Big Data, Data Mining, Cloud Computing

Session 10 : E-business models, Big Data, Data Mining, Cloud Computing INFORMATION STRATEGY Session 10 : E-business models, Big Data, Data Mining, Cloud Computing Tharaka Tennekoon B.Sc (Hons) Computing, MBA (PIM - USJ) POST GRADUATE DIPLOMA IN BUSINESS AND FINANCE 2014 Internet

More information

A Time Efficient Algorithm for Web Log Analysis

A Time Efficient Algorithm for Web Log Analysis A Time Efficient Algorithm for Web Log Analysis Santosh Shakya Anju Singh Divakar Singh Student [M.Tech.6 th sem (CSE)] Asst.Proff, Dept. of CSE BU HOD (CSE), BUIT, BUIT,BU Bhopal Barkatullah University,

More information

Binary Coded Web Access Pattern Tree in Education Domain

Binary Coded Web Access Pattern Tree in Education Domain Binary Coded Web Access Pattern Tree in Education Domain C. Gomathi P.G. Department of Computer Science Kongu Arts and Science College Erode-638-107, Tamil Nadu, India E-mail: kc.gomathi@gmail.com M. Moorthi

More information

Healthcare Measurement Analysis Using Data mining Techniques

Healthcare Measurement Analysis Using Data mining Techniques www.ijecs.in International Journal Of Engineering And Computer Science ISSN:2319-7242 Volume 03 Issue 07 July, 2014 Page No. 7058-7064 Healthcare Measurement Analysis Using Data mining Techniques 1 Dr.A.Shaik

More information

Computational Intelligence in Data Mining and Prospects in Telecommunication Industry

Computational Intelligence in Data Mining and Prospects in Telecommunication Industry Journal of Emerging Trends in Engineering and Applied Sciences (JETEAS) 2 (4): 601-605 Scholarlink Research Institute Journals, 2011 (ISSN: 2141-7016) jeteas.scholarlinkresearch.org Journal of Emerging

More information

DEVELOPMENT OF HASH TABLE BASED WEB-READY DATA MINING ENGINE

DEVELOPMENT OF HASH TABLE BASED WEB-READY DATA MINING ENGINE DEVELOPMENT OF HASH TABLE BASED WEB-READY DATA MINING ENGINE SK MD OBAIDULLAH Department of Computer Science & Engineering, Aliah University, Saltlake, Sector-V, Kol-900091, West Bengal, India sk.obaidullah@gmail.com

More information

Data Mining System, Functionalities and Applications: A Radical Review

Data Mining System, Functionalities and Applications: A Radical Review Data Mining System, Functionalities and Applications: A Radical Review Dr. Poonam Chaudhary System Programmer, Kurukshetra University, Kurukshetra Abstract: Data Mining is the process of locating potentially

More information

Discovering Sequential Rental Patterns by Fleet Tracking

Discovering Sequential Rental Patterns by Fleet Tracking Discovering Sequential Rental Patterns by Fleet Tracking Xinxin Jiang (B), Xueping Peng, and Guodong Long Quantum Computation and Intelligent Systems, University of Technology Sydney, Ultimo, Australia

More information

Advanced Analytics. The Way Forward for Businesses. Dr. Sujatha R Upadhyaya

Advanced Analytics. The Way Forward for Businesses. Dr. Sujatha R Upadhyaya Advanced Analytics The Way Forward for Businesses Dr. Sujatha R Upadhyaya Nov 2009 Advanced Analytics Adding Value to Every Business In this tough and competitive market, businesses are fighting to gain

More information

Data Mining is sometimes referred to as KDD and DM and KDD tend to be used as synonyms

Data Mining is sometimes referred to as KDD and DM and KDD tend to be used as synonyms Data Mining Techniques forcrm Data Mining The non-trivial extraction of novel, implicit, and actionable knowledge from large datasets. Extremely large datasets Discovery of the non-obvious Useful knowledge

More information

Supply chain management by means of FLM-rules

Supply chain management by means of FLM-rules Supply chain management by means of FLM-rules Nicolas Le Normand, Julien Boissière, Nicolas Méger, Lionel Valet LISTIC Laboratory - Polytech Savoie Université de Savoie B.P. 80439 F-74944 Annecy-Le-Vieux,

More information

DATA MINING TECHNIQUES SUPPORT TO KNOWLEGDE OF BUSINESS INTELLIGENT SYSTEM

DATA MINING TECHNIQUES SUPPORT TO KNOWLEGDE OF BUSINESS INTELLIGENT SYSTEM INTERNATIONAL JOURNAL OF RESEARCH IN COMPUTER APPLICATIONS AND ROBOTICS ISSN 2320-7345 DATA MINING TECHNIQUES SUPPORT TO KNOWLEGDE OF BUSINESS INTELLIGENT SYSTEM M. Mayilvaganan 1, S. Aparna 2 1 Associate

More information

Nine Common Types of Data Mining Techniques Used in Predictive Analytics

Nine Common Types of Data Mining Techniques Used in Predictive Analytics 1 Nine Common Types of Data Mining Techniques Used in Predictive Analytics By Laura Patterson, President, VisionEdge Marketing Predictive analytics enable you to develop mathematical models to help better

More information

A Review of Data Mining Techniques

A Review of Data Mining Techniques Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 4, April 2014,

More information

DATA MINING TECHNIQUES AND APPLICATIONS

DATA MINING TECHNIQUES AND APPLICATIONS DATA MINING TECHNIQUES AND APPLICATIONS Mrs. Bharati M. Ramageri, Lecturer Modern Institute of Information Technology and Research, Department of Computer Application, Yamunanagar, Nigdi Pune, Maharashtra,

More information

Apigee Insights Increase marketing effectiveness and customer satisfaction with API-driven adaptive apps

Apigee Insights Increase marketing effectiveness and customer satisfaction with API-driven adaptive apps White provides GRASP-powered big data predictive analytics that increases marketing effectiveness and customer satisfaction with API-driven adaptive apps that anticipate, learn, and adapt to deliver contextual,

More information

Data mining and complex telecommunications problems modeling

Data mining and complex telecommunications problems modeling Paper Data mining and complex telecommunications problems modeling Janusz Granat Abstract The telecommunications operators have to manage one of the most complex systems developed by human beings. Moreover,

More information

Predictive Analytics using Genetic Algorithm for Efficient Supply Chain Inventory Optimization

Predictive Analytics using Genetic Algorithm for Efficient Supply Chain Inventory Optimization 182 IJCSNS International Journal of Computer Science and Network Security, VOL.10 No.3, March 2010 Predictive Analytics using Genetic Algorithm for Efficient Supply Chain Inventory Optimization P.Radhakrishnan

More information

Review of Modern Techniques of Qualitative Data Clustering

Review of Modern Techniques of Qualitative Data Clustering Review of Modern Techniques of Qualitative Data Clustering Sergey Cherevko and Andrey Malikov The North Caucasus Federal University, Institute of Information Technology and Telecommunications cherevkosa92@gmail.com,

More information

Data Mining in Telecommunication:A Review

Data Mining in Telecommunication:A Review Data Mining in Telecommunication:A Review Suman H. Pal, Jignasa N. Patel Department of Information Technology, Shri S ad Vidya Mandal Institute of Technology Abstract Telecommunication is one of the first

More information

Memory Allocation Technique for Segregated Free List Based on Genetic Algorithm

Memory Allocation Technique for Segregated Free List Based on Genetic Algorithm Journal of Al-Nahrain University Vol.15 (2), June, 2012, pp.161-168 Science Memory Allocation Technique for Segregated Free List Based on Genetic Algorithm Manal F. Younis Computer Department, College

More information

Effect of Using Neural Networks in GA-Based School Timetabling

Effect of Using Neural Networks in GA-Based School Timetabling Effect of Using Neural Networks in GA-Based School Timetabling JANIS ZUTERS Department of Computer Science University of Latvia Raina bulv. 19, Riga, LV-1050 LATVIA janis.zuters@lu.lv Abstract: - The school

More information

SPMF: a Java Open-Source Pattern Mining Library

SPMF: a Java Open-Source Pattern Mining Library Journal of Machine Learning Research 1 (2014) 1-5 Submitted 4/12; Published 10/14 SPMF: a Java Open-Source Pattern Mining Library Philippe Fournier-Viger philippe.fournier-viger@umoncton.ca Department

More information

A Survey on Intrusion Detection System with Data Mining Techniques

A Survey on Intrusion Detection System with Data Mining Techniques A Survey on Intrusion Detection System with Data Mining Techniques Ms. Ruth D 1, Mrs. Lovelin Ponn Felciah M 2 1 M.Phil Scholar, Department of Computer Science, Bishop Heber College (Autonomous), Trichirappalli,

More information

An Effective System for Mining Web Log

An Effective System for Mining Web Log An Effective System for Mining Web Log Zhenglu Yang Yitong Wang Masaru Kitsuregawa Institute of Industrial Science, The University of Tokyo 4-6-1 Komaba, Meguro-Ku, Tokyo 153-8305, Japan {yangzl, ytwang,

More information

WHITEPAPER. Creating and Deploying Predictive Strategies that Drive Customer Value in Marketing, Sales and Risk

WHITEPAPER. Creating and Deploying Predictive Strategies that Drive Customer Value in Marketing, Sales and Risk WHITEPAPER Creating and Deploying Predictive Strategies that Drive Customer Value in Marketing, Sales and Risk Overview Angoss is helping its clients achieve significant revenue growth and measurable return

More information

A Way to Understand Various Patterns of Data Mining Techniques for Selected Domains

A Way to Understand Various Patterns of Data Mining Techniques for Selected Domains A Way to Understand Various Patterns of Data Mining Techniques for Selected Domains Dr. Kanak Saxena Professor & Head, Computer Application SATI, Vidisha, kanak.saxena@gmail.com D.S. Rajpoot Registrar,

More information

ISSN: 2321-7782 (Online) Volume 3, Issue 4, April 2015 International Journal of Advance Research in Computer Science and Management Studies

ISSN: 2321-7782 (Online) Volume 3, Issue 4, April 2015 International Journal of Advance Research in Computer Science and Management Studies ISSN: 2321-7782 (Online) Volume 3, Issue 4, April 2015 International Journal of Advance Research in Computer Science and Management Studies Research Article / Survey Paper / Case Study Available online

More information

Directed Graph based Distributed Sequential Pattern Mining Using Hadoop Map Reduce

Directed Graph based Distributed Sequential Pattern Mining Using Hadoop Map Reduce Directed Graph based Distributed Sequential Pattern Mining Using Hadoop Map Reduce Sushila S. Shelke, Suhasini A. Itkar, PES s Modern College of Engineering, Shivajinagar, Pune Abstract - Usual sequential

More information

Data Mining for Fun and Profit

Data Mining for Fun and Profit Data Mining for Fun and Profit Data mining is the extraction of implicit, previously unknown, and potentially useful information from data. - Ian H. Witten, Data Mining: Practical Machine Learning Tools

More information

Data Mining in Telecommunication

Data Mining in Telecommunication Data Mining in Telecommunication Mohsin Nadaf & Vidya Kadam Department of IT, Trinity College of Engineering & Research, Pune, India E-mail : mohsinanadaf@gmail.com Abstract Telecommunication is one of

More information

How To Use Data Mining For Knowledge Management In Technology Enhanced Learning

How To Use Data Mining For Knowledge Management In Technology Enhanced Learning Proceedings of the 6th WSEAS International Conference on Applications of Electrical Engineering, Istanbul, Turkey, May 27-29, 2007 115 Data Mining for Knowledge Management in Technology Enhanced Learning

More information

Keywords revenue management, yield management, genetic algorithm, airline reservation

Keywords revenue management, yield management, genetic algorithm, airline reservation Volume 4, Issue 1, January 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com A Revenue Management

More information

ISSN: 2319-5967 ISO 9001:2008 Certified International Journal of Engineering Science and Innovative Technology (IJESIT) Volume 2, Issue 3, May 2013

ISSN: 2319-5967 ISO 9001:2008 Certified International Journal of Engineering Science and Innovative Technology (IJESIT) Volume 2, Issue 3, May 2013 Transistor Level Fault Finding in VLSI Circuits using Genetic Algorithm Lalit A. Patel, Sarman K. Hadia CSPIT, CHARUSAT, Changa., CSPIT, CHARUSAT, Changa Abstract This paper presents, genetic based algorithm

More information

The Masters of Science in Information Systems & Technology

The Masters of Science in Information Systems & Technology The Masters of Science in Information Systems & Technology College of Engineering and Computer Science University of Michigan-Dearborn A Rackham School of Graduate Studies Program PH: 313-593-5361; FAX:

More information

Data Mining Analytics for Business Intelligence and Decision Support

Data Mining Analytics for Business Intelligence and Decision Support Data Mining Analytics for Business Intelligence and Decision Support Chid Apte, T.J. Watson Research Center, IBM Research Division Knowledge Discovery and Data Mining (KDD) techniques are used for analyzing

More information

Enhanced Boosted Trees Technique for Customer Churn Prediction Model

Enhanced Boosted Trees Technique for Customer Churn Prediction Model IOSR Journal of Engineering (IOSRJEN) ISSN (e): 2250-3021, ISSN (p): 2278-8719 Vol. 04, Issue 03 (March. 2014), V5 PP 41-45 www.iosrjen.org Enhanced Boosted Trees Technique for Customer Churn Prediction

More information

Using reporting and data mining techniques to improve knowledge of subscribers; applications to customer profiling and fraud management

Using reporting and data mining techniques to improve knowledge of subscribers; applications to customer profiling and fraud management Using reporting and data mining techniques to improve knowledge of subscribers; applications to customer profiling and fraud management Paper Jean-Louis Amat Abstract One of the main issues of operators

More information

Process Modelling from Insurance Event Log

Process Modelling from Insurance Event Log Process Modelling from Insurance Event Log P.V. Kumaraguru Research scholar, Dr.M.G.R Educational and Research Institute University Chennai- 600 095 India Dr. S.P. Rajagopalan Professor Emeritus, Dr. M.G.R

More information

Data Mining. Vera Goebel. Department of Informatics, University of Oslo

Data Mining. Vera Goebel. Department of Informatics, University of Oslo Data Mining Vera Goebel Department of Informatics, University of Oslo 2011 1 Lecture Contents Knowledge Discovery in Databases (KDD) Definition and Applications OLAP Architectures for OLAP and KDD KDD

More information

Name: Srinivasan Govindaraj Title: Big Data Predictive Analytics

Name: Srinivasan Govindaraj Title: Big Data Predictive Analytics Name: Srinivasan Govindaraj Title: Big Data Predictive Analytics Please note the following IBM s statements regarding its plans, directions, and intent are subject to change or withdrawal without notice

More information

Data Warehousing and Data Mining in Business Applications

Data Warehousing and Data Mining in Business Applications 133 Data Warehousing and Data Mining in Business Applications Eesha Goel CSE Deptt. GZS-PTU Campus, Bathinda. Abstract Information technology is now required in all aspect of our lives that helps in business

More information

Static Data Mining Algorithm with Progressive Approach for Mining Knowledge

Static Data Mining Algorithm with Progressive Approach for Mining Knowledge Global Journal of Business Management and Information Technology. Volume 1, Number 2 (2011), pp. 85-93 Research India Publications http://www.ripublication.com Static Data Mining Algorithm with Progressive

More information

Learning is a very general term denoting the way in which agents:

Learning is a very general term denoting the way in which agents: What is learning? Learning is a very general term denoting the way in which agents: Acquire and organize knowledge (by building, modifying and organizing internal representations of some external reality);

More information

Data are everywhere. IBM projects that every day we generate 2.5 quintillion bytes of data. In relative terms, this means 90

Data are everywhere. IBM projects that every day we generate 2.5 quintillion bytes of data. In relative terms, this means 90 FREE echapter C H A P T E R1 Big Data and Analytics Data are everywhere. IBM projects that every day we generate 2.5 quintillion bytes of data. In relative terms, this means 90 percent of the data in the

More information

Use of Data Mining in Banking

Use of Data Mining in Banking Use of Data Mining in Banking Kazi Imran Moin*, Dr. Qazi Baseer Ahmed** *(Department of Computer Science, College of Computer Science & Information Technology, Latur, (M.S), India ** (Department of Commerce

More information

DECISION TREE INDUCTION FOR FINANCIAL FRAUD DETECTION USING ENSEMBLE LEARNING TECHNIQUES

DECISION TREE INDUCTION FOR FINANCIAL FRAUD DETECTION USING ENSEMBLE LEARNING TECHNIQUES DECISION TREE INDUCTION FOR FINANCIAL FRAUD DETECTION USING ENSEMBLE LEARNING TECHNIQUES Vijayalakshmi Mahanra Rao 1, Yashwant Prasad Singh 2 Multimedia University, Cyberjaya, MALAYSIA 1 lakshmi.mahanra@gmail.com

More information

Introduction. A. Bellaachia Page: 1

Introduction. A. Bellaachia Page: 1 Introduction 1. Objectives... 3 2. What is Data Mining?... 4 3. Knowledge Discovery Process... 5 4. KD Process Example... 7 5. Typical Data Mining Architecture... 8 6. Database vs. Data Mining... 9 7.

More information

Building A Smart Academic Advising System Using Association Rule Mining

Building A Smart Academic Advising System Using Association Rule Mining Building A Smart Academic Advising System Using Association Rule Mining Raed Shatnawi +962795285056 raedamin@just.edu.jo Qutaibah Althebyan +962796536277 qaalthebyan@just.edu.jo Baraq Ghalib & Mohammed

More information

A New Method for Traffic Forecasting Based on the Data Mining Technology with Artificial Intelligent Algorithms

A New Method for Traffic Forecasting Based on the Data Mining Technology with Artificial Intelligent Algorithms Research Journal of Applied Sciences, Engineering and Technology 5(12): 3417-3422, 213 ISSN: 24-7459; e-issn: 24-7467 Maxwell Scientific Organization, 213 Submitted: October 17, 212 Accepted: November

More information

A STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS

A STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS A STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS Mrs. Jyoti Nawade 1, Dr. Balaji D 2, Mr. Pravin Nawade 3 1 Lecturer, JSPM S Bhivrabai Sawant Polytechnic, Pune (India) 2 Assistant

More information

Future Trend Prediction of Indian IT Stock Market using Association Rule Mining of Transaction data

Future Trend Prediction of Indian IT Stock Market using Association Rule Mining of Transaction data Volume 39 No10, February 2012 Future Trend Prediction of Indian IT Stock Market using Association Rule Mining of Transaction data Rajesh V Argiddi Assit Prof Department Of Computer Science and Engineering,

More information

Driving Value From Big Data

Driving Value From Big Data Big Data Executive Forum Data Discovery, Modern Architecture & Visualization Driving Value From Big Data Bill Franks Chief Analytics Officer, Teradata It s Not So Much Big Data As it is different data.

More information

The Applications of Genetic Algorithms in Stock Market Data Mining Optimisation

The Applications of Genetic Algorithms in Stock Market Data Mining Optimisation The Applications of Genetic Algorithms in Stock Market Data Mining Optimisation Li Lin, Longbing Cao, Jiaqi Wang, Chengqi Zhang Faculty of Information Technology, University of Technology, Sydney, NSW

More information

SPATIAL DATA CLASSIFICATION AND DATA MINING

SPATIAL DATA CLASSIFICATION AND DATA MINING , pp.-40-44. Available online at http://www. bioinfo. in/contents. php?id=42 SPATIAL DATA CLASSIFICATION AND DATA MINING RATHI J.B. * AND PATIL A.D. Department of Computer Science & Engineering, Jawaharlal

More information

An Improved Algorithm for Fuzzy Data Mining for Intrusion Detection

An Improved Algorithm for Fuzzy Data Mining for Intrusion Detection An Improved Algorithm for Fuzzy Data Mining for Intrusion Detection German Florez, Susan M. Bridges, and Rayford B. Vaughn Abstract We have been using fuzzy data mining techniques to extract patterns that

More information

Business Process Services. White Paper. Predictive Analytics in HR: A Primer

Business Process Services. White Paper. Predictive Analytics in HR: A Primer Business Process Services White Paper Predictive Analytics in HR: A Primer About the Authors Tuhin Subhra Dey Tuhin is a member of the Analytics and Insights team at Tata Consultancy Services (TCS), where

More information

Improving Apriori Algorithm to get better performance with Cloud Computing

Improving Apriori Algorithm to get better performance with Cloud Computing Improving Apriori Algorithm to get better performance with Cloud Computing Zeba Qureshi 1 ; Sanjay Bansal 2 Affiliation: A.I.T.R, RGPV, India 1, A.I.T.R, RGPV, India 2 ABSTRACT Cloud computing has become

More information

Process Intelligence: An Exciting New Frontier for Business Intelligence

Process Intelligence: An Exciting New Frontier for Business Intelligence February/2014 Process Intelligence: An Exciting New Frontier for Business Intelligence Claudia Imhoff, Ph.D. Sponsored by Altosoft, A Kofax Company Table of Contents Introduction... 1 Use Cases... 2 Business

More information

IT and CRM A basic CRM model Data source & gathering system Database system Data warehouse Information delivery system Information users

IT and CRM A basic CRM model Data source & gathering system Database system Data warehouse Information delivery system Information users 1 IT and CRM A basic CRM model Data source & gathering Database Data warehouse Information delivery Information users 2 IT and CRM Markets have always recognized the importance of gathering detailed data

More information

Distributed Data Mining Algorithm Parallelization

Distributed Data Mining Algorithm Parallelization Distributed Data Mining Algorithm Parallelization B.Tech Project Report By: Rishi Kumar Singh (Y6389) Abhishek Ranjan (10030) Project Guide: Prof. Satyadev Nandakumar Department of Computer Science and

More information

Introduction to Data Mining

Introduction to Data Mining Introduction to Data Mining 1 Why Data Mining? Explosive Growth of Data Data collection and data availability Automated data collection tools, Internet, smartphones, Major sources of abundant data Business:

More information

PRACTICAL DATA MINING IN A LARGE UTILITY COMPANY

PRACTICAL DATA MINING IN A LARGE UTILITY COMPANY QÜESTIIÓ, vol. 25, 3, p. 509-520, 2001 PRACTICAL DATA MINING IN A LARGE UTILITY COMPANY GEORGES HÉBRAIL We present in this paper the main applications of data mining techniques at Electricité de France,

More information

Contents. Dedication List of Figures List of Tables. Acknowledgments

Contents. Dedication List of Figures List of Tables. Acknowledgments Contents Dedication List of Figures List of Tables Foreword Preface Acknowledgments v xiii xvii xix xxi xxv Part I Concepts and Techniques 1. INTRODUCTION 3 1 The Quest for Knowledge 3 2 Problem Description

More information

Evolutionary Detection of Rules for Text Categorization. Application to Spam Filtering

Evolutionary Detection of Rules for Text Categorization. Application to Spam Filtering Advances in Intelligent Systems and Technologies Proceedings ECIT2004 - Third European Conference on Intelligent Systems and Technologies Iasi, Romania, July 21-23, 2004 Evolutionary Detection of Rules

More information

Data Mining + Business Intelligence. Integration, Design and Implementation

Data Mining + Business Intelligence. Integration, Design and Implementation Data Mining + Business Intelligence Integration, Design and Implementation ABOUT ME Vijay Kotu Data, Business, Technology, Statistics BUSINESS INTELLIGENCE - Result Making data accessible Wider distribution

More information

Supply chain intelligence: benefits, techniques and future trends

Supply chain intelligence: benefits, techniques and future trends MEB 2010 8 th International Conference on Management, Enterprise and Benchmarking June 4 5, 2010 Budapest, Hungary Supply chain intelligence: benefits, techniques and future trends Zoltán Bátori Óbuda

More information

Data Isn't Everything

Data Isn't Everything June 17, 2015 Innovate Forward Data Isn't Everything The Challenges of Big Data, Advanced Analytics, and Advance Computation Devices for Transportation Agencies. Using Data to Support Mission, Administration,

More information

Management Science Letters

Management Science Letters Management Science Letters 4 (2014) 905 912 Contents lists available at GrowingScience Management Science Letters homepage: www.growingscience.com/msl Measuring customer loyalty using an extended RFM and

More information

This software agent helps industry professionals review compliance case investigations, find resolutions, and improve decision making.

This software agent helps industry professionals review compliance case investigations, find resolutions, and improve decision making. Lost in a sea of data? Facing an external audit? Or just wondering how you re going meet the challenges of the next regulatory law? When you need fast, dependable support and company-specific solutions

More information

Performance Analysis of Sequential Pattern Mining Algorithms on Large Dense Datasets

Performance Analysis of Sequential Pattern Mining Algorithms on Large Dense Datasets Performance Analysis of Sequential Pattern Mining Algorithms on Large Dense Datasets Dr. Sunita Mahajan 1, Prajakta Pawar 2 and Alpa Reshamwala 3 1 Principal, 2 M.Tech Student, 3 Assistant Professor, 1

More information

Data Mining Techniques

Data Mining Techniques 15.564 Information Technology I Business Intelligence Outline Operational vs. Decision Support Systems What is Data Mining? Overview of Data Mining Techniques Overview of Data Mining Process Data Warehouses

More information

Finding Frequent Patterns Based On Quantitative Binary Attributes Using FP-Growth Algorithm

Finding Frequent Patterns Based On Quantitative Binary Attributes Using FP-Growth Algorithm R. Sridevi et al Int. Journal of Engineering Research and Applications RESEARCH ARTICLE OPEN ACCESS Finding Frequent Patterns Based On Quantitative Binary Attributes Using FP-Growth Algorithm R. Sridevi,*

More information

Data Mining for Manufacturing: Preventive Maintenance, Failure Prediction, Quality Control

Data Mining for Manufacturing: Preventive Maintenance, Failure Prediction, Quality Control Data Mining for Manufacturing: Preventive Maintenance, Failure Prediction, Quality Control Andre BERGMANN Salzgitter Mannesmann Forschung GmbH; Duisburg, Germany Phone: +49 203 9993154, Fax: +49 203 9993234;

More information

Obtaining Value from Big Data

Obtaining Value from Big Data Obtaining Value from Big Data Course Notes in Transparency Format technology basics for data scientists Spring - 2014 Jordi Torres, UPC - BSC www.jorditorres.eu @JordiTorresBCN Data deluge, is it enough?

More information

TEXT ANALYTICS INTEGRATION

TEXT ANALYTICS INTEGRATION TEXT ANALYTICS INTEGRATION A TELECOMMUNICATIONS BEST PRACTICES CASE STUDY VISION COMMON ANALYTICAL ENVIRONMENT Structured Unstructured Analytical Mining Text Discovery Text Categorization Text Sentiment

More information

VILNIUS UNIVERSITY FREQUENT PATTERN ANALYSIS FOR DECISION MAKING IN BIG DATA

VILNIUS UNIVERSITY FREQUENT PATTERN ANALYSIS FOR DECISION MAKING IN BIG DATA VILNIUS UNIVERSITY JULIJA PRAGARAUSKAITĖ FREQUENT PATTERN ANALYSIS FOR DECISION MAKING IN BIG DATA Doctoral Dissertation Physical Sciences, Informatics (09 P) Vilnius, 2013 Doctoral dissertation was prepared

More information

Digging for Gold: Business Usage for Data Mining Kim Foster, CoreTech Consulting Group, Inc., King of Prussia, PA

Digging for Gold: Business Usage for Data Mining Kim Foster, CoreTech Consulting Group, Inc., King of Prussia, PA Digging for Gold: Business Usage for Data Mining Kim Foster, CoreTech Consulting Group, Inc., King of Prussia, PA ABSTRACT Current trends in data mining allow the business community to take advantage of

More information

IncSpan: Incremental Mining of Sequential Patterns in Large Database

IncSpan: Incremental Mining of Sequential Patterns in Large Database IncSpan: Incremental Mining of Sequential Patterns in Large Database Hong Cheng Department of Computer Science University of Illinois at Urbana-Champaign Urbana, Illinois 61801 hcheng3@uiuc.edu Xifeng

More information

Data Mining: Motivations and Concepts

Data Mining: Motivations and Concepts POLYTECHNIC UNIVERSITY Department of Computer Science / Finance and Risk Engineering Data Mining: Motivations and Concepts K. Ming Leung Abstract: We discuss here the need, the goals, and the primary tasks

More information

Course 803401 DSS. Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

Course 803401 DSS. Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Oman College of Management and Technology Course 803401 DSS Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization CS/MIS Department Information Sharing

More information

ISSN: 2321-7782 (Online) Volume 3, Issue 7, July 2015 International Journal of Advance Research in Computer Science and Management Studies

ISSN: 2321-7782 (Online) Volume 3, Issue 7, July 2015 International Journal of Advance Research in Computer Science and Management Studies ISSN: 2321-7782 (Online) Volume 3, Issue 7, July 2015 International Journal of Advance Research in Computer Science and Management Studies Research Article / Survey Paper / Case Study Available online

More information

SOCIAL NETWORK ANALYSIS EVALUATING THE CUSTOMER S INFLUENCE FACTOR OVER BUSINESS EVENTS

SOCIAL NETWORK ANALYSIS EVALUATING THE CUSTOMER S INFLUENCE FACTOR OVER BUSINESS EVENTS SOCIAL NETWORK ANALYSIS EVALUATING THE CUSTOMER S INFLUENCE FACTOR OVER BUSINESS EVENTS Carlos Andre Reis Pinheiro 1 and Markus Helfert 2 1 School of Computing, Dublin City University, Dublin, Ireland

More information

A COGNITIVE APPROACH IN PATTERN ANALYSIS TOOLS AND TECHNIQUES USING WEB USAGE MINING

A COGNITIVE APPROACH IN PATTERN ANALYSIS TOOLS AND TECHNIQUES USING WEB USAGE MINING A COGNITIVE APPROACH IN PATTERN ANALYSIS TOOLS AND TECHNIQUES USING WEB USAGE MINING M.Gnanavel 1 & Dr.E.R.Naganathan 2 1. Research Scholar, SCSVMV University, Kanchipuram,Tamil Nadu,India. 2. Professor

More information

CHAPTER - 5 CONCLUSIONS / IMP. FINDINGS

CHAPTER - 5 CONCLUSIONS / IMP. FINDINGS CHAPTER - 5 CONCLUSIONS / IMP. FINDINGS In today's scenario data warehouse plays a crucial role in order to perform important operations. Different indexing techniques has been used and analyzed using

More information

Predicting the Stock Market with News Articles

Predicting the Stock Market with News Articles Predicting the Stock Market with News Articles Kari Lee and Ryan Timmons CS224N Final Project Introduction Stock market prediction is an area of extreme importance to an entire industry. Stock price is

More information

In this presentation, you will be introduced to data mining and the relationship with meaningful use.

In this presentation, you will be introduced to data mining and the relationship with meaningful use. In this presentation, you will be introduced to data mining and the relationship with meaningful use. Data mining refers to the art and science of intelligent data analysis. It is the application of machine

More information

New Matrix Approach to Improve Apriori Algorithm

New Matrix Approach to Improve Apriori Algorithm New Matrix Approach to Improve Apriori Algorithm A. Rehab H. Alwa, B. Anasuya V Patil Associate Prof., IT Faculty, Majan College-University College Muscat, Oman, rehab.alwan@majancolleg.edu.om Associate

More information

Investigating Clinical Care Pathways Correlated with Outcomes

Investigating Clinical Care Pathways Correlated with Outcomes Investigating Clinical Care Pathways Correlated with Outcomes Geetika T. Lakshmanan, Szabolcs Rozsnyai, Fei Wang IBM T. J. Watson Research Center, NY, USA August 2013 Outline Care Pathways Typical Challenges

More information

Data Mining as Part of Knowledge Discovery in Databases (KDD)

Data Mining as Part of Knowledge Discovery in Databases (KDD) Mining as Part of Knowledge Discovery in bases (KDD) Presented by Naci Akkøk as part of INF4180/3180, Advanced base Systems, fall 2003 (based on slightly modified foils of Dr. Denise Ecklund from 6 November

More information

A STUDY OF DATA MINING ACTIVITIES FOR MARKET RESEARCH

A STUDY OF DATA MINING ACTIVITIES FOR MARKET RESEARCH 205 A STUDY OF DATA MINING ACTIVITIES FOR MARKET RESEARCH ABSTRACT MR. HEMANT KUMAR*; DR. SARMISTHA SARMA** *Assistant Professor, Department of Information Technology (IT), Institute of Innovation in Technology

More information

Mining Sequence Data. JERZY STEFANOWSKI Inst. Informatyki PP Wersja dla TPD 2009 Zaawansowana eksploracja danych

Mining Sequence Data. JERZY STEFANOWSKI Inst. Informatyki PP Wersja dla TPD 2009 Zaawansowana eksploracja danych Mining Sequence Data JERZY STEFANOWSKI Inst. Informatyki PP Wersja dla TPD 2009 Zaawansowana eksploracja danych Outline of the presentation 1. Realtionships to mining frequent items 2. Motivations for

More information

Keywords: Mobility Prediction, Location Prediction, Data Mining etc

Keywords: Mobility Prediction, Location Prediction, Data Mining etc Volume 4, Issue 4, April 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Data Mining Approach

More information