# Using Data Mining for Mobile Communication Clustering and Characterization

Save this PDF as:

Size: px
Start display at page:

## Transcription

2 2. Let C(n) be a set of clusters. The number n must be provided by user and represents the number of clusters needed to process these data. 3. Step 1 Initialization: Choose random n observations from entire set O and initialize the cluster members to these observations. 4. Step 2 Assign each observation from O set to the nearest cluster. The distance between a cluster and an observation can be a Euclidian distance, a Manhattan distance or another distance. 5. Step 3 Recomputed the entire cluster set C(n) based on their subset of observation. 6. Step 4 If there are any changes on members from cluster set, go to Step 2 7. Final StepShow the structure of the clusters and classify the observations for each cluster. The differences to the classical algorithm arise from the Step 1 and Step 2. Thus, for a faster convergence, the clusters can be chosen to be uniformly distributed in observation space. It is also common practice to initialize clusters with members that are most representative of the entire set of observations. The Euclidean distance was chosen to run the algorithm. For a better representation, all the data was scaled in the range [0,1]. To do this scaling, for each of the components that compose an observation, we determine the maximum value. For an efficient running of the algorithm, the components that make up observations should not be redundant. It is difficult to identify the data redundancy, because it requires additional processing that can result in loss of important information. One possible solution to this would be to apply an algorithm to identify important components (e.g. PCA - Principal Component Analysis), but this is not the part of this article. We mention that the main point of the article is finding available relationships between users and that is why we wanted to analyze the initial data (gross - without alteration) with no changes. For this purpose, an observation is made up of all the data collected from the mobile operator. III. EXPERIMENTAL SETUP AND DATA INPUTS The IP multimedia and telecommunication industry generates and stores a high amount of data, therefore a manual analysis is not possible. Call detail records include descriptive information about each call made in a telecommunication network. Data mining algorithms take as source of information Call Detail Records (CDR s) extracted during calls. To uncover usage patterns, the information extracted requires data processing. The CDR reveals information at the level of individual phone calls. In data mining applications the aim is to extract knowledge at the customer level. One solution is to summarize the call detail records associated with a user into a single record that describes the user calling behaviour [2]. The users records (the attributes) are given as inputs to the clustering algorithm. Data sets used for this analysis contain information describing the calling behaviour of a user. Two sets of CDR data were drawn during a period of one year. Collected data was saved in a table that contains the entire year s records. Additionally, other tables contain customer information for each individual month. The first data set contains user records characterized by 24 attributes, while the second one by 20 attributes. The measures chosen to represent the user attributes in our analysis are described below. Below is the list of attributes included in both data sets: 1. Number of calls for different services used (voice, SMS, data): describe the frequency of usage of each type of call. Calls are restricted according to the direction of call: we will take into consideration only outgoing calls. The total duration and the total number of calls describe the user s activity level. voice_out_calls number of outgoing voice calls; sms_out_calls number of outgoing SMS calls; data_out_calls number of outgoing data calls; 2. Total Duration of calls: voice_out_duration duration of outgoing voice calls; data_out_size size of -outgoing data calls; 3. Calling Behaviour by Day/Time of the day for different call types: describes the calling behaviour of a user during a workday or weekend for different services used weekend_voice_out_calls number of outgoing voice calls during weekends; weekend_sms_out_calls number of outgoing SMS calls during weekend days; weekend_data_out_calls number of outgoing data calls during weekend days; workdays_voice_out_calls number of outgoing voice calls on working days; workdays_sms_out_calls number of outgoing SMS calls on working days; workdays_data_out_calls number of outgoing data calls on working days; worktime_voice_out_calls duration of outgoing voice calls on working hours (8:00 a.m. 6:00 p.m.); worktime_sms_out_calls (duration of outgoing voice calls on working hours (8:00 a.m. 6:00 p.m.)) worktime_data_out_calls duration of outgoing voice calls on working hours (8:00 a.m. 6:00 p.m.); offtime_voice_out_calls duration of voice calls made outside office hours; offtime_sms_out_calls duration of SMS calls made outside office hours; offtime_data_out_calls duration of data calls made outside office hours;

4 exceptions are observed for almost all users, exceptions that can be correlated with changes in their communication behaviour (e.g. vacancy periods or peak business activities). B. Clusters analysis Next, we analyzed the clusters and the relation between attributes inside clusters (Fig. 2). This analysis allows us to identify similar characteristics of employees in every cluster. Aspects like preferred communication channel (voice, SMS, data), length of communication (duration, size), time of communication (day hours) have been investigated. Clusters voice_out_calls sms_out_calls data_out_calls voice_out_duration data_out_size In Fig. 4 the clusters are plotted by their voice and data attributes. Employees in clusters 1, 4, 6 and 5 do not use or use less data communication plans, compared with the ones in 3 and 2 which are more data oriented. The graphic in Fig. 5 shows the clusters by their messaging and data traffic attributes. Figure 2. Clusters centroids The following graphics plot clusters by various couples of attributes. Each circle represents a cluster and its size represents the number of elements contained by it. In Fig. 3, clusters are presented by their number of voice calls and SMS count. It can be observed a direct relation between number of calls and number of messages for clusters 1, 4, 5 and 2. But employees in cluster 3 prefer SMS communication unlike the employees in cluster 6 which prefer voice communication. Figure 5. SMS traffic vs. initiated data traffic Figure 6. Voice out calls vs. average call duration Figure 3. Voice out calls vs. SMS sends Figure 7. Initiated data traffic vs. traffic size Figure 4. Voice out calls vs. initiated data traffic C. Business analysis In the end we tried to identify relations between employees communication behaviour and their business position or activity. We evaluated the job description of the employees in every cluster. In this analysis we found

5 similar positions of the employees inside clusters. For example in Fig. 2, cluster 2 (red) contains employees in top management positions. Cluster 5 (orange) is formed mainly by people that work in sales. Finally, cluster 1 (cyan) contains the less communicative people, the technical ones. V. CONCLUSIONS In this paper we investigated how data mining algorithms can be used to characterize employers communication patterns and their influence on business related activities. ACKNOWLEDGMENT This work was partially supported by the strategic grant POSDRU/89/1.5/S/57649, Project ID (PERFORM- ERA), co-financed by the European Social Fund Investing in People, within the Sectorial Operational Programme Human Resources Development REFERENCES [1] I. O. Folasade, Computational Intelligence in Data Mining and Prospects in Telecommunication Industry, Journal of Emerging Trends in Engineering and Applied Sciences, vol. 2, no. 4, pp , Aug [2] G. M. Weiss, Data Mining in the Telecommunications Industry, IGI Global Section: Service, [3] J. K. Pal, Usefulness and applications of data mining in extracting information from different perspective, Annals of Library and Information Studies, Vol. 58, pp. 7-16, Mar [4] K. Tulankar, M. Kshirsagar, R. Wajgi, Clustering Telecom Customers using Emergent Self Organizing Maps for Business Profitability, International Journal of Computer Science and Technology, Vol. 3, Iss. 1, pp , Mar

### HOSPIRA (HSP US) HISTORICAL COMMON STOCK PRICE INFORMATION

30-Apr-2004 28.35 29.00 28.20 28.46 28.55 03-May-2004 28.50 28.70 26.80 27.04 27.21 04-May-2004 26.90 26.99 26.00 26.00 26.38 05-May-2004 26.05 26.69 26.00 26.35 26.34 06-May-2004 26.31 26.35 26.05 26.26

### Median and Average Sales Prices of New Homes Sold in United States

Jan 1963 \$17,200 (NA) Feb 1963 \$17,700 (NA) Mar 1963 \$18,200 (NA) Apr 1963 \$18,200 (NA) May 1963 \$17,500 (NA) Jun 1963 \$18,000 (NA) Jul 1963 \$18,400 (NA) Aug 1963 \$17,800 (NA) Sep 1963 \$17,900 (NA) Oct

### THE UNIVERSITY OF BOLTON

JANUARY Jan 1 6.44 8.24 12.23 2.17 4.06 5.46 Jan 2 6.44 8.24 12.24 2.20 4.07 5.47 Jan 3 6.44 8.24 12.24 2.21 4.08 5.48 Jan 4 6.44 8.24 12.25 2.22 4.09 5.49 Jan 5 6.43 8.23 12.25 2.24 4.10 5.50 Jan 6 6.43

### Social Media Mining. Data Mining Essentials

Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers

### AT&T Global Network Client for Windows Product Support Matrix January 29, 2015

AT&T Global Network Client for Windows Product Support Matrix January 29, 2015 Product Support Matrix Following is the Product Support Matrix for the AT&T Global Network Client. See the AT&T Global Network

### NAV HISTORY OF DBH FIRST MUTUAL FUND (DBH1STMF)

NAV HISTORY OF DBH FIRST MUTUAL FUND () Date NAV 11-Aug-16 10.68 8.66 0.38% -0.07% 0.45% 3.81% 04-Aug-16 10.64 8.66-0.19% 0.87% -1.05% 3.76% 28-Jul-16 10.66 8.59 0.00% -0.34% 0.34% 3.89% 21-Jul-16 10.66

### An Overview of Knowledge Discovery Database and Data mining Techniques

An Overview of Knowledge Discovery Database and Data mining Techniques Priyadharsini.C 1, Dr. Antony Selvadoss Thanamani 2 M.Phil, Department of Computer Science, NGM College, Pollachi, Coimbatore, Tamilnadu,

### Machine Learning using MapReduce

Machine Learning using MapReduce What is Machine Learning Machine learning is a subfield of artificial intelligence concerned with techniques that allow computers to improve their outputs based on previous

### COMPARISON OF FIXED & VARIABLE RATES (25 YEARS) CHARTERED BANK ADMINISTERED INTEREST RATES - PRIME BUSINESS*

COMPARISON OF FIXED & VARIABLE RATES (25 YEARS) 2 Fixed Rates Variable Rates FIXED RATES OF THE PAST 25 YEARS AVERAGE RESIDENTIAL MORTGAGE LENDING RATE - 5 YEAR* (Per cent) Year Jan Feb Mar Apr May Jun

### COMPARISON OF FIXED & VARIABLE RATES (25 YEARS) CHARTERED BANK ADMINISTERED INTEREST RATES - PRIME BUSINESS*

COMPARISON OF FIXED & VARIABLE RATES (25 YEARS) 2 Fixed Rates Variable Rates FIXED RATES OF THE PAST 25 YEARS AVERAGE RESIDENTIAL MORTGAGE LENDING RATE - 5 YEAR* (Per cent) Year Jan Feb Mar Apr May Jun

### DATA MINING TECHNIQUES AND APPLICATIONS

DATA MINING TECHNIQUES AND APPLICATIONS Mrs. Bharati M. Ramageri, Lecturer Modern Institute of Information Technology and Research, Department of Computer Application, Yamunanagar, Nigdi Pune, Maharashtra,

### Case 2:08-cv-02463-ABC-E Document 1-4 Filed 04/15/2008 Page 1 of 138. Exhibit 8

Case 2:08-cv-02463-ABC-E Document 1-4 Filed 04/15/2008 Page 1 of 138 Exhibit 8 Case 2:08-cv-02463-ABC-E Document 1-4 Filed 04/15/2008 Page 2 of 138 Domain Name: CELLULARVERISON.COM Updated Date: 12-dec-2007

### ANNEXURE 1 STATUS OF 518 DEMAT REQUESTS PENDING WITH NSDL

ANNEXURE 1 STATUS OF 518 DEMAT REQUESTS PENDING WITH NSDL Sr. No. Demat Request No.(DRN) DP ID Client ID Date of Demat Request Received Quantity Requested Date of Demat Request Processed No. of days of

### FRAUD DETECTION IN ELECTRIC POWER DISTRIBUTION NETWORKS USING AN ANN-BASED KNOWLEDGE-DISCOVERY PROCESS

FRAUD DETECTION IN ELECTRIC POWER DISTRIBUTION NETWORKS USING AN ANN-BASED KNOWLEDGE-DISCOVERY PROCESS Breno C. Costa, Bruno. L. A. Alberto, André M. Portela, W. Maduro, Esdras O. Eler PDITec, Belo Horizonte,

### Data Mining Techniques in CRM

Data Mining Techniques in CRM Inside Customer Segmentation Konstantinos Tsiptsis CRM 6- Customer Intelligence Expert, Athens, Greece Antonios Chorianopoulos Data Mining Expert, Athens, Greece WILEY A John

### Qi Liu Rutgers Business School ISACA New York 2013

Qi Liu Rutgers Business School ISACA New York 2013 1 What is Audit Analytics The use of data analysis technology in Auditing. Audit analytics is the process of identifying, gathering, validating, analyzing,

### A Study of Web Log Analysis Using Clustering Techniques

A Study of Web Log Analysis Using Clustering Techniques Hemanshu Rana 1, Mayank Patel 2 Assistant Professor, Dept of CSE, M.G Institute of Technical Education, Gujarat India 1 Assistant Professor, Dept

### Web Usage Mining: Identification of Trends Followed by the user through Neural Network

International Journal of Information and Computation Technology. ISSN 0974-2239 Volume 3, Number 7 (2013), pp. 617-624 International Research Publications House http://www. irphouse.com /ijict.htm Web

### ARTIFICIAL INTELLIGENCE (CSCU9YE) LECTURE 6: MACHINE LEARNING 2: UNSUPERVISED LEARNING (CLUSTERING)

ARTIFICIAL INTELLIGENCE (CSCU9YE) LECTURE 6: MACHINE LEARNING 2: UNSUPERVISED LEARNING (CLUSTERING) Gabriela Ochoa http://www.cs.stir.ac.uk/~goc/ OUTLINE Preliminaries Classification and Clustering Applications

### Mobile Phone APP Software Browsing Behavior using Clustering Analysis

Proceedings of the 2014 International Conference on Industrial Engineering and Operations Management Bali, Indonesia, January 7 9, 2014 Mobile Phone APP Software Browsing Behavior using Clustering Analysis

### Analysis One Code Desc. Transaction Amount. Fiscal Period

Analysis One Code Desc Transaction Amount Fiscal Period 57.63 Oct-12 12.13 Oct-12-38.90 Oct-12-773.00 Oct-12-800.00 Oct-12-187.00 Oct-12-82.00 Oct-12-82.00 Oct-12-110.00 Oct-12-1115.25 Oct-12-71.00 Oct-12-41.00

### S&P Year Rolling Period Total Returns

S&P 500 10 Year Rolling Period Total Returns Summary: 1926 June 2013 700% 600% 500% 400% 300% 200% 100% 0% 100% Scatter chart of all 931 ten year periods. There were 931 ten year rolling periods from January

### Data Mining. 1 Introduction 2 Data Mining methods. Alfred Holl Data Mining 1

Data Mining 1 Introduction 2 Data Mining methods Alfred Holl Data Mining 1 1 Introduction 1.1 Motivation 1.2 Goals and problems 1.3 Definitions 1.4 Roots 1.5 Data Mining process 1.6 Epistemological constraints

### Data Mining for Customer Service Support. Senioritis Seminar Presentation Megan Boice Jay Carter Nick Linke KC Tobin

Data Mining for Customer Service Support Senioritis Seminar Presentation Megan Boice Jay Carter Nick Linke KC Tobin Traditional Hotline Services Problem Traditional Customer Service Support (manufacturing)

### Comparison of K-means and Backpropagation Data Mining Algorithms

Comparison of K-means and Backpropagation Data Mining Algorithms Nitu Mathuriya, Dr. Ashish Bansal Abstract Data mining has got more and more mature as a field of basic research in computer science and

### Data Mining Solutions for the Business Environment

Database Systems Journal vol. IV, no. 4/2013 21 Data Mining Solutions for the Business Environment Ruxandra PETRE University of Economic Studies, Bucharest, Romania ruxandra_stefania.petre@yahoo.com Over

### An Introduction to Data Mining

An Introduction to Intel Beijing wei.heng@intel.com January 17, 2014 Outline 1 DW Overview What is Notable Application of Conference, Software and Applications Major Process in 2 Major Tasks in Detail

### Introduction to Data Mining and Machine Learning Techniques. Iza Moise, Evangelos Pournaras, Dirk Helbing

Introduction to Data Mining and Machine Learning Techniques Iza Moise, Evangelos Pournaras, Dirk Helbing Iza Moise, Evangelos Pournaras, Dirk Helbing 1 Overview Main principles of data mining Definition

### Data Clustering. Dec 2nd, 2013 Kyrylo Bessonov

Data Clustering Dec 2nd, 2013 Kyrylo Bessonov Talk outline Introduction to clustering Types of clustering Supervised Unsupervised Similarity measures Main clustering algorithms k-means Hierarchical Main

### Data Mining Part 5. Prediction

Data Mining Part 5. Prediction 5.1 Spring 2010 Instructor: Dr. Masoud Yaghini Outline Classification vs. Numeric Prediction Prediction Process Data Preparation Comparing Prediction Methods References Classification

### COPYRIGHTED MATERIAL. Contents. List of Figures. Acknowledgments

Contents List of Figures Foreword Preface xxv xxiii xv Acknowledgments xxix Chapter 1 Fraud: Detection, Prevention, and Analytics! 1 Introduction 2 Fraud! 2 Fraud Detection and Prevention 10 Big Data for

### International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014

RESEARCH ARTICLE OPEN ACCESS A Survey of Data Mining: Concepts with Applications and its Future Scope Dr. Zubair Khan 1, Ashish Kumar 2, Sunny Kumar 3 M.Tech Research Scholar 2. Department of Computer

### An Enhanced Clustering Algorithm to Analyze Spatial Data

International Journal of Engineering and Technical Research (IJETR) ISSN: 2321-0869, Volume-2, Issue-7, July 2014 An Enhanced Clustering Algorithm to Analyze Spatial Data Dr. Mahesh Kumar, Mr. Sachin Yadav

### Enhanced Vessel Traffic Management System Booking Slots Available and Vessels Booked per Day From 12-JAN-2016 To 30-JUN-2017

From -JAN- To -JUN- -JAN- VIRP Page Period Period Period -JAN- 8 -JAN- 8 9 -JAN- 8 8 -JAN- -JAN- -JAN- 8-JAN- 9-JAN- -JAN- -JAN- -JAN- -JAN- -JAN- -JAN- -JAN- -JAN- 8-JAN- 9-JAN- -JAN- -JAN- -FEB- : days

### ROBERTO BATTITI, MAURO BRUNATO. The LION Way: Machine Learning plus Intelligent Optimization. LIONlab, University of Trento, Italy, Apr 2015

ROBERTO BATTITI, MAURO BRUNATO. The LION Way: Machine Learning plus Intelligent Optimization. LIONlab, University of Trento, Italy, Apr 2015 http://intelligentoptimization.org/lionbook Roberto Battiti

### CLASSIFICATION AND CLUSTERING. Anveshi Charuvaka

CLASSIFICATION AND CLUSTERING Anveshi Charuvaka Learning from Data Classification Regression Clustering Anomaly Detection Contrast Set Mining Classification: Definition Given a collection of records (training

### Clustering in Machine Learning. By: Ibrar Hussain Student ID:

Clustering in Machine Learning By: Ibrar Hussain Student ID: 11021083 Presentation An Overview Introduction Definition Types of Learning Clustering in Machine Learning K-means Clustering Example of k-means

### An Analysis on Density Based Clustering of Multi Dimensional Spatial Data

An Analysis on Density Based Clustering of Multi Dimensional Spatial Data K. Mumtaz 1 Assistant Professor, Department of MCA Vivekanandha Institute of Information and Management Studies, Tiruchengode,

### Introduction to Machine Learning. Speaker: Harry Chao Advisor: J.J. Ding Date: 1/27/2011

Introduction to Machine Learning Speaker: Harry Chao Advisor: J.J. Ding Date: 1/27/2011 1 Outline 1. What is machine learning? 2. The basic of machine learning 3. Principles and effects of machine learning

### Data Mining 資 料 探 勘. 分 群 分 析 (Cluster Analysis)

Data Mining 資 料 探 勘 Tamkang University 分 群 分 析 (Cluster Analysis) DM MI Wed,, (:- :) (B) Min-Yuh Day 戴 敏 育 Assistant Professor 專 任 助 理 教 授 Dept. of Information Management, Tamkang University 淡 江 大 學 資

### COE BIDDING RESULTS 2009 Category B Cars >1600 cc

Quota System A COE BIDDING RESULTS 2009 B Jan-2009 Quota 1,839 1,839 1,100 1,099 274 268 409 411 767 758 Successful bids 1,784 1,832 1,100 1,097 274 260 401 386 763 748 Bids received 2,541 2,109 1,332

### Clustering & Association

Clustering - Overview What is cluster analysis? Grouping data objects based only on information found in the data describing these objects and their relationships Maximize the similarity within objects

### The Business Value of Call Accounting

WHITE PAPER The Business Value of Call Accounting How Call Accounting Software Helps Reduce Business Expenses and Improve Productivity Introduction Call accounting software has been available for more

### Machine Learning. Chapter 18, 21. Some material adopted from notes by Chuck Dyer

Machine Learning Chapter 18, 21 Some material adopted from notes by Chuck Dyer What is learning? Learning denotes changes in a system that... enable a system to do the same task more efficiently the next

### Lecture 20: Clustering

Lecture 20: Clustering Wrap-up of neural nets (from last lecture Introduction to unsupervised learning K-means clustering COMP-424, Lecture 20 - April 3, 2013 1 Unsupervised learning In supervised learning,

### Learning is a very general term denoting the way in which agents:

What is learning? Learning is a very general term denoting the way in which agents: Acquire and organize knowledge (by building, modifying and organizing internal representations of some external reality);

### K-means Clustering Technique on Search Engine Dataset using Data Mining Tool

International Journal of Information and Computation Technology. ISSN 0974-2239 Volume 3, Number 6 (2013), pp. 505-510 International Research Publications House http://www. irphouse.com /ijict.htm K-means

### Ashley Institute of Training Schedule of VET Tuition Fees 2015

Ashley Institute of Training Schedule of VET Fees Year of Study Group ID:DECE15G1 Total Course Fees \$ 12,000 29-Aug- 17-Oct- 50 14-Sep- 0.167 blended various \$2,000 CHC02 Best practice 24-Oct- 12-Dec-

### Analytics on Big Data

Analytics on Big Data Riccardo Torlone Università Roma Tre Credits: Mohamed Eltabakh (WPI) Analytics The discovery and communication of meaningful patterns in data (Wikipedia) It relies on data analysis

### An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015

An Introduction to Data Mining for Wind Power Management Spring 2015 Big Data World Every minute: Google receives over 4 million search queries Facebook users share almost 2.5 million pieces of content

### USING DATA SCIENCE TO DISCOVE INSIGHT OF MEDICAL PROVIDERS CHARGE FOR COMMON SERVICES

USING DATA SCIENCE TO DISCOVE INSIGHT OF MEDICAL PROVIDERS CHARGE FOR COMMON SERVICES Irron Williams Northwestern University IrronWilliams2015@u.northwestern.edu Abstract--Data science is evolving. In

### Introduction to Data Mining

Introduction to Data Mining Jay Urbain Credits: Nazli Goharian & David Grossman @ IIT Outline Introduction Data Pre-processing Data Mining Algorithms Naïve Bayes Decision Tree Neural Network Association

### DATA MINING TECHNOLOGY. Keywords: data mining, data warehouse, knowledge discovery, OLAP, OLAM.

DATA MINING TECHNOLOGY Georgiana Marin 1 Abstract In terms of data processing, classical statistical models are restrictive; it requires hypotheses, the knowledge and experience of specialists, equations,

### Fig. 1 A typical Knowledge Discovery process [2]

Volume 4, Issue 7, July 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com A Review on Clustering

### ON INTEGRATING UNSUPERVISED AND SUPERVISED CLASSIFICATION FOR CREDIT RISK EVALUATION

ISSN 9 X INFORMATION TECHNOLOGY AND CONTROL, 00, Vol., No.A ON INTEGRATING UNSUPERVISED AND SUPERVISED CLASSIFICATION FOR CREDIT RISK EVALUATION Danuta Zakrzewska Institute of Computer Science, Technical

### EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set

EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set Amhmed A. Bhih School of Electrical and Electronic Engineering Princy Johnson School of Electrical and Electronic Engineering Martin

### Architectural Services Data Summary March 2011

Firms Typically Small in Size According to the latest U.S. Census Survey of Business Owners, majority of the firms under the description Architectural Services are less than 500 in staff size (99.78%).

### Essential Components of an Integrated Data Mining Tool for the Oil & Gas Industry, With an Example Application in the DJ Basin.

Essential Components of an Integrated Data Mining Tool for the Oil & Gas Industry, With an Example Application in the DJ Basin. Petroleum & Natural Gas Engineering West Virginia University SPE Annual Technical

### Use of Data Mining Techniques to Improve the Effectiveness of Sales and Marketing

Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 4, Issue. 4, April 2015,

### MS1b Statistical Data Mining

MS1b Statistical Data Mining Yee Whye Teh Department of Statistics Oxford http://www.stats.ox.ac.uk/~teh/datamining.html Outline Administrivia and Introduction Course Structure Syllabus Introduction to

### USING SELF-ORGANISING MAPS FOR ANOMALOUS BEHAVIOUR DETECTION IN A COMPUTER FORENSIC INVESTIGATION

USING SELF-ORGANISING MAPS FOR ANOMALOUS BEHAVIOUR DETECTION IN A COMPUTER FORENSIC INVESTIGATION B.K.L. Fei, J.H.P. Eloff, M.S. Olivier, H.M. Tillwick and H.S. Venter Information and Computer Security

### Computing & Telecommunications Services Monthly Report March 2015

March 215 Monthly Report Computing & Telecommunications Services Monthly Report March 215 CaTS Help Desk (937) 775-4827 1-888-775-4827 25 Library Annex helpdesk@wright.edu www.wright.edu/cats/ Last Modified

### Clustering Connectionist and Statistical Language Processing

Clustering Connectionist and Statistical Language Processing Frank Keller keller@coli.uni-sb.de Computerlinguistik Universität des Saarlandes Clustering p.1/21 Overview clustering vs. classification supervised

### Data Mining Applications in Higher Education

Executive report Data Mining Applications in Higher Education Jing Luan, PhD Chief Planning and Research Officer, Cabrillo College Founder, Knowledge Discovery Laboratories Table of contents Introduction..............................................................2

### BIOINF 585 Fall 2015 Machine Learning for Systems Biology & Clinical Informatics http://www.ccmb.med.umich.edu/node/1376

Course Director: Dr. Kayvan Najarian (DCM&B, kayvan@umich.edu) Lectures: Labs: Mondays and Wednesdays 9:00 AM -10:30 AM Rm. 2065 Palmer Commons Bldg. Wednesdays 10:30 AM 11:30 AM (alternate weeks) Rm.

### Cluster Analysis: Basic Concepts and Algorithms

Cluster Analsis: Basic Concepts and Algorithms What does it mean clustering? Applications Tpes of clustering K-means Intuition Algorithm Choosing initial centroids Bisecting K-means Post-processing Strengths

### How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning

How to use Big Data in Industry 4.0 implementations LAURI ILISON, PhD Head of Big Data and Machine Learning Big Data definition? Big Data is about structured vs unstructured data Big Data is about Volume

### The Data Mining Process

Sequence for Determining Necessary Data. Wrong: Catalog everything you have, and decide what data is important. Right: Work backward from the solution, define the problem explicitly, and map out the data

### Computational Complexity between K-Means and K-Medoids Clustering Algorithms for Normal and Uniform Distributions of Data Points

Journal of Computer Science 6 (3): 363-368, 2010 ISSN 1549-3636 2010 Science Publications Computational Complexity between K-Means and K-Medoids Clustering Algorithms for Normal and Uniform Distributions

### Robust Outlier Detection Technique in Data Mining: A Univariate Approach

Robust Outlier Detection Technique in Data Mining: A Univariate Approach Singh Vijendra and Pathak Shivani Faculty of Engineering and Technology Mody Institute of Technology and Science Lakshmangarh, Sikar,

### Exploratory data analysis approaches unsupervised approaches. Steven Kiddle With thanks to Richard Dobson and Emanuele de Rinaldis

Exploratory data analysis approaches unsupervised approaches Steven Kiddle With thanks to Richard Dobson and Emanuele de Rinaldis Lecture overview Page 1 Ø Background Ø Revision Ø Other clustering methods

### Churn problem in retail banking Current methods in churn prediction models Fuzzy c-means clustering algorithm vs. classical k-means clustering

CHURN PREDICTION MODEL IN RETAIL BANKING USING FUZZY C- MEANS CLUSTERING Džulijana Popović Consumer Finance, Zagrebačka banka d.d. Bojana Dalbelo Bašić Faculty of Electrical Engineering and Computing University

### Using reporting and data mining techniques to improve knowledge of subscribers; applications to customer profiling and fraud management

Using reporting and data mining techniques to improve knowledge of subscribers; applications to customer profiling and fraud management Paper Jean-Louis Amat Abstract One of the main issues of operators

### A STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS

A STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS Mrs. Jyoti Nawade 1, Dr. Balaji D 2, Mr. Pravin Nawade 3 1 Lecturer, JSPM S Bhivrabai Sawant Polytechnic, Pune (India) 2 Assistant

### 15.564 Information Technology I. Business Intelligence

15.564 Information Technology I Business Intelligence Outline Operational vs. Decision Support Systems What is Data Mining? Overview of Data Mining Techniques Overview of Data Mining Process Data Warehouses

### not possible or was possible at a high cost for collecting the data.

Data Mining and Knowledge Discovery Generating knowledge from data Knowledge Discovery Data Mining White Paper Organizations collect a vast amount of data in the process of carrying out their day-to-day

### A New Ensemble Model for Efficient Churn Prediction in Mobile Telecommunication

2012 45th Hawaii International Conference on System Sciences A New Ensemble Model for Efficient Churn Prediction in Mobile Telecommunication Namhyoung Kim, Jaewook Lee Department of Industrial and Management

### WhitePaper. The Business Value of Call Accounting Software How Call Accounting Helps Reduce Telecom Expenses and Improve Productivity

The Business Value of Call Accounting Software How Call Accounting Helps Reduce Telecom Expenses and Improve Productivity WhitePaper We innovate. You benefit. The Business Value of Call Accounting Software

### MACHINE LEARNING AN INTRODUCTION

AN INTRODUCTION JOSEFIN ROSÉN, SENIOR ANALYTICAL EXPERT, SAS INSTITUTE JOSEFIN.ROSEN@SAS.COM TWITTER: @ROSENJOSEFIN AGENDA What is machine learning? When, where and how is machine learning used? Exemple

### Chapter 7. Cluster Analysis

Chapter 7. Cluster Analysis. What is Cluster Analysis?. A Categorization of Major Clustering Methods. Partitioning Methods. Hierarchical Methods 5. Density-Based Methods 6. Grid-Based Methods 7. Model-Based

### DATA MINING - SELECTED TOPICS

DATA MINING - SELECTED TOPICS Peter Brezany Institute for Software Science University of Vienna E-mail : brezany@par.univie.ac.at 1 MINING SPATIAL DATABASES 2 Spatial Database Systems SDBSs offer spatial

### PERFORMANCE ANALYSIS OF CLUSTERING ALGORITHMS IN DATA MINING IN WEKA

PERFORMANCE ANALYSIS OF CLUSTERING ALGORITHMS IN DATA MINING IN WEKA Prakash Singh 1, Aarohi Surya 2 1 Department of Finance, IIM Lucknow, Lucknow, India 2 Department of Computer Science, LNMIIT, Jaipur,

### A Review of Data Mining Techniques

Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 4, April 2014,

### Predicting the Risk of Heart Attacks using Neural Network and Decision Tree

Predicting the Risk of Heart Attacks using Neural Network and Decision Tree S.Florence 1, N.G.Bhuvaneswari Amma 2, G.Annapoorani 3, K.Malathi 4 PG Scholar, Indian Institute of Information Technology, Srirangam,

### Overview. Clustering. Clustering vs. Classification. Supervised vs. Unsupervised Learning. Connectionist and Statistical Language Processing

Overview Clustering Connectionist and Statistical Language Processing Frank Keller keller@coli.uni-sb.de Computerlinguistik Universität des Saarlandes clustering vs. classification supervised vs. unsupervised

### BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES

BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES 123 CHAPTER 7 BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES 7.1 Introduction Even though using SVM presents

### Role of Neural network in data mining

Role of Neural network in data mining Chitranjanjit kaur Associate Prof Guru Nanak College, Sukhchainana Phagwara,(GNDU) Punjab, India Pooja kapoor Associate Prof Swami Sarvanand Group Of Institutes Dinanagar(PTU)

### Health Spring Meeting May 2008 Session # 42: Dental Insurance What's New, What's Important

Health Spring Meeting May 2008 Session # 42: Dental Insurance What's New, What's Important Floyd Ray Martin, FSA, MAAA Thomas A. McInteer, FSA, MAAA Jonathan P. Polon, FSA Dental Insurance Fraud Detection

### Predictive Analytics Techniques: What to Use For Your Big Data. March 26, 2014 Fern Halper, PhD

Predictive Analytics Techniques: What to Use For Your Big Data March 26, 2014 Fern Halper, PhD Presenter Proven Performance Since 1995 TDWI helps business and IT professionals gain insight about data warehousing,

### Mining Educational Data to Improve Students Performance: A Case Study

Mining Educational Data to Improve Students Performance: A Case Study Mohammed M. Abu Tair, Alaa M. El-Halees Faculty of Information Technology Islamic University of Gaza Gaza, Palestine ABSTRACT Educational

### Clustering Anonymized Mobile Call Detail Records to Find Usage Groups

Clustering Anonymized Mobile Call Detail Records to Find Usage Groups Richard A. Becker, Ramón Cáceres, Karrie Hanson, Ji Meng Loh, Simon Urbanek, Alexander Varshavsky, and Chris Volinsky AT&T Labs - Research

### COURSE RECOMMENDER SYSTEM IN E-LEARNING

International Journal of Computer Science and Communication Vol. 3, No. 1, January-June 2012, pp. 159-164 COURSE RECOMMENDER SYSTEM IN E-LEARNING Sunita B Aher 1, Lobo L.M.R.J. 2 1 M.E. (CSE)-II, Walchand

### A Survey on Intrusion Detection System with Data Mining Techniques

A Survey on Intrusion Detection System with Data Mining Techniques Ms. Ruth D 1, Mrs. Lovelin Ponn Felciah M 2 1 M.Phil Scholar, Department of Computer Science, Bishop Heber College (Autonomous), Trichirappalli,

### Data Mining: A Preprocessing Engine

Journal of Computer Science 2 (9): 735-739, 2006 ISSN 1549-3636 2005 Science Publications Data Mining: A Preprocessing Engine Luai Al Shalabi, Zyad Shaaban and Basel Kasasbeh Applied Science University,

### Predictive Modeling Techniques in Insurance

Predictive Modeling Techniques in Insurance Tuesday May 5, 2015 JF. Breton Application Engineer 2014 The MathWorks, Inc. 1 Opening Presenter: JF. Breton: 13 years of experience in predictive analytics

### Mining the Software Change Repository of a Legacy Telephony System

Mining the Software Change Repository of a Legacy Telephony System Jelber Sayyad Shirabad, Timothy C. Lethbridge, Stan Matwin School of Information Technology and Engineering University of Ottawa, Ottawa,

### A Study Of Bagging And Boosting Approaches To Develop Meta-Classifier

A Study Of Bagging And Boosting Approaches To Develop Meta-Classifier G.T. Prasanna Kumari Associate Professor, Dept of Computer Science and Engineering, Gokula Krishna College of Engg, Sullurpet-524121,

### Clustering Genetic Algorithm

Clustering Genetic Algorithm Petra Kudová Department of Theoretical Computer Science Institute of Computer Science Academy of Sciences of the Czech Republic ETID 2007 Outline Introduction Clustering Genetic