Preparation and analysis of industrial data
|
|
- Marjory Moore
- 7 years ago
- Views:
Transcription
1 Preparation and Analysis of Industrial Data Swedish Institute of Computer Science
2 Outline Data Analysis of Industrial Data 1 Data Analysis of Industrial Data Analysing Industrial and Commercial Data The Data Analysis Process The Data Preparation and Merging Problem 2 Analysing Industrial Data Some Common Difficulties in Data Preparation Examples 3 A For Data Preparation An Overview of the Data Preparation and Transforms
3 Data Analysis and Data Mining Analysing Industrial and Commercial Data The Data Analysis Process Data Preparation and Merging The purpose of data analysis and data mining is to extract answers and useful patterns, such as regularities and rules, in data. These patterns can then be exploited in making predictions, diagnoses, classifications, anomaly detection, hypothesis testing, process structure detection etc. Several terms describe essentially the same process: Data analysis, data mining, knowledge discovery in databases (KDD) etc. To some degree, data mining and knowledge discovery can be described as practical application of Machine Learning methods on (often large) data sets.
4 Analysing Industrial and Commercial Data The Data Analysis Process Data Preparation and Merging Working Industrial and Commercial Applications (1) Virtual sensors, i.e. an indirect measurement computed based on values that are easier to access. Predictive maintenance and weak point analysis through e.g. maintenance and warranty databases. Incremental step-wise diagnosis of equipment such as car engines or process plants. Intelligent alarm filtering and prioritization of information to operators of complex systems.
5 Analysing Industrial and Commercial Data The Data Analysis Process Data Preparation and Merging Working Industrial and Commercial Applications (2) Fraud and fault detection in e.g. data communication systems and ebusiness. Sales and demand prediction, e.g. in power grids or retail. Speed-up trough model approximation in control systems. Clustering and classification of customers, e.g. for targeted pricing and advertising, and identification of churners, i.e. customers likely to change provider.
6 The Data Analysis Process Analysing Industrial and Commercial Data The Data Analysis Process Data Preparation and Merging Several attempts have been made to describe the complete analysis process, e.g. process models such as CRISP-DM and SEMMA, as well as several textbooks. The complete process usually includes 1 Problem understanding. 2 Data preparation and understanding. 3 Modelling. 4 Evaluation. 5 Model deployment. We will focus on the second step, data preparation and understanding.
7 Data Transformation and Modeling Analysing Industrial and Commercial Data The Data Analysis Process Data Preparation and Merging Quite often, the exact choice of model / learning system is not critical. Once preprocessing has turned data into something reasonable, simple models may be sufficient. With limited amounts of independent data, the number of free parameters must be kept low.
8 Preparation and Transformation of Data Analysing Industrial and Commercial Data The Data Analysis Process Data Preparation and Merging The process of collecting and preprocessing data as well as understanding the domain and the problem is a huge and the by far largest part of a data analysis project. Preparation of data for analysis is often very time consuming since Available data has been collected for various other purposes and is unlikely to fit the new needs perfectly. Data bases need to be merged, using different encodings and ways of identifying items. Efficient methodologies and tools are critical.
9 Data preparation Data Analysis of Industrial Data Analysing Industrial and Commercial Data The Data Analysis Process Data Preparation and Merging Data preparation is the process of constructing well structured data sets for modelling. The data preparation process includes Acquisition of relevant data. Cleaning data from noise, errors and outliers. Resampling, transposition and other transformations. Dimensionality and volume reduction. Merging several data sets into one.
10 Descriptions of Data Preparation Analysing Industrial and Commercial Data The Data Analysis Process Data Preparation and Merging The guidelines given in current data mining process descriptions and textbooks are described on a too general level to be useful in practice. More detailed descriptions exist, but Do not provide a practical guide on what steps to perform and when. Do not provide more than hints on how to create an application that supports these steps. Focus on corporate and business related data rather than industrial data.
11 Analysis of Industrial Data Analysing Industrial Data Some Common Difficulties in Data Preparation Examples Generally a mixture of logs from sensors, operations, events and transactions. Several completely different data sources, modes and encodings are not uncommon. Sensor data will reflect that sensors drift, deteriorate, and are replaced and serviced. Anomalies often depend on earlier parts of the process. Still, data analysis is critical when Trying to increase production process efficiency. Explaining phenomena that are not completely understood.
12 Interpretation Data Analysis of Industrial Data Analysing Industrial Data Some Common Difficulties in Data Preparation Examples Correct typing of parameters is essential for effective modelling. Plain-text encoding can cause severe interpretation problems. Often several encodings for one type of parameter. Encodings may be surprisingly complex and unusable for direct modelling. Many choices on representation can only be made with a good understanding of the domain or with the help of a domain expert.
13 Noise, Errors and Anomalies Analysing Industrial Data Some Common Difficulties in Data Preparation Examples Industrial and commercial data usually contain a large amount of noise and errors. Some of these errors may be corrected using domain knowledge or other information sources. This can be very important in data bases with few effective samples. Remaining errors should usually be removed to ensure correct modelling. Noise can sometimes be reduced using redundancy in parameters. Noise can appear in the most surprising forms and for very unusual reasons.
14 Missing values Data Analysis of Industrial Data Analysing Industrial Data Some Common Difficulties in Data Preparation Examples Real databases more commonly than not contain missing or invalid values. These may be explicitly encoded, but often use a rather ad-hoc representation. Missing values are handled in the same way as invalid values: The record can be removed. The value can be derived from other data. The value can be set to a default value. Missing or partial data can sometimes be found via external sources.
15 Data Relevance Data Analysis of Industrial Data Analysing Industrial Data Some Common Difficulties in Data Preparation Examples Sensors drift and are replaced. Sales patterns show periodic behaviour. There may be sampling bias in the data set. A large amount of parameters is no guarantee for successful modelling. What we actually might need to measure may still not be included. Validation is often very complicated due to a large and complex state space. We may have very few effective samples.
16 Examples of difficult encodings / errors Analysing Industrial Data Some Common Difficulties in Data Preparation Examples Attribute 1 Attribute 2 Attribute 3 Attribute 4 Attribute Nej red Valsbyte red hasp temp 680 blue Hasp-temp green kyln blue Kylning rde karkylning på alltid Blue Ja red ej i dragläge red
17 The Data Preparation Process An Overview of the Data Preparation Generally include a number of steps, some of which have to be iterated several times. Data preparation is often revisited quite far into the analysis and modelling process. To handle this, the methodology includes a number of Distinct operations. Explicit iterations. Intended to provide a structured context to support the data preparation process.
18 An Overview of the Data Preparation Overview of the Data Preparation Figure: Overview
19 Preparatory Steps Data Analysis of Industrial Data An Overview of the Data Preparation Figure: Preparatory Steps
20 Type Analysis and Data Loading An Overview of the Data Preparation Analysis of type, range, and format of parameters is often surprisingly difficult. The analysis may be complicated by e.g. Damaged or missing data. Ad-hoc encodings. Should serve as input to the repair and recoding step.
21 Repair and Recoding An Overview of the Data Preparation Data generally contain deficiencies and anomalies that can be rectified. An entry for a particular parameter being out of range or missing may mean that a subset of all other parameters are useless. Automatic detection of possible errors and data cleaning are highly useful.
22 Selection Data Analysis of Industrial Data An Overview of the Data Preparation A refinement of choices made in the selection of data sources. After the type analysis and repair steps we are in a better position to asses quality and usefulness of data. Reducing the sizes of data sets might be considered.
23 Merging Data Analysis of Industrial Data An Overview of the Data Preparation Often the most complex step of data preparation. Common merge problems include Different sample rates. Transposition. Event / series representations. Differences and errors in textual representations can cause severe problems.
24 Final Steps Data Analysis of Industrial Data An Overview of the Data Preparation Often a good idea to revisit data selection at this point. Suitable point for data partitioning. Introduction of domain knowledge using calculated parameters.
25 An Overview of the Data Preparation The methodology outlined earlier specifies what needs to be done to the data. What remains to describe is how to perform the necessary preparations. We have identified a small set of generic operations, with a focus on data transformation, of which all data preparation tasks are special cases. These operations can be composed into a reproducible sequence of transforms, making it possible to move between preparation steps in a controlled manner. The set of operations is for all practical purposes complete, and contains overlap in the functionality of the operations.
26 Overview of An Overview of the Data Preparation Preparation step Operation Description Visualisation and inspection inspect Inspect data plot Plot / visualise data Type analysis and data loading guess Guess parameter types sample read Extract and load sample of data source Read complete data Repair and recoding replace Replace entries derive match Compute derived parameter Match parameters Selection select Select entries and parameters Merging merge Resample and merge two data sets Table: Overview of generic operations and transforms
27 An Overview of the Data Preparation Visualisation, Inspection and Other Supporting Operations Very important for data preparation, but outside the scope of this presentation. Not only for discovering structure and dependencies, but to get a more basic understanding of what the parameters represent and on what form. Examples of useful plots are scatter plots, parallel coordinates plots, histograms, and simple series plots. Anomaly detection and structure analysis are critical operations.
28 Type Analysis and Data Loading An Overview of the Data Preparation We need a mechanism to specify and semi-automatically derive, or guess, parameter types. A generic read operation should support Sufficiently general input formats. Ability to guess the type of each parameter and modify it from manual specification. A mechanism to extract only a sample of a data source. These mechanisms can be provided by parameterising a single read operation.
29 Repair, Recoding and Derivation An Overview of the Data Preparation There is frequently a need to transform data into another unit or representation. This operation, replace, should Use a single transformation function returning values in accordance with a single resultant type. Let the transformation function inspect any number of parameter values. There is also often a need to derive new parameters, and this operation should Return for each entry in the input data a derived value as a function of a fixed selection of parameters in the old data. Return data conforming to a predetermined type, but allow for handling of anomalies.
30 Selection of Entries and Parameters An Overview of the Data Preparation Data sources frequently contain redundant or irrelevant parameters, and entries that are for some reason not usable. A primitive operation select should allow us to do simultaneous selection of parameters and entries by specifying indexes or names and a boolean filter function.
31 Merging and Resampling An Overview of the Data Preparation Resampling can be treated as a special case of merging. It is possible to specify a very general merge operation, that for non-trivial cases will require programming of functional parameters. In time, a library of such parameters can be built. For the majority of merge tasks, we have made the assumption that A merge is always done between two data sources. The entries in the merged data source will always consist of data derivable from exactly one entry in the first data object and a number of matching entries in the second. The match should be efficiently computable by comparing a specified number of parameter pairs. For each entry in the first input data source, a specified number of derived entries will be computed from the matching entries in the second.
32 Merging Steps Data Analysis of Industrial Data An Overview of the Data Preparation The assumptions allow us to partition the computation of the merge into two steps: 1 Computation of a number of matching entries in the second input data source for each entry in the first data source. 2 Computation of each derived entry by a supplied merge function from the single entry in the first and the corresponding match of entries in the second. Several types of common merge functions have been identified, such as interpolation, gradients, and derivatives. In short, a primitive merge operation should take as input exactly two data sources, a matching specification and a merge function, and produce a single data source as output.
33 Data Analysis of Industrial Data Data preparation and transformation is critical for the quality of the data analysis. Data preparation and understanding is time consuming and is usually the largest part of a data analysis project. Although data preparation and merging are exploratory by nature, the process can benefit greatly from a structured approach.
Practical Machine Learning and Data Analysis
Data Analysis of Industrial Data Practical Machine Learning and Data Analysis Swedish Institute of Computer Science Outline Data Analysis of Industrial Data 1 Data Analysis of Industrial Data Analysing
More informationAn Overview of Knowledge Discovery Database and Data mining Techniques
An Overview of Knowledge Discovery Database and Data mining Techniques Priyadharsini.C 1, Dr. Antony Selvadoss Thanamani 2 M.Phil, Department of Computer Science, NGM College, Pollachi, Coimbatore, Tamilnadu,
More informationIntroduction to Data Mining and Machine Learning Techniques. Iza Moise, Evangelos Pournaras, Dirk Helbing
Introduction to Data Mining and Machine Learning Techniques Iza Moise, Evangelos Pournaras, Dirk Helbing Iza Moise, Evangelos Pournaras, Dirk Helbing 1 Overview Main principles of data mining Definition
More informationThe Scientific Data Mining Process
Chapter 4 The Scientific Data Mining Process When I use a word, Humpty Dumpty said, in rather a scornful tone, it means just what I choose it to mean neither more nor less. Lewis Carroll [87, p. 214] In
More informationIntroduction. A. Bellaachia Page: 1
Introduction 1. Objectives... 3 2. What is Data Mining?... 4 3. Knowledge Discovery Process... 5 4. KD Process Example... 7 5. Typical Data Mining Architecture... 8 6. Database vs. Data Mining... 9 7.
More informationIMPROVING DATA INTEGRATION FOR DATA WAREHOUSE: A DATA MINING APPROACH
IMPROVING DATA INTEGRATION FOR DATA WAREHOUSE: A DATA MINING APPROACH Kalinka Mihaylova Kaloyanova St. Kliment Ohridski University of Sofia, Faculty of Mathematics and Informatics Sofia 1164, Bulgaria
More informationData mining as a tool of revealing the hidden connection of the plant
Data mining as a tool of revealing the hidden connection of the plant Honeywell AIDA Advanced Interactive Data Analysis Introduction What is AIDA? AIDA: Advanced Interactive Data Analysis Developped in
More informationData Mining for Manufacturing: Preventive Maintenance, Failure Prediction, Quality Control
Data Mining for Manufacturing: Preventive Maintenance, Failure Prediction, Quality Control Andre BERGMANN Salzgitter Mannesmann Forschung GmbH; Duisburg, Germany Phone: +49 203 9993154, Fax: +49 203 9993234;
More informationUsing reporting and data mining techniques to improve knowledge of subscribers; applications to customer profiling and fraud management
Using reporting and data mining techniques to improve knowledge of subscribers; applications to customer profiling and fraud management Paper Jean-Louis Amat Abstract One of the main issues of operators
More informationDatabase Marketing, Business Intelligence and Knowledge Discovery
Database Marketing, Business Intelligence and Knowledge Discovery Note: Using material from Tan / Steinbach / Kumar (2005) Introduction to Data Mining,, Addison Wesley; and Cios / Pedrycz / Swiniarski
More informationData Mining: Exploring Data. Lecture Notes for Chapter 3. Introduction to Data Mining
Data Mining: Exploring Data Lecture Notes for Chapter 3 Introduction to Data Mining by Tan, Steinbach, Kumar What is data exploration? A preliminary exploration of the data to better understand its characteristics.
More informationUnsupervised Data Mining (Clustering)
Unsupervised Data Mining (Clustering) Javier Béjar KEMLG December 01 Javier Béjar (KEMLG) Unsupervised Data Mining (Clustering) December 01 1 / 51 Introduction Clustering in KDD One of the main tasks in
More informationSPATIAL DATA CLASSIFICATION AND DATA MINING
, pp.-40-44. Available online at http://www. bioinfo. in/contents. php?id=42 SPATIAL DATA CLASSIFICATION AND DATA MINING RATHI J.B. * AND PATIL A.D. Department of Computer Science & Engineering, Jawaharlal
More informationnot possible or was possible at a high cost for collecting the data.
Data Mining and Knowledge Discovery Generating knowledge from data Knowledge Discovery Data Mining White Paper Organizations collect a vast amount of data in the process of carrying out their day-to-day
More informationLluis Belanche + Alfredo Vellido. Intelligent Data Analysis and Data Mining
Lluis Belanche + Alfredo Vellido Intelligent Data Analysis and Data Mining a.k.a. Data Mining II Office 319, Omega, BCN EET, office 107, TR 2, Terrassa avellido@lsi.upc.edu skype, gtalk: avellido Tels.:
More informationData Mining for Fun and Profit
Data Mining for Fun and Profit Data mining is the extraction of implicit, previously unknown, and potentially useful information from data. - Ian H. Witten, Data Mining: Practical Machine Learning Tools
More information480093 - TDS - Socio-Environmental Data Science
Coordinating unit: Teaching unit: Academic year: Degree: ECTS credits: 2015 480 - IS.UPC - University Research Institute for Sustainability Science and Technology 715 - EIO - Department of Statistics and
More informationExtend Table Lens for High-Dimensional Data Visualization and Classification Mining
Extend Table Lens for High-Dimensional Data Visualization and Classification Mining CPSC 533c, Information Visualization Course Project, Term 2 2003 Fengdong Du fdu@cs.ubc.ca University of British Columbia
More informationInternational Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014
RESEARCH ARTICLE OPEN ACCESS A Survey of Data Mining: Concepts with Applications and its Future Scope Dr. Zubair Khan 1, Ashish Kumar 2, Sunny Kumar 3 M.Tech Research Scholar 2. Department of Computer
More informationExample application (1) Telecommunication. Lecture 1: Data Mining Overview and Process. Example application (2) Health
Lecture 1: Data Mining Overview and Process What is data mining? Example applications Definitions Multi disciplinary Techniques Major challenges The data mining process History of data mining Data mining
More informationASSOCIATION RULE MINING ON WEB LOGS FOR EXTRACTING INTERESTING PATTERNS THROUGH WEKA TOOL
International Journal Of Advanced Technology In Engineering And Science Www.Ijates.Com Volume No 03, Special Issue No. 01, February 2015 ISSN (Online): 2348 7550 ASSOCIATION RULE MINING ON WEB LOGS FOR
More informationMachine Learning and Data Mining. Fundamentals, robotics, recognition
Machine Learning and Data Mining Fundamentals, robotics, recognition Machine Learning, Data Mining, Knowledge Discovery in Data Bases Their mutual relations Data Mining, Knowledge Discovery in Databases,
More informationHow To Use Neural Networks In Data Mining
International Journal of Electronics and Computer Science Engineering 1449 Available Online at www.ijecse.org ISSN- 2277-1956 Neural Networks in Data Mining Priyanka Gaur Department of Information and
More informationHow Organisations Are Using Data Mining Techniques To Gain a Competitive Advantage John Spooner SAS UK
How Organisations Are Using Data Mining Techniques To Gain a Competitive Advantage John Spooner SAS UK Agenda Analytics why now? The process around data and text mining Case Studies The Value of Information
More informationCustomer Classification And Prediction Based On Data Mining Technique
Customer Classification And Prediction Based On Data Mining Technique Ms. Neethu Baby 1, Mrs. Priyanka L.T 2 1 M.E CSE, Sri Shakthi Institute of Engineering and Technology, Coimbatore 2 Assistant Professor
More informationREFLECTIONS ON THE USE OF BIG DATA FOR STATISTICAL PRODUCTION
REFLECTIONS ON THE USE OF BIG DATA FOR STATISTICAL PRODUCTION Pilar Rey del Castillo May 2013 Introduction The exploitation of the vast amount of data originated from ICT tools and referring to a big variety
More informationData Exploration and Preprocessing. Data Mining and Text Mining (UIC 583 @ Politecnico di Milano)
Data Exploration and Preprocessing Data Mining and Text Mining (UIC 583 @ Politecnico di Milano) References Jiawei Han and Micheline Kamber, "Data Mining: Concepts and Techniques", The Morgan Kaufmann
More informationData Mining. SPSS Clementine 12.0. 1. Clementine Overview. Spring 2010 Instructor: Dr. Masoud Yaghini. Clementine
Data Mining SPSS 12.0 1. Overview Spring 2010 Instructor: Dr. Masoud Yaghini Introduction Types of Models Interface Projects References Outline Introduction Introduction Three of the common data mining
More informationData Mining: Overview. What is Data Mining?
Data Mining: Overview What is Data Mining? Recently * coined term for confluence of ideas from statistics and computer science (machine learning and database methods) applied to large databases in science,
More informationData Mining Applications in Manufacturing
Data Mining Applications in Manufacturing Dr Jenny Harding Senior Lecturer Wolfson School of Mechanical & Manufacturing Engineering, Loughborough University Identification of Knowledge - Context Intelligent
More informationNetwork Machine Learning Research Group. Intended status: Informational October 19, 2015 Expires: April 21, 2016
Network Machine Learning Research Group S. Jiang Internet-Draft Huawei Technologies Co., Ltd Intended status: Informational October 19, 2015 Expires: April 21, 2016 Abstract Network Machine Learning draft-jiang-nmlrg-network-machine-learning-00
More informationMachine Learning. Chapter 18, 21. Some material adopted from notes by Chuck Dyer
Machine Learning Chapter 18, 21 Some material adopted from notes by Chuck Dyer What is learning? Learning denotes changes in a system that... enable a system to do the same task more efficiently the next
More informationInternational Journal of Scientific & Engineering Research, Volume 5, Issue 4, April-2014 442 ISSN 2229-5518
International Journal of Scientific & Engineering Research, Volume 5, Issue 4, April-2014 442 Over viewing issues of data mining with highlights of data warehousing Rushabh H. Baldaniya, Prof H.J.Baldaniya,
More informationBUSINESS ANALYTICS. Overview. Lecture 0. Information Systems and Machine Learning Lab. University of Hildesheim. Germany
Tomáš Horváth BUSINESS ANALYTICS Lecture 0 Overview Information Systems and Machine Learning Lab University of Hildesheim Germany BA and its relation to BI Business analytics is the continuous iterative
More informationPerspectives on Data Mining
Perspectives on Data Mining Niall Adams Department of Mathematics, Imperial College London n.adams@imperial.ac.uk April 2009 Objectives Give an introductory overview of data mining (DM) (or Knowledge Discovery
More informationAn Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015
An Introduction to Data Mining for Wind Power Management Spring 2015 Big Data World Every minute: Google receives over 4 million search queries Facebook users share almost 2.5 million pieces of content
More informationWHAT DEVELOPERS ARE TALKING ABOUT?
WHAT DEVELOPERS ARE TALKING ABOUT? AN ANALYSIS OF STACK OVERFLOW DATA 1. Abstract We implemented a methodology to analyze the textual content of Stack Overflow discussions. We used latent Dirichlet allocation
More informationBusiness Intelligence and Decision Support Systems
Chapter 12 Business Intelligence and Decision Support Systems Information Technology For Management 7 th Edition Turban & Volonino Based on lecture slides by L. Beaubien, Providence College John Wiley
More informationUse of Data Mining in the field of Library and Information Science : An Overview
512 Use of Data Mining in the field of Library and Information Science : An Overview Roopesh K Dwivedi R P Bajpai Abstract Data Mining refers to the extraction or Mining knowledge from large amount of
More informationUsing Data Mining for Mobile Communication Clustering and Characterization
Using Data Mining for Mobile Communication Clustering and Characterization A. Bascacov *, C. Cernazanu ** and M. Marcu ** * Lasting Software, Timisoara, Romania ** Politehnica University of Timisoara/Computer
More informationEnhanced data mining analysis in higher educational system using rough set theory
African Journal of Mathematics and Computer Science Research Vol. 2(9), pp. 184-188, October, 2009 Available online at http://www.academicjournals.org/ajmcsr ISSN 2006-9731 2009 Academic Journals Review
More informationData Mining System, Functionalities and Applications: A Radical Review
Data Mining System, Functionalities and Applications: A Radical Review Dr. Poonam Chaudhary System Programmer, Kurukshetra University, Kurukshetra Abstract: Data Mining is the process of locating potentially
More informationVisionet IT Modernization Empowering Change
Visionet IT Modernization A Visionet Systems White Paper September 2009 Visionet Systems Inc. 3 Cedar Brook Dr. Cranbury, NJ 08512 Tel: 609 360-0501 Table of Contents 1 Executive Summary... 4 2 Introduction...
More informationEFFICIENT DATA PRE-PROCESSING FOR DATA MINING
EFFICIENT DATA PRE-PROCESSING FOR DATA MINING USING NEURAL NETWORKS JothiKumar.R 1, Sivabalan.R.V 2 1 Research scholar, Noorul Islam University, Nagercoil, India Assistant Professor, Adhiparasakthi College
More informationSanjeev Kumar. contribute
RESEARCH ISSUES IN DATAA MINING Sanjeev Kumar I.A.S.R.I., Library Avenue, Pusa, New Delhi-110012 sanjeevk@iasri.res.in 1. Introduction The field of data mining and knowledgee discovery is emerging as a
More informationData Mining in Telecommunication
Data Mining in Telecommunication Mohsin Nadaf & Vidya Kadam Department of IT, Trinity College of Engineering & Research, Pune, India E-mail : mohsinanadaf@gmail.com Abstract Telecommunication is one of
More information131-1. Adding New Level in KDD to Make the Web Usage Mining More Efficient. Abstract. 1. Introduction [1]. 1/10
1/10 131-1 Adding New Level in KDD to Make the Web Usage Mining More Efficient Mohammad Ala a AL_Hamami PHD Student, Lecturer m_ah_1@yahoocom Soukaena Hassan Hashem PHD Student, Lecturer soukaena_hassan@yahoocom
More informationA Review of Data Mining Techniques
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 4, April 2014,
More informationWelcome. Data Mining: Updates in Technologies. Xindong Wu. Colorado School of Mines Golden, Colorado 80401, USA
Welcome Xindong Wu Data Mining: Updates in Technologies Dept of Math and Computer Science Colorado School of Mines Golden, Colorado 80401, USA Email: xwu@ mines.edu Home Page: http://kais.mines.edu/~xwu/
More informationGrid Density Clustering Algorithm
Grid Density Clustering Algorithm Amandeep Kaur Mann 1, Navneet Kaur 2, Scholar, M.Tech (CSE), RIMT, Mandi Gobindgarh, Punjab, India 1 Assistant Professor (CSE), RIMT, Mandi Gobindgarh, Punjab, India 2
More informationData Mining: Exploring Data. Lecture Notes for Chapter 3. Introduction to Data Mining
Data Mining: Exploring Data Lecture Notes for Chapter 3 Introduction to Data Mining by Tan, Steinbach, Kumar Tan,Steinbach, Kumar Introduction to Data Mining 8/05/2005 1 What is data exploration? A preliminary
More informationDEALING WITH THE DATA An important assumption underlying statistical quality control is that their interpretation is based on normal distribution of t
APPLICATION OF UNIVARIATE AND MULTIVARIATE PROCESS CONTROL PROCEDURES IN INDUSTRY Mali Abdollahian * H. Abachi + and S. Nahavandi ++ * Department of Statistics and Operations Research RMIT University,
More informationDATA PREPARATION FOR DATA MINING
Applied Artificial Intelligence, 17:375 381, 2003 Copyright # 2003 Taylor & Francis 0883-9514/03 $12.00 +.00 DOI: 10.1080/08839510390219264 u DATA PREPARATION FOR DATA MINING SHICHAO ZHANG and CHENGQI
More informationUnderstanding Web personalization with Web Usage Mining and its Application: Recommender System
Understanding Web personalization with Web Usage Mining and its Application: Recommender System Manoj Swami 1, Prof. Manasi Kulkarni 2 1 M.Tech (Computer-NIMS), VJTI, Mumbai. 2 Department of Computer Technology,
More informationMaschinelles Lernen mit MATLAB
Maschinelles Lernen mit MATLAB Jérémy Huard Applikationsingenieur The MathWorks GmbH 2015 The MathWorks, Inc. 1 Machine Learning is Everywhere Image Recognition Speech Recognition Stock Prediction Medical
More informationGerard Mc Nulty Systems Optimisation Ltd gmcnulty@iol.ie/0876697867 BA.,B.A.I.,C.Eng.,F.I.E.I
Gerard Mc Nulty Systems Optimisation Ltd gmcnulty@iol.ie/0876697867 BA.,B.A.I.,C.Eng.,F.I.E.I Data is Important because it: Helps in Corporate Aims Basis of Business Decisions Engineering Decisions Energy
More informationIntroduction to Data Mining
Introduction to Data Mining 1 Why Data Mining? Explosive Growth of Data Data collection and data availability Automated data collection tools, Internet, smartphones, Major sources of abundant data Business:
More informationStructural Health Monitoring Tools (SHMTools)
Structural Health Monitoring Tools (SHMTools) Getting Started LANL/UCSD Engineering Institute LA-CC-14-046 c Copyright 2014, Los Alamos National Security, LLC All rights reserved. May 30, 2014 Contents
More informationBig Data with Rough Set Using Map- Reduce
Big Data with Rough Set Using Map- Reduce Mr.G.Lenin 1, Mr. A. Raj Ganesh 2, Mr. S. Vanarasan 3 Assistant Professor, Department of CSE, Podhigai College of Engineering & Technology, Tirupattur, Tamilnadu,
More informationL25: Ensemble learning
L25: Ensemble learning Introduction Methods for constructing ensembles Combination strategies Stacked generalization Mixtures of experts Bagging Boosting CSCE 666 Pattern Analysis Ricardo Gutierrez-Osuna
More informationECLT 5810 E-Commerce Data Mining Techniques - Introduction. Prof. Wai Lam
ECLT 5810 E-Commerce Data Mining Techniques - Introduction Prof. Wai Lam Data Opportunities Business infrastructure have improved the ability to collect data Virtually every aspect of business is now open
More informationDATA MINING TOOL FOR INTEGRATED COMPLAINT MANAGEMENT SYSTEM WEKA 3.6.7
DATA MINING TOOL FOR INTEGRATED COMPLAINT MANAGEMENT SYSTEM WEKA 3.6.7 UNDER THE GUIDANCE Dr. N.P. DHAVALE, DGM, INFINET Department SUBMITTED TO INSTITUTE FOR DEVELOPMENT AND RESEARCH IN BANKING TECHNOLOGY
More informationHELP DESK SYSTEMS. Using CaseBased Reasoning
HELP DESK SYSTEMS Using CaseBased Reasoning Topics Covered Today What is Help-Desk? Components of HelpDesk Systems Types Of HelpDesk Systems Used Need for CBR in HelpDesk Systems GE Helpdesk using ReMind
More informationClustering Technique in Data Mining for Text Documents
Clustering Technique in Data Mining for Text Documents Ms.J.Sathya Priya Assistant Professor Dept Of Information Technology. Velammal Engineering College. Chennai. Ms.S.Priyadharshini Assistant Professor
More informationCOM CO P 5318 Da t Da a t Explora Explor t a ion and Analysis y Chapte Chapt r e 3
COMP 5318 Data Exploration and Analysis Chapter 3 What is data exploration? A preliminary exploration of the data to better understand its characteristics. Key motivations of data exploration include Helping
More informationKnowledge Discovery and Data Mining. Structured vs. Non-Structured Data
Knowledge Discovery and Data Mining Unit # 2 1 Structured vs. Non-Structured Data Most business databases contain structured data consisting of well-defined fields with numeric or alphanumeric values.
More informationLearning is a very general term denoting the way in which agents:
What is learning? Learning is a very general term denoting the way in which agents: Acquire and organize knowledge (by building, modifying and organizing internal representations of some external reality);
More informationWhat is Data Mining? Data Mining (Knowledge discovery in database) Data mining: Basic steps. Mining tasks. Classification: YES, NO
What is Data Mining? Data Mining (Knowledge discovery in database) Data Mining: "The non trivial extraction of implicit, previously unknown, and potentially useful information from data" William J Frawley,
More informationResearch of Postal Data mining system based on big data
3rd International Conference on Mechatronics, Robotics and Automation (ICMRA 2015) Research of Postal Data mining system based on big data Xia Hu 1, Yanfeng Jin 1, Fan Wang 1 1 Shi Jiazhuang Post & Telecommunication
More information1.1 Difficulty in Fault Localization in Large-Scale Computing Systems
Chapter 1 Introduction System failures have been one of the biggest obstacles in operating today s largescale computing systems. Fault localization, i.e., identifying direct or indirect causes of failures,
More informationData Engineering for the Analysis of Semiconductor Manufacturing Data
Data Engineering for the Analysis of Semiconductor Manufacturing Data Peter Turney Knowledge Systems Laboratory Institute for Information Technology National Research Council Canada Ottawa, Ontario, Canada
More informationICT Perspectives on Big Data: Well Sorted Materials
ICT Perspectives on Big Data: Well Sorted Materials 3 March 2015 Contents Introduction 1 Dendrogram 2 Tree Map 3 Heat Map 4 Raw Group Data 5 For an online, interactive version of the visualisations in
More informationChapter ML:XI. XI. Cluster Analysis
Chapter ML:XI XI. Cluster Analysis Data Mining Overview Cluster Analysis Basics Hierarchical Cluster Analysis Iterative Cluster Analysis Density-Based Cluster Analysis Cluster Evaluation Constrained Cluster
More informationData Mining Analytics for Business Intelligence and Decision Support
Data Mining Analytics for Business Intelligence and Decision Support Chid Apte, T.J. Watson Research Center, IBM Research Division Knowledge Discovery and Data Mining (KDD) techniques are used for analyzing
More informationVisualization methods for patent data
Visualization methods for patent data Treparel 2013 Dr. Anton Heijs (CTO & Founder) Delft, The Netherlands Introduction Treparel can provide advanced visualizations for patent data. This document describes
More informationClass 10. Data Mining and Artificial Intelligence. Data Mining. We are in the 21 st century So where are the robots?
Class 1 Data Mining Data Mining and Artificial Intelligence We are in the 21 st century So where are the robots? Data mining is the one really successful application of artificial intelligence technology.
More informationRESEARCH PAPERS FACULTY OF MATERIALS SCIENCE AND TECHNOLOGY IN TRNAVA SLOVAK UNIVERSITY OF TECHNOLOGY IN BRATISLAVA
RESEARCH PAPERS FACULTY OF MATERIALS SCIENCE AND TECHNOLOGY IN TRNAVA SLOVAK UNIVERSITY OF TECHNOLOGY IN BRATISLAVA 2013 Number 33 BUSINESS INTELLIGENCE IN PROCESS CONTROL Alena KOPČEKOVÁ, Michal KOPČEK,
More informationCAPTURING THE VALUE OF UNSTRUCTURED DATA: INTRODUCTION TO TEXT MINING
CAPTURING THE VALUE OF UNSTRUCTURED DATA: INTRODUCTION TO TEXT MINING Mary-Elizabeth ( M-E ) Eddlestone Principal Systems Engineer, Analytics SAS Customer Loyalty, SAS Institute, Inc. Is there valuable
More informationRequirements Analysis Concepts & Principles. Instructor: Dr. Jerry Gao
Requirements Analysis Concepts & Principles Instructor: Dr. Jerry Gao Requirements Analysis Concepts and Principles - Requirements Analysis - Communication Techniques - Initiating the Process - Facilitated
More informationInferential Statistics. Data Mining. ASC September Proving value in complex analytics. 2 Rivers. Information and Data Management
ASC September Proving value in complex analytics 26 th September 2014 John McConnell Information and Data Management 1 1 2 Rivers Research Operational/Transactional Inferential Statistics Inferring parameter
More informationData Mining and Knowledge Discovery in Databases (KDD) State of the Art. Prof. Dr. T. Nouri Computer Science Department FHNW Switzerland
Data Mining and Knowledge Discovery in Databases (KDD) State of the Art Prof. Dr. T. Nouri Computer Science Department FHNW Switzerland 1 Conference overview 1. Overview of KDD and data mining 2. Data
More informationHealthcare Measurement Analysis Using Data mining Techniques
www.ijecs.in International Journal Of Engineering And Computer Science ISSN:2319-7242 Volume 03 Issue 07 July, 2014 Page No. 7058-7064 Healthcare Measurement Analysis Using Data mining Techniques 1 Dr.A.Shaik
More informationCase studies: Outline. Requirement Engineering. Case Study: Automated Banking System. UML and Case Studies ITNP090 - Object Oriented Software Design
I. Automated Banking System Case studies: Outline Requirements Engineering: OO and incremental software development 1. case study: withdraw money a. use cases b. identifying class/object (class diagram)
More informationChapter 3: Cluster Analysis
Chapter 3: Cluster Analysis 3.1 Basic Concepts of Clustering 3.2 Partitioning Methods 3.3 Hierarchical Methods 3.4 Density-Based Methods 3.5 Model-Based Methods 3.6 Clustering High-Dimensional Data 3.7
More informationData Preprocessing. Week 2
Data Preprocessing Week 2 Topics Data Types Data Repositories Data Preprocessing Present homework assignment #1 Team Homework Assignment #2 Read pp. 227 240, pp. 250 250, and pp. 259 263 the text book.
More informationYour logbook. Choosing a topic
This booklet contains information that will be used to complete a science fair project for the César Chávez science fair. It is designed to help participants to successfully complete a project. This booklet
More informationrelevant to the management dilemma or management question.
CHAPTER 5: Clarifying the Research Question through Secondary Data and Exploration (Handout) A SEARCH STRATEGY FOR EXPLORATION Exploration is particularly useful when researchers lack a clear idea of the
More informationIAI : Expert Systems
IAI : Expert Systems John A. Bullinaria, 2005 1. What is an Expert System? 2. The Architecture of Expert Systems 3. Knowledge Acquisition 4. Representing the Knowledge 5. The Inference Engine 6. The Rete-Algorithm
More informationData Mining Solutions for the Business Environment
Database Systems Journal vol. IV, no. 4/2013 21 Data Mining Solutions for the Business Environment Ruxandra PETRE University of Economic Studies, Bucharest, Romania ruxandra_stefania.petre@yahoo.com Over
More informationDATA ANALYSIS USING BUSINESS INTELLIGENCE TOOL. A Thesis. Presented to the. Faculty of. San Diego State University. In Partial Fulfillment
DATA ANALYSIS USING BUSINESS INTELLIGENCE TOOL A Thesis Presented to the Faculty of San Diego State University In Partial Fulfillment of the Requirements for the Degree Master of Science in Computer Science
More informationINTRODUCTION TO DATA MINING SAS ENTERPRISE MINER
INTRODUCTION TO DATA MINING SAS ENTERPRISE MINER Mary-Elizabeth ( M-E ) Eddlestone Principal Systems Engineer, Analytics SAS Customer Loyalty, SAS Institute, Inc. AGENDA Overview/Introduction to Data Mining
More informationData Mining Analysis of a Complex Multistage Polymer Process
Data Mining Analysis of a Complex Multistage Polymer Process Rolf Burghaus, Daniel Leineweber, Jörg Lippert 1 Problem Statement Especially in the highly competitive commodities market, the chemical process
More informationSocio technical Systems. Ian Sommerville 2006 Software Engineering, 8th edition. Chapter 2 Slide 1
Socio technical Systems Ian Sommerville 2006 Software Engineering, 8th edition. Chapter 2 Slide 1 Objectives To explain what a socio technical system is and the distinction between this and a computer
More informationData Warehousing and Data Mining in Business Applications
133 Data Warehousing and Data Mining in Business Applications Eesha Goel CSE Deptt. GZS-PTU Campus, Bathinda. Abstract Information technology is now required in all aspect of our lives that helps in business
More informationPREDICTIVE TECHNIQUES IN SOFTWARE ENGINEERING : APPLICATION IN SOFTWARE TESTING
PREDICTIVE TECHNIQUES IN SOFTWARE ENGINEERING : APPLICATION IN SOFTWARE TESTING Jelber Sayyad Shirabad Lionel C. Briand, Yvan Labiche, Zaheer Bawar Presented By : Faezeh R.Sadeghi Overview Introduction
More informationA Framework for Intelligent Online Customer Service System
A Framework for Intelligent Online Customer Service System Yiping WANG Yongjin ZHANG School of Business Administration, Xi an University of Technology Abstract: In a traditional customer service support
More informationApplication of Data Mining Techniques in Intrusion Detection
Application of Data Mining Techniques in Intrusion Detection LI Min An Yang Institute of Technology leiminxuan@sohu.com Abstract: The article introduced the importance of intrusion detection, as well as
More informationData Project Extract Big Data Analytics course. Toulouse Business School London 2015
Data Project Extract Big Data Analytics course Toulouse Business School London 2015 How do you analyse data? Project are often a flop: Need a problem, a business problem to solve. Start with a small well-defined
More information2.1. Data Mining for Biomedical and DNA data analysis
Applications of Data Mining Simmi Bagga Assistant Professor Sant Hira Dass Kanya Maha Vidyalaya, Kala Sanghian, Distt Kpt, India (Email: simmibagga12@gmail.com) Dr. G.N. Singh Department of Physics and
More informationChapter 7. Using Hadoop Cluster and MapReduce
Chapter 7 Using Hadoop Cluster and MapReduce Modeling and Prototyping of RMS for QoS Oriented Grid Page 152 7. Using Hadoop Cluster and MapReduce for Big Data Problems The size of the databases used in
More information