Defect Analytics in a High-End Server Manufacturing Environment

Size: px
Start display at page:

Download "Defect Analytics in a High-End Server Manufacturing Environment"

Transcription

1 Proceedings of the 2015 Industrial and Systems Engineering Research Conference S. Cetinkaya and J. Ryan, eds. Defect Analytics in a High-End Server Manufacturing Environment Faisal Aqlan Industrial Engineering Department The Pennsylvania State University, The Behrend College Erie, PA Chanchal Saha Department of Systems Science and Industrial Engineering State University of New York at Binghamton Binghamton, NY Sreekanth Ramakrishnan IBM Corporation 1 Rogers St, Cambridge, MA Abstract Server manufacturing is characterized by extensive test processes to ensure high quality and reliability of the servers. Server components are obtained from different suppliers who may have different specifications. Although outsourcing of components provides many potential benefits to the company, it can also cause quality issues. If quality issues are not addressed effectively at the initial stages, defects can transit through the supply chain. Thus, quality control is one of the major challenges for the high-end server manufacturing industries. Defective parts are either disposed, repaired, or returned to the supplier depending on the type of defects. Product quality is ensured through multiple test processes at the manufacturing and design stages are substantially expensive. The defect-related quality test results are stored in different databases in both structured and unstructured data format. In this study, defect analytics models are used for defect assessment of more than 5,000 different defect instances collected from different databases sources of a highend server manufacturing environment. Analytics models including cluster analysis, neural networks, and text mining to characterize and predict the defect root causes and solutions. The proposed defect analytics framework replaced the current manual defect analysis method which is based on trial and error. Keywords Defect analytics, defect characterization, cluster analysis, artificial neural network, text mining, server manufacturing 1. Introduction Data analytics has emerged as one of the main research areas in the last few years. Companies also found data analytics as a big opportunity to utilize for improving their performance. In integrated manufacturing environments such as high-end manufacturing, parts, and components are supplied by different suppliers who may have different specifications. Extensive test processes are required to ensure high quality and treat any defect at the earlier stages. Manual and automated systems have been developed to detect and resolve the defects in such environments. However, even with the automated defect detection systems, defect can still arise in which root causes and solutions are not known. In many manufacturing environments, defect resolutions tend to be based on trial and error. This process consumes time and effort to troubleshoot the defect root causes and to identify proper solutions. The typical characteristics of server manufacturing include aggressive new product introduction cycles, continuous quality improvements, extremely skewed demand patterns, high penalty costs from end-product order fulfillment, lower forecasting accuracies due to nature of production process and long lead times, thin profit margins, and a continuously increasing number of parts and features [1-3]. As a result of these characteristics, extensive test processes to ensure high quality and reliability of the servers are extremely critical in this environment. Quality issues are major disruptions of the operations in the high-end server manufacturing. Defective parts may be disposed, repaired, or

2 returned to the supplier depending on the issue. Removal of defective parts is necessary to protect the company s product, image and reputation, and customer satisfaction. Product quality is ensured through test processes, manufacturing, and design. Since server components are very expensive and they should have high quality, they are tested multiple times in both suppliers sites and manufacturers sites. Figure 1 shows the material flow and test processes for the server manufacturing environment. Major quality risk events can disrupt the smooth flow of products and operations in the supply chain. Quality management is responsible for stopping the flow of defective materials to the customers. Figure 1: Material flow and test processes for server manufacturing environment Defect management in manufacturing environments requires effective identification of the defects, finding the proper solutions for these defects, and providing the required resources and tools to repair the defects. Predicting and preventing the defects or quality issues before they can occur is the focus of quality risk management. Several tools are used for analyzing the defects such as Risk Ranking and Filtering (RRF), Failure Mode and Effect Analysis (FMEA), Hazard and Operability Analysis (HAZOP), and Fault Tree Analysis (FTA). Furthermore, automated systems have been proposed to identify defects and retrieve related solutions from the database. However, these systems do not consider the required skills and resources to solve the problem. The remainder of this paper is organized as follows: Section 2 discusses the literature related to data analytics methods for defect management. Section 3 presents the proposed framework for defect analytics. Section 4 discusses the case study for defect analytics in a high-end server manufacturing environment. Finally, conclusions and recommendations are discussed in Section Literature Review The challenges that are faced by the companies in quality risk management mainly arise from the lack of early defect detection mechasims. Furthermore, the problem is exacerbated by the time consuming yet erronous solutions retrival mechanisms that are currently available. Thus, accurate defect prediction and prompt retrival of defect resolution mechanisms are important for the quality risk management sysem of a company. Many predictive models can be found in literature including discriminant analysis, statistical methods, logistic regression, factor analysis, fuzzy classification, classification trees, Bayesian network, Artificial Neural Networks, support vector machines for defect prediction. It was claimed that NN has been proven to be more effective in prediction compared to statistical tools and expert systems [4]. However, a predective model should be chosen based on the complexity of problem, i.e., types of inputs and outputs (data structure), their relationships, data availability, nature of problems, and expected outcomes. This section presents a thorough review of literature related to the defect detection predective models using structured data, their application areas more specially, application of ANN based models, and scope of defect predection and resolution through through text mining of unstructured data. An intelligent defect analysis framework was proposed that automatically gathers manufacturing process data from all the related databases to determine the root-cause of a process excursion. The proposed model combined both special and temporal data, and analyzed them using artificial intelligence methods. The real-time output was presented through a multi-dimensional cubic structure. Although, the framework outlined an intelligent defect analysis method, however the author did not measure its effectiveness by implementing the model into any real environment [5]. Thus, this study can be extended by conducting performance measures, i.e., survey among the users, accuracy and reliability analysis, time-saving experiments for the proposed framework. An ANN based classification method was proposed to classify software into defect prone and non-defect prone classes. This early defect detection approach compared three algorithms to capture the misclassifications of non-defect prone software considering time and cost metrics. The 2

3 threshold-moving algorithm was claimed to be the most cost-sensitive software for the defect prediction [6]. A data mining approach was proposed to identify the attributes responsible for the defective software modules. This extracted knowledge was applied in defect prediction using a data mining model that is a weighted voting rule of four data mining clustering algorithms, namely Naïve Bayes, ANN, Association Rules, and Decision Tree algorithms [7]. Generalization of the proposed model can be a potential future direction to detect defects in manufacturing processes. A case-based reasoning system was proposed to predict the defects in the Printed Circuit Board (PCB) design. In casebased method, a case database stores all the past defect cases along with their design specifications, defect items, and corresponding costs. The past cases were clustered and ranked using vantage based case indexing mechanism to accelerate the case retrieval efficiency for a new case similar to past cases. Finally, a reasoning algorithm proposed the defect costs for the defective items [8]. Thus, in future, a factorial analysis of the design parameters can be conducted to determine the value of threshold parameters of the reasoning algorithm. Another study proposed a Naïve Bayes classifier based statistical method for defect prediction. The authors recommended to pay more attention to calibrating defect prediction model for that particularproblem rather searching for complex algorithms [9]. ANN is an effective tool for prediction because it can analyze the behavior of a system with certain amount of data to train the system and correlate it with other system parameters. Accurate predictions are important for a Supply Chain Network (SCN), as incorrect prediction not only affects a single stage of a company s Supply Chain (SC) but also the entire SC of that company as well as other stakeholders compancies. As stated earlier, ANN has been proven to be more effective in prediction compared to statistical tools and expert systems [4]. From the users perspective, prediction is probably the most discussed application in ANN domain. ANNs are increasingly used for short and long term demand forecasting and automatic defect predictions for electric loads, energy consumption, pattern recognitions, and stock markets [4, 10]. Reducing total cost in SC has become a crucial issue. Thus, ANNs can help in reducing or eliminating defects that are affecting the production or supply network of an SC by developing better forecasting models. ANNs are used in SC for optimization (logistics management, resource allocation, and scheduling), modeling and simulation (discrete event simulation, dynamic systems theory), defect prediction, globalization (interactions among different activities at different locations), decision support (data query, analysis, and management), and forecasting (any state from one echelon propagate to others in a SC) [10, 11]. However, ANN-based Artificial Intelligence (AI) models are very effective in analyzing only the structured data. Thus, for analyzing unstructured data, attention can be extended to Natural Language Processing (NLP). In current times, many sources including social media, mobile transactions, business networks, scientific experiments as well as operational domains such as healthcare, bioinformatics, finance, manufacturing industries are generating a remarkable amount of data and the amount is increasing rapidly. In response to that, studies on collecting, storing, cleaning, analyzing, and presenting new meaningful and real-time insights of these data have gained tremendous growth. The analytics associated with the big data analysis not only complements traditional statistics, surveys, archival data sources, hypothesis testing but also aim to explore novel patterns or predict future trends from the big data [12, 13]. In the research paradigm of big data analytics, one of the application areas of growing interest is text analytics which can be used for opinion mining and sentiment analysis [14]. In general, sentiment analysis and opinion mining refer to the same techniques that are derived from and based upon NLP, Information Retrieval (IR), Information Extraction (IE), and AI. Typical tasks of sentiment analysis include: (1) finding data relevant to a specific topic or purpose; (2) pre-processing collected data, e.g., summarizing data into single words and extracting relevant information from them; and (3) identifying the sentiment surrounding a product or service [15]. Sentiment analysis technologies, a special type of text mining, can be applied for extracting opinions and sentiments from unstructured human-authored documents [16]. Thus, NLP can be an excellent tool for handling many business intelligence tasks including reputation management, public relations, defect prediction and resolutions, tracking public viewpoints, as well as market trend prediction. In NLP, sentiment analysis takes the challenge of classifying the orientation of texts either into positive or negative to help the machines understand texts similar to human. The texts are analyzed at different levels, such as, word or phrase, sentence, document level or user level. Word level sentiment analysis explore the orientation of the words or phrases in the text as well as their effect on the overall sentiment, while sentence level expresses a single opinion and tries to define its orientation from sentences. The document level opinion mining looks at the overall sentiment of the whole document, and user level sentiment searches for the possibility that connected users on the social network could have the same opinion [17]. Three different approaches, namely machine learning approach, lexicon based, and linguistic analysis are found to be applied in sentiment analysis to classify texts. Machine learning methods are based on training an algorithm, mostly classification on a set of selected features for a specific mission and then test on another set whether it is able to detect the right features and give the right classification. Naïve Bayes, maximum entropy and SVM are used as sentiment 3

4 classifiers in this method. A lexicon based method depends on a predefined list or corpus of words with a certain polarity. An algorithm is then searching for those words, counting them or estimating their weight and measuring the overall polarity of the text. Lastly, the linguistic approach uses the syntactic characteristics of the words or phrases, the negation, and the structure of the text to determine the text orientation. This approach is usually combined with a lexicon based method [17, 18]. A study was conducted to find the relationship between public sentiment and stock market price using Twitter streams. They proposed an active learning approach using Support Vector Machine (SVM) classifier to query the news feed of the Twitter streams as an active learning process for the sentiment analysis [19]. Their proposed model was able to predict the stock market price movements a few days in advance. Another study also applied SVM to classify the topics for sentiment analysis [18]. The authors claimed that pre-processing of texts using SVM can improve the accuracy of the results. Many studies can be found on defect prediction and resolution that applied structured data in risk management. However, there are limited studies available considering both structured and unstructured data format for model development. To the best of the authors knowledge, none of the previous study applied both data format for defect prediction and resolution. Therefore, in this study, an initiative is taken to propose a defect analytics framework for defect prediction and resolution considering structured and unstructured data format. 3. Proposed Framework for Defect Analytics The proposed framework for defect analytics utilizes both structured and unstructured data for defect characterization and assessment. Figure 2 shows the proposed framework in which analytics models that are used to predict and resolve the defects. For the unstructured data, the individual defect files are kept together to form the corpus, which is a collection of documents. Text analytics models are then used to characterize the defects. Predictive analytics models are also used to characterize the defects based on the structured data. The output both text analytics and predictive analytics models are then used to predict defect root cause and potential solutions. Figure 2: Proposed defect analytics framework 3.1 Unstructured Data Analytics Unstructured data analytics is used to characterize and classify the defects. The proposed framework for defect unstructured data analytics is shown in Figure 3. The unstructured data framework consists of the following steps: 4

5 1. Documents collection step collects documents that include the unstructured data on defects 2. Text analysis and concept extraction step analyze text using NLP 3. Text link analysis step identifies relationships between the concepts using pattern matching 4. Building defect categories relies on the extracted concepts from the text link analysis. In this step, a clustering method is used to cluster the defect into categories based on the similarities in the extracted concepts 5. Defect characterization step in which the defects are characterized based on the concepts in each category Figure 3: Unstructured data analytics for defect assessment 3.2 Structured Data Analytics Structured data are used to predict defect root causes and potential solutions using the ANN. Structured data analytics consists of two main steps: 1) predicting the root cause of the defect and 2) predicting the potential solutions of the defect. For predicting the root causes, the main defect attributes that are used as inputs include: defect type, product characteristics, production environment variables, and the defect categories obtained by the text analytics model. For predicting the potential solutions, the attributes considered as inputs are the resource attributes and the predicted root causes. The proposed ANN structure for the defect root cause and solution prediction is shown in Figure 4. Examples of defect attributes that are used as inputs for root cause predictions include: part type, part size or capacity, and part supplier. Examples of production environment variables include: production stage and time-to-failure. Examples of resource availability that is used as an input for solution prediction include: available spare parts for repair, cost of disposal, etc. Figure 4: ANN based approach for predicting defect root cause and solution 4. Case Study: Defect Analytics in High-End Server Manufacturing Environment Server manufacturing environment is relatively complex and it is prone to many quality problems that could be caused by external suppliers and internal processes. Since the server manufacturing environment requires extremely high reliability and quality assurance, thus their test processes are expected to be very accurate and downtimes free. Figure 5

6 5 shows a high level overview of the main stages of the high-end server production process that is considered in this study. In this production process, there are three test stages: panel test, assembly or fabrication test, and fulfillment test. Structured and unstructured data of 5,000 defects data points of the three test stages were collected from different databases. Figure 5: Process flow of high-end server manufacturing The part considered in this study is the Memory Card which is also known as known as Dual In-line Memory Module (DIMM). The process flow of the DIMM inspection, test, and assembly is shown in Figure 6. Non-value added processes were highlighted with red frame while value-added processes were lighted with green frame. The figure shows the assembly and test processes that are performed on the DIMMs and the different movements of the DIMMS between the inventory and production area locations. The defect analytics framework is implemented using IBM SPSS Modeler software. The analytics models for defect root cause and solution prediction are shown in Figure 7. The unstructured data were characterized using the concept of text mining analytics. The text mining analytics model uses linguistic and frequency techniques of NLP methods to extract the key concepts from the unstructured data and categorize the data according to its concepts and patterns. The text mining model extracted 479 concepts by analyzing the unstructured data. By careful observation of extracted concepts usage percentage and technical importance, 179 key concepts were selected for cluster analysis. The concept-wise categorized unstructured data were clustered using the two-step clustering method. Two-step clustering algorithm was used due to its ability to handle mixed data types and larger data sets efficiently. In addition, the twostep clustering algorithm has the advantage of automatically decide the optimal number of clusters. Therefore, the clustering algorithm clustered 179 key concepts into 15 clusters. The selection of 15 clusters gives the best cluster quality which is measured by the Silhouette index. The obtained value of Silhouette index was 0.7 which means a good clustering quality. Root cause and solution prediction models are developed using ANN models. The output (15 clusters) of two-step clustering algorithm (obtained from the concept extraction of unstructured data using text mining) along with structured data were combined using Defect IDs. The combined data were used as inputs to train and test the ANN models for the root cause and solution prediction. Figure 8 shows that the accuracy rates of ANN models for both root cause and solution predictions are 86% and 74.4%, respectively. However, in absence of unstructured data, the accuracy rates of the ANN models for root cause and solution predictions are 75.7% and 50%, respectively. Therefore, inclusion of unstructured data for defect assessment increased the root cause and solution predictions accuracies by 14% and 49%, respectively. 6

7 Figure 6: Process flow for DIMMs Figure 7: Analytics model for defect assessment 7

8 Figure 8: Accuracy of the ANN models for the root cause and solution predictions 5. Conclusions and Future Work In this study, an analytics based framework is proposed for defect assessment in a high-end server manufacturing environment. Both structured and unstructured data were utilized to build prediction and assessment models for the defects. Identifying causes of defects and proposing solutions using the proposed framework is found to be very effective for sever manufacturing environment for early detection of production related faults. The performance levels of the analytics used in this framework are 86% and 74.4% for root cause prediction and solution prediction, respectively. The proposed defect analytics framework replaces the current manual defect analysis method which is based on trial and error. It plays a significant role to predict the defect characteristics and root causes using historical data and could be incorporated into the decision support system of the server manufacturing environment. There are several avenues that future research could follow to overcome the limitations of the proposed models. Efforts can be made to increase the accuracy levels of the model parameters by conducting Design of Experiment. Furthermore, a larger set of data as well as data from other defect prone sectors can be analyzed by adjusting the proposed framework model parameters. References 1. Ramakrishnan, S., Tsai, P.-F., Srihari, K., and Foltz, C., 2008, Using Design of Experiments and Simulation Modeling to Study the Facility Layout for a Server Assembly Process, Proc. of the 2008 Industrial Engineering Research Conference, May 17-21, Vancouver, BC, Cao, H., Xi, H., and Smith, S.F., 2003, A Reinforcement Learning Approach to Production Planning in the Fabrication/Fulfillment Manufacturing Process, Proc. of the 35th Winter Simulation Conference, December 7-10, New Orleans, LA, Lendermann, P., 2006, About the Need for Distributed Simulation Technology for the Resolution of Real- World Manufacturing and Logistics Problems, Proc. of the 2006 Winter Simulation Conference, December 3-6, Monterey, CA, Efendigil, T., Önüt, S., and Kahraman, C., 2009, A Decision Support System for Demand Forecasting with Artificial Neural Networks and Neuro-fuzzy Models: A Comparative Analysis, Expert Systems with Applications, 36(1), Siglaz, 2011, Intelligent Defect Analysis, Framework for Integrated Data Management, Available at Accessed Decemmber 26, Zheng, J., 2010, Cost-sensitive Boosting Neural Networks for Software Defect Prediction, Expert Systems with Applications, 37(6), Yousef, A.H., 2014, Extracting Software Static Defect Models Using Data Mining, Ain Shams Engineering Journal, 6(1),

9 8. Tsai, C.-Y., Chiu, C-C., and Chen, J.-S., 2005, A Case-based Reasoning System for PCB Defect Prediction, Expert Systems with Applications, 28(4), Tosun, A., Bener, A., Turhan, B., and Menzies, T., 2010, Practical Considerations in Deploying Statistical Methods for Defect Prediction: A Case Study within the Turkish Telecommunications Industry, Information and Software Technology, 52(11), Mirapeix, J., García-Allende, P.B., Cobo, A., Conde, O.M., López-Higuera, J.M., 2007, Real-Time Arc- Welding Defect Detection and Classification with Principal Component Analysis and Artificial Neural Networks, NDT & E International, 40(4), Leung, H. C., 1995, Neural Networks in Supply Chain Management, Proc. of IEEE Annual International Engineering Management Conference, June 28-30, George, G., Haas, M.R., and Pentland, A., 2014, Big Data and Management, Academy of Management Journal, 57(2), Aiden, E., and Michel, J.-B., Dec , The Predictive Power of Big Data, Newsweek, Available at Accessed November 15, Pang, B. and Lee, L., 2008, Opinion Mining and Sentiment Analysis, Foundations and Trends in Information Retrieval, 2(1-2), Schmunk, S., Höpken, W., Fuchs, M., and Lexhagen, M., 2013, Sentiment Analysis: Extracting Decisionrelevant Knowledge from UGC. In: Xiang, Z., Tussyadiah, I. (Eds.), Information and Communication Technologies in Tourism Springer Inter-national Publishing, New York, NY, Choudhary, A.K., Oluikpe, P.I., Harding, J.A., and Carrillo, P.M., 2009, The Needs and Benefits of Text Mining Applications on Post-Project Reviews, Computers in Industry, 60(9), Tan, C., Lee, L., Tang, J., Jiang, L., Zhou, M., and Li, P., 2011, User-level Sentiment Analysis Incorporating Social Networks, Proc. of the 17 th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, NY, Haddi, E, Liu, X., and Shi, Y., 2013, The Role of Text Pre-processing in Sentiment Analysis, Information and Quantitative Management, 17(1), Smailović, J., Grčar, M., Lavrač, N., and Žnidaršič, M., 2014, Stream-based Active Learning for Sentiment Analysis in the Financial Domain, Information Sciences, 285(1),

Sentiment analysis on tweets in a financial domain

Sentiment analysis on tweets in a financial domain Sentiment analysis on tweets in a financial domain Jasmina Smailović 1,2, Miha Grčar 1, Martin Žnidaršič 1 1 Dept of Knowledge Technologies, Jožef Stefan Institute, Ljubljana, Slovenia 2 Jožef Stefan International

More information

An Introduction to Data Mining

An Introduction to Data Mining An Introduction to Intel Beijing [email protected] January 17, 2014 Outline 1 DW Overview What is Notable Application of Conference, Software and Applications Major Process in 2 Major Tasks in Detail

More information

Using Text and Data Mining Techniques to extract Stock Market Sentiment from Live News Streams

Using Text and Data Mining Techniques to extract Stock Market Sentiment from Live News Streams 2012 International Conference on Computer Technology and Science (ICCTS 2012) IPCSIT vol. XX (2012) (2012) IACSIT Press, Singapore Using Text and Data Mining Techniques to extract Stock Market Sentiment

More information

DATA MINING TECHNIQUES AND APPLICATIONS

DATA MINING TECHNIQUES AND APPLICATIONS DATA MINING TECHNIQUES AND APPLICATIONS Mrs. Bharati M. Ramageri, Lecturer Modern Institute of Information Technology and Research, Department of Computer Application, Yamunanagar, Nigdi Pune, Maharashtra,

More information

Hexaware E-book on Predictive Analytics

Hexaware E-book on Predictive Analytics Hexaware E-book on Predictive Analytics Business Intelligence & Analytics Actionable Intelligence Enabled Published on : Feb 7, 2012 Hexaware E-book on Predictive Analytics What is Data mining? Data mining,

More information

Text Opinion Mining to Analyze News for Stock Market Prediction

Text Opinion Mining to Analyze News for Stock Market Prediction Int. J. Advance. Soft Comput. Appl., Vol. 6, No. 1, March 2014 ISSN 2074-8523; Copyright SCRG Publication, 2014 Text Opinion Mining to Analyze News for Stock Market Prediction Yoosin Kim 1, Seung Ryul

More information

Research of Postal Data mining system based on big data

Research of Postal Data mining system based on big data 3rd International Conference on Mechatronics, Robotics and Automation (ICMRA 2015) Research of Postal Data mining system based on big data Xia Hu 1, Yanfeng Jin 1, Fan Wang 1 1 Shi Jiazhuang Post & Telecommunication

More information

Data Mining Solutions for the Business Environment

Data Mining Solutions for the Business Environment Database Systems Journal vol. IV, no. 4/2013 21 Data Mining Solutions for the Business Environment Ruxandra PETRE University of Economic Studies, Bucharest, Romania [email protected] Over

More information

Knowledge Discovery from patents using KMX Text Analytics

Knowledge Discovery from patents using KMX Text Analytics Knowledge Discovery from patents using KMX Text Analytics Dr. Anton Heijs [email protected] Treparel Abstract In this white paper we discuss how the KMX technology of Treparel can help searchers

More information

Chapter ML:XI. XI. Cluster Analysis

Chapter ML:XI. XI. Cluster Analysis Chapter ML:XI XI. Cluster Analysis Data Mining Overview Cluster Analysis Basics Hierarchical Cluster Analysis Iterative Cluster Analysis Density-Based Cluster Analysis Cluster Evaluation Constrained Cluster

More information

TEXT ANALYTICS INTEGRATION

TEXT ANALYTICS INTEGRATION TEXT ANALYTICS INTEGRATION A TELECOMMUNICATIONS BEST PRACTICES CASE STUDY VISION COMMON ANALYTICAL ENVIRONMENT Structured Unstructured Analytical Mining Text Discovery Text Categorization Text Sentiment

More information

Sentiment analysis of Twitter microblogging posts. Jasmina Smailović Jožef Stefan Institute Department of Knowledge Technologies

Sentiment analysis of Twitter microblogging posts. Jasmina Smailović Jožef Stefan Institute Department of Knowledge Technologies Sentiment analysis of Twitter microblogging posts Jasmina Smailović Jožef Stefan Institute Department of Knowledge Technologies Introduction Popularity of microblogging services Twitter microblogging posts

More information

How To Solve The Kd Cup 2010 Challenge

How To Solve The Kd Cup 2010 Challenge A Lightweight Solution to the Educational Data Mining Challenge Kun Liu Yan Xing Faculty of Automation Guangdong University of Technology Guangzhou, 510090, China [email protected] [email protected]

More information

Introduction to Data Mining

Introduction to Data Mining Introduction to Data Mining Jay Urbain Credits: Nazli Goharian & David Grossman @ IIT Outline Introduction Data Pre-processing Data Mining Algorithms Naïve Bayes Decision Tree Neural Network Association

More information

Sentiment Analysis. D. Skrepetos 1. University of Waterloo. NLP Presenation, 06/17/2015

Sentiment Analysis. D. Skrepetos 1. University of Waterloo. NLP Presenation, 06/17/2015 Sentiment Analysis D. Skrepetos 1 1 Department of Computer Science University of Waterloo NLP Presenation, 06/17/2015 D. Skrepetos (University of Waterloo) Sentiment Analysis NLP Presenation, 06/17/2015

More information

Class Imbalance Learning in Software Defect Prediction

Class Imbalance Learning in Software Defect Prediction Class Imbalance Learning in Software Defect Prediction Dr. Shuo Wang [email protected] University of Birmingham Research keywords: ensemble learning, class imbalance learning, online learning Shuo Wang

More information

The Big Data methodology in computer vision systems

The Big Data methodology in computer vision systems The Big Data methodology in computer vision systems Popov S.B. Samara State Aerospace University, Image Processing Systems Institute, Russian Academy of Sciences Abstract. I consider the advantages of

More information

DATA MINING TECHNIQUES SUPPORT TO KNOWLEGDE OF BUSINESS INTELLIGENT SYSTEM

DATA MINING TECHNIQUES SUPPORT TO KNOWLEGDE OF BUSINESS INTELLIGENT SYSTEM INTERNATIONAL JOURNAL OF RESEARCH IN COMPUTER APPLICATIONS AND ROBOTICS ISSN 2320-7345 DATA MINING TECHNIQUES SUPPORT TO KNOWLEGDE OF BUSINESS INTELLIGENT SYSTEM M. Mayilvaganan 1, S. Aparna 2 1 Associate

More information

Introduction. A. Bellaachia Page: 1

Introduction. A. Bellaachia Page: 1 Introduction 1. Objectives... 3 2. What is Data Mining?... 4 3. Knowledge Discovery Process... 5 4. KD Process Example... 7 5. Typical Data Mining Architecture... 8 6. Database vs. Data Mining... 9 7.

More information

Data Mining for Customer Service Support. Senioritis Seminar Presentation Megan Boice Jay Carter Nick Linke KC Tobin

Data Mining for Customer Service Support. Senioritis Seminar Presentation Megan Boice Jay Carter Nick Linke KC Tobin Data Mining for Customer Service Support Senioritis Seminar Presentation Megan Boice Jay Carter Nick Linke KC Tobin Traditional Hotline Services Problem Traditional Customer Service Support (manufacturing)

More information

Database Marketing, Business Intelligence and Knowledge Discovery

Database Marketing, Business Intelligence and Knowledge Discovery Database Marketing, Business Intelligence and Knowledge Discovery Note: Using material from Tan / Steinbach / Kumar (2005) Introduction to Data Mining,, Addison Wesley; and Cios / Pedrycz / Swiniarski

More information

Towards SoMEST Combining Social Media Monitoring with Event Extraction and Timeline Analysis

Towards SoMEST Combining Social Media Monitoring with Event Extraction and Timeline Analysis Towards SoMEST Combining Social Media Monitoring with Event Extraction and Timeline Analysis Yue Dai, Ernest Arendarenko, Tuomo Kakkonen, Ding Liao School of Computing University of Eastern Finland {yvedai,

More information

VCU-TSA at Semeval-2016 Task 4: Sentiment Analysis in Twitter

VCU-TSA at Semeval-2016 Task 4: Sentiment Analysis in Twitter VCU-TSA at Semeval-2016 Task 4: Sentiment Analysis in Twitter Gerard Briones and Kasun Amarasinghe and Bridget T. McInnes, PhD. Department of Computer Science Virginia Commonwealth University Richmond,

More information

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014 RESEARCH ARTICLE OPEN ACCESS A Survey of Data Mining: Concepts with Applications and its Future Scope Dr. Zubair Khan 1, Ashish Kumar 2, Sunny Kumar 3 M.Tech Research Scholar 2. Department of Computer

More information

131-1. Adding New Level in KDD to Make the Web Usage Mining More Efficient. Abstract. 1. Introduction [1]. 1/10

131-1. Adding New Level in KDD to Make the Web Usage Mining More Efficient. Abstract. 1. Introduction [1]. 1/10 1/10 131-1 Adding New Level in KDD to Make the Web Usage Mining More Efficient Mohammad Ala a AL_Hamami PHD Student, Lecturer m_ah_1@yahoocom Soukaena Hassan Hashem PHD Student, Lecturer soukaena_hassan@yahoocom

More information

SPATIAL DATA CLASSIFICATION AND DATA MINING

SPATIAL DATA CLASSIFICATION AND DATA MINING , pp.-40-44. Available online at http://www. bioinfo. in/contents. php?id=42 SPATIAL DATA CLASSIFICATION AND DATA MINING RATHI J.B. * AND PATIL A.D. Department of Computer Science & Engineering, Jawaharlal

More information

Cleaned Data. Recommendations

Cleaned Data. Recommendations Call Center Data Analysis Megaputer Case Study in Text Mining Merete Hvalshagen www.megaputer.com Megaputer Intelligence, Inc. 120 West Seventh Street, Suite 10 Bloomington, IN 47404, USA +1 812-0-0110

More information

An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015

An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015 An Introduction to Data Mining for Wind Power Management Spring 2015 Big Data World Every minute: Google receives over 4 million search queries Facebook users share almost 2.5 million pieces of content

More information

Promises and Pitfalls of Big-Data-Predictive Analytics: Best Practices and Trends

Promises and Pitfalls of Big-Data-Predictive Analytics: Best Practices and Trends Promises and Pitfalls of Big-Data-Predictive Analytics: Best Practices and Trends Spring 2015 Thomas Hill, Ph.D. VP Analytic Solutions Dell Statistica Overview and Agenda Dell Software overview Dell in

More information

Random forest algorithm in big data environment

Random forest algorithm in big data environment Random forest algorithm in big data environment Yingchun Liu * School of Economics and Management, Beihang University, Beijing 100191, China Received 1 September 2014, www.cmnt.lv Abstract Random forest

More information

CHURN PREDICTION IN MOBILE TELECOM SYSTEM USING DATA MINING TECHNIQUES

CHURN PREDICTION IN MOBILE TELECOM SYSTEM USING DATA MINING TECHNIQUES International Journal of Scientific and Research Publications, Volume 4, Issue 4, April 2014 1 CHURN PREDICTION IN MOBILE TELECOM SYSTEM USING DATA MINING TECHNIQUES DR. M.BALASUBRAMANIAN *, M.SELVARANI

More information

Sentiment Analysis on Big Data

Sentiment Analysis on Big Data SPAN White Paper!? Sentiment Analysis on Big Data Machine Learning Approach Several sources on the web provide deep insight about people s opinions on the products and services of various companies. Social

More information

Using reporting and data mining techniques to improve knowledge of subscribers; applications to customer profiling and fraud management

Using reporting and data mining techniques to improve knowledge of subscribers; applications to customer profiling and fraud management Using reporting and data mining techniques to improve knowledge of subscribers; applications to customer profiling and fraud management Paper Jean-Louis Amat Abstract One of the main issues of operators

More information

DATA MINING TECHNOLOGY. Keywords: data mining, data warehouse, knowledge discovery, OLAP, OLAM.

DATA MINING TECHNOLOGY. Keywords: data mining, data warehouse, knowledge discovery, OLAP, OLAM. DATA MINING TECHNOLOGY Georgiana Marin 1 Abstract In terms of data processing, classical statistical models are restrictive; it requires hypotheses, the knowledge and experience of specialists, equations,

More information

Application of Business Intelligence in Transportation for a Transportation Service Provider

Application of Business Intelligence in Transportation for a Transportation Service Provider Application of Business Intelligence in Transportation for a Transportation Service Provider Mohamed Sheriff Business Analyst Satyam Computer Services Ltd Email: [email protected], [email protected]

More information

Towards applying Data Mining Techniques for Talent Mangement

Towards applying Data Mining Techniques for Talent Mangement 2009 International Conference on Computer Engineering and Applications IPCSIT vol.2 (2011) (2011) IACSIT Press, Singapore Towards applying Data Mining Techniques for Talent Mangement Hamidah Jantan 1,

More information

Name: Srinivasan Govindaraj Title: Big Data Predictive Analytics

Name: Srinivasan Govindaraj Title: Big Data Predictive Analytics Name: Srinivasan Govindaraj Title: Big Data Predictive Analytics Please note the following IBM s statements regarding its plans, directions, and intent are subject to change or withdrawal without notice

More information

Maximizing Return and Minimizing Cost with the Decision Management Systems

Maximizing Return and Minimizing Cost with the Decision Management Systems KDD 2012: Beijing 18 th ACM SIGKDD Conference on Knowledge Discovery and Data Mining Rich Holada, Vice President, IBM SPSS Predictive Analytics Maximizing Return and Minimizing Cost with the Decision Management

More information

A STUDY OF DATA MINING ACTIVITIES FOR MARKET RESEARCH

A STUDY OF DATA MINING ACTIVITIES FOR MARKET RESEARCH 205 A STUDY OF DATA MINING ACTIVITIES FOR MARKET RESEARCH ABSTRACT MR. HEMANT KUMAR*; DR. SARMISTHA SARMA** *Assistant Professor, Department of Information Technology (IT), Institute of Innovation in Technology

More information

How To Use Neural Networks In Data Mining

How To Use Neural Networks In Data Mining International Journal of Electronics and Computer Science Engineering 1449 Available Online at www.ijecse.org ISSN- 2277-1956 Neural Networks in Data Mining Priyanka Gaur Department of Information and

More information

Natural Language to Relational Query by Using Parsing Compiler

Natural Language to Relational Query by Using Parsing Compiler Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 4, Issue. 3, March 2015,

More information

Sentiment analysis on news articles using Natural Language Processing and Machine Learning Approach.

Sentiment analysis on news articles using Natural Language Processing and Machine Learning Approach. Sentiment analysis on news articles using Natural Language Processing and Machine Learning Approach. Pranali Chilekar 1, Swati Ubale 2, Pragati Sonkambale 3, Reema Panarkar 4, Gopal Upadhye 5 1 2 3 4 5

More information

A Systemic Artificial Intelligence (AI) Approach to Difficult Text Analytics Tasks

A Systemic Artificial Intelligence (AI) Approach to Difficult Text Analytics Tasks A Systemic Artificial Intelligence (AI) Approach to Difficult Text Analytics Tasks Text Analytics World, Boston, 2013 Lars Hard, CTO Agenda Difficult text analytics tasks Feature extraction Bio-inspired

More information

Analyzing Customer Churn in the Software as a Service (SaaS) Industry

Analyzing Customer Churn in the Software as a Service (SaaS) Industry Analyzing Customer Churn in the Software as a Service (SaaS) Industry Ben Frank, Radford University Jeff Pittges, Radford University Abstract Predicting customer churn is a classic data mining problem.

More information

Forecasting stock markets with Twitter

Forecasting stock markets with Twitter Forecasting stock markets with Twitter Argimiro Arratia [email protected] Joint work with Marta Arias and Ramón Xuriguera To appear in: ACM Transactions on Intelligent Systems and Technology, 2013,

More information

Data Mining System, Functionalities and Applications: A Radical Review

Data Mining System, Functionalities and Applications: A Radical Review Data Mining System, Functionalities and Applications: A Radical Review Dr. Poonam Chaudhary System Programmer, Kurukshetra University, Kurukshetra Abstract: Data Mining is the process of locating potentially

More information

AUTO CLAIM FRAUD DETECTION USING MULTI CLASSIFIER SYSTEM

AUTO CLAIM FRAUD DETECTION USING MULTI CLASSIFIER SYSTEM AUTO CLAIM FRAUD DETECTION USING MULTI CLASSIFIER SYSTEM ABSTRACT Luis Alexandre Rodrigues and Nizam Omar Department of Electrical Engineering, Mackenzie Presbiterian University, Brazil, São Paulo [email protected],[email protected]

More information

COPYRIGHTED MATERIAL. Contents. List of Figures. Acknowledgments

COPYRIGHTED MATERIAL. Contents. List of Figures. Acknowledgments Contents List of Figures Foreword Preface xxv xxiii xv Acknowledgments xxix Chapter 1 Fraud: Detection, Prevention, and Analytics! 1 Introduction 2 Fraud! 2 Fraud Detection and Prevention 10 Big Data for

More information

Equity forecast: Predicting long term stock price movement using machine learning

Equity forecast: Predicting long term stock price movement using machine learning Equity forecast: Predicting long term stock price movement using machine learning Nikola Milosevic School of Computer Science, University of Manchester, UK [email protected] Abstract Long

More information

IT services for analyses of various data samples

IT services for analyses of various data samples IT services for analyses of various data samples Ján Paralič, František Babič, Martin Sarnovský, Peter Butka, Cecília Havrilová, Miroslava Muchová, Michal Puheim, Martin Mikula, Gabriel Tutoky Technical

More information

Course Syllabus For Operations Management. Management Information Systems

Course Syllabus For Operations Management. Management Information Systems For Operations Management and Management Information Systems Department School Year First Year First Year First Year Second year Second year Second year Third year Third year Third year Third year Third

More information

The multilayer sentiment analysis model based on Random forest Wei Liu1, Jie Zhang2

The multilayer sentiment analysis model based on Random forest Wei Liu1, Jie Zhang2 2nd International Conference on Advances in Mechanical Engineering and Industrial Informatics (AMEII 2016) The multilayer sentiment analysis model based on Random forest Wei Liu1, Jie Zhang2 1 School of

More information

Online Content Optimization Using Hadoop. Jyoti Ahuja Dec 20 2011

Online Content Optimization Using Hadoop. Jyoti Ahuja Dec 20 2011 Online Content Optimization Using Hadoop Jyoti Ahuja Dec 20 2011 What do we do? Deliver right CONTENT to the right USER at the right TIME o Effectively and pro-actively learn from user interactions with

More information

Statistics for BIG data

Statistics for BIG data Statistics for BIG data Statistics for Big Data: Are Statisticians Ready? Dennis Lin Department of Statistics The Pennsylvania State University John Jordan and Dennis K.J. Lin (ICSA-Bulletine 2014) Before

More information

Neural Networks for Sentiment Detection in Financial Text

Neural Networks for Sentiment Detection in Financial Text Neural Networks for Sentiment Detection in Financial Text Caslav Bozic* and Detlef Seese* With a rise of algorithmic trading volume in recent years, the need for automatic analysis of financial news emerged.

More information

Social Media Implementations

Social Media Implementations SEM Experience Analytics Social Media Implementations SEM Experience Analytics delivers real sentiment, meaning and trends within social media for many of the world s leading consumer brand companies.

More information

Big Data. Fast Forward. Putting data to productive use

Big Data. Fast Forward. Putting data to productive use Big Data Putting data to productive use Fast Forward What is big data, and why should you care? Get familiar with big data terminology, technologies, and techniques. Getting started with big data to realize

More information

Role of Social Networking in Marketing using Data Mining

Role of Social Networking in Marketing using Data Mining Role of Social Networking in Marketing using Data Mining Mrs. Saroj Junghare Astt. Professor, Department of Computer Science and Application St. Aloysius College, Jabalpur, Madhya Pradesh, India Abstract:

More information

Prerequisites. Course Outline

Prerequisites. Course Outline MS-55040: Data Mining, Predictive Analytics with Microsoft Analysis Services and Excel PowerPivot Description This three-day instructor-led course will introduce the students to the concepts of data mining,

More information

ANALYTICS CENTER LEARNING PROGRAM

ANALYTICS CENTER LEARNING PROGRAM Overview of Curriculum ANALYTICS CENTER LEARNING PROGRAM The following courses are offered by Analytics Center as part of its learning program: Course Duration Prerequisites 1- Math and Theory 101 - Fundamentals

More information

Data Warehousing and Data Mining in Business Applications

Data Warehousing and Data Mining in Business Applications 133 Data Warehousing and Data Mining in Business Applications Eesha Goel CSE Deptt. GZS-PTU Campus, Bathinda. Abstract Information technology is now required in all aspect of our lives that helps in business

More information

Using News Articles to Predict Stock Price Movements

Using News Articles to Predict Stock Price Movements Using News Articles to Predict Stock Price Movements Győző Gidófalvi Department of Computer Science and Engineering University of California, San Diego La Jolla, CA 9237 [email protected] 21, June 15,

More information

MS1b Statistical Data Mining

MS1b Statistical Data Mining MS1b Statistical Data Mining Yee Whye Teh Department of Statistics Oxford http://www.stats.ox.ac.uk/~teh/datamining.html Outline Administrivia and Introduction Course Structure Syllabus Introduction to

More information

Document Image Retrieval using Signatures as Queries

Document Image Retrieval using Signatures as Queries Document Image Retrieval using Signatures as Queries Sargur N. Srihari, Shravya Shetty, Siyuan Chen, Harish Srinivasan, Chen Huang CEDAR, University at Buffalo(SUNY) Amherst, New York 14228 Gady Agam and

More information

The Big Data Paradigm Shift. Insight Through Automation

The Big Data Paradigm Shift. Insight Through Automation The Big Data Paradigm Shift Insight Through Automation Agenda The Problem Emcien s Solution: Algorithms solve data related business problems How Does the Technology Work? Case Studies 2013 Emcien, Inc.

More information

A Big Data Analytical Framework For Portfolio Optimization Abstract. Keywords. 1. Introduction

A Big Data Analytical Framework For Portfolio Optimization Abstract. Keywords. 1. Introduction A Big Data Analytical Framework For Portfolio Optimization Dhanya Jothimani, Ravi Shankar and Surendra S. Yadav Department of Management Studies, Indian Institute of Technology Delhi {dhanya.jothimani,

More information

A Proposed Prediction Model for Forecasting the Financial Market Value According to Diversity in Factor

A Proposed Prediction Model for Forecasting the Financial Market Value According to Diversity in Factor A Proposed Prediction Model for Forecasting the Financial Market Value According to Diversity in Factor Ms. Hiral R. Patel, Mr. Amit B. Suthar, Dr. Satyen M. Parikh Assistant Professor, DCS, Ganpat University,

More information

SURVEY REPORT DATA SCIENCE SOCIETY 2014

SURVEY REPORT DATA SCIENCE SOCIETY 2014 SURVEY REPORT DATA SCIENCE SOCIETY 2014 TABLE OF CONTENTS Contents About the Initiative 1 Report Summary 2 Participants Info 3 Participants Expertise 6 Suggested Discussion Topics 7 Selected Responses

More information

An Overview of Knowledge Discovery Database and Data mining Techniques

An Overview of Knowledge Discovery Database and Data mining Techniques An Overview of Knowledge Discovery Database and Data mining Techniques Priyadharsini.C 1, Dr. Antony Selvadoss Thanamani 2 M.Phil, Department of Computer Science, NGM College, Pollachi, Coimbatore, Tamilnadu,

More information

Master s Program in Information Systems

Master s Program in Information Systems The University of Jordan King Abdullah II School for Information Technology Department of Information Systems Master s Program in Information Systems 2006/2007 Study Plan Master Degree in Information Systems

More information

Software Defect Prediction for Quality Improvement Using Hybrid Approach

Software Defect Prediction for Quality Improvement Using Hybrid Approach Software Defect Prediction for Quality Improvement Using Hybrid Approach 1 Pooja Paramshetti, 2 D. A. Phalke D.Y. Patil College of Engineering, Akurdi, Pune. Savitribai Phule Pune University ABSTRACT In

More information

Intrusion Detection via Machine Learning for SCADA System Protection

Intrusion Detection via Machine Learning for SCADA System Protection Intrusion Detection via Machine Learning for SCADA System Protection S.L.P. Yasakethu Department of Computing, University of Surrey, Guildford, GU2 7XH, UK. [email protected] J. Jiang Department

More information

Data Mining Part 5. Prediction

Data Mining Part 5. Prediction Data Mining Part 5. Prediction 5.1 Spring 2010 Instructor: Dr. Masoud Yaghini Outline Classification vs. Numeric Prediction Prediction Process Data Preparation Comparing Prediction Methods References Classification

More information

Social Media Mining. Data Mining Essentials

Social Media Mining. Data Mining Essentials Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers

More information

Data Isn't Everything

Data Isn't Everything June 17, 2015 Innovate Forward Data Isn't Everything The Challenges of Big Data, Advanced Analytics, and Advance Computation Devices for Transportation Agencies. Using Data to Support Mission, Administration,

More information

Customer Classification And Prediction Based On Data Mining Technique

Customer Classification And Prediction Based On Data Mining Technique Customer Classification And Prediction Based On Data Mining Technique Ms. Neethu Baby 1, Mrs. Priyanka L.T 2 1 M.E CSE, Sri Shakthi Institute of Engineering and Technology, Coimbatore 2 Assistant Professor

More information

Industrial Roadmap for Connected Machines. Sal Spada Research Director ARC Advisory Group [email protected]

Industrial Roadmap for Connected Machines. Sal Spada Research Director ARC Advisory Group sspada@arcweb.com Industrial Roadmap for Connected Machines Sal Spada Research Director ARC Advisory Group [email protected] Industrial Internet of Things (IoT) Based upon enhanced connectivity of this stuff Connecting

More information

Digging for Gold: Business Usage for Data Mining Kim Foster, CoreTech Consulting Group, Inc., King of Prussia, PA

Digging for Gold: Business Usage for Data Mining Kim Foster, CoreTech Consulting Group, Inc., King of Prussia, PA Digging for Gold: Business Usage for Data Mining Kim Foster, CoreTech Consulting Group, Inc., King of Prussia, PA ABSTRACT Current trends in data mining allow the business community to take advantage of

More information

Index Contents Page No. Introduction . Data Mining & Knowledge Discovery

Index Contents Page No. Introduction . Data Mining & Knowledge Discovery Index Contents Page No. 1. Introduction 1 1.1 Related Research 2 1.2 Objective of Research Work 3 1.3 Why Data Mining is Important 3 1.4 Research Methodology 4 1.5 Research Hypothesis 4 1.6 Scope 5 2.

More information

APPLICATION OF DATA MINING TECHNIQUES FOR THE DEVELOPMENT OF NEW ROCK MECHANICS CONSTITUTIVE MODELS

APPLICATION OF DATA MINING TECHNIQUES FOR THE DEVELOPMENT OF NEW ROCK MECHANICS CONSTITUTIVE MODELS APPLICATION OF DATA MINING TECHNIQUES FOR THE DEVELOPMENT OF NEW ROCK MECHANICS CONSTITUTIVE MODELS T. Miranda 1, L.R. Sousa 2 *, W. Roggenthen 3, and R.L. Sousa 4 1 University of Minho, Guimarães, Portugal

More information

CONTENTS PREFACE 1 INTRODUCTION 1 2 DATA VISUALIZATION 19

CONTENTS PREFACE 1 INTRODUCTION 1 2 DATA VISUALIZATION 19 PREFACE xi 1 INTRODUCTION 1 1.1 Overview 1 1.2 Definition 1 1.3 Preparation 2 1.3.1 Overview 2 1.3.2 Accessing Tabular Data 3 1.3.3 Accessing Unstructured Data 3 1.3.4 Understanding the Variables and Observations

More information

A Novel Feature Selection Method Based on an Integrated Data Envelopment Analysis and Entropy Mode

A Novel Feature Selection Method Based on an Integrated Data Envelopment Analysis and Entropy Mode A Novel Feature Selection Method Based on an Integrated Data Envelopment Analysis and Entropy Mode Seyed Mojtaba Hosseini Bamakan, Peyman Gholami RESEARCH CENTRE OF FICTITIOUS ECONOMY & DATA SCIENCE UNIVERSITY

More information

Business Intelligence and Decision Support Systems

Business Intelligence and Decision Support Systems Chapter 12 Business Intelligence and Decision Support Systems Information Technology For Management 7 th Edition Turban & Volonino Based on lecture slides by L. Beaubien, Providence College John Wiley

More information

Azure Machine Learning, SQL Data Mining and R

Azure Machine Learning, SQL Data Mining and R Azure Machine Learning, SQL Data Mining and R Day-by-day Agenda Prerequisites No formal prerequisites. Basic knowledge of SQL Server Data Tools, Excel and any analytical experience helps. Best of all:

More information

Predictive Modeling for Collections of Accounts Receivable Sai Zeng IBM T.J. Watson Research Center Hawthorne, NY, 10523. Abstract

Predictive Modeling for Collections of Accounts Receivable Sai Zeng IBM T.J. Watson Research Center Hawthorne, NY, 10523. Abstract Paper Submission for ACM SIGKDD Workshop on Domain Driven Data Mining (DDDM2007) Predictive Modeling for Collections of Accounts Receivable Sai Zeng [email protected] Prem Melville Yorktown Heights, NY,

More information

E-commerce Transaction Anomaly Classification

E-commerce Transaction Anomaly Classification E-commerce Transaction Anomaly Classification Minyong Lee [email protected] Seunghee Ham [email protected] Qiyi Jiang [email protected] I. INTRODUCTION Due to the increasing popularity of e-commerce

More information

DMDSS: Data Mining Based Decision Support System to Integrate Data Mining and Decision Support

DMDSS: Data Mining Based Decision Support System to Integrate Data Mining and Decision Support DMDSS: Data Mining Based Decision Support System to Integrate Data Mining and Decision Support Rok Rupnik, Matjaž Kukar, Marko Bajec, Marjan Krisper University of Ljubljana, Faculty of Computer and Information

More information

A HYBRID RULE BASED FUZZY-NEURAL EXPERT SYSTEM FOR PASSIVE NETWORK MONITORING

A HYBRID RULE BASED FUZZY-NEURAL EXPERT SYSTEM FOR PASSIVE NETWORK MONITORING A HYBRID RULE BASED FUZZY-NEURAL EXPERT SYSTEM FOR PASSIVE NETWORK MONITORING AZRUDDIN AHMAD, GOBITHASAN RUDRUSAMY, RAHMAT BUDIARTO, AZMAN SAMSUDIN, SURESRAWAN RAMADASS. Network Research Group School of

More information

INTELLIGENT DEFECT ANALYSIS, FRAMEWORK FOR INTEGRATED DATA MANAGEMENT

INTELLIGENT DEFECT ANALYSIS, FRAMEWORK FOR INTEGRATED DATA MANAGEMENT INTELLIGENT DEFECT ANALYSIS, FRAMEWORK FOR INTEGRATED DATA MANAGEMENT Website: http://www.siglaz.com Abstract Spatial signature analysis (SSA) is one of the key technologies that semiconductor manufacturers

More information

Pentaho Data Mining Last Modified on January 22, 2007

Pentaho Data Mining Last Modified on January 22, 2007 Pentaho Data Mining Copyright 2007 Pentaho Corporation. Redistribution permitted. All trademarks are the property of their respective owners. For the latest information, please visit our web site at www.pentaho.org

More information

Predicting the Risk of Heart Attacks using Neural Network and Decision Tree

Predicting the Risk of Heart Attacks using Neural Network and Decision Tree Predicting the Risk of Heart Attacks using Neural Network and Decision Tree S.Florence 1, N.G.Bhuvaneswari Amma 2, G.Annapoorani 3, K.Malathi 4 PG Scholar, Indian Institute of Information Technology, Srirangam,

More information

A.I. in health informatics lecture 1 introduction & stuff kevin small & byron wallace

A.I. in health informatics lecture 1 introduction & stuff kevin small & byron wallace A.I. in health informatics lecture 1 introduction & stuff kevin small & byron wallace what is this class about? health informatics managing and making sense of biomedical information but mostly from an

More information

Data Mining Yelp Data - Predicting rating stars from review text

Data Mining Yelp Data - Predicting rating stars from review text Data Mining Yelp Data - Predicting rating stars from review text Rakesh Chada Stony Brook University [email protected] Chetan Naik Stony Brook University [email protected] ABSTRACT The majority

More information

Network Machine Learning Research Group. Intended status: Informational October 19, 2015 Expires: April 21, 2016

Network Machine Learning Research Group. Intended status: Informational October 19, 2015 Expires: April 21, 2016 Network Machine Learning Research Group S. Jiang Internet-Draft Huawei Technologies Co., Ltd Intended status: Informational October 19, 2015 Expires: April 21, 2016 Abstract Network Machine Learning draft-jiang-nmlrg-network-machine-learning-00

More information

IBM SPSS Modeler Premium

IBM SPSS Modeler Premium IBM SPSS Modeler Premium Improve model accuracy with structured and unstructured data, entity analytics and social network analysis Highlights Solve business problems faster with analytical techniques

More information