Feature Engineering in Machine Learning
|
|
|
- Pierce Underwood
- 10 years ago
- Views:
Transcription
1 Research Fellow Faculty of Information Technology, Monash University, Melbourne VIC 3800, Australia August 21, 2015
2 Outline A Machine Learning Primer Machine Learning and Data Science Bias-Variance Phenomenon Regularization What is Feature Engineering (FE)? Graphical Models and Bayesian Networks Deep Learning and FE Dimensionality Reduction Wrap-up Current Trends Practical Advice on FE
3 Machine Learning Suppose there exists a function y = f(x), now given examples of the form (y,x), can we determine the function f? [1] Functional approximation Input is the Data Roots in Statistics Kin to Data Mining Subset of Data Science Countless applications in: Medicine, Science, Finance, Industry, etc. Renewed interests due to the emergence of big data Part of multi-billion analytics industry of 21st century
4 Typical Machine Learning Problems Supervised Learning (Classification, Regression) Un-supervised Learning (Clustering) Recommender Systems Market Basket Analysis (Association Rule Discovery) Ad-placement Link Analysis Text Analysis (e.g., mining, retrieval) Social Network Analysis Natural Language Processing
5 Applications of Machine Learning
6 Tale of Two Worlds
7 Data Science The two worlds are merging into each other day by day Database community needs analytics and analytics community needs a way to store and manage large quantities of data On-going debate about putting databases into analytics or analytics into databases SQL vs. NoSQL Database world Pros: Good at storing, accessing and managing large quantities of data Cons: Very bad for analytics (assumes a structure) Analytics world Pros:Good at analyzing Cons: Poor at managing data
8 Data Science What constitutes data science: Analytics Storage Visualization Munging
9 Data-driven World Four paradigms of science Observational based Experimental based Model based Data based The Fourth Paradigm: Data-Intensive Scientific Discovery, by Jim Gray It is not about databases vs. analytics, SQL vs. NoSQL, it is all about data
10 Fundamental Problem in Machine Learning Regression: min β R n 1 N N i=1 (y i β T x) 2 + λ β 2 2 Classification: min β R n 1 N N i=1 L(y iβ T x) + λ β 2 2 In general: min β R n(loss + Regularization) min β R n F(β) Important questions: Model selection. Which optimization to use to learn the parameters. Different loss functions leads to different classifiers: Logistic Regression Support Vector Classifier Artificial Neural Network
11 Model Selection Let us visit the simplest of all machine learning problem: Data Linear Quadratic Cubic f(x) x
12 Model Selection Let us visit the simplest of all machine learning problem: Data Linear Quadratic Cubic f(x) x
13 Model Selection Let us visit the simplest of all machine learning problem: Data Linear Quadratic Cubic f(x) x
14 Model Selection Observation Models vs. Features You can take the cube of the features and fit a linear model. (x 1,..., x m) (x 3 1, x 2 1 x 2, x 1x 2x 3,...) This will be equivalent to applying cubic model. Question Which model should you select? Or equivalently, which attributes interactions should you consider? Remember In real world, you will have more than one features - categorical, discrete, etc. Hint With every model selection decision - there is a control of bias and variance: Why not select model by controlling for both bias and variance?
15 Parameterizing Model How do you handle numeric attributes? One parameter per attribute per class? How do you handle categorical attributes? Multiple parameters per attribute per class? How do you handle interactions among the variables? How do you handle missing values of the attributes? How do you handle redundant attributes? Over-parameterized model vs. under-parameterized model
16 Bias Variance Illustration
17 Model Selection Low variance model for small data Low bias model for big data More details in [2] Machine learning has been applied on small quantities of data Here, low variance algorithms are the best Low bias algorithms will over-fit and you need to rely on regularization or feature selection Feature Selection Forward selection Backward elimination
18 Regularizing your Model Very powerful technique for controlling bias and variance Many different regularizations exists, most common: L2 Regularization min β R n 1 N N i=1 L(y iβ T x) + λ β 2 2 L1 Regularization (also know as sparsity inducing norms) min β R n 1 N N i=1 L(y iβ T x) + λ β 1 Or elastic nets min β R n 1 N N i=1 L(y iβ T x) + λ( β 1 + γ β 2 2)
19 Feature Engineering Feature engineering is the process of using domain knowledge of the data to create features that make machine learning algorithms work. It is fundamental to the application of machine learning, and is both difficult and expensive. The need of manual feature engineering can be obviated by automated feature learning. Wikipedia Coming up with features is difficult, time-consuming, requires expert knowledge. Applied machine learning is basically feature engineering. Andrew Ng Is this what feature engineering is?
20 Feature Engineering (Contd) Feature Engineering is the next buzz word after big data. But. On the basis of Wikipedia definition, one can say that feature engineering has been going on for decades. Why so much attention now? In my view Feature Engineering and Big Data are related concepts. For big data, you need big models Big models Any model with very large no. of parameters. Note that a big model can be simply linear. For big models, it is more of an engineering problem as to how handle these parameters effectively. Since the hardware has not scaled-up well with data.
21 Feature Engineering (Contd) Given a model, learning problem is the learning of the parameters of model. The number of parameters depends on the number of features that you have: Exception, if you are solving for the dual. There will be some hyper-parameters. Parameter estimation is done by optimizing some objective function. Traditionally, there has been four elements of interest: Features Model Objective function Optimization
22 Feature Engineering (Contd) Models and features are related. Let us not worry about that for the time being. There have been two main objective functions that is: Conditional Log-Likelihood P(y x). Log-Likelihood P(y, x). This distinction has led to generative-discriminative paradigms in machine learning. Generative models P(y, x). Discriminative models P(y x). Very confusing distinction with no obvious benefits.
23 Generative vs. Discriminative Models/Learning Bayes rule: P(y x) P(y, x) Converting to =, we get: P(y x) = P(y,x) P(x). And therefore, P(y x) = P(y,x) c P(c,x).
24 Generative vs. Discriminative Models/Learning A well-known generative model Naive Bayes P(y x) = π y i P(x i y) c π c i P(x i c) A well-known discriminative model Logistic Regression P(y x) = exp(β y + i β i,x i,y x i ) c exp(β c + i β i,x i,cx i ) (1) (2)
25 On the Equivalence of Generative vs. Discriminative Models/Learning We have naive Bayes as: P(y x) = exp(logπ y + i Same exp and log trick: log P(x i y) log( c π c P(x i c))) i P(y x) = exp(log π y + i log P(x i y) log( c exp(log π c + i log P(x i c)))) log P(y x) = log π y + i log P(x i y) log( c exp(log π c + i log P(x i c))).
26 On the Equivalence of Generative vs. Discriminative Models/Learning Now, let us take the log of LR: log P(y x) = β y + i β i,x i,y x i log( c exp(β c + i β i,x i,cx i )) Reminder, for NB we had: log P(y x) = log π y + i log P(x i y) log( c exp(log π c + i log P(x i c))). NB and LR are just re-parameterization of each other for example: β y = log π y and β i,xi,y = log P(x i y).
27 On the Equivalence of Generative vs. Discriminative Models/Learning Modifying NB: log P(y x) = w y log π y + i w xi y log P(x i y) log( c exp(w c log π c + i w xi c log P(x i c))). This leads to: P(y x) = πwy y c πwc c i P(x i y) w x i y i P(x i c) w x i c NB and LR are just re-parameterization of each other, so why the fuss? More details in [3, 4, 5]
28 Summary so far Model selection and feature selection are tightly coupled. Feature Engineering The distinction between CLL and LL is confusing. Rather, superfluous. They only differ in the way parameters are being learned: For LL (naive Bayes), parameters are empirical estimates of probability For CLL (LR), parameters are learned by iterative optimization of some soft-max. This leaves two things: How to do feature engineering? How to actually optimize?
29 How to Engineer Features? Regularization Use domain knowledge Exploit existing structure in the data Dimensionality reduction Use data to build features
30 Domain Knowledge Use of expert knowledge or your knowledge about the problem Use of other datasets to explain your data Main advantage is the simplicity and intuitiveness Only applies to small number of features Access to domain expert might be difficult Summary You use some information at your disposal to build features before starting a learning process
31 Dimensionality Reduction Feature Selection Filter Wrapper Embedded Principle Component Analysis (PCA) Linear Discriminant Analysis (LDA) Metric Learning [6, 7] Auto-encoders
32 Auto-Encoders Similar to multi-layer Perceptron Output layer has equally many nodes as input layer Trained to reconstruct its own input Learning algorithm: Feed-forward back propagation Structure:
33 Exploit Structures in the Data Some datasets have an inherent structure, e.g., in computer vision, NLP You can use this information to build features Convolutional Neural Networks CNNs exploit spatially-local correlation by enforcing a local connectivity pattern between neurons of adjacent layers Feed-forward neural network Very successful in digit and object recognition (e.g., MNIST, CIFAR, etc.)
34 Convolutional Neural Networks
35 Use Data to Build Features Restricted Boltzmann Machines Trained by contrastive divergence Deep Belief Networks Many more variants
36 Lessons Learned from Big Data Automatic feature engineering helps Capturing higher-order interactions in the data is beneficial Low-bias algorithms with some regularization on big data leads to state-of-the-art results If you know the structure, leverage it to build (engineer) features What if you don t know the structure Use heuristics
37 Optimization Let us focus on the optimization: Going second-order: w t+1 = w t η OF(w) w (3) w t+1 = w t η 2 OF(w) w Place regularization to make function smooth for second-order differentiation (4)
38 Implementation issues in Optimization Can we leverage map-reduce paramdigm? No Massive Parallelization is required to handle parameters Batch version you can distribute parameters across nodes SGD you can rely on mini-batches Success in deep learning is attributed to advancements in optimization and implementation techniques
39 Big Data vs. Big Models
40 Big Data vs. Big Models
41 Big Data vs. Big Models
42 Big Data vs. Big Models Big Data: Lot of hype around it Volume, Velocity, Variety, Varsity Lots of efforts for managing big data Big Models: Models that can learn from big data For example: [8, 9] Deep and Broad Ingredients: Feature Engineering + Stable Optimization It is the big models that hold key to a break-through than big data.
43 Conclusion and Take-away-home Message Two most important things: Feature Engineering + Optimization Big Model not big data Big Model: Low bias + Out-of-core + Minimal tuning parameters + Multi-class Big Model: Deep + Broad Questions
44 R. Duda, P. Hart, and D. Stork, Pattern Classification. John Wiley and Sons, D. Brain and G. I. Webb, The need for low bias algorithms in classification learning from small data sets, in PKDD, pp , T. Jebara, Machine Learning: Discriminative and Generative. Springer International Series, N. A. Zaidi, J. Cerquides, M. J. Carman, and G. I. Webb, Alleviating naive Bayes attribute independence assumption by attribute weighting, Journal of Machine Learning Research, vol. 14, pp , N. A. Zaidi, M. J. Carman, J. Cerquides, and G. I. Webb, Naive-bayes inspired effective pre-conditioners for speeding-up logistic regression, in IEEE International Conference on Data Mining, 2014.
45 N. Zaidi and D. M. Squire, Local adaptive svm for object recognition, in Digital Image Computing: Techniques and Applications (DICTA), 2010 International Conference on, N. Zaidi and D. M. Squire, A gradient-based metric learning algorithm for k-nn classifiers, in AI 2010: Advances in Artificial Intelligence, N. A. Zaidi and G. I. Webb, Fast and effective single pass bayesian learning, in Advances in Knowledge Discovery and Data Mining, S. Martinez, A. Chen, G. I. Webb, and N. A. Zaidi, Scalable learning of bayesian network classifiers, accepted to be published in Journal of Machine Learning Research.
CSCI567 Machine Learning (Fall 2014)
CSCI567 Machine Learning (Fall 2014) Drs. Sha & Liu {feisha,yanliu.cs}@usc.edu September 22, 2014 Drs. Sha & Liu ({feisha,yanliu.cs}@usc.edu) CSCI567 Machine Learning (Fall 2014) September 22, 2014 1 /
Statistical Machine Learning
Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes
large-scale machine learning revisited Léon Bottou Microsoft Research (NYC)
large-scale machine learning revisited Léon Bottou Microsoft Research (NYC) 1 three frequent ideas in machine learning. independent and identically distributed data This experimental paradigm has driven
CS 2750 Machine Learning. Lecture 1. Machine Learning. http://www.cs.pitt.edu/~milos/courses/cs2750/ CS 2750 Machine Learning.
Lecture Machine Learning Milos Hauskrecht [email protected] 539 Sennott Square, x5 http://www.cs.pitt.edu/~milos/courses/cs75/ Administration Instructor: Milos Hauskrecht [email protected] 539 Sennott
Chapter 12 Discovering New Knowledge Data Mining
Chapter 12 Discovering New Knowledge Data Mining Becerra-Fernandez, et al. -- Knowledge Management 1/e -- 2004 Prentice Hall Additional material 2007 Dekai Wu Chapter Objectives Introduce the student to
Steven C.H. Hoi School of Information Systems Singapore Management University Email: [email protected]
Steven C.H. Hoi School of Information Systems Singapore Management University Email: [email protected] Introduction http://stevenhoi.org/ Finance Recommender Systems Cyber Security Machine Learning Visual
MACHINE LEARNING IN HIGH ENERGY PHYSICS
MACHINE LEARNING IN HIGH ENERGY PHYSICS LECTURE #1 Alex Rogozhnikov, 2015 INTRO NOTES 4 days two lectures, two practice seminars every day this is introductory track to machine learning kaggle competition!
Azure Machine Learning, SQL Data Mining and R
Azure Machine Learning, SQL Data Mining and R Day-by-day Agenda Prerequisites No formal prerequisites. Basic knowledge of SQL Server Data Tools, Excel and any analytical experience helps. Best of all:
Lecture 6. Artificial Neural Networks
Lecture 6 Artificial Neural Networks 1 1 Artificial Neural Networks In this note we provide an overview of the key concepts that have led to the emergence of Artificial Neural Networks as a major paradigm
An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015
An Introduction to Data Mining for Wind Power Management Spring 2015 Big Data World Every minute: Google receives over 4 million search queries Facebook users share almost 2.5 million pieces of content
Supervised Feature Selection & Unsupervised Dimensionality Reduction
Supervised Feature Selection & Unsupervised Dimensionality Reduction Feature Subset Selection Supervised: class labels are given Select a subset of the problem features Why? Redundant features much or
A Content based Spam Filtering Using Optical Back Propagation Technique
A Content based Spam Filtering Using Optical Back Propagation Technique Sarab M. Hameed 1, Noor Alhuda J. Mohammed 2 Department of Computer Science, College of Science, University of Baghdad - Iraq ABSTRACT
Data Mining and Neural Networks in Stata
Data Mining and Neural Networks in Stata 2 nd Italian Stata Users Group Meeting Milano, 10 October 2005 Mario Lucchini e Maurizo Pisati Università di Milano-Bicocca [email protected] [email protected]
Linear Threshold Units
Linear Threshold Units w x hx (... w n x n w We assume that each feature x j and each weight w j is a real number (we will relax this later) We will study three different algorithms for learning linear
Scalable Developments for Big Data Analytics in Remote Sensing
Scalable Developments for Big Data Analytics in Remote Sensing Federated Systems and Data Division Research Group High Productivity Data Processing Dr.-Ing. Morris Riedel et al. Research Group Leader,
Predictive Data modeling for health care: Comparative performance study of different prediction models
Predictive Data modeling for health care: Comparative performance study of different prediction models Shivanand Hiremath [email protected] National Institute of Industrial Engineering (NITIE) Vihar
Machine Learning. 01 - Introduction
Machine Learning 01 - Introduction Machine learning course One lecture (Wednesday, 9:30, 346) and one exercise (Monday, 17:15, 203). Oral exam, 20 minutes, 5 credit points. Some basic mathematical knowledge
NEURAL NETWORKS A Comprehensive Foundation
NEURAL NETWORKS A Comprehensive Foundation Second Edition Simon Haykin McMaster University Hamilton, Ontario, Canada Prentice Hall Prentice Hall Upper Saddle River; New Jersey 07458 Preface xii Acknowledgments
Data quality in Accounting Information Systems
Data quality in Accounting Information Systems Comparing Several Data Mining Techniques Erjon Zoto Department of Statistics and Applied Informatics Faculty of Economy, University of Tirana Tirana, Albania
Statistics for BIG data
Statistics for BIG data Statistics for Big Data: Are Statisticians Ready? Dennis Lin Department of Statistics The Pennsylvania State University John Jordan and Dennis K.J. Lin (ICSA-Bulletine 2014) Before
An Introduction to Data Mining
An Introduction to Intel Beijing [email protected] January 17, 2014 Outline 1 DW Overview What is Notable Application of Conference, Software and Applications Major Process in 2 Major Tasks in Detail
Defending Networks with Incomplete Information: A Machine Learning Approach. Alexandre Pinto [email protected] @alexcpsec @MLSecProject
Defending Networks with Incomplete Information: A Machine Learning Approach Alexandre Pinto [email protected] @alexcpsec @MLSecProject Agenda Security Monitoring: We are doing it wrong Machine Learning
How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning
How to use Big Data in Industry 4.0 implementations LAURI ILISON, PhD Head of Big Data and Machine Learning Big Data definition? Big Data is about structured vs unstructured data Big Data is about Volume
Practical Data Science with Azure Machine Learning, SQL Data Mining, and R
Practical Data Science with Azure Machine Learning, SQL Data Mining, and R Overview This 4-day class is the first of the two data science courses taught by Rafal Lukawiecki. Some of the topics will be
Introduction to Machine Learning CMU-10701
Introduction to Machine Learning CMU-10701 Deep Learning Barnabás Póczos & Aarti Singh Credits Many of the pictures, results, and other materials are taken from: Ruslan Salakhutdinov Joshua Bengio Geoffrey
Chapter 6. The stacking ensemble approach
82 This chapter proposes the stacking ensemble approach for combining different data mining classifiers to get better performance. Other combination techniques like voting, bagging etc are also described
A Logistic Regression Approach to Ad Click Prediction
A Logistic Regression Approach to Ad Click Prediction Gouthami Kondakindi [email protected] Satakshi Rana [email protected] Aswin Rajkumar [email protected] Sai Kaushik Ponnekanti [email protected] Vinit Parakh
203.4770: Introduction to Machine Learning Dr. Rita Osadchy
203.4770: Introduction to Machine Learning Dr. Rita Osadchy 1 Outline 1. About the Course 2. What is Machine Learning? 3. Types of problems and Situations 4. ML Example 2 About the course Course Homepage:
STA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! [email protected]! http://www.cs.toronto.edu/~rsalakhu/ Lecture 6 Three Approaches to Classification Construct
Data Mining Algorithms Part 1. Dejan Sarka
Data Mining Algorithms Part 1 Dejan Sarka Join the conversation on Twitter: @DevWeek #DW2015 Instructor Bio Dejan Sarka ([email protected]) 30 years of experience SQL Server MVP, MCT, 13 books 7+ courses
E-commerce Transaction Anomaly Classification
E-commerce Transaction Anomaly Classification Minyong Lee [email protected] Seunghee Ham [email protected] Qiyi Jiang [email protected] I. INTRODUCTION Due to the increasing popularity of e-commerce
Data Mining Practical Machine Learning Tools and Techniques
Ensemble learning Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 8 of Data Mining by I. H. Witten, E. Frank and M. A. Hall Combining multiple models Bagging The basic idea
Forecasting Trade Direction and Size of Future Contracts Using Deep Belief Network
Forecasting Trade Direction and Size of Future Contracts Using Deep Belief Network Anthony Lai (aslai), MK Li (lilemon), Foon Wang Pong (ppong) Abstract Algorithmic trading, high frequency trading (HFT)
Machine Learning in Spam Filtering
Machine Learning in Spam Filtering A Crash Course in ML Konstantin Tretyakov [email protected] Institute of Computer Science, University of Tartu Overview Spam is Evil ML for Spam Filtering: General Idea, Problems.
IFT3395/6390. Machine Learning from linear regression to Neural Networks. Machine Learning. Training Set. t (3.5, -2,..., 127, 0,...
IFT3395/6390 Historical perspective: back to 1957 (Prof. Pascal Vincent) (Rosenblatt, Perceptron ) Machine Learning from linear regression to Neural Networks Computer Science Artificial Intelligence Symbolic
Predictive Analytics Techniques: What to Use For Your Big Data. March 26, 2014 Fern Halper, PhD
Predictive Analytics Techniques: What to Use For Your Big Data March 26, 2014 Fern Halper, PhD Presenter Proven Performance Since 1995 TDWI helps business and IT professionals gain insight about data warehousing,
6.2.8 Neural networks for data mining
6.2.8 Neural networks for data mining Walter Kosters 1 In many application areas neural networks are known to be valuable tools. This also holds for data mining. In this chapter we discuss the use of neural
Big-Data Computing with Smart Clouds and IoT Sensing
A New Book from Wiley Publisher to appear in late 2016 or early 2017 Big-Data Computing with Smart Clouds and IoT Sensing Kai Hwang, University of Southern California, USA Min Chen, Huazhong University
Predicting the Risk of Heart Attacks using Neural Network and Decision Tree
Predicting the Risk of Heart Attacks using Neural Network and Decision Tree S.Florence 1, N.G.Bhuvaneswari Amma 2, G.Annapoorani 3, K.Malathi 4 PG Scholar, Indian Institute of Information Technology, Srirangam,
Neural Networks and Support Vector Machines
INF5390 - Kunstig intelligens Neural Networks and Support Vector Machines Roar Fjellheim INF5390-13 Neural Networks and SVM 1 Outline Neural networks Perceptrons Neural networks Support vector machines
The Data Mining Process
Sequence for Determining Necessary Data. Wrong: Catalog everything you have, and decide what data is important. Right: Work backward from the solution, define the problem explicitly, and map out the data
Principles of Data Mining by Hand&Mannila&Smyth
Principles of Data Mining by Hand&Mannila&Smyth Slides for Textbook Ari Visa,, Institute of Signal Processing Tampere University of Technology October 4, 2010 Data Mining: Concepts and Techniques 1 Differences
Logistic Regression. Vibhav Gogate The University of Texas at Dallas. Some Slides from Carlos Guestrin, Luke Zettlemoyer and Dan Weld.
Logistic Regression Vibhav Gogate The University of Texas at Dallas Some Slides from Carlos Guestrin, Luke Zettlemoyer and Dan Weld. Generative vs. Discriminative Classifiers Want to Learn: h:x Y X features
Comparison of machine learning methods for intelligent tutoring systems
Comparison of machine learning methods for intelligent tutoring systems Wilhelmiina Hämäläinen 1 and Mikko Vinni 1 Department of Computer Science, University of Joensuu, P.O. Box 111, FI-80101 Joensuu
REVIEW OF ENSEMBLE CLASSIFICATION
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology ISSN 2320 088X IJCSMC, Vol. 2, Issue.
A Study Of Bagging And Boosting Approaches To Develop Meta-Classifier
A Study Of Bagging And Boosting Approaches To Develop Meta-Classifier G.T. Prasanna Kumari Associate Professor, Dept of Computer Science and Engineering, Gokula Krishna College of Engg, Sullurpet-524121,
Meta-learning. Synonyms. Definition. Characteristics
Meta-learning Włodzisław Duch, Department of Informatics, Nicolaus Copernicus University, Poland, School of Computer Engineering, Nanyang Technological University, Singapore [email protected] (or search
A STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS
A STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS Mrs. Jyoti Nawade 1, Dr. Balaji D 2, Mr. Pravin Nawade 3 1 Lecturer, JSPM S Bhivrabai Sawant Polytechnic, Pune (India) 2 Assistant
Novelty Detection in image recognition using IRF Neural Networks properties
Novelty Detection in image recognition using IRF Neural Networks properties Philippe Smagghe, Jean-Luc Buessler, Jean-Philippe Urban Université de Haute-Alsace MIPS 4, rue des Frères Lumière, 68093 Mulhouse,
Linear Classification. Volker Tresp Summer 2015
Linear Classification Volker Tresp Summer 2015 1 Classification Classification is the central task of pattern recognition Sensors supply information about an object: to which class do the object belong
Processing of Big Data. Nelson L. S. da Fonseca IEEE ComSoc Summer Scool Trento, July 9 th, 2015
Processing of Big Data Nelson L. S. da Fonseca IEEE ComSoc Summer Scool Trento, July 9 th, 2015 Acknowledgement Some slides in this set of slides were provided by EMC Corporation and Sandra Avila, University
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION Introduction In the previous chapter, we explored a class of regression models having particularly simple analytical
Artificial Neural Networks and Support Vector Machines. CS 486/686: Introduction to Artificial Intelligence
Artificial Neural Networks and Support Vector Machines CS 486/686: Introduction to Artificial Intelligence 1 Outline What is a Neural Network? - Perceptron learners - Multi-layer networks What is a Support
HT2015: SC4 Statistical Data Mining and Machine Learning
HT2015: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford http://www.stats.ox.ac.uk/~sejdinov/sdmml.html Bayesian Nonparametrics Parametric vs Nonparametric
Mobile Phone APP Software Browsing Behavior using Clustering Analysis
Proceedings of the 2014 International Conference on Industrial Engineering and Operations Management Bali, Indonesia, January 7 9, 2014 Mobile Phone APP Software Browsing Behavior using Clustering Analysis
Pentaho Data Mining Last Modified on January 22, 2007
Pentaho Data Mining Copyright 2007 Pentaho Corporation. Redistribution permitted. All trademarks are the property of their respective owners. For the latest information, please visit our web site at www.pentaho.org
Supporting Online Material for
www.sciencemag.org/cgi/content/full/313/5786/504/dc1 Supporting Online Material for Reducing the Dimensionality of Data with Neural Networks G. E. Hinton* and R. R. Salakhutdinov *To whom correspondence
Big Data Analytics CSCI 4030
High dim. data Graph data Infinite data Machine learning Apps Locality sensitive hashing PageRank, SimRank Filtering data streams SVM Recommen der systems Clustering Community Detection Web advertising
Data Mining - Evaluation of Classifiers
Data Mining - Evaluation of Classifiers Lecturer: JERZY STEFANOWSKI Institute of Computing Sciences Poznan University of Technology Poznan, Poland Lecture 4 SE Master Course 2008/2009 revised for 2010
Deep learning applications and challenges in big data analytics
Najafabadi et al. Journal of Big Data (2015) 2:1 DOI 10.1186/s40537-014-0007-7 RESEARCH Open Access Deep learning applications and challenges in big data analytics Maryam M Najafabadi 1, Flavio Villanustre
NEURAL NETWORKS IN DATA MINING
NEURAL NETWORKS IN DATA MINING 1 DR. YASHPAL SINGH, 2 ALOK SINGH CHAUHAN 1 Reader, Bundelkhand Institute of Engineering & Technology, Jhansi, India 2 Lecturer, United Institute of Management, Allahabad,
Bayesian Machine Learning (ML): Modeling And Inference in Big Data. Zhuhua Cai Google, Rice University [email protected]
Bayesian Machine Learning (ML): Modeling And Inference in Big Data Zhuhua Cai Google Rice University [email protected] 1 Syllabus Bayesian ML Concepts (Today) Bayesian ML on MapReduce (Next morning) Bayesian
Ensemble Methods. Knowledge Discovery and Data Mining 2 (VU) (707.004) Roman Kern. KTI, TU Graz 2015-03-05
Ensemble Methods Knowledge Discovery and Data Mining 2 (VU) (707004) Roman Kern KTI, TU Graz 2015-03-05 Roman Kern (KTI, TU Graz) Ensemble Methods 2015-03-05 1 / 38 Outline 1 Introduction 2 Classification
Learning to Process Natural Language in Big Data Environment
CCF ADL 2015 Nanchang Oct 11, 2015 Learning to Process Natural Language in Big Data Environment Hang Li Noah s Ark Lab Huawei Technologies Part 1: Deep Learning - Present and Future Talk Outline Overview
Introduction to Data Mining and Machine Learning Techniques. Iza Moise, Evangelos Pournaras, Dirk Helbing
Introduction to Data Mining and Machine Learning Techniques Iza Moise, Evangelos Pournaras, Dirk Helbing Iza Moise, Evangelos Pournaras, Dirk Helbing 1 Overview Main principles of data mining Definition
Data Mining: Overview. What is Data Mining?
Data Mining: Overview What is Data Mining? Recently * coined term for confluence of ideas from statistics and computer science (machine learning and database methods) applied to large databases in science,
Predict Influencers in the Social Network
Predict Influencers in the Social Network Ruishan Liu, Yang Zhao and Liuyu Zhou Email: rliu2, yzhao2, [email protected] Department of Electrical Engineering, Stanford University Abstract Given two persons
Ensemble Data Mining Methods
Ensemble Data Mining Methods Nikunj C. Oza, Ph.D., NASA Ames Research Center, USA INTRODUCTION Ensemble Data Mining Methods, also known as Committee Methods or Model Combiners, are machine learning methods
WebFOCUS RStat. RStat. Predict the Future and Make Effective Decisions Today. WebFOCUS RStat
Information Builders enables agile information solutions with business intelligence (BI) and integration technologies. WebFOCUS the most widely utilized business intelligence platform connects to any enterprise
CLASSIFYING NETWORK TRAFFIC IN THE BIG DATA ERA
CLASSIFYING NETWORK TRAFFIC IN THE BIG DATA ERA Professor Yang Xiang Network Security and Computing Laboratory (NSCLab) School of Information Technology Deakin University, Melbourne, Australia http://anss.org.au/nsclab
The Role of Size Normalization on the Recognition Rate of Handwritten Numerals
The Role of Size Normalization on the Recognition Rate of Handwritten Numerals Chun Lei He, Ping Zhang, Jianxiong Dong, Ching Y. Suen, Tien D. Bui Centre for Pattern Recognition and Machine Intelligence,
LCs for Binary Classification
Linear Classifiers A linear classifier is a classifier such that classification is performed by a dot product beteen the to vectors representing the document and the category, respectively. Therefore it
Introduction to Machine Learning and Data Mining. Prof. Dr. Igor Trajkovski [email protected]
Introduction to Machine Learning and Data Mining Prof. Dr. Igor Trakovski [email protected] Neural Networks 2 Neural Networks Analogy to biological neural systems, the most robust learning systems
International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014
RESEARCH ARTICLE OPEN ACCESS A Survey of Data Mining: Concepts with Applications and its Future Scope Dr. Zubair Khan 1, Ashish Kumar 2, Sunny Kumar 3 M.Tech Research Scholar 2. Department of Computer
International Journal of Computer Science Trends and Technology (IJCST) Volume 3 Issue 3, May-June 2015
RESEARCH ARTICLE OPEN ACCESS Data Mining Technology for Efficient Network Security Management Ankit Naik [1], S.W. Ahmad [2] Student [1], Assistant Professor [2] Department of Computer Science and Engineering
KATE GLEASON COLLEGE OF ENGINEERING. John D. Hromi Center for Quality and Applied Statistics
ROCHESTER INSTITUTE OF TECHNOLOGY COURSE OUTLINE FORM KATE GLEASON COLLEGE OF ENGINEERING John D. Hromi Center for Quality and Applied Statistics NEW (or REVISED) COURSE (KGCOE- CQAS- 747- Principles of
Classification algorithm in Data mining: An Overview
Classification algorithm in Data mining: An Overview S.Neelamegam #1, Dr.E.Ramaraj *2 #1 M.phil Scholar, Department of Computer Science and Engineering, Alagappa University, Karaikudi. *2 Professor, Department
Role of Neural network in data mining
Role of Neural network in data mining Chitranjanjit kaur Associate Prof Guru Nanak College, Sukhchainana Phagwara,(GNDU) Punjab, India Pooja kapoor Associate Prof Swami Sarvanand Group Of Institutes Dinanagar(PTU)
Data Mining. Dr. Saed Sayad. University of Toronto 2010 [email protected]. http://chem-eng.utoronto.ca/~datamining/
Data Mining Dr. Saed Sayad University of Toronto 2010 [email protected] http://chem-eng.utoronto.ca/~datamining/ 1 Data Mining Data mining is about explaining the past and predicting the future by
Simple and efficient online algorithms for real world applications
Simple and efficient online algorithms for real world applications Università degli Studi di Milano Milano, Italy Talk @ Centro de Visión por Computador Something about me PhD in Robotics at LIRA-Lab,
Machine Learning for Data Science (CS4786) Lecture 1
Machine Learning for Data Science (CS4786) Lecture 1 Tu-Th 10:10 to 11:25 AM Hollister B14 Instructors : Lillian Lee and Karthik Sridharan ROUGH DETAILS ABOUT THE COURSE Diagnostic assignment 0 is out:
Data Mining Techniques Chapter 7: Artificial Neural Networks
Data Mining Techniques Chapter 7: Artificial Neural Networks Artificial Neural Networks.................................................. 2 Neural network example...................................................
SURVIVABILITY ANALYSIS OF PEDIATRIC LEUKAEMIC PATIENTS USING NEURAL NETWORK APPROACH
330 SURVIVABILITY ANALYSIS OF PEDIATRIC LEUKAEMIC PATIENTS USING NEURAL NETWORK APPROACH T. M. D.Saumya 1, T. Rupasinghe 2 and P. Abeysinghe 3 1 Department of Industrial Management, University of Kelaniya,
Customer Classification And Prediction Based On Data Mining Technique
Customer Classification And Prediction Based On Data Mining Technique Ms. Neethu Baby 1, Mrs. Priyanka L.T 2 1 M.E CSE, Sri Shakthi Institute of Engineering and Technology, Coimbatore 2 Assistant Professor
The Scientific Data Mining Process
Chapter 4 The Scientific Data Mining Process When I use a word, Humpty Dumpty said, in rather a scornful tone, it means just what I choose it to mean neither more nor less. Lewis Carroll [87, p. 214] In
Data Mining Part 5. Prediction
Data Mining Part 5. Prediction 5.1 Spring 2010 Instructor: Dr. Masoud Yaghini Outline Classification vs. Numeric Prediction Prediction Process Data Preparation Comparing Prediction Methods References Classification
High Productivity Data Processing Analytics Methods with Applications
High Productivity Data Processing Analytics Methods with Applications Dr. Ing. Morris Riedel et al. Adjunct Associate Professor School of Engineering and Natural Sciences, University of Iceland Research
How To Use Neural Networks In Data Mining
International Journal of Electronics and Computer Science Engineering 1449 Available Online at www.ijecse.org ISSN- 2277-1956 Neural Networks in Data Mining Priyanka Gaur Department of Information and
EFFICIENT DATA PRE-PROCESSING FOR DATA MINING
EFFICIENT DATA PRE-PROCESSING FOR DATA MINING USING NEURAL NETWORKS JothiKumar.R 1, Sivabalan.R.V 2 1 Research scholar, Noorul Islam University, Nagercoil, India Assistant Professor, Adhiparasakthi College
Semi-Supervised Support Vector Machines and Application to Spam Filtering
Semi-Supervised Support Vector Machines and Application to Spam Filtering Alexander Zien Empirical Inference Department, Bernhard Schölkopf Max Planck Institute for Biological Cybernetics ECML 2006 Discovery
Advanced In-Database Analytics
Advanced In-Database Analytics Tallinn, Sept. 25th, 2012 Mikko-Pekka Bertling, BDM Greenplum EMEA 1 That sounds complicated? 2 Who can tell me how best to solve this 3 What are the main mathematical functions??
Bayesian networks - Time-series models - Apache Spark & Scala
Bayesian networks - Time-series models - Apache Spark & Scala Dr John Sandiford, CTO Bayes Server Data Science London Meetup - November 2014 1 Contents Introduction Bayesian networks Latent variables Anomaly
Microsoft Azure Machine learning Algorithms
Microsoft Azure Machine learning Algorithms Tomaž KAŠTRUN @tomaz_tsql [email protected] http://tomaztsql.wordpress.com Our Sponsors Speaker info https://tomaztsql.wordpress.com Agenda Focus on explanation
MS1b Statistical Data Mining
MS1b Statistical Data Mining Yee Whye Teh Department of Statistics Oxford http://www.stats.ox.ac.uk/~teh/datamining.html Outline Administrivia and Introduction Course Structure Syllabus Introduction to
