8. Machine Learning Applied Artificial Intelligence



Similar documents
Social Media Mining. Data Mining Essentials

Chapter 12 Discovering New Knowledge Data Mining

Analytics on Big Data

Data Mining with Weka

Data Mining Algorithms Part 1. Dejan Sarka

Bayesian networks - Time-series models - Apache Spark & Scala

Web Document Clustering

Machine learning for algo trading

Predicting the Risk of Heart Attacks using Neural Network and Decision Tree

Data Mining & Data Stream Mining Open Source Tools

Implementation of Breiman s Random Forest Machine Learning Algorithm

Network Machine Learning Research Group. Intended status: Informational October 19, 2015 Expires: April 21, 2016

An Introduction to WEKA. As presented by PACE

Université de Montpellier 2 Hugo Alatrista-Salas : hugo.alatrista-salas@teledetection.fr

An Introduction to Data Mining

from Larson Text By Susan Miertschin

Knowledge-based systems and the need for learning

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014

Introduction Predictive Analytics Tools: Weka

Prof. Pietro Ducange Students Tutor and Practical Classes Course of Business Intelligence

Machine Learning. Chapter 18, 21. Some material adopted from notes by Chuck Dyer

Analysis Tools and Libraries for BigData

Learning is a very general term denoting the way in which agents:

What is Data Mining? Data Mining (Knowledge discovery in database) Data mining: Basic steps. Mining tasks. Classification: YES, NO

Classification algorithm in Data mining: An Overview

Monday Morning Data Mining

Practical Introduction to Machine Learning and Optimization. Alessio Signorini

Comparative Analysis of EM Clustering Algorithm and Density Based Clustering Algorithm Using WEKA tool.

Machine Learning Introduction

Pentaho Data Mining Last Modified on January 22, 2007

Supervised Learning (Big Data Analytics)

BOOSTING - A METHOD FOR IMPROVING THE ACCURACY OF PREDICTIVE MODEL

More Data Mining with Weka

Final Project Report

Model Deployment. Dr. Saed Sayad. University of Toronto

Classification and Prediction

Machine Learning: Overview

Data Mining and Neural Networks in Stata

Event driven trading new studies on innovative way. of trading in Forex market. Michał Osmoła INIME live 23 February 2016

Index Contents Page No. Introduction . Data Mining & Knowledge Discovery

Using Data Mining for Mobile Communication Clustering and Characterization

Predictive Analytics Techniques: What to Use For Your Big Data. March 26, 2014 Fern Halper, PhD

Analysis of WEKA Data Mining Algorithm REPTree, Simple Cart and RandomTree for Classification of Indian News

NEURAL NETWORKS IN DATA MINING

Attribution. Modified from Stuart Russell s slides (Berkeley) Parts of the slides are inspired by Dan Klein s lecture material for CS 188 (Berkeley)

Data Mining Part 5. Prediction

Neural Networks and Support Vector Machines

Keywords Data mining, Classification Algorithm, Decision tree, J48, Random forest, Random tree, LMT, WEKA 3.7. Fig.1. Data mining techniques.

Question 2 Naïve Bayes (16 points)

Machine Learning and Data Mining. Fundamentals, robotics, recognition

Data Mining of Web Access Logs

D A T A M I N I N G C L A S S I F I C A T I O N

Machine Learning CS Lecture 01. Razvan C. Bunescu School of Electrical Engineering and Computer Science

An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015

Machine Learning Capacity and Performance Analysis and R

Steven C.H. Hoi School of Information Systems Singapore Management University

Data Mining. Knowledge Discovery, Data Warehousing and Machine Learning Final remarks. Lecturer: JERZY STEFANOWSKI

Management Decision Making. Hadi Hosseini CS 330 David R. Cheriton School of Computer Science University of Waterloo July 14, 2011

Introduction to Machine Learning Using Python. Vikram Kamath

Machine Learning and Statistics: What s the Connection?

Title. Introduction to Data Mining. Dr Arulsivanathan Naidoo Statistics South Africa. OECD Conference Cape Town 8-10 December 2010.

How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning

Predictive Dynamix Inc

Why is Internal Audit so Hard?

ANALYTICS IN BIG DATA ERA

THE COMPARISON OF DATA MINING TOOLS

Role of Neural network in data mining

Evaluation of Machine Learning Techniques for Green Energy Prediction

1. Classification problems

Comparison of K-means and Backpropagation Data Mining Algorithms

COMP3420: Advanced Databases and Data Mining. Classification and prediction: Introduction and Decision Tree Induction

Data Mining Techniques for Prognosis in Pancreatic Cancer

Machine Learning using MapReduce

Artificial Intelligence and Machine Learning Models

Introduction to Machine Learning Lecture 1. Mehryar Mohri Courant Institute and Google Research

The KDD Process: Applying Data Mining

Reference Books. Data Mining. Supervised vs. Unsupervised Learning. Classification: Definition. Classification k-nearest neighbors

Bayesian Machine Learning (ML): Modeling And Inference in Big Data. Zhuhua Cai Google, Rice University

Data Mining. 1 Introduction 2 Data Mining methods. Alfred Holl Data Mining 1

Rule based Classification of BSE Stock Data with Data Mining

Azure Machine Learning, SQL Data Mining and R

Improving spam mail filtering using classification algorithms with discretization Filter

A Comparative Analysis of Classification Techniques on Categorical Data in Data Mining

EFFICIENCY OF DECISION TREES IN PREDICTING STUDENT S ACADEMIC PERFORMANCE

WEKA Explorer Tutorial

Data Mining: Overview. What is Data Mining?

Prediction of Cancer Count through Artificial Neural Networks Using Incidence and Mortality Cancer Statistics Dataset for Cancer Control Organizations

Artificial Neural Networks and Support Vector Machines. CS 486/686: Introduction to Artificial Intelligence

Data Mining Lab 5: Introduction to Neural Networks

BIG DATA What it is and how to use?

In this presentation, you will be introduced to data mining and the relationship with meaningful use.

Machine Learning.

How To Make A Credit Risk Model For A Bank Account

Detection. Perspective. Network Anomaly. Bhattacharyya. Jugal. A Machine Learning »C) Dhruba Kumar. Kumar KaKta. CRC Press J Taylor & Francis Croup

A Content based Spam Filtering Using Optical Back Propagation Technique

Dan French Founder & CEO, Consider Solutions

Decision-Tree Learning

DATA MINING TOOL FOR INTEGRATED COMPLAINT MANAGEMENT SYSTEM WEKA 3.6.7

Transcription:

8. Machine Learning Applied Artificial Intelligence Prof. Dr. Bernhard Humm Faculty of Computer Science Hochschule Darmstadt University of Applied Sciences 1

Retrospective Natural Language Processing Name and explain different areas of NLP What are the 7 levels of language understanding? What is tokenizing, sentence splitting, POS tagging, and parsing? What do language resources offer to NLP? Give examples What do NLP frameworks offer? Give examples What do NLP services offer? Give examples 2

Agenda Overview ML Applications ML Tasks ML Approaches ML Tools Services / Product Map 3

What is Machine Learning (ML)? Generating a model based on inputs and using it for making decisions or predictions ( rather than programming instructions explicitly ) 4

Agenda Overview ML Applications ML Tasks ML Approaches ML Tools Services / Product Map 5

Applications of ML: Spam filtering Task: classify new e-mails as spam or not spam New e-mails Automatically classified Spam filter ML input Manually classified Corrections 6

Stock market analysis Task: make recommendations on buying and selling stocks Current stock values Prediction Recommendation ML input Decision History of stock values 7 Image source: Wikimedia

Detecting credit card fraud Task: Detect fraud in credit card payments CC payments Automatically classified Fraud detection ML input Manually classified Corrections 8

Recommender systems Task: Recommending customers suitable products Order Recommendation of related products Recommender system ML input Purchasing behaviour of other customers or customer groups 9

Agenda Overview ML Applications ML Tasks ML Approaches ML Tools Services / Product Map 10

Categories of ML tasks Machine Learning Task Supervised Learning Unsupervised Learning Reinforcement Learning Classification Regression Clustering Feature selection / extraction Topic modeling P.S. Other categorizations / groupings are possible 11

Categories of ML tasks Supervised learning Given: Example inputs and desired outputs Goal: Learn a general rule that maps inputs to outputs Unsupervised learning Given: Data inputs (e.g., documents) Goal: Find structure in the inputs Reinforcement learning Setting: An agent interacts with a dynamic environment in which it must perform a goal Goal: Improving the agent s behaviour 12

Supervised learning subcategories Classification Given: Training inputs (records) which are divided into two or more classes Goal: Produce model to classify new inputs Examples: spam filter, fraud detection, Regression Given: Training data (records) with continuous (not discrete) output values Goal: Produce model to predict output values for new inputs Example: stock value prediction 13

Unsupervised learning subcategories Clustering Given: Set of input records Goal: Identifying clusters (groups of similar records) Example: Customer grouping Feature selection / extraction Given: Set of input records with attributes ( features ) Goal: Find a subset of the original attributes that are equally well suited for classification / clustering tasks Topic modeling Given: Set of text documents Goal: Find abstract topics that occur in several documents and classify documents accordingly 14

Agenda Overview ML Applications ML Tasks ML Approaches ML Tools Services / Product Map 15

Decision Tree Learning Used for supervised learning (classification, regression) Training input: Training data (records) with output values (discrete or continuous Learning result: decision tree that allows classifying / predicting output values of new data records Example (figure): Decision tree for classfying passengers on the Titanic in survived / died 16 Image source: Wikipedia

Artificial Neural Networks (ANN) Inspired by brain / nervous system: - Neurons connected via dentrites - Reduce resistance if fired repeatedly Artificial Neuron: - Weighted inputs - Function, e.g., weighted sum - Filter, e.g, threshold output Artificial Neural Network (ANN): - Input layer, output layer, and possibly intermediate layers of neurons - Training phase: weights are adjusted via known cases - Regognition phase: output is produced for new cases 17 Prof. Source: Dr. Bernhard Ivan Galkin, Humm, U. MASS Darmstadt Lowell University ( http://ulcar.uml.edu/~iag/cs/intro-to-ann.html of Applied Sciences. www.fbi.h-da.de/~b.humm. ) 18.11.2014

Bayesian Networks Directed acyclic graph (DAG) with: - Nodes: random variables + probability function - Edges: conditional dependencies Example: - Probablility of rain - Sprinkler is turned on if it hasn t rained for a while - Grass is wet if it is raining or the sprinkler is turned on Bayes Network inference allows answering questions like: - What is the probability that it is raining, given the grass is wet? - What is the impact of turning the sprinkler on? 18 Source: http://en.wikipedia.org/wiki/bayesian_network

Inductive Logic Programming Given: - Set of logic facts (background knowledge), e.g. male(tom), female(eve), parent (Tom, Eve) - Positive and / or negative examples, e.g., daughter (Eve, Tom) Learning goal: - General rules that are consistent with the examples and the background knowledge, e.g., parent(p1, p2) and female(p2) daughter(p2, p1) male female George parent Helen Mary Tom Nancy 19 Eve

Agenda Overview ML Applications ML Tasks ML Approaches ML Tools Services / Product Map 20

WEKA http://www.cs.waikato.ac.nz/ml/weka/ 21

Tasks supported by WEKA Numerous approaches for supervised and unsupervised learning Preprocess Choose and modify the data being acted on Classify Cluster Train and test learning schemes that classify or perform regression Learn clusters for the data Associate Learn association rules for the data Select attributes Select the most relevant attributes in the data Visualize View an interactive 2D plot of the data 22

WEKA Datasets Collection of examples Each instance consists of attributes Attribute types: - Nominal (enumeration) - Numeric (real or integer number) - String Example: @relation golfweathermichigan_1988/02/10_14days @attribute outlook {sunny, overcast, rainy} @attribute windy {TRUE, FALSE} @attribute temperature real @attribute humidity real @attribute play {yes, no} @data sunny,false,85,85,no sunny,true,80,90,no overcast,false,83,86,yes rainy,false,70,96,yes rainy,false,68,80,yes 23

WEKA GUI 24

Agenda Overview ML Applications ML Tasks ML Approaches ML Tools Services / Product Map 25

ML Services Map ML services ML development environments / frameworks ML libraries Web services for for experimenting with different ML approaches and configuring solutions IDEs and frameworks for experimenting with different ML approaches and configuring solutions Algorithms for classification, regression, clustering, feature selection / extraction, tropic modelling, etc. using different approaches, e.g., decision tree learning, Artificial Neural Networks, Bayes networks, inductive logic programming, Support Vector machines, Hidden Markov Chains, etc. 26

ML Product Map ML services ML development environments / frameworks bigml, wise.io, procog, ersatz, WEKA, Orange, Shogun, scikt-learn, ML libraries Eblearn, OpenNN, aisolver, CURRENNT, 27

ML product map (table) Product ML library ML development environment / framework Java Neural Network Framework Neuroph x x ML service Fast Artificial Neural Network Library eblearn x x Jaden x x OpenNN - Open Neural Networks Library aisolver CURRENNT x x x WEKA x x Orange x x Shogun x x scikit-learn x x bigml wise.io procog ersatz x x x x 28