Practical Structured Learning for Natural Language Processing
|
|
- Cornelia Webster
- 7 years ago
- Views:
Transcription
1 Practical Structured Learning for Natural Language Processing Hal Daumé III Information Sciences Institute University of Southern California Committee: Daniel Marcu (Adviser), Kevin Knight, Eduard Hovy Stefan Schaal, Gareth James, Andrew McCallum Slide 1
2 All have innate structure to Processing of language to varying solve a degrees problem What Is Natural Language Processing? Machine Translation Summarization Information Extraction Input Arabic sentence Long documents Document Output English sentence Short documents Database Question Answering Corpus + query Corpus based NLP: Answer Parsing Collect example input/output Sentence pairs and Syntactic learn tree Understanding Sentence Logical sentence Slide 2
3 What is Machine Learning? Automatically uncovering structure in data Classification Regression Dimensionality reduction Clustering Sequence labeling Input Vector Vector Vector Vectors Vectors Output Bit (or bits) Real number Shorter vector Partition Labels Slide 3 Reinforcement learning State space + observations Policy
4 NLP versus Machine Learning Machine Translation Summarization Information Extraction Question Answering Parsing Understanding Natural Decomposition Classification Regression Dimensionality reduction Clustering Sequence labeling Arabic sentence Long documents Document Corpus + query Sentence Sentence Encoding Vector Vector Vector Vectors Vectors English sentence Short documents Database Answer Syntactic tree Logical sentence Ad Hoc Provably Search Good Bit (or bits) Real number Shorter vector Partition Labels Slide 4 Reinforcement learninginspired State space Policy
5 Entity Detection and Tracking Shakespeare scholarship, lively at the best of times, saw the fur flying yesterday after a German academic claimed to have authenticated not just one but four contemporary images of the playwright and suggested, to boot, that he had died of cancer. 1.3 (Example Problem: EDT) Slide 5 As the National Portrait Gallery planned to reveal that only one of half a dozen claimed portraits of William Shakespeare can now be considered genuine, Prof Hildegard Hammerschmidt Hummel said she could prove that there were at least four surviving portraits of the playwright. Why? Summarization, Machine Translation, etc...
6 Why is EDT Hard? Long range dependencies World Knowledge Syntax, Semantics and Discourse 1.3 (Example Problem: EDT) Slide 6 Highly constrained outputs Shakespeare scholarship, lively at the best of times, saw the fur flying yesterday after a German academic claimed to have authenticated not just one but four contemporary images of the playwright and suggested, to boot, that he had died of cancer.
7 How is EDT Typically Attacked? Shakespeare scholarship, lively at the best of times, saw the fur flying yesterday after a German academic claimed to have authenticated not just one but four contemporary images of the playwright and suggested, to boot, that he had died of cancer. Shakespeare scholarship,... a German academic claimed... PER * * * GPE PER * 5.2 (EDT: Prior Work) Shakespeare Slide 7 he playwright German academic Sequence Labeling Binary Classification + Clustering
8 Learning in NLP Applications Model Features Learning Search Slide 8
9 Result: Searn (= Search + Learn ) Computationally efficient Strong theoretical guarantees 3 (Search-based Structured Prediction) Slide 9 State of the art performance Broadly applicable Easy to implement
10 Talk Outline Searn: Search based Structured Prediction Experimental Results: Sequence Labeling Entity Detection and Tracking Automatic Document Summarization Discussion and Future Work Slide 10
11 Structured Prediction Formulated as a maximization problem: 2.2 (Machine Learning: Structured Prediction) Slide 11 y = arg max y Y x f x, y ; Search Model Features Learning Search (and learning) tractable only for very simple Y(x) and f
12 Reduction for Structured Prediction 3.3 (Search-based Structured Prediction) Slide 12 Idea: view structured prediction in light of search D P N R V D P N R V D P N R V I sing merrily Each step here looks like it could be represented as a weighted multiclass problem. Can we formalize this idea?
13 Error-Limiting Reductions A formal mapping between learning problems F [Beygelzimer et al., ICML 2005] 2.3 (Machine Learning: Reductions) Slide 13 Learning Problem A What we want to solve F 1 Learning Problem B F maps examples from A to examples from B What we know how to solve F 1 maps a classifier h for B to a classifier that solves A Error limiting: good performance on B implies good performance on A
14 Our Reduction Goal 2.3 (Machine Learning: Reductions) Slide 14 Structured Prediction Error bounds get composed Searn Importance Weighted Classification New techniques for binary classification lead to new techniques for structured prediction Any generalization theory for binary classification immediately applies to structured prediction Weighted All Pairs [Beygelzimer et al., ICML 2005] Costsensitive Classification Binary Classification Costing [Zadronzy, Langford & Abe, ICDM 2003]
15 Two Easy Reductions X {±1} R 0 Costing [Zadronzy, Langford & Abe, ICDM 2003] X {±1} h h h L IW L BC E [w] 2.3 (Machine Learning: Reductions) Slide 15 X R 0 R 0 1v2 1v3 2v Weighted All Pairs 1vK 2vK [Beygelzimer, Dani, Hayes, Langford and Zadrozny, ICML 2005] X {±1} R 0 L CS 2 L IW
16 A First Attempt 3.3 (Search-based Structured Prediction) Slide 16 D N A R V D N A R V D N A R V I sing merrily At each correct state: Train classifier to choose next state Thm (Kääriäinen): There is a 1st order binary Markov problem such that a classification error implies a Hamming loss of: T T T 2
17 Reducing Structured Prediction D N A R V D N A R V D N A R V I sing merrily (Searn: Optimal Policy) Key Assumption: Optimal Policy for training data Slide 17 Given: input, true output and state; Return: best successor state Weak!
18 Idea Optimal Policy Learned Policy (Features) 3.4 (Searn: Training) Slide 18
19 Iterative Algorithm: Searn Set current policy = optimal policy Repeat: Decode using current policy (Searn: Algorithm)... after a German academic claimed... * * GPE PER * PER Generate classification problems ORG * * PER * * PER * Learn new multiclass classifier Interpolate: cur = *cur + (1 )*new L = 0 L = 0.3 L = 0.2 L = 0.25 Slide 19
20 Theoretical Analysis Theorem: For conservative, after 2T3 ln T iterations, the loss of the learned policy is bounded as follows: 3.5 (Searn: Theoretical Analysis) Slide 20 L h L h 0 2 T ln T l avg 1 ln T c max T Loss of the optimal policy Average multiclass classification loss Worst case per step loss
21 Why Does Searn Work? A classifier tells us how to search 3.7 (Searn: Discussion and Conclusions) Impossible to train on full space Want to train only where we'll wind up Searn tells us where we'll wind up Slide 21
22 Talk Outline Searn: Search based Structured Prediction Experimental Results: Sequence Labeling Entity Detection and Tracking Automatic Document Summarization Discussion and Future Work Slide 22
23 Proof on Concept: Sequence Labels Spanish Named Entity Recognition Handwriting Recognition Chunking+ Tagging (Sequence Labeling: Comparison) Slide 23 SVM ISO CRF Searn Searn+data LR SVM M3N Searn Searn+data Incremental Perceptron LaSO CRF Searn
24 Talk Outline Searn: Search based Structured Prediction Experimental Results: Sequence Labeling Entity Detection and Tracking Automatic Document Summarization Discussion and Future Work Slide 24
25 Entity Detection and Tracking (EDT: Joint Detection and Coreference: Search) Slide 25 Shakespeare scholarship, lively at the best of times, saw the fur flying yesterday after a German academic claimed to have authenticated not just one but four contemporary images of the playwright and suggested, to boot, that he had died of cancer.... he had died of cancer... he had died of cancer... he had died of cancer... he had died of cancer... he had died of cancer... he had died of cancer... he had died of cancer
26 EDT Features Shakespeare scholarship, lively at the best of times, saw the fur flying yesterday after a German academic claimed to have authenticated not just one but four contemporary images of the playwright and suggested, to boot, that he had died of cancer. Lexical 5.[45].3 (EDT: Feature Functions) Syntactic Semantic Gazetteer Knowledge based Slide 26
27 EDT Results (ACE 2004 data) (EDT: Experimental Results) Slide Sys1 Sys2 Sys3 Sys4 Searn Previous state of the art with extra proprietary data
28 Feature Contributions (EDT: Experimental Results) Slide Lex -Syn -Sem -Gaz -KB
29 Talk Outline Searn: Search based Structured Prediction Experimental Results: Sequence Labeling Entity Detection and Tracking Automatic Document Summarization Discussion and Future Work Slide 29
30 New Task: Summarization Give me That's information too much about the Falkland information That's islands perfect! to read! war Argentina was still obsessed with the Falkland Islands even in 1994, 12 years after its defeat in the 74 day war with Britain. The country's overriding foreign policy aim continued to be winning sovereignty over the islands. The Falkland islands war, in 1982, was fought between Britain and Argentina. Standard approach is sentence extraction, but that is often deemed too coarse to produce good, very short summaries. We wish to also drop words and phrases => document compression Slide 30
31 6.1 (Summarization: Vine-Growth Model) Structure of Search Argentina was still obsessed with the Falkland Islands even in 1994, 12 years after its defeat in the 74 day war with Britain. The country's overriding foreign policy aim continued to be winning sovereignty over the islands. man Optimal Policy not analytically available; Approximated with beam search. Slide 31 To make search more tractable, we run an initial round of sentence 5x length Lay sentences out sequentially Generate a dependency parse of each sentence Mark Theeach root as a a frontier big node Repeat: Choose a frontier node node to add to the summary Add all its children sandwich to the frontier Finish when we have enough words ate The man ate a big sandwich S1 S2 S3 S4 S5... Sn = frontier node = summary node
32 6.7 (Summarization: Error Analaysis) Example Output (40 word limit) Slide 32 Sentence Extraction + Compression: Argentina and Britain announced an agreement, nearly eight years after they fought a 74 day war a populated archipelago off Argentina's coast. Argentina gets out the red carpet, official royal visitor since the end of the Falklands war in Vine Growth: Argentina and Britain announced to restore full ties, eight years after they fought a 74 day war over the Falkland islands. Britain invited Argentina's minister Cavallo to London in 1992 in the first official visit since the Falklands war in Diplomatic ties restored 3 Falkland war was in Major cabinet member visits 3 Cavallo visited UK 5 Exchanges were in War was 74 days long 3 War between Britain and Argentina
33 Results (Summarization: Experimental Results) Slide 33 Approx. Optimal Vine Growth Vine Growth (Searn) Optimal Extract [Daumé III and Marcu, DUC 2005]
34 Talk Outline Searn: Search based Structured Prediction Experimental Results: Sequence Labeling Entity Detection and Tracking Automatic Document Summarization Discussion and Future Work Slide 34
35 What Can Searn Do? Solve structured prediction problems......efficiently. 7 (Conclusions and Future Directions) Slide 35...with theoretical guarantees....with weak assumptions on structure....and outperform the competition....with little extra code required.
36 What Can Searn Not Do? (Yet) Solve structured prediction problems......with only weak feedback....with hidden variables. 7 (Conclusions and Future Directions) Slide 36...with enormous branching factors....without pre defined search spaces....with combinatorial outputs....with only an optimal path.
37 Thanks! Questions? Slide 37
38 BACKUP SLIDES Slide 38
39 Optimal Policies Weak Feedback Optimal Path? Most NLP Problems Approximately Optimal Policy Slide 39? Optimal Policy Optimally Completable Loss augmented Search Normalized Search Searn [Daumé III, Langford, Marcu, submitted to NIPS 2006] Incremental Perceptron [Collins & Roark, EMNLP 2004] LaSO [Daumé III, Marcu, ICML 2005] Max margin Markov Networks [Taskar, Guestrin, Koller, NIPS 2003] Conditional Random Fields [Lafferty, McCallum, Pereira, ICML 2001]
Structured Models for Fine-to-Coarse Sentiment Analysis
Structured Models for Fine-to-Coarse Sentiment Analysis Ryan McDonald Kerry Hannan Tyler Neylon Mike Wells Jeff Reynar Google, Inc. 76 Ninth Avenue New York, NY 10011 Contact email: ryanmcd@google.com
More informationSemantic parsing with Structured SVM Ensemble Classification Models
Semantic parsing with Structured SVM Ensemble Classification Models Le-Minh Nguyen, Akira Shimazu, and Xuan-Hieu Phan Japan Advanced Institute of Science and Technology (JAIST) Asahidai 1-1, Nomi, Ishikawa,
More informationA Systematic Cross-Comparison of Sequence Classifiers
A Systematic Cross-Comparison of Sequence Classifiers Binyamin Rozenfeld, Ronen Feldman, Moshe Fresko Bar-Ilan University, Computer Science Department, Israel grurgrur@gmail.com, feldman@cs.biu.ac.il,
More informationRevisiting Output Coding for Sequential Supervised Learning
Revisiting Output Coding for Sequential Supervised Learning Guohua Hao and Alan Fern School of Electrical Engineering and Computer Science Oregon State University Corvallis, OR 97331 USA {haog, afern}@eecs.oregonstate.edu
More informationLearning and Inference over Constrained Output
IJCAI 05 Learning and Inference over Constrained Output Vasin Punyakanok Dan Roth Wen-tau Yih Dav Zimak Department of Computer Science University of Illinois at Urbana-Champaign {punyakan, danr, yih, davzimak}@uiuc.edu
More informationTagging with Hidden Markov Models
Tagging with Hidden Markov Models Michael Collins 1 Tagging Problems In many NLP problems, we would like to model pairs of sequences. Part-of-speech (POS) tagging is perhaps the earliest, and most famous,
More informationMonotonicity Hints. Abstract
Monotonicity Hints Joseph Sill Computation and Neural Systems program California Institute of Technology email: joe@cs.caltech.edu Yaser S. Abu-Mostafa EE and CS Deptartments California Institute of Technology
More informationThe Data Mining Process
Sequence for Determining Necessary Data. Wrong: Catalog everything you have, and decide what data is important. Right: Work backward from the solution, define the problem explicitly, and map out the data
More informationMachine Learning CS 6830. Lecture 01. Razvan C. Bunescu School of Electrical Engineering and Computer Science bunescu@ohio.edu
Machine Learning CS 6830 Razvan C. Bunescu School of Electrical Engineering and Computer Science bunescu@ohio.edu What is Learning? Merriam-Webster: learn = to acquire knowledge, understanding, or skill
More informationDistributed Structured Prediction for Big Data
Distributed Structured Prediction for Big Data A. G. Schwing ETH Zurich aschwing@inf.ethz.ch T. Hazan TTI Chicago M. Pollefeys ETH Zurich R. Urtasun TTI Chicago Abstract The biggest limitations of learning
More informationSimple and efficient online algorithms for real world applications
Simple and efficient online algorithms for real world applications Università degli Studi di Milano Milano, Italy Talk @ Centro de Visión por Computador Something about me PhD in Robotics at LIRA-Lab,
More informationLearning as Search Optimization: Approximate Large Margin Methods for Structured Prediction
: Approximate Large Margin Methods for Structured Prediction Hal Daumé III Daniel Marcu Information Sciences Institute, 4676 Admiralty Way, Marina del Rey, CA 90292 USA hdaume@isi.edu marcu@isi.edu Abstract
More informationCross-Task Knowledge-Constrained Self Training
Cross-Task Knowledge-Constrained Self Training Hal Daumé III School of Computing University of Utah Salt Lake City, UT 84112 me@hal3.name Abstract We present an algorithmic framework for learning multiple
More informationMulti-Class and Structured Classification
Multi-Class and Structured Classification [slides prises du cours cs294-10 UC Berkeley (2006 / 2009)] [ p y( )] http://www.cs.berkeley.edu/~jordan/courses/294-fall09 Basic Classification in ML Input Output
More informationStatistical Machine Translation: IBM Models 1 and 2
Statistical Machine Translation: IBM Models 1 and 2 Michael Collins 1 Introduction The next few lectures of the course will be focused on machine translation, and in particular on statistical machine translation
More informationThe Artificial Prediction Market
The Artificial Prediction Market Adrian Barbu Department of Statistics Florida State University Joint work with Nathan Lay, Siemens Corporate Research 1 Overview Main Contributions A mathematical theory
More informationNotes from Week 1: Algorithms for sequential prediction
CS 683 Learning, Games, and Electronic Markets Spring 2007 Notes from Week 1: Algorithms for sequential prediction Instructor: Robert Kleinberg 22-26 Jan 2007 1 Introduction In this course we will be looking
More informationIntroduction to Machine Learning Lecture 1. Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu
Introduction to Machine Learning Lecture 1 Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu Introduction Logistics Prerequisites: basics concepts needed in probability and statistics
More informationCS 2750 Machine Learning. Lecture 1. Machine Learning. http://www.cs.pitt.edu/~milos/courses/cs2750/ CS 2750 Machine Learning.
Lecture Machine Learning Milos Hauskrecht milos@cs.pitt.edu 539 Sennott Square, x5 http://www.cs.pitt.edu/~milos/courses/cs75/ Administration Instructor: Milos Hauskrecht milos@cs.pitt.edu 539 Sennott
More informationIntroduction to Learning & Decision Trees
Artificial Intelligence: Representation and Problem Solving 5-38 April 0, 2007 Introduction to Learning & Decision Trees Learning and Decision Trees to learning What is learning? - more than just memorizing
More informationBig Data Analytics CSCI 4030
High dim. data Graph data Infinite data Machine learning Apps Locality sensitive hashing PageRank, SimRank Filtering data streams SVM Recommen der systems Clustering Community Detection Web advertising
More informationOptimizing Description Logic Subsumption
Topics in Knowledge Representation and Reasoning Optimizing Description Logic Subsumption Maryam Fazel-Zarandi Company Department of Computer Science University of Toronto Outline Introduction Optimization
More informationDirect Loss Minimization for Structured Prediction
Direct Loss Minimization for Structured Prediction David McAllester TTI-Chicago mcallester@ttic.edu Tamir Hazan TTI-Chicago tamir@ttic.edu Joseph Keshet TTI-Chicago jkeshet@ttic.edu Abstract In discriminative
More informationChapter 6: Episode discovery process
Chapter 6: Episode discovery process Algorithmic Methods of Data Mining, Fall 2005, Chapter 6: Episode discovery process 1 6. Episode discovery process The knowledge discovery process KDD process of analyzing
More informationSupport Vector Machines with Clustering for Training with Very Large Datasets
Support Vector Machines with Clustering for Training with Very Large Datasets Theodoros Evgeniou Technology Management INSEAD Bd de Constance, Fontainebleau 77300, France theodoros.evgeniou@insead.fr Massimiliano
More informationSupervised Learning (Big Data Analytics)
Supervised Learning (Big Data Analytics) Vibhav Gogate Department of Computer Science The University of Texas at Dallas Practical advice Goal of Big Data Analytics Uncover patterns in Data. Can be used
More information10-601. Machine Learning. http://www.cs.cmu.edu/afs/cs/academic/class/10601-f10/index.html
10-601 Machine Learning http://www.cs.cmu.edu/afs/cs/academic/class/10601-f10/index.html Course data All up-to-date info is on the course web page: http://www.cs.cmu.edu/afs/cs/academic/class/10601-f10/index.html
More informationQuestion 2 Naïve Bayes (16 points)
Question 2 Naïve Bayes (16 points) About 2/3 of your email is spam so you downloaded an open source spam filter based on word occurrences that uses the Naive Bayes classifier. Assume you collected the
More informationCI6227: Data Mining. Lesson 11b: Ensemble Learning. Data Analytics Department, Institute for Infocomm Research, A*STAR, Singapore.
CI6227: Data Mining Lesson 11b: Ensemble Learning Sinno Jialin PAN Data Analytics Department, Institute for Infocomm Research, A*STAR, Singapore Acknowledgements: slides are adapted from the lecture notes
More informationBuilding a Question Classifier for a TREC-Style Question Answering System
Building a Question Classifier for a TREC-Style Question Answering System Richard May & Ari Steinberg Topic: Question Classification We define Question Classification (QC) here to be the task that, given
More informationData Mining on Social Networks. Dionysios Sotiropoulos Ph.D.
Data Mining on Social Networks Dionysios Sotiropoulos Ph.D. 1 Contents What are Social Media? Mathematical Representation of Social Networks Fundamental Data Mining Concepts Data Mining Tasks on Digital
More informationNeural Networks and Support Vector Machines
INF5390 - Kunstig intelligens Neural Networks and Support Vector Machines Roar Fjellheim INF5390-13 Neural Networks and SVM 1 Outline Neural networks Perceptrons Neural networks Support vector machines
More informationLanguage Modeling. Chapter 1. 1.1 Introduction
Chapter 1 Language Modeling (Course notes for NLP by Michael Collins, Columbia University) 1.1 Introduction In this chapter we will consider the the problem of constructing a language model from a set
More information203.4770: Introduction to Machine Learning Dr. Rita Osadchy
203.4770: Introduction to Machine Learning Dr. Rita Osadchy 1 Outline 1. About the Course 2. What is Machine Learning? 3. Types of problems and Situations 4. ML Example 2 About the course Course Homepage:
More informationCSE 473: Artificial Intelligence Autumn 2010
CSE 473: Artificial Intelligence Autumn 2010 Machine Learning: Naive Bayes and Perceptron Luke Zettlemoyer Many slides over the course adapted from Dan Klein. 1 Outline Learning: Naive Bayes and Perceptron
More informationChallenges of Cloud Scale Natural Language Processing
Challenges of Cloud Scale Natural Language Processing Mark Dredze Johns Hopkins University My Interests? Information Expressed in Human Language Machine Learning Natural Language Processing Intelligent
More informationAn Introduction to Data Mining
An Introduction to Intel Beijing wei.heng@intel.com January 17, 2014 Outline 1 DW Overview What is Notable Application of Conference, Software and Applications Major Process in 2 Major Tasks in Detail
More informationAUTOMATION OF ENERGY DEMAND FORECASTING. Sanzad Siddique, B.S.
AUTOMATION OF ENERGY DEMAND FORECASTING by Sanzad Siddique, B.S. A Thesis submitted to the Faculty of the Graduate School, Marquette University, in Partial Fulfillment of the Requirements for the Degree
More informationMaster's projects at ITMO University. Daniil Chivilikhin PhD Student @ ITMO University
Master's projects at ITMO University Daniil Chivilikhin PhD Student @ ITMO University General information Guidance from our lab's researchers Publishable results 2 Research areas Research at ITMO Evolutionary
More informationPart III: Machine Learning. CS 188: Artificial Intelligence. Machine Learning This Set of Slides. Parameter Estimation. Estimation: Smoothing
CS 188: Artificial Intelligence Lecture 20: Dynamic Bayes Nets, Naïve Bayes Pieter Abbeel UC Berkeley Slides adapted from Dan Klein. Part III: Machine Learning Up until now: how to reason in a model and
More informationInteractive Machine Learning. Maria-Florina Balcan
Interactive Machine Learning Maria-Florina Balcan Machine Learning Image Classification Document Categorization Speech Recognition Protein Classification Branch Prediction Fraud Detection Spam Detection
More informationApproximation Algorithms
Approximation Algorithms or: How I Learned to Stop Worrying and Deal with NP-Completeness Ong Jit Sheng, Jonathan (A0073924B) March, 2012 Overview Key Results (I) General techniques: Greedy algorithms
More informationOnline Large-Margin Training of Dependency Parsers
Online Large-Margin Training of Dependency Parsers Ryan McDonald Koby Crammer Fernando Pereira Department of Computer and Information Science University of Pennsylvania Philadelphia, PA {ryantm,crammer,pereira}@cis.upenn.edu
More informationPATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION Introduction In the previous chapter, we explored a class of regression models having particularly simple analytical
More informationClassifying Large Data Sets Using SVMs with Hierarchical Clusters. Presented by :Limou Wang
Classifying Large Data Sets Using SVMs with Hierarchical Clusters Presented by :Limou Wang Overview SVM Overview Motivation Hierarchical micro-clustering algorithm Clustering-Based SVM (CB-SVM) Experimental
More informationThe multilayer sentiment analysis model based on Random forest Wei Liu1, Jie Zhang2
2nd International Conference on Advances in Mechanical Engineering and Industrial Informatics (AMEII 2016) The multilayer sentiment analysis model based on Random forest Wei Liu1, Jie Zhang2 1 School of
More informationWeb Document Clustering
Web Document Clustering Lab Project based on the MDL clustering suite http://www.cs.ccsu.edu/~markov/mdlclustering/ Zdravko Markov Computer Science Department Central Connecticut State University New Britain,
More informationCell Phone based Activity Detection using Markov Logic Network
Cell Phone based Activity Detection using Markov Logic Network Somdeb Sarkhel sxs104721@utdallas.edu 1 Introduction Mobile devices are becoming increasingly sophisticated and the latest generation of smart
More informationOpen Domain Information Extraction. Günter Neumann, DFKI, 2012
Open Domain Information Extraction Günter Neumann, DFKI, 2012 Improving TextRunner Wu and Weld (2010) Open Information Extraction using Wikipedia, ACL 2010 Fader et al. (2011) Identifying Relations for
More informationCompiler I: Syntax Analysis Human Thought
Course map Compiler I: Syntax Analysis Human Thought Abstract design Chapters 9, 12 H.L. Language & Operating Sys. Compiler Chapters 10-11 Virtual Machine Software hierarchy Translator Chapters 7-8 Assembly
More informationReinforcement Learning
Reinforcement Learning LU 2 - Markov Decision Problems and Dynamic Programming Dr. Martin Lauer AG Maschinelles Lernen und Natürlichsprachliche Systeme Albert-Ludwigs-Universität Freiburg martin.lauer@kit.edu
More informationOur Raison d'être. Identify major choice decision points. Leverage Analytical Tools and Techniques to solve problems hindering these decision points
Analytic 360 Our Raison d'être Identify major choice decision points Leverage Analytical Tools and Techniques to solve problems hindering these decision points Empowerment through Intelligence Our Suite
More informationConditional Random Fields: An Introduction
Conditional Random Fields: An Introduction Hanna M. Wallach February 24, 2004 1 Labeling Sequential Data The task of assigning label sequences to a set of observation sequences arises in many fields, including
More informationKeywords data mining, prediction techniques, decision making.
Volume 5, Issue 4, April 2015 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Analysis of Datamining
More informationRecovering Business Rules from Legacy Source Code for System Modernization
Recovering Business Rules from Legacy Source Code for System Modernization Erik Putrycz, Ph.D. Anatol W. Kark Software Engineering Group National Research Council, Canada Introduction Legacy software 000009*
More informationSTATISTICA. Financial Institutions. Case Study: Credit Scoring. and
Financial Institutions and STATISTICA Case Study: Credit Scoring STATISTICA Solutions for Business Intelligence, Data Mining, Quality Control, and Web-based Analytics Table of Contents INTRODUCTION: WHAT
More informationADVANCED MACHINE LEARNING. Introduction
1 1 Introduction Lecturer: Prof. Aude Billard (aude.billard@epfl.ch) Teaching Assistants: Guillaume de Chambrier, Nadia Figueroa, Denys Lamotte, Nicola Sommer 2 2 Course Format Alternate between: Lectures
More informationIntroduction to Support Vector Machines. Colin Campbell, Bristol University
Introduction to Support Vector Machines Colin Campbell, Bristol University 1 Outline of talk. Part 1. An Introduction to SVMs 1.1. SVMs for binary classification. 1.2. Soft margins and multi-class classification.
More informationLecture 4 Online and streaming algorithms for clustering
CSE 291: Geometric algorithms Spring 2013 Lecture 4 Online and streaming algorithms for clustering 4.1 On-line k-clustering To the extent that clustering takes place in the brain, it happens in an on-line
More informationReinforcement Learning
Reinforcement Learning LU 2 - Markov Decision Problems and Dynamic Programming Dr. Joschka Bödecker AG Maschinelles Lernen und Natürlichsprachliche Systeme Albert-Ludwigs-Universität Freiburg jboedeck@informatik.uni-freiburg.de
More informationMachine Learning Big Data using Map Reduce
Machine Learning Big Data using Map Reduce By Michael Bowles, PhD Where Does Big Data Come From? -Web data (web logs, click histories) -e-commerce applications (purchase histories) -Retail purchase histories
More informationAttribution. Modified from Stuart Russell s slides (Berkeley) Parts of the slides are inspired by Dan Klein s lecture material for CS 188 (Berkeley)
Machine Learning 1 Attribution Modified from Stuart Russell s slides (Berkeley) Parts of the slides are inspired by Dan Klein s lecture material for CS 188 (Berkeley) 2 Outline Inductive learning Decision
More informationRecurrent Neural Networks
Recurrent Neural Networks Neural Computation : Lecture 12 John A. Bullinaria, 2015 1. Recurrent Neural Network Architectures 2. State Space Models and Dynamical Systems 3. Backpropagation Through Time
More informationNetwork Machine Learning Research Group. Intended status: Informational October 19, 2015 Expires: April 21, 2016
Network Machine Learning Research Group S. Jiang Internet-Draft Huawei Technologies Co., Ltd Intended status: Informational October 19, 2015 Expires: April 21, 2016 Abstract Network Machine Learning draft-jiang-nmlrg-network-machine-learning-00
More informationANALYTICS IN BIG DATA ERA
ANALYTICS IN BIG DATA ERA ANALYTICS TECHNOLOGY AND ARCHITECTURE TO MANAGE VELOCITY AND VARIETY, DISCOVER RELATIONSHIPS AND CLASSIFY HUGE AMOUNT OF DATA MAURIZIO SALUSTI SAS Copyr i g ht 2012, SAS Ins titut
More informationDUOL: A Double Updating Approach for Online Learning
: A Double Updating Approach for Online Learning Peilin Zhao School of Comp. Eng. Nanyang Tech. University Singapore 69798 zhao6@ntu.edu.sg Steven C.H. Hoi School of Comp. Eng. Nanyang Tech. University
More informationA Simple Introduction to Support Vector Machines
A Simple Introduction to Support Vector Machines Martin Law Lecture for CSE 802 Department of Computer Science and Engineering Michigan State University Outline A brief history of SVM Large-margin linear
More informationTrading regret rate for computational efficiency in online learning with limited feedback
Trading regret rate for computational efficiency in online learning with limited feedback Shai Shalev-Shwartz TTI-C Hebrew University On-line Learning with Limited Feedback Workshop, 2009 June 2009 Shai
More informationTowards better accuracy for Spam predictions
Towards better accuracy for Spam predictions Chengyan Zhao Department of Computer Science University of Toronto Toronto, Ontario, Canada M5S 2E4 czhao@cs.toronto.edu Abstract Spam identification is crucial
More informationA Branch and Bound Algorithm for Solving the Binary Bi-level Linear Programming Problem
A Branch and Bound Algorithm for Solving the Binary Bi-level Linear Programming Problem John Karlof and Peter Hocking Mathematics and Statistics Department University of North Carolina Wilmington Wilmington,
More informationIntroducing Formal Methods. Software Engineering and Formal Methods
Introducing Formal Methods Formal Methods for Software Specification and Analysis: An Overview 1 Software Engineering and Formal Methods Every Software engineering methodology is based on a recommended
More informationModelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches
Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches PhD Thesis by Payam Birjandi Director: Prof. Mihai Datcu Problematic
More informationCS510 Software Engineering
CS510 Software Engineering Propositional Logic Asst. Prof. Mathias Payer Department of Computer Science Purdue University TA: Scott A. Carr Slides inspired by Xiangyu Zhang http://nebelwelt.net/teaching/15-cs510-se
More informationMachine Learning. CS 188: Artificial Intelligence Naïve Bayes. Example: Digit Recognition. Other Classification Tasks
CS 188: Artificial Intelligence Naïve Bayes Machine Learning Up until now: how use a model to make optimal decisions Machine learning: how to acquire a model from data / experience Learning parameters
More informationA Support Vector Method for Multivariate Performance Measures
Thorsten Joachims Cornell University, Dept. of Computer Science, 4153 Upson Hall, Ithaca, NY 14853 USA tj@cs.cornell.edu Abstract This paper presents a Support Vector Method for optimizing multivariate
More informationMachine Learning and Statistics: What s the Connection?
Machine Learning and Statistics: What s the Connection? Institute for Adaptive and Neural Computation School of Informatics, University of Edinburgh, UK August 2006 Outline The roots of machine learning
More informationLecture 6 Online and streaming algorithms for clustering
CSE 291: Unsupervised learning Spring 2008 Lecture 6 Online and streaming algorithms for clustering 6.1 On-line k-clustering To the extent that clustering takes place in the brain, it happens in an on-line
More informationData Mining: An Overview. David Madigan http://www.stat.columbia.edu/~madigan
Data Mining: An Overview David Madigan http://www.stat.columbia.edu/~madigan Overview Brief Introduction to Data Mining Data Mining Algorithms Specific Eamples Algorithms: Disease Clusters Algorithms:
More informationLecture 10: Regression Trees
Lecture 10: Regression Trees 36-350: Data Mining October 11, 2006 Reading: Textbook, sections 5.2 and 10.5. The next three lectures are going to be about a particular kind of nonlinear predictive model,
More informationInternational Journal of Scientific & Engineering Research, Volume 4, Issue 11, November-2013 5 ISSN 2229-5518
International Journal of Scientific & Engineering Research, Volume 4, Issue 11, November-2013 5 INTELLIGENT MULTIDIMENSIONAL DATABASE INTERFACE Mona Gharib Mohamed Reda Zahraa E. Mohamed Faculty of Science,
More informationAn Overview of Knowledge Discovery Database and Data mining Techniques
An Overview of Knowledge Discovery Database and Data mining Techniques Priyadharsini.C 1, Dr. Antony Selvadoss Thanamani 2 M.Phil, Department of Computer Science, NGM College, Pollachi, Coimbatore, Tamilnadu,
More informationMachine Learning. CUNY Graduate Center, Spring 2013. Professor Liang Huang. huang@cs.qc.cuny.edu
Machine Learning CUNY Graduate Center, Spring 2013 Professor Liang Huang huang@cs.qc.cuny.edu http://acl.cs.qc.edu/~lhuang/teaching/machine-learning Logistics Lectures M 9:30-11:30 am Room 4419 Personnel
More informationChapter ML:XI (continued)
Chapter ML:XI (continued) XI. Cluster Analysis Data Mining Overview Cluster Analysis Basics Hierarchical Cluster Analysis Iterative Cluster Analysis Density-Based Cluster Analysis Cluster Evaluation Constrained
More informationArtificial Intelligence Exam DT2001 / DT2006 Ordinarie tentamen
Artificial Intelligence Exam DT2001 / DT2006 Ordinarie tentamen Date: 2010-01-11 Time: 08:15-11:15 Teacher: Mathias Broxvall Phone: 301438 Aids: Calculator and/or a Swedish-English dictionary Points: The
More informationMapReduce Approach to Collective Classification for Networks
MapReduce Approach to Collective Classification for Networks Wojciech Indyk 1, Tomasz Kajdanowicz 1, Przemyslaw Kazienko 1, and Slawomir Plamowski 1 Wroclaw University of Technology, Wroclaw, Poland Faculty
More informationTekniker för storskalig parsning
Tekniker för storskalig parsning Diskriminativa modeller Joakim Nivre Uppsala Universitet Institutionen för lingvistik och filologi joakim.nivre@lingfil.uu.se Tekniker för storskalig parsning 1(19) Generative
More informationFrustratingly Easy Domain Adaptation
Frustratingly Easy Domain Adaptation Hal Daumé III School of Computing University of Utah Salt Lake City, Utah 84112 me@hal3.name Abstract We describe an approach to domain adaptation that is appropriate
More informationEmployer Health Insurance Premium Prediction Elliott Lui
Employer Health Insurance Premium Prediction Elliott Lui 1 Introduction The US spends 15.2% of its GDP on health care, more than any other country, and the cost of health insurance is rising faster than
More informationSocial Media Mining. Data Mining Essentials
Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers
More informationPOMPDs Make Better Hackers: Accounting for Uncertainty in Penetration Testing. By: Chris Abbott
POMPDs Make Better Hackers: Accounting for Uncertainty in Penetration Testing By: Chris Abbott Introduction What is penetration testing? Methodology for assessing network security, by generating and executing
More informationConcepts of digital forensics
Chapter 3 Concepts of digital forensics Digital forensics is a branch of forensic science concerned with the use of digital information (produced, stored and transmitted by computers) as source of evidence
More informationCS 6740 / INFO 6300. Ad-hoc IR. Graduate-level introduction to technologies for the computational treatment of information in humanlanguage
CS 6740 / INFO 6300 Advanced d Language Technologies Graduate-level introduction to technologies for the computational treatment of information in humanlanguage form, covering natural-language processing
More informationSequential lmove Games. Using Backward Induction (Rollback) to Find Equilibrium
Sequential lmove Games Using Backward Induction (Rollback) to Find Equilibrium Sequential Move Class Game: Century Mark Played by fixed pairs of players taking turns. At each turn, each player chooses
More informationEmail Spam Detection Using Customized SimHash Function
International Journal of Research Studies in Computer Science and Engineering (IJRSCSE) Volume 1, Issue 8, December 2014, PP 35-40 ISSN 2349-4840 (Print) & ISSN 2349-4859 (Online) www.arcjournals.org Email
More informationDecision Trees for Mining Data Streams Based on the Gaussian Approximation
International Journal of Computer Sciences and Engineering Open Access Review Paper Volume-4, Issue-3 E-ISSN: 2347-2693 Decision Trees for Mining Data Streams Based on the Gaussian Approximation S.Babu
More informationSearch Taxonomy. Web Search. Search Engine Optimization. Information Retrieval
Information Retrieval INFO 4300 / CS 4300! Retrieval models Older models» Boolean retrieval» Vector Space model Probabilistic Models» BM25» Language models Web search» Learning to Rank Search Taxonomy!
More informationAdaBoost. Jiri Matas and Jan Šochman. Centre for Machine Perception Czech Technical University, Prague http://cmp.felk.cvut.cz
AdaBoost Jiri Matas and Jan Šochman Centre for Machine Perception Czech Technical University, Prague http://cmp.felk.cvut.cz Presentation Outline: AdaBoost algorithm Why is of interest? How it works? Why
More informationBoosting. riedmiller@informatik.uni-freiburg.de
. Machine Learning Boosting Prof. Dr. Martin Riedmiller AG Maschinelles Lernen und Natürlichsprachliche Systeme Institut für Informatik Technische Fakultät Albert-Ludwigs-Universität Freiburg riedmiller@informatik.uni-freiburg.de
More informationLinear smoother. ŷ = S y. where s ij = s ij (x) e.g. s ij = diag(l i (x)) To go the other way, you need to diagonalize S
Linear smoother ŷ = S y where s ij = s ij (x) e.g. s ij = diag(l i (x)) To go the other way, you need to diagonalize S 2 Online Learning: LMS and Perceptrons Partially adapted from slides by Ryan Gabbard
More informationEquilibrium computation: Part 1
Equilibrium computation: Part 1 Nicola Gatti 1 Troels Bjerre Sorensen 2 1 Politecnico di Milano, Italy 2 Duke University, USA Nicola Gatti and Troels Bjerre Sørensen ( Politecnico di Milano, Italy, Equilibrium
More information