Machine Learning for natural language processing



Similar documents
Introduction. BM1 Advanced Natural Language Processing. Alexander Koller. 17 October 2014

Clustering Connectionist and Statistical Language Processing

Search and Data Mining: Techniques. Text Mining Anya Yarygina Boris Novikov

Why language is hard. And what Linguistics has to say about it. Natalia Silveira Participation code: eagles

Natural Language Processing. Today. Logistic Regression Models. Lecture 13 10/6/2015. Jim Martin. Multinomial Logistic Regression

Introduction to Machine Learning Lecture 1. Mehryar Mohri Courant Institute and Google Research

Applying Co-Training Methods to Statistical Parsing. Anoop Sarkar anoop/

CS 6740 / INFO Ad-hoc IR. Graduate-level introduction to technologies for the computational treatment of information in humanlanguage

Machine Learning. Chapter 18, 21. Some material adopted from notes by Chuck Dyer

Using Artificial Intelligence to Manage Big Data for Litigation

INF5820 Natural Language Processing - NLP. H2009 Jan Tore Lønning jtl@ifi.uio.no

NLP Programming Tutorial 5 - Part of Speech Tagging with Hidden Markov Models

Log-Linear Models a.k.a. Logistic Regression, Maximum Entropy Models

Machine Learning. CUNY Graduate Center, Spring Professor Liang Huang.

Course: Model, Learning, and Inference: Lecture 5

Machine Learning. Mausam (based on slides by Tom Mitchell, Oren Etzioni and Pedro Domingos)

Effective Self-Training for Parsing

Attribution. Modified from Stuart Russell s slides (Berkeley) Parts of the slides are inspired by Dan Klein s lecture material for CS 188 (Berkeley)

Statistical Machine Translation

Machine Learning: Overview

Sentiment Analysis and Topic Classification: Case study over Spanish tweets

An Incrementally Trainable Statistical Approach to Information Extraction Based on Token Classification and Rich Context Models

Accelerating and Evaluation of Syntactic Parsing in Natural Language Question Answering Systems

Building a Question Classifier for a TREC-Style Question Answering System

The multilayer sentiment analysis model based on Random forest Wei Liu1, Jie Zhang2

8. Machine Learning Applied Artificial Intelligence

Machine learning for algo trading

Simple Type-Level Unsupervised POS Tagging

The Seven Practice Areas of Text Analytics

Reference Books. Data Mining. Supervised vs. Unsupervised Learning. Classification: Definition. Classification k-nearest neighbors

Resolving Common Analytical Tasks in Text Databases

Ming-Wei Chang. Machine learning and its applications to natural language processing, information retrieval and data mining.

RRSS - Rating Reviews Support System purpose built for movies recommendation

POS Tagging 1. POS Tagging. Rule-based taggers Statistical taggers Hybrid approaches

Network Machine Learning Research Group. Intended status: Informational October 19, 2015 Expires: April 21, 2016

Supervised Learning (Big Data Analytics)

Basic Parsing Algorithms Chart Parsing

Identifying Focus, Techniques and Domain of Scientific Papers

Facilitating Business Process Discovery using Analysis

Data Mining for Customer Service Support. Senioritis Seminar Presentation Megan Boice Jay Carter Nick Linke KC Tobin

Classification algorithm in Data mining: An Overview

SYNTAX AND SEMANTICS OF CAUSAL DENN IN GERMAN TATJANA SCHEFFLER

Social Media Mining. Data Mining Essentials

Topological Field Chunking in German

Introduction to Pattern Recognition

Multi-Engine Machine Translation by Recursive Sentence Decomposition

Learning is a very general term denoting the way in which agents:

Practical Applications of DATA MINING. Sang C Suh Texas A&M University Commerce JONES & BARTLETT LEARNING

BIOINF 585 Fall 2015 Machine Learning for Systems Biology & Clinical Informatics

NATURAL LANGUAGE QUERY PROCESSING USING PROBABILISTIC CONTEXT FREE GRAMMAR

Leveraging ASEAN Economic Community through Language Translation Services

Social Media Text and Predictive Analytics

Classification of Natural Language Interfaces to Databases based on the Architectures

Language and Computation

Projektgruppe. Categorization of text documents via classification

Search Taxonomy. Web Search. Search Engine Optimization. Information Retrieval

Search Engines Chapter 2 Architecture Felix Naumann

Hidden Markov Models Chapter 15

An Introduction to Data Mining

CS 2750 Machine Learning. Lecture 1. Machine Learning. CS 2750 Machine Learning.

Machine Learning and Statistics: What s the Connection?

Applications of Deep Learning to the GEOINT mission. June 2015

Web Content Mining and NLP. Bing Liu Department of Computer Science University of Illinois at Chicago

Special Topics in Computer Science

Data Mining Yelp Data - Predicting rating stars from review text

Machine Learning with MATLAB David Willingham Application Engineer

Bing Liu. Web Data Mining. Exploring Hyperlinks, Contents, and Usage Data. With 177 Figures. ~ Spring~r

German Language Resource Packet

Natural Language Processing

SENTIMENT ANALYSIS: A STUDY ON PRODUCT FEATURES

MS1b Statistical Data Mining

Detecting Forum Authority Claims in Online Discussions

Automatic Text Analysis Using Drupal

Semi-Supervised Learning for Blog Classification

An NLP Curator (or: How I Learned to Stop Worrying and Love NLP Pipelines)

A chart generator for the Dutch Alpino grammar

Data Mining on Social Networks. Dionysios Sotiropoulos Ph.D.

Topic Extrac,on from Online Reviews for Classifica,on and Recommenda,on (2013) R. Dong, M. Schaal, M. P. O Mahony, B. Smyth

Elena Chiocchetti & Natascia Ralli (EURAC) Tanja Wissik & Vesna Lušicky (University of Vienna)

: Introduction to Machine Learning Dr. Rita Osadchy

Workshop. Neil Barrett PhD, Jens Weber PhD, Vincent Thai MD. Engineering & Health Informa2on Science

Text Processing with Hadoop and Mahout Key Concepts for Distributed NLP

ONLINE RESUME PARSING SYSTEM USING TEXT ANALYTICS

Concept Term Expansion Approach for Monitoring Reputation of Companies on Twitter

Is a Data Scientist the New Quant? Stuart Kozola MathWorks

Sentiment analysis of Twitter microblogging posts. Jasmina Smailović Jožef Stefan Institute Department of Knowledge Technologies

MACHINE LEARNING IN HIGH ENERGY PHYSICS

Machine Translation and the Translator

Fine-grained German Sentiment Analysis on Social Media

Phrase-Based MT. Machine Translation Lecture 7. Instructor: Chris Callison-Burch TAs: Mitchell Stern, Justin Chiu. Website: mt-class.

Exemplar for Internal Achievement Standard. German Level 1

AP WORLD LANGUAGE AND CULTURE EXAMS 2012 SCORING GUIDELINES

Lernsituation 9. Giving information on the phone. 62 Lernsituation 9 Giving information on the phone

Sentiment analysis on news articles using Natural Language Processing and Machine Learning Approach.

Chapter ML:XI (continued)

Probabilistic topic models for sentiment analysis on the Web

Outline of today s lecture

Chapter 8. Final Results on Dutch Senseval-2 Test Data

Sentiment Analysis of Twitter Data

Multi-Class and Structured Classification

Transcription:

Machine Learning for natural language processing Introduction Laura Kallmeyer Heinrich-Heine-Universität Düsseldorf Summer 2016 1 / 13

Introduction Goal of machine learning: Automatically learn how to assign structure to data. Structure can be for instance a class label, a POS tag, syntactic structure or semantic structure. We learn from data and then apply the resulting model to new data. Most of the course is based on Jurafsky & Martin (2015); Manning et al. (2008) 2 / 13

Table of contents 1 Applications of ML in NLP 2 Supervised vs. unsupervised methods 3 Topics 3 / 13

Applications Language Models (LM) are probabilistic models of sentences in a given language, for instance probabilistic models of German sentences. They give us the probability that we encounter a given sentence in German text. Language Models and Machine Translation LMs are used in various contexts, for instance in machine translation (MT), where a LM selects a translation among a list of alternatives output by the translation model. (1) a. Paul a des projets magnifiques pour embellir son appartement. b. Paul hat Projekte wunderbare um zu verschönern seine Wohnung. c. Paul hat wunderbare Projekte um zu verschönern seine Wohnung. d. Paul hat Projekte wunderbare um seine Wohnung zu verschönern. e. Paul hat wunderbare Projekte um seine Wohnung zu verschönern. 4 / 13

Applications Various text classification tasks, for example according to topic, authorship, sentiment, etc. Sentiment analysis Goal: Build a classifier that assigns a class c {positive, negative, neutral} to documents, for instance to comments on films: c negative negative negative neutral neutral positive positive positive document total langweilig auf den Film kann man echt verzichten ich habe selten so etwas schlechtes gesehen ich habe schon bessere Filme gesehen kann man angucken super Film sicher einer der besten Filme dieses Sommers gute Schauspieler und eine gute Story 5 / 13

Applications Tasks assigning categories to words, for instance Part-of-Speech (POS) Tagging POS Tagging Goal: assign POS tags to words in a text. For example: Pierre Vinken, 61 years old, will join the board NNP NNP, CD NNS JJ, MD VB DT NN as a nonexecutive director Nov. 29. IN DT JJ NN NNP CD. 6 / 13

Applications Tasks assigning structure to sentences, for instance dependency parsing Dependency Parsing Goal: Create dependencies (directed edges) between words in a sentence and assign dependency labels to these edges. PROAV VMFIN VVPP VAINF Darüber muss nachgedacht werden root pp aux aux r PROAV VMFIN VVPP VAINF Darüber muss nachgedacht werden 7 / 13

Supervised vs. unsupervised methods ML approaches can be distinguished with respect to the type of training data and with respect to the type of model that is learned. Important distinction concerning training data: Learning can be based on data containing (parts of) the structure we aim at ((semi-)supervised learning) or on data without any information about structure (unsupervised learning). 8 / 13

Supervised vs. unsupervised methods Some examples: Supervised learning POS tagging is mostly done with supervised approaches:traning data:... das muss man jetzt machen....... NN VMFIN NN AV VAINF PUNCT... Goal: assign the same types of POS tags to words in new data. 9 / 13

Supervised vs. unsupervised methods Semisupervised learning Constituency parsing with latent variables Petrov et al. (2006): Traning data:... X... X X Det N V X the man loves Det N the woman Goal: Learn a probabilistic CFG that generates the same type of structures but with more specific labels instead of X. 10 / 13

Supervised vs. unsupervised methods Unsupervised learning Word sense induction: Create vector representations for words in context and cluster them into word senses. Input: Raw text. Output: pear banana orange apple hardware processor machine software computer In this course, we are mainly concerned with supervised learning. 11 / 13

Topics Topics to be covered in the course: N-Grams and Language Models Classification: Naive Bayes Classification: k Nearest Neighbors Vector Semantics Logistic Regression: MaxEnt Hidden-Markov Models (HMM) POS Tagging EM algorithm... 12 / 13

References Jurafsky, Daniel & James H. Martin. 2015. Speech and language processing. an introduction to natural language processing, computational linguistics, and speech recognition. Draft of the 3rd edition. Manning, Christopher D., Prabhakar Raghavan & Hinrich Schütze. 2008. Introduction to information retrieval. Cambridge University Press. Petrov, Slav, Leon Barrett, Romain Thibaux & Dan Klein. 2006. Learning Accurate, Compact, and Interpretable Tree Annotation. In Proceedings of the 21st international conference on computational linguistics and 44th annual meeting of the acl, 433 440. Sydney. 13 / 13