An Introduction to Language Processing with Perl and Prolog



Similar documents
Special Topics in Computer Science

Voice Driven Animation System

NATURAL LANGUAGE QUERY PROCESSING USING PROBABILISTIC CONTEXT FREE GRAMMAR

Module Catalogue for the Bachelor Program in Computational Linguistics at the University of Heidelberg

Ling 201 Syntax 1. Jirka Hana April 10, 2006

Natural Language to Relational Query by Using Parsing Compiler

NATURAL LANGUAGE TO SQL CONVERSION SYSTEM

31 Case Studies: Java Natural Language Tools Available on the Web

Quick Start Guide: Read & Write 11.0 Gold for PC

International Journal of Scientific & Engineering Research, Volume 4, Issue 11, November ISSN

Endowing a virtual assistant with intelligence: a multi-paradigm approach

Natural Language Database Interface for the Community Based Monitoring System *

A Machine Translation System Between a Pair of Closely Related Languages

How the Computer Translates. Svetlana Sokolova President and CEO of PROMT, PhD.

Processing: current projects and research at the IXA Group

Overview of MT techniques. Malek Boualem (FT)

Moving Enterprise Applications into VoiceXML. May 2002

Morphology. Morphology is the study of word formation, of the structure of words. 1. some words can be divided into parts which still have meaning

Lesson 7. Les Directions (Disc 1 Track 12) The word crayons is similar in English and in French. Oui, mais où sont mes crayons?

Semantic analysis of text and speech

1/20/2016 INTRODUCTION

Sample Test Questions

Parsing Technology and its role in Legacy Modernization. A Metaware White Paper

NATURAL LANGUAGE QUERY PROCESSING USING SEMANTIC GRAMMAR

Paraphrasing controlled English texts

The Lexicon. The Lexicon. The Lexicon. The Significance of the Lexicon. Australian? The Significance of the Lexicon 澳 大 利 亚 奥 地 利

Psychology G4470. Psychology and Neuropsychology of Language. Spring 2013.

Language Meaning and Use

Design Grammars for High-performance Speech Recognition

VoiceXML-Based Dialogue Systems

Develop Software that Speaks and Listens

Introduction. Philipp Koehn. 28 January 2016

Noam Chomsky: Aspects of the Theory of Syntax notes

NLUI Server User s Guide

Syntactic Theory. Background and Transformational Grammar. Dr. Dan Flickinger & PD Dr. Valia Kordoni

TRANSLATION OF TELUGU-MARATHI AND VICE- VERSA USING RULE BASED MACHINE TRANSLATION

Chapter 1. Dr. Chris Irwin Davis Phone: (972) Office: ECSS CS-4337 Organization of Programming Languages

The SYSTRAN Linguistics Platform: A Software Solution to Manage Multilingual Corporate Knowledge

Year 1 reading expectations (New Curriculum) Year 1 writing expectations (New Curriculum)

Survey Results: Requirements and Use Cases for Linguistic Linked Data

Building a Question Classifier for a TREC-Style Question Answering System

A picture is worth five captions

Open-Source, Cross-Platform Java Tools Working Together on a Dialogue System

Comprendium Translator System Overview

Rethinking the relationship between transitive and intransitive verbs

Introduction. Compiler Design CSE 504. Overview. Programming problems are easier to solve in high-level languages

Semester Review. CSC 301, Fall 2015

USABILITY OF A FILIPINO LANGUAGE TOOLS WEBSITE

Syntax: Phrases. 1. The phrase

Text-To-Speech Technologies for Mobile Telephony Services

Efficient Techniques for Improved Data Classification and POS Tagging by Monitoring Extraction, Pruning and Updating of Unknown Foreign Words

A Knowledge-based System for Translating FOL Formulas into NL Sentences

Text Mining - Scope and Applications

Professional Organization Checklist for the Computer Science Curriculum Updates. Association of Computing Machinery Computing Curricula 2008

CSCI 3136 Principles of Programming Languages

Compilers. Introduction to Compilers. Lecture 1. Spring term. Mick O Donnell: michael.odonnell@uam.es Alfonso Ortega: alfonso.ortega@uam.

Web 3.0 image search: a World First

A Case Study of Question Answering in Automatic Tourism Service Packaging

Introduction to Semantics. A Case Study in Semantic Fieldwork: Modality in Tlingit

Standard Languages for Developing Multimodal Applications

Transformation of Free-text Electronic Health Records for Efficient Information Retrieval and Support of Knowledge Discovery

INCREASE YOUR PRODUCTIVITY WITH CELF 4 SOFTWARE! SAMPLE REPORTS. To order, call , or visit our Web site at

Automatic Text Analysis Using Drupal

Testing Data-Driven Learning Algorithms for PoS Tagging of Icelandic

Reading Competencies

California Treasures High-Frequency Words Scope and Sequence K-3

Datavetenskapligt Program (kandidat) Computer Science Programme (master)

Web Mining. Margherita Berardi LACAM. Dipartimento di Informatica Università degli Studi di Bari

Grammars and introduction to machine learning. Computers Playing Jeopardy! Course Stony Brook University

Historical Linguistics. Diachronic Analysis. Two Approaches to the Study of Language. Kinds of Language Change. What is Historical Linguistics?

CIKM 2015 Melbourne Australia Oct. 22, 2015 Building a Better Connected World with Data Mining and Artificial Intelligence Technologies

Modern Databases. Database Systems Lecture 18 Natasha Alechina

CHARTES D'ANGLAIS SOMMAIRE. CHARTE NIVEAU A1 Pages 2-4. CHARTE NIVEAU A2 Pages 5-7. CHARTE NIVEAU B1 Pages CHARTE NIVEAU B2 Pages 11-14

PARALLEL STRUCTURE S-10

Outline of today s lecture

Functional Auditory Performance Indicators (FAPI)

Culture and Language. What We Say Influences What We Think, What We Feel and What We Believe

Word Completion and Prediction in Hebrew

Hacking the Mind: NLP and Influence. by Mystic

Language Evaluation Criteria. Evaluation Criteria: Readability. Evaluation Criteria: Writability. ICOM 4036 Programming Languages

Types of meaning. KNOWLEDGE: the different types of meaning that items of lexis can have and the terms used to describe these

Automatic Speech Recognition and Hybrid Machine Translation for High-Quality Closed-Captioning and Subtitling for Video Broadcast

Source Code Translation

CS4025: Pragmatics. Resolving referring Expressions Interpreting intention in dialogue Conversational Implicature

Context Grammar and POS Tagging

GED Language Arts, Writing Lesson 1: Noun Overview Worksheet

Fry Phrases Set 1. TeacherHelpForParents.com help for all areas of your child s education

COMPUTER TECHNOLOGY IN TEACHING READING

Guide to the essentials of creating accessible PDFs with Microsoft Word and Acrobat Professional 8

The Prolog Interface to the Unstructured Information Management Architecture

Creating and Implementing Conversational Agents

Transcription:

An Introduction to Language Processing with Perl and Prolog Pierre Nugues Lund University Pierre.Nugues@cs.lth.se http://www.cs.lth.se/home/pierre_nugues/ Pierre Nugues An Introduction to Language Processing with Perl and Prolog 1 / 19

Applications of Language Processing Spelling and grammatical checkers: MS Word Text indexing and information retrieval on the Internet: Google, Microsoft Bing, Yahoo Telephone information that understands some spoken questions: SJ (trains in Sweden) or Tellme.com in the United States Speech dictation of letters or reports: IBM ViaVoice, Windows Vista Translation: Google Translate, SYSTRAN Pierre Nugues An Introduction to Language Processing with Perl and Prolog 2 / 19

Applications of Language Processing (ctn d) Direct translation from spoken English to spoken Swedish in a restricted domain: SRI and SICS Voice control of domestic devices such as tape recorders: Philips or disc changers: MS Persona Conversational agents able to dialogue and to plan: TRAINS Spoken navigation in virtual worlds: Ulysse, Higgins Generation of 3D scenes from text: Carsim Pierre Nugues An Introduction to Language Processing with Perl and Prolog 3 / 19

Linguistics Layers Sounds Phonemes Words and morphology Syntax and functions Semantics Dialogue Pierre Nugues An Introduction to Language Processing with Perl and Prolog 4 / 19

Sounds and Phonemes Serious C est par là It is that way Pierre Nugues An Introduction to Language Processing with Perl and Prolog 5 / 19

Lexicon and Parts of Speech The big cat ate the gray mouse The/article big/adjective cat/noun ate/verb the/article gray/adjective mouse/noun Le/article gros/adjectif chat/nom mange/verbe la/article souris/nom grise/adjectif Die/Artikel große/adjektiv Katze/Substantiv ißt/verb die/artikel graue/adjektiv Maus/Substantiv Pierre Nugues An Introduction to Language Processing with Perl and Prolog 6 / 19

Morphology Word worked travaillé gearbeitet Root form to work + verb + preterit travailler + verb + past participle arbeiten + verb + past participle Pierre Nugues An Introduction to Language Processing with Perl and Prolog 7 / 19

Syntactic Tree sentence noun phrase verb phrase article noun verb noun phrase article noun The boy hit the ball Pierre Nugues An Introduction to Language Processing with Perl and Prolog 8 / 19

Syntax: A Classical View A graph of dependencies and functions Verb Subject Object The boy hit the ball Pierre Nugues An Introduction to Language Processing with Perl and Prolog 9 / 19

Semantics As opposed to syntax: 1 Colorless green ideas sleep furiously. 2 *Furiously sleep ideas green colorless. Determining the logical form: Sentence Frank is writing notes François écrit des notes Franz schreibt Notizen Logical representation writing(frank, notes). écrit(françois, notes). schreibt(franz, Notizen). Pierre Nugues An Introduction to Language Processing with Perl and Prolog 10 / 19

Lexical Semantics Word senses: 1 note (noun) short piece of writing; 2 note (noun) a single sound at a particular level; 3 note (noun) a piece of paper money; 4 note (verb) to take notice of; 5 note (noun) of note: of importance. Pierre Nugues An Introduction to Language Processing with Perl and Prolog 11 / 19

Reference 1. sentence 2. logical representation Pierre wrote notes wrote(pierre, notes) 3. real world referencing referencing Louis Pierre Charlotte operating systems computational linguistics Prolog programming Pierre Nugues An Introduction to Language Processing with Perl and Prolog 12 / 19

Ambiguity Many analyses are ambiguous. It makes language processing difficult. Ambiguity occurs in any layer: speech recognition, part-of-speech tagging, parsing, etc. Example of an ambiguous phonetic transcription: The boys eat the sandwiches That may correspond to: The boy seat the sandwiches; the boy seat this and which is; the buoys eat the sand which is Pierre Nugues An Introduction to Language Processing with Perl and Prolog 13 / 19

Models and Tools Linguistics has produced an impressive set of theories and models Language processing requires significant resources Models and tools have matured. Resources are available. Tools involve notably finite-state automata, regular expressions, rewriting rules, logic, statistics and machine learning. Pierre Nugues An Introduction to Language Processing with Perl and Prolog 14 / 19

The Carsim System: A Text-to-Scene Converter Texts XML Templates 3D Animation Véhicule B venant de ma gauche, je me trouve dans le carrefour, à faible vitesse environ 40 km/h, quand le véhicule B, percute mon véhicule, et me refuse la priorité à droite. Le premier choc atteint mon aile arrière gauche, // Static Objects STATIC [ ROAD TREE ] // Dynamic Objects DYNAMIC [ VEHICLE [ ID = vehicule b; INITDIRECTION = east; = = NLP engine Java 3D animation program Pierre Nugues An Introduction to Language Processing with Perl and Prolog 15 / 19

Dialogue: The Persona Project from Microsoft Research A conversation with Peedy Turn Utterance [Peedy is asleep on his perch] User: Good morning, Peedy. [Peedy rouses] Peedy: Good morning. User: Let s do a demo. [Peedy stands up, smiles] Peedy: Your wish is my command, what would you like to hear? User: What have you got by Bonnie Raitt? [Peedy waves in a stream of notes, and grabs one as they rush by.] Peedy: I have The Bonnie Raitt Collection from 1990. User: Pick something from that Peedy: How about Angel from Montgomery? Pierre Nugues An Introduction to Language Processing with Perl and Prolog 16 / 19

Dialogue: The Persona Project from Microsoft Research User: Peedy: User: Peedy: User: Peedy: User: Peedy: User: Sounds good. [Peedy drops note on pile] OK. Play some rock after that. [Peedy scans the notes again, selects one] How about Fools in love? Who wrote that? [Peedy cups one wing to his ear ] Huh? Who wrote that? [Peedy looks up, scrunches his brow] Joe Jackson Fine. [Drops note on pile] Pierre Nugues An Introduction to Language Processing with Perl and Prolog 17 / 19

Persona System Architecture Source: http: //research.microsoft.com/research/pubs/view.aspx?pubid=439 Pierre Nugues An Introduction to Language Processing with Perl and Prolog 18 / 19

Research Relevance Large companies like Microsoft, Google, Yahoo, IBM, or Xerox have a research activity in natural language processing. The 7th European framework program (2007-2013) names six technology pillars in information technologies. Two of them are related to language processing: Knowledge, cognitive and learning systems: semantic systems; capturing and exploiting knowledge embedded in web and multimedia content; bio-inspired artificial systems that perceive, understand, learn and evolve, and act autonomously; learning by convivial machines and humans based on a better understanding of human cognition. Simulation, visualization, interaction and mixed realities: tools for innovative design and creativity in products, services and digital media, and for natural, language-enabled and context-rich interaction and communication. Pierre Nugues An Introduction to Language Processing with Perl and Prolog 19 / 19