Effective Self-Training for Parsing
|
|
|
- Grant Garey Wells
- 9 years ago
- Views:
Transcription
1 Effective Self-Training for Parsing David McClosky Brown Laboratory for Linguistic Information Processing (BLLIP) Joint work with Eugene Charniak and Mark Johnson David McClosky - [email protected] - NAACL
2 Parsing David McClosky - [email protected] - NAACL
3 Parsing I need a sentence with ambiguity. David McClosky - [email protected] - NAACL
4 Parsing S NP VP. PRP VBP NP. I need NP PP DT NN IN NP a sentence with NN ambiguity I need a sentence with ambiguity. David McClosky - [email protected] - NAACL
5 Parsing s is a sentence π is a parse tree parse(s) = arg max π p(π s) such that yield(π) = s David McClosky - [email protected] - NAACL
6 Flow Chart David McClosky - [email protected] - NAACL
7 Flow Chart David McClosky - [email protected] - NAACL
8 n-best parsing S NP VP. PRP VBP NP. p(π 1 ) = I need NP PP a sentence with ambiguity S NP VP. PRP VBP NP PP. p(π 2 ) = I need a sentence with ambiguity David McClosky - [email protected] - NAACL
9 Reranking Parsers Best parses are not always first, but the correct parse is often in the top 50 Rerankers rescore parses from the n-best parser using more complex (not necessarily context-free) features Oracle rerankers on the Charniak parser s 50-best list can achieve over 95% f-score David McClosky - [email protected] - NAACL
10 Flow Chart David McClosky - [email protected] - NAACL
11 Our reranking parser Parser and reranker as described in Charniak and Johnson (ACL 2005) with new features Lexicalized context-free generative parser, maximum entropy discriminative reranker New reranking features improve reranking parser s performance by 0.3% on section 23 over ACL 2005 David McClosky - [email protected] - NAACL
12 Unlabelled data Question: Can we improve the reranking parser with cheap unlabeled data? David McClosky - [email protected] - NAACL
13 Unlabelled data Question: Can we improve the reranking parser with cheap unlabeled data? Self-training Co-training Clustering n-grams, use clusters as general class of n-grams Improve vocabulary, n-gram language model etc. David McClosky - [email protected] - NAACL
14 Self-training Train model from labeled data train reranking parser on WSJ Use model to annotate unlabeled data use model to parse NANC Combine annotated data with labeled training data merge WSJ training data with parsed NANC data Train a new model from the combined data train reranking parser on WSJ+NANC data Optional: repeat with new model on more unlabeled data David McClosky - [email protected] - NAACL
15 Flow Chart David McClosky - [email protected] - NAACL
16 Previous work Parsing: Charniak (1997), confirmed by Steedman et al. (2003) insignificant improvement Part of speech tagging: Clark et al. (2003) minor improvement/damage depending on amount of training data Parser adaptation: Bacchiani et al. (2006) helps when parsing WSJ when training on Brown corpus and self-training on news data David McClosky - [email protected] - NAACL
17 Experiments (overview) How should we annotate data? (parser or reranking parser) How much unlabelled data should we label? How should we combine annotated unlabeled data with true data? David McClosky - [email protected] - NAACL
18 Annotating unlabeled data Annotator Sentences added Parser Reranking parser 0 (baseline) k k ,000k ,500k ,000k 91.0 Parser (not reranking parser) f-scores on all sentences in section 22 David McClosky - [email protected] - NAACL
19 Annotating unlabeled data WSJ Section Sentences added (baseline) k k ,000k ,000k Reranking parser f-scores for all sentences David McClosky - [email protected] - NAACL
20 Weighting WSJ data Wall Street Journal data is more reliable than the self-trained data Multiply each event in Wall Street Journal data by a constant to give it a higher relative weight events = c events wsj + events nanc Increasing WSJ weight tends to improve f-scores. Based on development data, our best model is WSJ 5+1,750k sentences from NANC David McClosky - [email protected] - NAACL
21 Evaluation on test section Model f parser f reranker Charniak and Johnson (2005) 91.0 Current baseline Self-trained f-scores from all sentences in WSJ section 23 David McClosky - [email protected] - NAACL
22 The Story So Far... Retraining parser on its own output doesn t help Retraining parser on the reranker s output helps Retraining reranker on the reranker s output doesn t help David McClosky - [email protected] - NAACL
23 Analysis: Global changes Oracle f-scores increase, self-trained parser has greater potential Model 1-best 10-best 50-best Baseline WSJ k WSJ 5 + 1,750k Average of log 2 Pr(1-best) Pr(50th-best) increases from 12.0 (baseline parser) to 14.1 (self-trained parser) David McClosky - [email protected] - NAACL
24 Sentence-level Analysis Better No change Worse Better No change Worse Number of sentences (smoothed) Number of sentences Sentence length Unknown words Better No change Worse Better No change Worse Number of sentences Number of sentences Number of CCs Number of INs David McClosky - [email protected] - NAACL
25 Effect of Sentence Length Number of sentences (smoothed) Better No change Worse Sentence length David McClosky - [email protected] - NAACL
26 The Goldilocks Effect TM Number of sentences (smoothed) Better No change Worse Sentence length David McClosky - [email protected] - NAACL
27 ... and... Number of sentences Better No change Worse Number of CCs David McClosky - [email protected] - NAACL
28 Ongoing work Parser adaptation (McClosky, Charniak, and Johnson ACL 2006) Sentence selection Clustering local trees Other ways of combining data David McClosky - [email protected] - NAACL
29 Conclusions Self-training can improve on state-of-the-art parsing for Wall Street Journal Reranking parsers can self-train their first stage parser More analysis is needed to understand why reranking is necessary Self-trained reranking parser available from: ftp://ftp.cs.brown.edu/pub/nlparser David McClosky - [email protected] - NAACL
30 Acknowledgements This work was supported by NSF grants LIS , and IIS , and DARPA GALE contract HR Thanks to Michael Collins, Brian Roark, James Henderson, Miles Osborne, and the BLLIP team for their comments. Questions? David McClosky - [email protected] - NAACL
Domain Adaptation for Dependency Parsing via Self-training
Domain Adaptation for Dependency Parsing via Self-training Juntao Yu 1, Mohab Elkaref 1, Bernd Bohnet 2 1 University of Birmingham, Birmingham, UK 2 Google, London, UK 1 {j.yu.1, m.e.a.r.elkaref}@cs.bham.ac.uk,
Applying Co-Training Methods to Statistical Parsing. Anoop Sarkar http://www.cis.upenn.edu/ anoop/ [email protected]
Applying Co-Training Methods to Statistical Parsing Anoop Sarkar http://www.cis.upenn.edu/ anoop/ [email protected] 1 Statistical Parsing: the company s clinical trials of both its animal and human-based
Accelerating and Evaluation of Syntactic Parsing in Natural Language Question Answering Systems
Accelerating and Evaluation of Syntactic Parsing in Natural Language Question Answering Systems cation systems. For example, NLP could be used in Question Answering (QA) systems to understand users natural
Statistical Machine Translation
Statistical Machine Translation Some of the content of this lecture is taken from previous lectures and presentations given by Philipp Koehn and Andy Way. Dr. Jennifer Foster National Centre for Language
DEPENDENCY PARSING JOAKIM NIVRE
DEPENDENCY PARSING JOAKIM NIVRE Contents 1. Dependency Trees 1 2. Arc-Factored Models 3 3. Online Learning 3 4. Eisner s Algorithm 4 5. Spanning Tree Parsing 6 References 7 A dependency parser analyzes
Log-Linear Models a.k.a. Logistic Regression, Maximum Entropy Models
Log-Linear Models a.k.a. Logistic Regression, Maximum Entropy Models Natural Language Processing CS 6120 Spring 2014 Northeastern University David Smith (some slides from Jason Eisner and Dan Klein) summary
Evalita 09 Parsing Task: constituency parsers and the Penn format for Italian
Evalita 09 Parsing Task: constituency parsers and the Penn format for Italian Cristina Bosco, Alessandro Mazzei, and Vincenzo Lombardo Dipartimento di Informatica, Università di Torino, Corso Svizzera
Search and Data Mining: Techniques. Text Mining Anya Yarygina Boris Novikov
Search and Data Mining: Techniques Text Mining Anya Yarygina Boris Novikov Introduction Generally used to denote any system that analyzes large quantities of natural language text and detects lexical or
POS Tagsets and POS Tagging. Definition. Tokenization. Tagset Design. Automatic POS Tagging Bigram tagging. Maximum Likelihood Estimation 1 / 23
POS Def. Part of Speech POS POS L645 POS = Assigning word class information to words Dept. of Linguistics, Indiana University Fall 2009 ex: the man bought a book determiner noun verb determiner noun 1
Machine Learning for natural language processing
Machine Learning for natural language processing Introduction Laura Kallmeyer Heinrich-Heine-Universität Düsseldorf Summer 2016 1 / 13 Introduction Goal of machine learning: Automatically learn how to
Chapter 8. Final Results on Dutch Senseval-2 Test Data
Chapter 8 Final Results on Dutch Senseval-2 Test Data The general idea of testing is to assess how well a given model works and that can only be done properly on data that has not been seen before. Supervised
Classification of Natural Language Interfaces to Databases based on the Architectures
Volume 1, No. 11, ISSN 2278-1080 The International Journal of Computer Science & Applications (TIJCSA) RESEARCH PAPER Available Online at http://www.journalofcomputerscience.com/ Classification of Natural
Language Model of Parsing and Decoding
Syntax-based Language Models for Statistical Machine Translation Eugene Charniak ½, Kevin Knight ¾ and Kenji Yamada ¾ ½ Department of Computer Science, Brown University ¾ Information Sciences Institute,
Benefits of HPC for NLP besides big data
Benefits of HPC for NLP besides big data Barbara Plank Center for Sprogteknologie (CST) University of Copenhagen, Denmark http://cst.dk/bplank Web-Scale Natural Language Processing in Northern Europe Oslo,
Sense-Tagging Verbs in English and Chinese. Hoa Trang Dang
Sense-Tagging Verbs in English and Chinese Hoa Trang Dang Department of Computer and Information Sciences University of Pennsylvania [email protected] October 30, 2003 Outline English sense-tagging
The XMU Phrase-Based Statistical Machine Translation System for IWSLT 2006
The XMU Phrase-Based Statistical Machine Translation System for IWSLT 2006 Yidong Chen, Xiaodong Shi Institute of Artificial Intelligence Xiamen University P. R. China November 28, 2006 - Kyoto 13:46 1
Factored Translation Models
Factored Translation s Philipp Koehn and Hieu Hoang [email protected], [email protected] School of Informatics University of Edinburgh 2 Buccleuch Place, Edinburgh EH8 9LW Scotland, United Kingdom
31 Case Studies: Java Natural Language Tools Available on the Web
31 Case Studies: Java Natural Language Tools Available on the Web Chapter Objectives Chapter Contents This chapter provides a number of sources for open source and free atural language understanding software
An End-to-End Discriminative Approach to Machine Translation
An End-to-End Discriminative Approach to Machine Translation Percy Liang Alexandre Bouchard-Côté Dan Klein Ben Taskar Computer Science Division, EECS Department University of California at Berkeley Berkeley,
Object-Extraction and Question-Parsing using CCG
Object-Extraction and Question-Parsing using CCG Stephen Clark and Mark Steedman School of Informatics University of Edinburgh 2 Buccleuch Place, Edinburgh, UK stevec,steedman @inf.ed.ac.uk James R. Curran
POS Tagging 1. POS Tagging. Rule-based taggers Statistical taggers Hybrid approaches
POS Tagging 1 POS Tagging Rule-based taggers Statistical taggers Hybrid approaches POS Tagging 1 POS Tagging 2 Words taken isolatedly are ambiguous regarding its POS Yo bajo con el hombre bajo a PP AQ
An NLP Curator (or: How I Learned to Stop Worrying and Love NLP Pipelines)
An NLP Curator (or: How I Learned to Stop Worrying and Love NLP Pipelines) James Clarke, Vivek Srikumar, Mark Sammons, Dan Roth Department of Computer Science, University of Illinois, Urbana-Champaign.
Open Domain Information Extraction. Günter Neumann, DFKI, 2012
Open Domain Information Extraction Günter Neumann, DFKI, 2012 Improving TextRunner Wu and Weld (2010) Open Information Extraction using Wikipedia, ACL 2010 Fader et al. (2011) Identifying Relations for
Multi-Engine Machine Translation by Recursive Sentence Decomposition
Multi-Engine Machine Translation by Recursive Sentence Decomposition Bart Mellebeek Karolina Owczarzak Josef van Genabith Andy Way National Centre for Language Technology School of Computing Dublin City
NATURAL LANGUAGE QUERY PROCESSING USING PROBABILISTIC CONTEXT FREE GRAMMAR
NATURAL LANGUAGE QUERY PROCESSING USING PROBABILISTIC CONTEXT FREE GRAMMAR Arati K. Deshpande 1 and Prakash. R. Devale 2 1 Student and 2 Professor & Head, Department of Information Technology, Bharati
Shallow Parsing with Apache UIMA
Shallow Parsing with Apache UIMA Graham Wilcock University of Helsinki Finland [email protected] Abstract Apache UIMA (Unstructured Information Management Architecture) is a framework for linguistic
A Fast Decoder for Joint Word Segmentation and POS-Tagging Using a Single Discriminative Model
A Fast Decoder for Joint Word Segmentation and POS-Tagging Using a Single Discriminative Model Yue Zhang and Stephen Clark University of Cambridge Computer Laboratory William Gates Building, 15 JJ Thomson
Micro blogs Oriented Word Segmentation System
Micro blogs Oriented Word Segmentation System Yijia Liu, Meishan Zhang, Wanxiang Che, Ting Liu, Yihe Deng Research Center for Social Computing and Information Retrieval Harbin Institute of Technology,
Introduction. BM1 Advanced Natural Language Processing. Alexander Koller. 17 October 2014
Introduction! BM1 Advanced Natural Language Processing Alexander Koller! 17 October 2014 Outline What is computational linguistics? Topics of this course Organizational issues Siri Text prediction Facebook
Systematic Comparison of Professional and Crowdsourced Reference Translations for Machine Translation
Systematic Comparison of Professional and Crowdsourced Reference Translations for Machine Translation Rabih Zbib, Gretchen Markiewicz, Spyros Matsoukas, Richard Schwartz, John Makhoul Raytheon BBN Technologies
Automatic slide assignation for language model adaptation
Automatic slide assignation for language model adaptation Applications of Computational Linguistics Adrià Agustí Martínez Villaronga May 23, 2013 1 Introduction Online multimedia repositories are rapidly
Translating the Penn Treebank with an Interactive-Predictive MT System
IJCLA VOL. 2, NO. 1 2, JAN-DEC 2011, PP. 225 237 RECEIVED 31/10/10 ACCEPTED 26/11/10 FINAL 11/02/11 Translating the Penn Treebank with an Interactive-Predictive MT System MARTHA ALICIA ROCHA 1 AND JOAN
#hardtoparse: POS Tagging and Parsing the Twitterverse
Analyzing Microtext: Papers from the 2011 AAAI Workshop (WS-11-05) #hardtoparse: POS Tagging and Parsing the Twitterverse Jennifer Foster 1, Özlem Çetinoğlu 1, Joachim Wagner 1, Joseph Le Roux 2 Stephen
Leveraging Frame Semantics and Distributional Semantic Polieties
Leveraging Frame Semantics and Distributional Semantics for Unsupervised Semantic Slot Induction in Spoken Dialogue Systems (Extended Abstract) Yun-Nung Chen, William Yang Wang, and Alexander I. Rudnicky
D2.4: Two trained semantic decoders for the Appointment Scheduling task
D2.4: Two trained semantic decoders for the Appointment Scheduling task James Henderson, François Mairesse, Lonneke van der Plas, Paola Merlo Distribution: Public CLASSiC Computational Learning in Adaptive
Less Grammar, More Features
Less Grammar, More Features David Hall Greg Durrett Dan Klein Computer Science Division University of California, Berkeley {dlwh,gdurrett,klein}@cs.berkeley.edu Abstract We present a parser that relies
A Mixed Trigrams Approach for Context Sensitive Spell Checking
A Mixed Trigrams Approach for Context Sensitive Spell Checking Davide Fossati and Barbara Di Eugenio Department of Computer Science University of Illinois at Chicago Chicago, IL, USA [email protected], [email protected]
SWIFT Aligner, A Multifunctional Tool for Parallel Corpora: Visualization, Word Alignment, and (Morpho)-Syntactic Cross-Language Transfer
SWIFT Aligner, A Multifunctional Tool for Parallel Corpora: Visualization, Word Alignment, and (Morpho)-Syntactic Cross-Language Transfer Timur Gilmanov, Olga Scrivner, Sandra Kübler Indiana University
Outline of today s lecture
Outline of today s lecture Generative grammar Simple context free grammars Probabilistic CFGs Formalism power requirements Parsing Modelling syntactic structure of phrases and sentences. Why is it useful?
A chart generator for the Dutch Alpino grammar
June 10, 2009 Introduction Parsing: determining the grammatical structure of a sentence. Semantics: a parser can build a representation of meaning (semantics) as a side-effect of parsing a sentence. Generation:
A Framework-based Online Question Answering System. Oliver Scheuer, Dan Shen, Dietrich Klakow
A Framework-based Online Question Answering System Oliver Scheuer, Dan Shen, Dietrich Klakow Outline General Structure for Online QA System Problems in General Structure Framework-based Online QA system
Modeling coherence in ESOL learner texts
University of Cambridge Computer Lab Building Educational Applications NAACL 2012 Outline 1 2 3 4 The Task: Automated Text Scoring (ATS) ATS systems Discourse coherence & cohesion The Task: Automated Text
Basic Parsing Algorithms Chart Parsing
Basic Parsing Algorithms Chart Parsing Seminar Recent Advances in Parsing Technology WS 2011/2012 Anna Schmidt Talk Outline Chart Parsing Basics Chart Parsing Algorithms Earley Algorithm CKY Algorithm
Strategies for Training Large Scale Neural Network Language Models
Strategies for Training Large Scale Neural Network Language Models Tomáš Mikolov #1, Anoop Deoras 2, Daniel Povey 3, Lukáš Burget #4, Jan Honza Černocký #5 # Brno University of Technology, Speech@FIT,
Ngram Search Engine with Patterns Combining Token, POS, Chunk and NE Information
Ngram Search Engine with Patterns Combining Token, POS, Chunk and NE Information Satoshi Sekine Computer Science Department New York University [email protected] Kapil Dalwani Computer Science Department
Transition-Based Dependency Parsing with Long Distance Collocations
Transition-Based Dependency Parsing with Long Distance Collocations Chenxi Zhu, Xipeng Qiu (B), and Xuanjing Huang Shanghai Key Laboratory of Intelligent Information Processing, School of Computer Science,
2014/02/13 Sphinx Lunch
2014/02/13 Sphinx Lunch Best Student Paper Award @ 2013 IEEE Workshop on Automatic Speech Recognition and Understanding Dec. 9-12, 2013 Unsupervised Induction and Filling of Semantic Slot for Spoken Dialogue
Why language is hard. And what Linguistics has to say about it. Natalia Silveira Participation code: eagles
Why language is hard And what Linguistics has to say about it Natalia Silveira Participation code: eagles Christopher Natalia Silveira Manning Language processing is so easy for humans that it is like
CINTIL-PropBank. CINTIL-PropBank Sub-corpus id Sentences Tokens Domain Sentences for regression atsts 779 5,654 Test
CINTIL-PropBank I. Basic Information 1.1. Corpus information The CINTIL-PropBank (Branco et al., 2012) is a set of sentences annotated with their constituency structure and semantic role tags, composed
BEER 1.1: ILLC UvA submission to metrics and tuning task
BEER 1.1: ILLC UvA submission to metrics and tuning task Miloš Stanojević ILLC University of Amsterdam [email protected] Khalil Sima an ILLC University of Amsterdam [email protected] Abstract We describe
Automatic Detection and Correction of Errors in Dependency Treebanks
Automatic Detection and Correction of Errors in Dependency Treebanks Alexander Volokh DFKI Stuhlsatzenhausweg 3 66123 Saarbrücken, Germany [email protected] Günter Neumann DFKI Stuhlsatzenhausweg
Parsing Software Requirements with an Ontology-based Semantic Role Labeler
Parsing Software Requirements with an Ontology-based Semantic Role Labeler Michael Roth University of Edinburgh [email protected] Ewan Klein University of Edinburgh [email protected] Abstract Software
Brill s rule-based PoS tagger
Beáta Megyesi Department of Linguistics University of Stockholm Extract from D-level thesis (section 3) Brill s rule-based PoS tagger Beáta Megyesi Eric Brill introduced a PoS tagger in 1992 that was based
Question Prediction Language Model
Proceedings of the Australasian Language Technology Workshop 2007, pages 92-99 Question Prediction Language Model Luiz Augusto Pizzato and Diego Mollá Centre for Language Technology Macquarie University
Terminology Extraction from Log Files
Terminology Extraction from Log Files Hassan Saneifar 1,2, Stéphane Bonniol 2, Anne Laurent 1, Pascal Poncelet 1, and Mathieu Roche 1 1 LIRMM - Université Montpellier 2 - CNRS 161 rue Ada, 34392 Montpellier
Language and Computation
Language and Computation week 13, Thursday, April 24 Tamás Biró Yale University [email protected] http://www.birot.hu/courses/2014-lc/ Tamás Biró, Yale U., Language and Computation p. 1 Practical matters
Automatic Speech Recognition and Hybrid Machine Translation for High-Quality Closed-Captioning and Subtitling for Video Broadcast
Automatic Speech Recognition and Hybrid Machine Translation for High-Quality Closed-Captioning and Subtitling for Video Broadcast Hassan Sawaf Science Applications International Corporation (SAIC) 7990
Modern Natural Language Interfaces to Databases: Composing Statistical Parsing with Semantic Tractability
Modern Natural Language Interfaces to Databases: Composing Statistical Parsing with Semantic Tractability Ana-Maria Popescu Alex Armanasu Oren Etzioni University of Washington David Ko {amp, alexarm, etzioni,
This page intentionally left blank
This page intentionally left blank Statistical Machine Translation The field of machine translation has recently been energized by the emergence of statistical techniques, which have brought the dream
Identifying Focus, Techniques and Domain of Scientific Papers
Identifying Focus, Techniques and Domain of Scientific Papers Sonal Gupta Department of Computer Science Stanford University Stanford, CA 94305 [email protected] Christopher D. Manning Department of
Effectiveness of Domain Adaptation Approaches for Social Media PoS Tagging
Effectiveness of Domain Adaptation Approaches for Social Media PoS Tagging Tobias Horsmann, Torsten Zesch Language Technology Lab Department of Computer Science and Applied Cognitive Science University
Course: Model, Learning, and Inference: Lecture 5
Course: Model, Learning, and Inference: Lecture 5 Alan Yuille Department of Statistics, UCLA Los Angeles, CA 90095 [email protected] Abstract Probability distributions on structured representation.
Machine Learning Approach To Augmenting News Headline Generation
Machine Learning Approach To Augmenting News Headline Generation Ruichao Wang Dept. of Computer Science University College Dublin Ireland [email protected] John Dunnion Dept. of Computer Science University
Artificial Intelligence Exam DT2001 / DT2006 Ordinarie tentamen
Artificial Intelligence Exam DT2001 / DT2006 Ordinarie tentamen Date: 2010-01-11 Time: 08:15-11:15 Teacher: Mathias Broxvall Phone: 301438 Aids: Calculator and/or a Swedish-English dictionary Points: The
IBM Research Report. Scaling Shrinkage-Based Language Models
RC24970 (W1004-019) April 6, 2010 Computer Science IBM Research Report Scaling Shrinkage-Based Language Models Stanley F. Chen, Lidia Mangu, Bhuvana Ramabhadran, Ruhi Sarikaya, Abhinav Sethy IBM Research
Task-oriented Evaluation of Syntactic Parsers and Their Representations
Task-oriented Evaluation of Syntactic Parsers and Their Representations Yusuke Miyao Rune Sætre Kenji Sagae Takuya Matsuzaki Jun ichi Tsujii Department of Computer Science, University of Tokyo, Japan School
Active Learning SVM for Blogs recommendation
Active Learning SVM for Blogs recommendation Xin Guan Computer Science, George Mason University Ⅰ.Introduction In the DH Now website, they try to review a big amount of blogs and articles and find the
SZTE-NLP: Aspect Level Opinion Mining Exploiting Syntactic Cues
ZTE-NLP: Aspect Level Opinion Mining Exploiting yntactic Cues Viktor Hangya 1, Gábor Berend 1, István Varga 2, Richárd Farkas 1 1 University of zeged Department of Informatics {hangyav,berendg,rfarkas}@inf.u-szeged.hu
PoS-tagging Italian texts with CORISTagger
PoS-tagging Italian texts with CORISTagger Fabio Tamburini DSLO, University of Bologna, Italy [email protected] Abstract. This paper presents an evolution of CORISTagger [1], an high-performance
Joint POS Tagging and Text Normalization for Informal Text
Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence (IJCAI 2015) Joint POS Tagging and Text Normalization for Informal Text Chen Li and Yang Liu University of Texas
Tibetan-Chinese Bilingual Sentences Alignment Method based on Multiple Features
, pp.273-280 http://dx.doi.org/10.14257/ijdta.2015.8.4.27 Tibetan-Chinese Bilingual Sentences Alignment Method based on Multiple Features Lirong Qiu School of Information Engineering, MinzuUniversity of
Learning the Structure of Task-driven Human-Human Dialogs
Learning the Structure of Task-driven Human-Human Dialogs Srinivas Bangalore AT&T Labs-Research 180 Park Ave Florham Park, NJ 07932 [email protected] Giuseppe Di Fabbrizio AT&T Labs-Research 180 Park
