Syntactic Transfer Using a Bilingual Lexicon

Size: px
Start display at page:

Download "Syntactic Transfer Using a Bilingual Lexicon"

Transcription

1 Syntactic Transfer Using a Bilingual Lexicon Greg Durrett, Adam Pauls, and Dan Klein UC Berkeley

2 Parsing a New Language

3 Parsing a New Language Mozambique hope on trade with other members

4 Parsing a New Language NOUN VERB PREP NOUN PREP ADJ NOUN Mozambique hope on trade with other members [Berg-Kirkpatrick and Klein (2010), Petrov et al. (2011)]

5 Parsing a New Language NOUN VERB PREP NOUN PREP ADJ NOUN Mozambique hope on trade with other members [Berg-Kirkpatrick and Klein (2010), Petrov et al. (2011)] [McDonald et al. (2011), Cohen et al. (2011)]

6 Parsing a New Language NOUN VERB PREP NOUN PREP ADJ NOUN Mozambique hope on trade with other members [Berg-Kirkpatrick and Klein (2010), Petrov et al. (2011)] [McDonald et al. (2011), Cohen et al. (2011)]

7 Parsing a New Language NOUN VERB PREP NOUN PREP ADJ NOUN Mozambique hope on trade with other members [Berg-Kirkpatrick and Klein (2010), Petrov et al. (2011)] [McDonald et al. (2011), Cohen et al. (2011)]

8 Parsing a New Language NOUN VERB PREP NOUN PREP ADJ NOUN Mozambique hope on trade with other members [Berg-Kirkpatrick and Klein (2010), Petrov et al. (2011)] [McDonald et al. (2011), Cohen et al. (2011)]

9 Parsing a New Language? NOUN VERB PREP NOUN PREP ADJ NOUN Mozambique hope on trade with other members [Berg-Kirkpatrick and Klein (2010), Petrov et al. (2011)] [McDonald et al. (2011), Cohen et al. (2011)]

10 Parsing a New Language? NOUN VERB PREP NOUN PREP ADJ NOUN Mozambique hope on trade with other members [Berg-Kirkpatrick and Klein (2010), Petrov et al. (2011)] [McDonald et al. (2011), Cohen et al. (2011)]

11 Parsing a New Language? NOUN VERB PREP NOUN PREP ADJ NOUN Mozambique hope on trade with other members [Berg-Kirkpatrick and Klein (2010), Petrov et al. (2011)] [McDonald et al. (2011), Cohen et al. (2011)]

12 Token-level Projection NOUN VERB PREP NOUN PREP ADJ NOUN Mozambique hopes for trade with other members [Hwa et al. (2005), Ganchev et al. (2009), McDonald et al. (2011)]

13 Token-level Projection NOUN VERB PREP NOUN PREP ADJ NOUN Mozambique hopes for trade with other members Assumes access to parallel data [Hwa et al. (2005), Ganchev et al. (2009), McDonald et al. (2011)]

14 Token-level Projection NOUN VERB PREP NOUN PREP ADJ NOUN Mozambique hopes for trade with other members Assumes access to parallel data Asymmetric train and test conditions [Hwa et al. (2005), Ganchev et al. (2009), McDonald et al. (2011)]

15 Type-level Projection NOUN VERB PREP NOUN PREP ADJ NOUN

16 Type-level Projection NOUN VERB PREP NOUN PREP ADJ NOUN DE EN dict Handel trade, deal hofft hope mit with

17 Type-level Projection NOUN VERB PREP NOUN PREP ADJ NOUN trade, deal with DE EN dict Handel trade, deal hofft hope mit with

18 Type-level Projection NOUN VERB PREP NOUN PREP ADJ NOUN trade, deal with DE EN dict Handel trade, deal hofft hope mit with EN trees... trade with those regions... NOUN ADP DET NOUN... a deal with China... DET NOUN ADP NOUN

19 Type-level Projection Feature: Goodness in English Value: high NOUN VERB PREP NOUN PREP ADJ NOUN trade, deal with DE EN dict Handel trade, deal hofft hope mit with EN trees... trade with those regions... NOUN ADP DET NOUN... a deal with China... DET NOUN ADP NOUN

20 Type-level Projection NOUN VERB PREP NOUN PREP ADJ NOUN DE EN dict Handel trade, deal hofft hope mit with

21 Type-level Projection NOUN VERB PREP NOUN PREP ADJ NOUN hope with DE EN dict Handel trade, deal hofft hope mit with

22 Type-level Projection NOUN VERB PREP NOUN PREP ADJ NOUN hope with DE EN dict Handel trade, deal hofft hope mit with EN trees no examples

23 Type-level Projection Feature: Goodness in English Value: low NOUN VERB PREP NOUN PREP ADJ NOUN hope with DE EN dict Handel trade, deal hofft hope mit with EN trees no examples

24 Model Edge-factored discriminative parser, based on MSTParser (McDonald et al. 2005) DELEX features: universal POS (McDonald et al. 2011) PROJ features: Goodness in English

25 Model: DELEX Features NOUN VERB PREP NOUN PREP ADJ NOUN Mozambique hope on trade with other members

26 Model: DELEX Features ATTACH NOUN VERB PREP NOUN PREP ADJ NOUN Mozambique hope on trade with other members

27 Model: DELEX Features ATTACH NOUN VERB PREP NOUN PREP ADJ NOUN Mozambique hope on trade with other members

28 Model: DELEX Features ATTACH Feature Value NOUN PREP 1 NOUN VERB PREP NOUN PREP ADJ NOUN Mozambique hope on trade with other members

29 Model: DELEX Features ATTACH Feature Value NOUN PREP 1 + indicators including direction and distance NOUN VERB PREP NOUN PREP ADJ NOUN Mozambique hope on trade with other members

30 Model: PROJ Features ATTACH NOUN VERB PREP NOUN PREP ADJ NOUN Mozambique hope on trade with other members

31 Model: PROJ Features ATTACH Feature Value Handel mit 1 NOUN VERB PREP NOUN PREP ADJ NOUN Mozambique hope on trade with other members

32 Model: PROJ Features ATTACH Insufficient target-language supervised data to use X Feature Value Handel mit 1 NOUN VERB PREP NOUN PREP ADJ NOUN Mozambique hope on trade with other members

33 Model: PROJ Features ATTACH Feature Value Handel mit NOUN VERB PREP NOUN PREP ADJ NOUN Mozambique hope on trade with other members

34 Model: PROJ Features ATTACH Feature Value Score(Handel mit) NOUN VERB PREP NOUN PREP ADJ NOUN Mozambique hope on trade with other members

35 Model: PROJ Features ATTACH Feature Value Score(Handel mit) Query NOUN VERB PREP NOUN PREP ADJ NOUN Mozambique hope on trade with other members

36 Model: PROJ Features ATTACH Feature Goodness in English Value Score(Handel mit) Query NOUN VERB PREP NOUN PREP ADJ NOUN Mozambique hope on trade with other members

37 DELEX+PROJ MULTI-PROJ Query Score Computation Score(Handel PREP) = log p(prep Handel) p(w s Handel)

38 DELEX+PROJ DELEX+PROJ MULTIPROJ MULTI-PROJ Query Score Computation Score(Handel Score(Handel PREP) = log p(prep Handel) PREP) = log p(prep Handel) = log X p(prep w s ) p(w s Handel) p(w s Handel) w s

39 DELEX+PROJ DELEX+PROJ MULTIPROJ MULTI-PROJ our approach our produces approach gains produces across gains a range across a ra of target languages, of target languages, using two using different twolow- resource training resource methodologies training methodologies (one weakly (one wea different lo supervised supervised and one indirectly and one supervised) indirectly supervised) and two different twodictionary different dictionary sources (one sources manually constructed ally constructed and one automatically and one automatically con- c (one ma structed). structed). Query Score Computation Score(Handel Score(Handel PREP) = log p(prep Handel) PREP) = log p(prep Handel) = log X p(prep w s ) p(w s Handel) p(w s Handel) w s Score(Handel Score(Handel PREP) =p(prep Handel) PREP) p(w s Handel) p(w s Handel) trade 0.5 deal p(x trade) p(x trade) 0.5

40 DELEX+PROJ DELEX+PROJ MULTIPROJ MULTI-PROJ = log X w s E(PREP w s ) p(w s Handel) Query Score Computation EN trees... trade with those regions... NOUN PREP DET NOUN... a good trade... DET ADJ NOUN our approach our produces approach gains produces across gains a range across a ra of target languages, of target languages, using two using different twolow- resource training resource methodologies training methodologies (one weakly (one wea different lo supervised supervised and one indirectly and one supervised) indirectly supervised) and two different twodictionary different dictionary sources (one sources manually constructed ally constructed and one automatically and one automatically con- c (one ma structed). structed). Score(Handel p(w s Handel) PREP) = log p(prep Handel) Score(Handel PREP) = log p(prep Handel) p(x trade) = log X p(prep w s ) p(w s Handel) p(w s Handel) w s w s = trade Score(Handel Score(Handel PREP) =p(prep Handel) PREP) p(w s Handel) p(w s Handel) trade 0.5 deal p(x trade) p(x trade) 0.5

41 DELEX+PROJ our approach our produces demand approach gains takes produces across nouns gains a on range its left and different supervised DELEX+PROJ and one indirectly supervised) and across a ra = log X dictionary MULTIPROJ E(PREP w sources (one manuconstructed demand takes nouns on it two different dictionary s ) p(w sources of target s Handel) languages, German of target languages, using verlangen two using different should twolow- resource training resource methodologies training methodologies (one weakly (one wea do the s different lo MULTI-PROJ and one automatically concted). s constructed and one automatically con- (one manually w German verlangen should ing attachments to Verzicht and G ing attachments to Verzic structed). Query Score supervised Computation supervised and one indirectly and one supervised) indirectly supervised) and two different twodictionary different dictionary sources (one sources manually constructed (one ma Score(Handel p(w s Handel) PREP) = log p(prep Handel) ally constructed and one automatically and one automatically constructed). = structed). log c Score(Handel PREP) p(prep Handel) (Handel Score(Handel PREP) =p(prep Handel) PREP) =p(prep Handel) p(x trade) = log X p(prep w s ) p(w s Handel) w s AUTOMATIC AUTOMA p(w Score(Handel s Handel) Voc OCC Voc O p(w s Handel) p(w s Handel) Score(Handel PREP) =p(prep Handel) PREP) DA 324K DA 324K w s = trade DE 320K DE 320K p(x trade) p(x trade) p(w s Handel) p(w EL s Handel) 196K EL 196K ES PREP 0.5 trade ES K 165K IT IT 158K 0 ADJ 0.5 deal p(x trade) p(x trade) 158K NL 251K NL 251K PT 165K PT 165K SV 307K SV 307K Ta Table 14:

42 Feature Specialization Feature: Goodness in English Value: Score(hofft PREP) Feature: Goodness in English Value: Score(Handel PREP) NOUN VERB PREP NOUN PREP ADJ NOUN Mozambique hope on trade with other members

43 Feature Specialization Feature: Goodness (VERB link) in English Value: Score(hofft PREP) Feature: Goodness in English Value: Score(Handel PREP) NOUN VERB PREP NOUN PREP ADJ NOUN Mozambique hope on trade with other members

44 Feature Specialization Feature: Goodness (VERB link) in English Value: Score(hofft PREP) Feature: Goodness (NOUN link) in English Value: Score(Handel PREP) NOUN VERB PREP NOUN PREP ADJ NOUN Mozambique hope on trade with other members

45 Feature Specialization Feature: Goodness (VERB link) in English Value: Score(hofft PREP) Feature: Goodness (NOUN link) in English Value: Score(Handel PREP) NOUN VERB PREP NOUN PREP ADJ NOUN Mozambique hope on trade with other members Similar to selective sharing of Naseem et al. (2012)

46 Experiments Sources of bilingual dictionaries Training Results

47 MANUAL Lexicon

48 MANUAL Lexicon

49 DELEX+PROJ MULTI-DIR MULTI-PROJ MANUAL Lexicon DELEX+PROJ structed). MULTI-PROJ resource training methodologies (one we supervised and one indirectly supervised two different dictionary sources (one m ally constructed and one automatically Score(Handel Score(Handel PREP) =p(prep Handel) PREP) =p(prep H p(w s Handel) p(w s Handel) deal trade p(x trade) trading 0.333

50 AUTOMATIC Lexicon Da ist zum Beispiel der Handel mit Drogen Trafficking in drugs is one example Unkontrollierter Handel mit leichten Waffen Uncontrolled trade in light weapons Ich will mit dem Handel anfangen I will start with trade

51 structed). structed). AUTOMATIC Lexicon Score(Handel Score(Handel PREP) =p(prep Handel) PREP) =p(prep Da ist zum Beispiel der Handel mit Drogen Trafficking in drugs is one example p(w s Handel) p(w s Handel) trade 0.66 trafficking p(x trade) p(x trade) 0.33 Unkontrollierter Handel mit leichten Waffen Uncontrolled trade in light weapons Ich will mit dem Handel anfangen I will start with trade

52 structed). structed). AUTOMATIC Lexicon Score(Handel Score(Handel PREP) =p(prep Handel) PREP) =p(prep Da ist zum Beispiel der Handel mit Drogen Trafficking in drugs is one example Unkontrollierter Handel mit leichten Waffen Uncontrolled trade in light weapons Ich will mit dem Handel anfangen I will start with trade p(w s Handel) p(w s Handel) trade trafficking p(x trade) p(x trade) commerce 0.05 trading 0.03 market 0.02 marketing 0.02 business traffic sales deal

53 Small Target Training Set 8 target languages, CoNLL shared task treebanks Train on 400 trees from CoNLL training set Test on CoNLL test set

54 Small Target Training Set 8 target languages, CoNLL shared task treebanks Train on 400 trees from CoNLL training set Test on CoNLL test set Universal POS tags generated from gold tags using mapping of Petrov et al. (2011)

55 Small Target Training Set 8 target languages, CoNLL shared task treebanks Train on 400 trees from CoNLL training set Test on CoNLL test set Universal POS tags generated from gold tags using mapping of Petrov et al. (2011) Source corpus: parsed English Gigaword

56 UAS (8 languages averaged) behavior of words at a type level. In a discriminative dependency parsing framework, our approach produces gains across a range of target languages, using two different lowresource training methodologies (one weakly supervised and one indirectly supervised) and two different dictionary sources (one manually constructed and one automatically constructed). Small Target Training Set DELEX DELEX+PROJ(MANUAL) DELEX+PROJ(AUTOMATIC) MULTI-DIR MULTI-PROJ Score(Handel train trees PREP) =p(prep Handel) p(w s Handel) p(x trade) Figure 1: Sentences in English and Germ ing words that mean demand. The fact demand takes nouns on its left and right German verlangen should do the same, c ing attachments to Verzicht and Gewerks AUTOMATIC Voc OCC Voc MAN DA 324K K DE 320K K EL 196K K ES 165K K IT 158K K NL 251K K PT 165K K SV 307K K Table 14:

57 UAS (8 languages averaged) behavior of words at a type level. In a discriminative dependency parsing framework, our approach produces gains across a range of target languages, using two different lowresource training methodologies (one weakly supervised and one indirectly supervised) and two different dictionary sources (one manually constructed and one automatically constructed). Small Target Training Set DELEX DELEX+PROJ(MANUAL) DELEX+PROJ(AUTOMATIC) MULTI-DIR MULTI-PROJ Score(Handel 400 train trees PREP) =p(prep Handel) p(w s Handel) p(x trade) Figure 1: Sentences in English and Germ ing words that mean demand. The fact demand takes nouns on its left and right German verlangen should do the same, c ing attachments to Verzicht and Gewerks AUTOMATIC Voc OCC Voc MAN DA 324K K DE 320K K EL 196K K ES 165K K IT 158K K NL 251K K PT 165K K SV 307K K Table 14:

58 UAS (8 languages averaged) behavior of words at a type level. In a discriminative dependency parsing framework, our approach produces gains across a range of target languages, using two different lowresource training methodologies (one weakly supervised and one indirectly supervised) and two different dictionary sources (one manually constructed and one automatically constructed). Small Target Training Set DELEX DELEX+PROJ(MANUAL) DELEX+PROJ(AUTOMATIC) MULTI-DIR MULTI-PROJ Score(Handel 400 train trees PREP) =p(prep Handel) p(w s Handel) p(x trade) Figure 1: Sentences in English and Germ ing words that mean demand. The fact demand takes nouns on its left and right German verlangen should do the same, c ing attachments to Verzicht and Gewerks AUTOMATIC Voc OCC Voc MAN DA 324K K DE 320K K EL 196K K ES 165K K IT 158K K NL 251K K PT 165K K SV 307K K Table 14:

59 UAS (8 languages averaged) behavior of words at a type level. In a discriminative dependency parsing framework, resource training methodologies (one weakly our approach produces gains across a range supervised and one indirectly supervised) and of target languages, using two different lowresource training two different Figure dictionary 1: Sentences sources in English (one manually constructed ing words and Germ Small Target methodologies (one Training weakly Set andthat onemean automatically demand. The con-facstructed). demand takes nouns on its left and right supervised and one indirectly supervised) and two different dictionary sources (one manually constructed and one automatically German verlangen should do the same, c DELEX constructed). ing attachments to Verzicht and Gewerks DELEX+PROJ(MANUAL) 100 DELEX+PROJ(AUTOMATIC) DELEX MANUAL DELEX+PROJ(MANUAL) AUTOMATIC DELEX+PROJ(AUTOMATIC) 89.9 DELEX MULTI-DIR AUTOMATIC MAN 75 MULTI-PROJ DELEX+PROJ 77.4 Voc OCC Voc MULTI-DIR DA 324K K MULTI-PROJ DE Score(Handel +1.6 PREP) =p(prep Handel) K K 50 EL 196K K 75.1 ES 165K K Score(Handel PREP) =p(prep Handel) IT 158K K p(w s Handel) NL 251K K 25 PT 165K K p(w s Handel) p(x trade) SV 307K K 400 train trees p(x trade) 0 Open-class token coverage Table 14:

60 two different dictionary sources ( ally constructed and one automat structed). UAS (8 languages averaged) Small Target Training Set DELEX DELEX+PROJ(MANUAL) DELEX+PROJ(AUTOMATIC) MANUAL AUTOMATIC DELEX DELEX+PROJ MULTI-DIR MULTI-PROJ Score(Handel train trees PREP) =p(pre p(w s Handel) p(x trade)

61 two different dictionary sources ( ally constructed and one automat structed). UAS (8 languages averaged) Small Target Training Set DELEX DELEX+PROJ(MANUAL) DELEX+PROJ(AUTOMATIC) MANUAL AUTOMATIC DELEX DELEX+PROJ MULTI-DIR MULTI-PROJ Score(Handel train trees 200 target trees 400 train trees PREP) =p(pre p(w s Handel) p(x trade)

62 No Target Training Set Train ES trees Test DE CoNLL test set PT trees SV trees DET NOUN VERB ADJ El amarillismo es imparcial NOUN VERB NOUN PREP DET NOUN Bruxelas adia fiscalidade de os combustíveis PRON VERB PRON ADV Jag tycker det inte... [McDonald et al. (2011)]

63 No Target Training Set Train ES trees Test DE CoNLL test set DET NOUN VERB ADJ PT trees NOUN VERB NOUN PREP DET NOUN SV trees PRON VERB PRON ADV... [McDonald et al. (2011)]

64 No Target Training Set Train ES trees Test DE CoNLL test set PT trees SV trees DET NOUN VERB ADJ El amarillismo es imparcial NOUN VERB NOUN PREP DET NOUN Bruxelas adia fiscalidade de os combustíveis PRON VERB PRON ADV Jag tycker det inte... [McDonald et al. (2011)]

65 UAS (8 languages averaged) supervised and one indirectly supervised) and two different dictionary sources (one manually constructed and one automatically constructed). No Target Training Set DELEX DELEX+PROJ(MANUAL) DELEX+PROJ(AUTOMATIC) MANUAL AUTOMATIC DELEX DELEX+PROJ MULTI-DIR 62.6 MULTI-PROJ Score(Handel PREP) =p(prep Handel) p(w s Handel) p(x trade) demand takes nouns on its l German verlangen should d ing attachments to Verzicht AUTOMAT Voc OC DA 324K 0.9 DE 320K 0.8 EL 196K 0.9 ES 165K 0.8 IT 158K 0.9 NL 251K 0.8 PT 165K 0.8 SV 307K 0.9 Tabl

66 UAS (8 languages averaged) supervised and one indirectly supervised) and two different dictionary sources (one manually constructed and one automatically constructed). No Target Training Set DELEX DELEX+PROJ(MANUAL) DELEX+PROJ(AUTOMATIC) Type-level* (this work) MANUAL AUTOMATIC DELEX DELEX+PROJ MULTI-DIR 62.6 MULTI-PROJ Score(Handel PREP) =p(prep Handel) p(w s Handel) p(x trade) demand takes nouns on its l German verlangen should d ing attachments to Verzicht AUTOMAT Voc OC DA 324K 0.9 DE 320K 0.8 EL 196K 0.9 ES 165K 0.8 IT 158K 0.9 NL 251K 0.8 PT 165K 0.8 SV 307K 0.9 Tabl

67 UAS (8 languages averaged) supervised and one indirectly supervised) and structed). two different dictionary sources (one manually constructed and one automatically constructed). DELEX No Target Training Set DELEX DELEX+PROJ(MANUAL) DELEX+PROJ(AUTOMATIC) Type-level* (this work) MANUAL AUTOMATIC DELEX DELEX+PROJ MULTI-DIR 62.6 MULTI-PROJ Score(Handel Token-level (McDonald et al. 2011) 61.1 PREP) =p(prep Handel) Score(Handel p(w s Handel) p(x trade) DELEX+PROJ(MANUAL) DELEX+PROJ(AUTOMATIC) MANUAL AUTOMATIC DELEX DELEX+PROJ MULTIDIR MULTIPROJ DELEX+PROJ MULTIPROJ +2.7 demand takes nouns on its l German verlangen should d ing attachments to Verzicht AUTOMAT Voc OC DA 324K DE 320K 0.8 EL 196K 0.9 ES 165K 0.8 IT 158K 0.9 NL 251K 0.8 PREP) = log p(prep Hande PT 165K 0.8 SV 307K 0.9 p(w s Handel) Tabl

68 Gain over delexicalized baseline DELEX DELEX+PROJ(MANUAL) DELEX+PROJ(AUTOMATIC) MANUAL AUTOMATIC DELEX DELEX+PROJ MULTIDIR MULTIPROJ DELEX+PROJ MULTIPROJ Score(Handel Gains PREP) = log p(prep Handel) p(w s Handel) DA DE EL ES IT NL PT SV

69 Related Work Täckström et al. (2012): type-level transfer using word clusters

70 Related Work Täckström et al. (2012): type-level transfer using word clusters Naseem et al. (2012): adaptation of universal POS transfer to the target language

71 Conclusion Bilingual dictionaries allow transfer of linguistic knowledge at a type level

72 Conclusion Bilingual dictionaries allow transfer of linguistic knowledge at a type level Type-level transfer: Comparable to token-level transfer Simpler to implement Does not require bitext

73 Conclusion Bilingual dictionaries allow transfer of linguistic knowledge at a type level Type-level transfer: Comparable to token-level transfer Simpler to implement Does not require bitext Thank you!

Simple Type-Level Unsupervised POS Tagging

Simple Type-Level Unsupervised POS Tagging Simple Type-Level Unsupervised POS Tagging Yoong Keok Lee Aria Haghighi Regina Barzilay Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology {yklee, aria42, regina}@csail.mit.edu

More information

Why language is hard. And what Linguistics has to say about it. Natalia Silveira Participation code: eagles

Why language is hard. And what Linguistics has to say about it. Natalia Silveira Participation code: eagles Why language is hard And what Linguistics has to say about it Natalia Silveira Participation code: eagles Christopher Natalia Silveira Manning Language processing is so easy for humans that it is like

More information

Universal Dependencies

Universal Dependencies Universal Dependencies Joakim Nivre! Uppsala University Department of Linguistics and Philology Based on collaborative work with Jinho Choi, Timothy Dozat, Filip Ginter, Yoav Goldberg, Jan Hajič, Chris

More information

Learning Translation Rules from Bilingual English Filipino Corpus

Learning Translation Rules from Bilingual English Filipino Corpus Proceedings of PACLIC 19, the 19 th Asia-Pacific Conference on Language, Information and Computation. Learning Translation s from Bilingual English Filipino Corpus Michelle Wendy Tan, Raymond Joseph Ang,

More information

Wiki-ly Supervised Part-of-Speech Tagging

Wiki-ly Supervised Part-of-Speech Tagging Wiki-ly Supervised Part-of-Speech Tagging Shen Li Computer & Information Science University of Pennsylvania shenli@seas.upenn.edu João V. Graça L 2 F INESC-ID Lisboa, Portugal javg@l2f.inesc-id.pt Ben

More information

Transition-Based Dependency Parsing with Long Distance Collocations

Transition-Based Dependency Parsing with Long Distance Collocations Transition-Based Dependency Parsing with Long Distance Collocations Chenxi Zhu, Xipeng Qiu (B), and Xuanjing Huang Shanghai Key Laboratory of Intelligent Information Processing, School of Computer Science,

More information

Automatic Detection and Correction of Errors in Dependency Treebanks

Automatic Detection and Correction of Errors in Dependency Treebanks Automatic Detection and Correction of Errors in Dependency Treebanks Alexander Volokh DFKI Stuhlsatzenhausweg 3 66123 Saarbrücken, Germany alexander.volokh@dfki.de Günter Neumann DFKI Stuhlsatzenhausweg

More information

The Evalita 2011 Parsing Task: the Dependency Track

The Evalita 2011 Parsing Task: the Dependency Track The Evalita 2011 Parsing Task: the Dependency Track Cristina Bosco and Alessandro Mazzei Dipartimento di Informatica, Università di Torino Corso Svizzera 185, 101049 Torino, Italy {bosco,mazzei}@di.unito.it

More information

EVALITA 07 parsing task

EVALITA 07 parsing task EVALITA 07 parsing task Cristina BOSCO Alessandro MAZZEI Vincenzo LOMBARDO (Dipartimento di Informatica Università di Torino) 1 overview 1. task 2. development data 3. evaluation 4. conclusions 2 task

More information

Machine Learning for natural language processing

Machine Learning for natural language processing Machine Learning for natural language processing Introduction Laura Kallmeyer Heinrich-Heine-Universität Düsseldorf Summer 2016 1 / 13 Introduction Goal of machine learning: Automatically learn how to

More information

A Mixed Trigrams Approach for Context Sensitive Spell Checking

A Mixed Trigrams Approach for Context Sensitive Spell Checking A Mixed Trigrams Approach for Context Sensitive Spell Checking Davide Fossati and Barbara Di Eugenio Department of Computer Science University of Illinois at Chicago Chicago, IL, USA dfossa1@uic.edu, bdieugen@cs.uic.edu

More information

A chart generator for the Dutch Alpino grammar

A chart generator for the Dutch Alpino grammar June 10, 2009 Introduction Parsing: determining the grammatical structure of a sentence. Semantics: a parser can build a representation of meaning (semantics) as a side-effect of parsing a sentence. Generation:

More information

CINTIL-PropBank. CINTIL-PropBank Sub-corpus id Sentences Tokens Domain Sentences for regression atsts 779 5,654 Test

CINTIL-PropBank. CINTIL-PropBank Sub-corpus id Sentences Tokens Domain Sentences for regression atsts 779 5,654 Test CINTIL-PropBank I. Basic Information 1.1. Corpus information The CINTIL-PropBank (Branco et al., 2012) is a set of sentences annotated with their constituency structure and semantic role tags, composed

More information

Introduction. BM1 Advanced Natural Language Processing. Alexander Koller. 17 October 2014

Introduction. BM1 Advanced Natural Language Processing. Alexander Koller. 17 October 2014 Introduction! BM1 Advanced Natural Language Processing Alexander Koller! 17 October 2014 Outline What is computational linguistics? Topics of this course Organizational issues Siri Text prediction Facebook

More information

Search and Data Mining: Techniques. Text Mining Anya Yarygina Boris Novikov

Search and Data Mining: Techniques. Text Mining Anya Yarygina Boris Novikov Search and Data Mining: Techniques Text Mining Anya Yarygina Boris Novikov Introduction Generally used to denote any system that analyzes large quantities of natural language text and detects lexical or

More information

Applying Co-Training Methods to Statistical Parsing. Anoop Sarkar http://www.cis.upenn.edu/ anoop/ anoop@linc.cis.upenn.edu

Applying Co-Training Methods to Statistical Parsing. Anoop Sarkar http://www.cis.upenn.edu/ anoop/ anoop@linc.cis.upenn.edu Applying Co-Training Methods to Statistical Parsing Anoop Sarkar http://www.cis.upenn.edu/ anoop/ anoop@linc.cis.upenn.edu 1 Statistical Parsing: the company s clinical trials of both its animal and human-based

More information

Improving Data Driven Part-of-Speech Tagging by Morphologic Knowledge Induction

Improving Data Driven Part-of-Speech Tagging by Morphologic Knowledge Induction Improving Data Driven Part-of-Speech Tagging by Morphologic Knowledge Induction Uwe D. Reichel Department of Phonetics and Speech Communication University of Munich reichelu@phonetik.uni-muenchen.de Abstract

More information

Sense-Tagging Verbs in English and Chinese. Hoa Trang Dang

Sense-Tagging Verbs in English and Chinese. Hoa Trang Dang Sense-Tagging Verbs in English and Chinese Hoa Trang Dang Department of Computer and Information Sciences University of Pennsylvania htd@linc.cis.upenn.edu October 30, 2003 Outline English sense-tagging

More information

Finding Syntactic Characteristics of Surinamese Dutch

Finding Syntactic Characteristics of Surinamese Dutch Finding Syntactic Characteristics of Surinamese Dutch Erik Tjong Kim Sang Meertens Institute erikt(at)xs4all.nl June 13, 2014 1 Introduction Surinamese Dutch is a variant of Dutch spoken in Suriname, a

More information

NATURAL LANGUAGE QUERY PROCESSING USING PROBABILISTIC CONTEXT FREE GRAMMAR

NATURAL LANGUAGE QUERY PROCESSING USING PROBABILISTIC CONTEXT FREE GRAMMAR NATURAL LANGUAGE QUERY PROCESSING USING PROBABILISTIC CONTEXT FREE GRAMMAR Arati K. Deshpande 1 and Prakash. R. Devale 2 1 Student and 2 Professor & Head, Department of Information Technology, Bharati

More information

CONSTRAINING THE GRAMMAR OF APS AND ADVPS. TIBOR LACZKÓ & GYÖRGY RÁKOSI http://ieas.unideb.hu/ rakosi, laczko http://hungram.unideb.

CONSTRAINING THE GRAMMAR OF APS AND ADVPS. TIBOR LACZKÓ & GYÖRGY RÁKOSI http://ieas.unideb.hu/ rakosi, laczko http://hungram.unideb. CONSTRAINING THE GRAMMAR OF APS AND ADVPS IN HUNGARIAN AND ENGLISH: CHALLENGES AND SOLUTIONS IN AN LFG-BASED COMPUTATIONAL PROJECT TIBOR LACZKÓ & GYÖRGY RÁKOSI http://ieas.unideb.hu/ rakosi, laczko http://hungram.unideb.hu

More information

ANALYSIS OF LEXICO-SYNTACTIC PATTERNS FOR ANTONYM PAIR EXTRACTION FROM A TURKISH CORPUS

ANALYSIS OF LEXICO-SYNTACTIC PATTERNS FOR ANTONYM PAIR EXTRACTION FROM A TURKISH CORPUS ANALYSIS OF LEXICO-SYNTACTIC PATTERNS FOR ANTONYM PAIR EXTRACTION FROM A TURKISH CORPUS Gürkan Şahin 1, Banu Diri 1 and Tuğba Yıldız 2 1 Faculty of Electrical-Electronic, Department of Computer Engineering

More information

POS Tagsets and POS Tagging. Definition. Tokenization. Tagset Design. Automatic POS Tagging Bigram tagging. Maximum Likelihood Estimation 1 / 23

POS Tagsets and POS Tagging. Definition. Tokenization. Tagset Design. Automatic POS Tagging Bigram tagging. Maximum Likelihood Estimation 1 / 23 POS Def. Part of Speech POS POS L645 POS = Assigning word class information to words Dept. of Linguistics, Indiana University Fall 2009 ex: the man bought a book determiner noun verb determiner noun 1

More information

C o p yr i g ht 2015, S A S I nstitute Inc. A l l r i g hts r eser v ed. INTRODUCTION TO SAS TEXT MINER

C o p yr i g ht 2015, S A S I nstitute Inc. A l l r i g hts r eser v ed. INTRODUCTION TO SAS TEXT MINER INTRODUCTION TO SAS TEXT MINER TODAY S AGENDA INTRODUCTION TO SAS TEXT MINER Define data mining Overview of SAS Enterprise Miner Describe text analytics and define text data mining Text Mining Process

More information

Identifying Focus, Techniques and Domain of Scientific Papers

Identifying Focus, Techniques and Domain of Scientific Papers Identifying Focus, Techniques and Domain of Scientific Papers Sonal Gupta Department of Computer Science Stanford University Stanford, CA 94305 sonal@cs.stanford.edu Christopher D. Manning Department of

More information

Clustering Connectionist and Statistical Language Processing

Clustering Connectionist and Statistical Language Processing Clustering Connectionist and Statistical Language Processing Frank Keller keller@coli.uni-sb.de Computerlinguistik Universität des Saarlandes Clustering p.1/21 Overview clustering vs. classification supervised

More information

Symbiosis of Evolutionary Techniques and Statistical Natural Language Processing

Symbiosis of Evolutionary Techniques and Statistical Natural Language Processing 1 Symbiosis of Evolutionary Techniques and Statistical Natural Language Processing Lourdes Araujo Dpto. Sistemas Informáticos y Programación, Univ. Complutense, Madrid 28040, SPAIN (email: lurdes@sip.ucm.es)

More information

Treebank Search with Tree Automata MonaSearch Querying Linguistic Treebanks with Monadic Second Order Logic

Treebank Search with Tree Automata MonaSearch Querying Linguistic Treebanks with Monadic Second Order Logic Treebank Search with Tree Automata MonaSearch Querying Linguistic Treebanks with Monadic Second Order Logic Authors: H. Maryns, S. Kepser Speaker: Stephanie Ehrbächer July, 31th Treebank Search with Tree

More information

Extraction and Visualization of Protein-Protein Interactions from PubMed

Extraction and Visualization of Protein-Protein Interactions from PubMed Extraction and Visualization of Protein-Protein Interactions from PubMed Ulf Leser Knowledge Management in Bioinformatics Humboldt-Universität Berlin Finding Relevant Knowledge Find information about Much

More information

Trameur: A Framework for Annotated Text Corpora Exploration

Trameur: A Framework for Annotated Text Corpora Exploration Trameur: A Framework for Annotated Text Corpora Exploration Serge Fleury (Sorbonne Nouvelle Paris 3) serge.fleury@univ-paris3.fr Maria Zimina(Paris Diderot Sorbonne Paris Cité) maria.zimina@eila.univ-paris-diderot.fr

More information

Maskinöversättning 2008. F2 Översättningssvårigheter + Översättningsstrategier

Maskinöversättning 2008. F2 Översättningssvårigheter + Översättningsstrategier Maskinöversättning 2008 F2 Översättningssvårigheter + Översättningsstrategier Flertydighet i källspråket poäng point, points, credit, credits, var verb ->was, were pron -> each adv -> where adj -> every

More information

Ling 201 Syntax 1. Jirka Hana April 10, 2006

Ling 201 Syntax 1. Jirka Hana April 10, 2006 Overview of topics What is Syntax? Word Classes What to remember and understand: Ling 201 Syntax 1 Jirka Hana April 10, 2006 Syntax, difference between syntax and semantics, open/closed class words, all

More information

The treatment of noun phrase queries in a natural language database access system

The treatment of noun phrase queries in a natural language database access system The treatment of noun phrase queries in a natural language database access system Alexandra Klein and Johannes Matiasek and Harald Trost Austrian Research nstitute for Artificial ntelligence Schottengasse

More information

Tekniker för storskalig parsning

Tekniker för storskalig parsning Tekniker för storskalig parsning Diskriminativa modeller Joakim Nivre Uppsala Universitet Institutionen för lingvistik och filologi joakim.nivre@lingfil.uu.se Tekniker för storskalig parsning 1(19) Generative

More information

Chunk Parsing. Steven Bird Ewan Klein Edward Loper. University of Melbourne, AUSTRALIA. University of Edinburgh, UK. University of Pennsylvania, USA

Chunk Parsing. Steven Bird Ewan Klein Edward Loper. University of Melbourne, AUSTRALIA. University of Edinburgh, UK. University of Pennsylvania, USA Chunk Parsing Steven Bird Ewan Klein Edward Loper University of Melbourne, AUSTRALIA University of Edinburgh, UK University of Pennsylvania, USA March 1, 2012 chunk parsing: efficient and robust approach

More information

Simple maths for keywords

Simple maths for keywords Simple maths for keywords Adam Kilgarriff Lexical Computing Ltd adam@lexmasterclass.com Abstract We present a simple method for identifying keywords of one corpus vs. another. There is no one-sizefits-all

More information

The Role of Sentence Structure in Recognizing Textual Entailment

The Role of Sentence Structure in Recognizing Textual Entailment Blake,C. (In Press) The Role of Sentence Structure in Recognizing Textual Entailment. ACL-PASCAL Workshop on Textual Entailment and Paraphrasing, Prague, Czech Republic. The Role of Sentence Structure

More information

31 Case Studies: Java Natural Language Tools Available on the Web

31 Case Studies: Java Natural Language Tools Available on the Web 31 Case Studies: Java Natural Language Tools Available on the Web Chapter Objectives Chapter Contents This chapter provides a number of sources for open source and free atural language understanding software

More information

Statistical Machine Translation

Statistical Machine Translation Statistical Machine Translation Some of the content of this lecture is taken from previous lectures and presentations given by Philipp Koehn and Andy Way. Dr. Jennifer Foster National Centre for Language

More information

CALICO Journal, Volume 9 Number 1 9

CALICO Journal, Volume 9 Number 1 9 PARSING, ERROR DIAGNOSTICS AND INSTRUCTION IN A FRENCH TUTOR GILLES LABRIE AND L.P.S. SINGH Abstract: This paper describes the strategy used in Miniprof, a program designed to provide "intelligent' instruction

More information

Context Grammar and POS Tagging

Context Grammar and POS Tagging Context Grammar and POS Tagging Shian-jung Dick Chen Don Loritz New Technology and Research New Technology and Research LexisNexis LexisNexis Ohio, 45342 Ohio, 45342 dick.chen@lexisnexis.com don.loritz@lexisnexis.com

More information

Motivation. Korpus-Abfrage: Werkzeuge und Sprachen. Overview. Languages of Corpus Query. SARA Query Possibilities 1

Motivation. Korpus-Abfrage: Werkzeuge und Sprachen. Overview. Languages of Corpus Query. SARA Query Possibilities 1 Korpus-Abfrage: Werkzeuge und Sprachen Gastreferat zur Vorlesung Korpuslinguistik mit und für Computerlinguistik Charlotte Merz 3. Dezember 2002 Motivation Lizentiatsarbeit: A Corpus Query Tool for Automatically

More information

Shallow Parsing with Apache UIMA

Shallow Parsing with Apache UIMA Shallow Parsing with Apache UIMA Graham Wilcock University of Helsinki Finland graham.wilcock@helsinki.fi Abstract Apache UIMA (Unstructured Information Management Architecture) is a framework for linguistic

More information

How To Translate English To Yoruba Language To Yoranuva

How To Translate English To Yoruba Language To Yoranuva International Journal of Language and Linguistics 2015; 3(3): 154-159 Published online May 11, 2015 (http://www.sciencepublishinggroup.com/j/ijll) doi: 10.11648/j.ijll.20150303.17 ISSN: 2330-0205 (Print);

More information

Search and Information Retrieval

Search and Information Retrieval Search and Information Retrieval Search on the Web 1 is a daily activity for many people throughout the world Search and communication are most popular uses of the computer Applications involving search

More information

Natural Language Database Interface for the Community Based Monitoring System *

Natural Language Database Interface for the Community Based Monitoring System * Natural Language Database Interface for the Community Based Monitoring System * Krissanne Kaye Garcia, Ma. Angelica Lumain, Jose Antonio Wong, Jhovee Gerard Yap, Charibeth Cheng De La Salle University

More information

Coffee Break German Lesson 06

Coffee Break German Lesson 06 LESSON NOTES WIE VIEL KOSTET DAS? In this episode of Coffee Break German we ll start by learning the numbers from zero to ten and then learn to deal with transactional situations involving paying for things

More information

TS3: an Improved Version of the Bilingual Concordancer TransSearch

TS3: an Improved Version of the Bilingual Concordancer TransSearch TS3: an Improved Version of the Bilingual Concordancer TransSearch Stéphane HUET, Julien BOURDAILLET and Philippe LANGLAIS EAMT 2009 - Barcelona June 14, 2009 Computer assisted translation Preferred by

More information

Adapting General Models to Novel Project Ideas

Adapting General Models to Novel Project Ideas The KIT Translation Systems for IWSLT 2013 Thanh-Le Ha, Teresa Herrmann, Jan Niehues, Mohammed Mediani, Eunah Cho, Yuqi Zhang, Isabel Slawik and Alex Waibel Institute for Anthropomatics KIT - Karlsruhe

More information

Semantic parsing with Structured SVM Ensemble Classification Models

Semantic parsing with Structured SVM Ensemble Classification Models Semantic parsing with Structured SVM Ensemble Classification Models Le-Minh Nguyen, Akira Shimazu, and Xuan-Hieu Phan Japan Advanced Institute of Science and Technology (JAIST) Asahidai 1-1, Nomi, Ishikawa,

More information

Data-Driven Bitext Dependency Parsing and Alignment

Data-Driven Bitext Dependency Parsing and Alignment copenhagen business school handelshøjskolen solbjerg plads 3 dk-2000 frederiksberg danmark www.cbs.dk Data-Driven Bitext Dependency Parsing and Alignment Data-Driven Bitext Dependency Parsing and Alignment

More information

Search Engines Chapter 2 Architecture. 14.4.2011 Felix Naumann

Search Engines Chapter 2 Architecture. 14.4.2011 Felix Naumann Search Engines Chapter 2 Architecture 14.4.2011 Felix Naumann Overview 2 Basic Building Blocks Indexing Text Acquisition Text Transformation Index Creation Querying User Interaction Ranking Evaluation

More information

Natural Language to Relational Query by Using Parsing Compiler

Natural Language to Relational Query by Using Parsing Compiler Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 4, Issue. 3, March 2015,

More information

Hybrid Strategies. for better products and shorter time-to-market

Hybrid Strategies. for better products and shorter time-to-market Hybrid Strategies for better products and shorter time-to-market Background Manufacturer of language technology software & services Spin-off of the research center of Germany/Heidelberg Founded in 1999,

More information

Special Topics in Computer Science

Special Topics in Computer Science Special Topics in Computer Science NLP in a Nutshell CS492B Spring Semester 2009 Jong C. Park Computer Science Department Korea Advanced Institute of Science and Technology INTRODUCTION Jong C. Park, CS

More information

NATURAL LANGUAGE QUERY PROCESSING USING SEMANTIC GRAMMAR

NATURAL LANGUAGE QUERY PROCESSING USING SEMANTIC GRAMMAR NATURAL LANGUAGE QUERY PROCESSING USING SEMANTIC GRAMMAR 1 Gauri Rao, 2 Chanchal Agarwal, 3 Snehal Chaudhry, 4 Nikita Kulkarni,, 5 Dr. S.H. Patil 1 Lecturer department o f Computer Engineering BVUCOE,

More information

Question Answering and Multilingual CLEF 2008

Question Answering and Multilingual CLEF 2008 Dublin City University at QA@CLEF 2008 Sisay Fissaha Adafre Josef van Genabith National Center for Language Technology School of Computing, DCU IBM CAS Dublin sadafre,josef@computing.dcu.ie Abstract We

More information

INF5820 Natural Language Processing - NLP. H2009 Jan Tore Lønning jtl@ifi.uio.no

INF5820 Natural Language Processing - NLP. H2009 Jan Tore Lønning jtl@ifi.uio.no INF5820 Natural Language Processing - NLP H2009 Jan Tore Lønning jtl@ifi.uio.no Semantic Role Labeling INF5830 Lecture 13 Nov 4, 2009 Today Some words about semantics Thematic/semantic roles PropBank &

More information

BILINGUAL TRANSLATION SYSTEM

BILINGUAL TRANSLATION SYSTEM BILINGUAL TRANSLATION SYSTEM (FOR ENGLISH AND TAMIL) Dr. S. Saraswathi Associate Professor M. Anusiya P. Kanivadhana S. Sathiya Abstract--- The project aims in developing Bilingual Translation System for

More information

Effective Self-Training for Parsing

Effective Self-Training for Parsing Effective Self-Training for Parsing David McClosky dmcc@cs.brown.edu Brown Laboratory for Linguistic Information Processing (BLLIP) Joint work with Eugene Charniak and Mark Johnson David McClosky - dmcc@cs.brown.edu

More information

Coupling an annotated corpus and a morphosyntactic lexicon for state-of-the-art POS tagging with less human effort

Coupling an annotated corpus and a morphosyntactic lexicon for state-of-the-art POS tagging with less human effort Coupling an annotated corpus and a morphosyntactic lexicon for state-of-the-art POS tagging with less human effort Pascal Denis and Benoît Sagot Equipe-project ALPAGE INRIA and Université Paris 7 30, rue

More information

Bootstrapping Term Extractors for Multiple Languages

Bootstrapping Term Extractors for Multiple Languages Bootstrapping Term Extractors for Multiple Languages Ahmet Aker, Monica Lestari Paramita, Emma Barker, Robert Gaizauskas Department of Computer Science, University of Sheffield Sheffield, S1 4DP, United

More information

Customizing an English-Korean Machine Translation System for Patent Translation *

Customizing an English-Korean Machine Translation System for Patent Translation * Customizing an English-Korean Machine Translation System for Patent Translation * Sung-Kwon Choi, Young-Gil Kim Natural Language Processing Team, Electronics and Telecommunications Research Institute,

More information

DEPENDENCY PARSING JOAKIM NIVRE

DEPENDENCY PARSING JOAKIM NIVRE DEPENDENCY PARSING JOAKIM NIVRE Contents 1. Dependency Trees 1 2. Arc-Factored Models 3 3. Online Learning 3 4. Eisner s Algorithm 4 5. Spanning Tree Parsing 6 References 7 A dependency parser analyzes

More information

SERVICES PRICE LIST - COMMERCIAL Sysorex Government Services, Inc.

SERVICES PRICE LIST - COMMERCIAL Sysorex Government Services, Inc. SERVICES - COMMERCIAL Sysorex Government Services, Inc. ITEM NUMBER LABOR TYPE DESCRIPTION PT00201 PT00202 Junior System Staff System equivalent working knowledge of System ing $ 109.63 experience or equivalent,

More information

Comprendium Translator System Overview

Comprendium Translator System Overview Comprendium System Overview May 2004 Table of Contents 1. INTRODUCTION...3 2. WHAT IS MACHINE TRANSLATION?...3 3. THE COMPRENDIUM MACHINE TRANSLATION TECHNOLOGY...4 3.1 THE BEST MT TECHNOLOGY IN THE MARKET...4

More information

Sharing resources between free/open-source rule-based machine translation systems: Grammatical Framework and Apertium

Sharing resources between free/open-source rule-based machine translation systems: Grammatical Framework and Apertium Sharing resources between free/open-source rule-based machine translation systems: Grammatical Framework and Apertium Grégoire Détrez, Víctor M. Sánchez-Cartagena, Aarne Ranta gregoire.detrez@gu.se, vmsanchez@prompsit.com,

More information

Ngram Search Engine with Patterns Combining Token, POS, Chunk and NE Information

Ngram Search Engine with Patterns Combining Token, POS, Chunk and NE Information Ngram Search Engine with Patterns Combining Token, POS, Chunk and NE Information Satoshi Sekine Computer Science Department New York University sekine@cs.nyu.edu Kapil Dalwani Computer Science Department

More information

A Chart Parsing implementation in Answer Set Programming

A Chart Parsing implementation in Answer Set Programming A Chart Parsing implementation in Answer Set Programming Ismael Sandoval Cervantes Ingenieria en Sistemas Computacionales ITESM, Campus Guadalajara elcoraz@gmail.com Rogelio Dávila Pérez Departamento de

More information

Activities. but I will require that groups present research papers

Activities. but I will require that groups present research papers CS-498 Signals AI Themes Much of AI occurs at the signal level Processing data and making inferences rather than logical reasoning Areas such as vision, speech, NLP, robotics methods bleed into other areas

More information

Parsing Software Requirements with an Ontology-based Semantic Role Labeler

Parsing Software Requirements with an Ontology-based Semantic Role Labeler Parsing Software Requirements with an Ontology-based Semantic Role Labeler Michael Roth University of Edinburgh mroth@inf.ed.ac.uk Ewan Klein University of Edinburgh ewan@inf.ed.ac.uk Abstract Software

More information

Example-Based Treebank Querying. Liesbeth Augustinus Vincent Vandeghinste Frank Van Eynde

Example-Based Treebank Querying. Liesbeth Augustinus Vincent Vandeghinste Frank Van Eynde Example-Based Treebank Querying Liesbeth Augustinus Vincent Vandeghinste Frank Van Eynde LREC 2012, Istanbul May 25, 2012 NEDERBOOMS Exploitation of Dutch treebanks for research in linguistics September

More information

Tanja Wissik. COTSOES Terminology and Documentation Working Group Meeting, 13 May 2013, Stockholm

Tanja Wissik. COTSOES Terminology and Documentation Working Group Meeting, 13 May 2013, Stockholm Tanja Wissik COTSOES Terminology and Documentation Working Group Meeting, 13 May 2013, Stockholm What is LISE? Main pillars of the LISE Project LISE Service Version and IATE use case Workflow Research

More information

The PALAVRAS parser and its Linguateca applications - a mutually productive relationship

The PALAVRAS parser and its Linguateca applications - a mutually productive relationship The PALAVRAS parser and its Linguateca applications - a mutually productive relationship Eckhard Bick University of Southern Denmark eckhard.bick@mail.dk Outline Flow chart Linguateca Palavras History

More information

Domain Adaptation for Dependency Parsing via Self-training

Domain Adaptation for Dependency Parsing via Self-training Domain Adaptation for Dependency Parsing via Self-training Juntao Yu 1, Mohab Elkaref 1, Bernd Bohnet 2 1 University of Birmingham, Birmingham, UK 2 Google, London, UK 1 {j.yu.1, m.e.a.r.elkaref}@cs.bham.ac.uk,

More information

Phrase Dependency Machine Translation with Quasi-Synchronous Tree-to-Tree Features

Phrase Dependency Machine Translation with Quasi-Synchronous Tree-to-Tree Features Carnegie Mellon University Research Showcase @ CMU Language Technologies Institute School of Computer Science 6-2014 Phrase Dependency Machine Translation with Quasi-Synchronous Tree-to-Tree Features Kevin

More information

Surface Realisation using Tree Adjoining Grammar. Application to Computer Aided Language Learning

Surface Realisation using Tree Adjoining Grammar. Application to Computer Aided Language Learning Surface Realisation using Tree Adjoining Grammar. Application to Computer Aided Language Learning Claire Gardent CNRS / LORIA, Nancy, France (Joint work with Eric Kow and Laura Perez-Beltrachini) March

More information

How to make Ontologies self-building from Wiki-Texts

How to make Ontologies self-building from Wiki-Texts How to make Ontologies self-building from Wiki-Texts Bastian HAARMANN, Frederike GOTTSMANN, and Ulrich SCHADE Fraunhofer Institute for Communication, Information Processing & Ergonomics Neuenahrer Str.

More information

Marie Dupuch, Frédérique Segond, André Bittar, Luca Dini, Lina Soualmia, Stefan Darmoni, Quentin Gicquel, Marie-Hélène Metzger

Marie Dupuch, Frédérique Segond, André Bittar, Luca Dini, Lina Soualmia, Stefan Darmoni, Quentin Gicquel, Marie-Hélène Metzger Separate the grain from the chaff: designing a system to make the best use of language and knowledge technologies to model textual medical data extracted from electronic health records Marie Dupuch, Frédérique

More information

Syntactic Theory on Swedish

Syntactic Theory on Swedish Syntactic Theory on Swedish Mats Uddenfeldt Pernilla Näsfors June 13, 2003 Report for Introductory course in NLP Department of Linguistics Uppsala University Sweden Abstract Using the grammar presented

More information

INPUTLOG 6.0 a research tool for logging and analyzing writing process data. Linguistic analysis. Linguistic analysis

INPUTLOG 6.0 a research tool for logging and analyzing writing process data. Linguistic analysis. Linguistic analysis INPUTLOG 6.0 a research tool for logging and analyzing writing process data Linguistic analysis From character level analyses to word level analyses Linguistic analysis 2 Linguistic Analyses The concept

More information

Using Knowledge Extraction and Maintenance Techniques To Enhance Analytical Performance

Using Knowledge Extraction and Maintenance Techniques To Enhance Analytical Performance Using Knowledge Extraction and Maintenance Techniques To Enhance Analytical Performance David Bixler, Dan Moldovan and Abraham Fowler Language Computer Corporation 1701 N. Collins Blvd #2000 Richardson,

More information

Circle the right answer for each question. 1) These words all have the same root word. Which is the root word?

Circle the right answer for each question. 1) These words all have the same root word. Which is the root word? Quiz Root words Level A Circle the right answer for each question. 1) These words all have the same root word. Which is the root word? A) takes B) taken C) mistaken D) take 2) These words all have the

More information

Joint Ensemble Model for POS Tagging and Dependency Parsing

Joint Ensemble Model for POS Tagging and Dependency Parsing Joint Ensemble Model for POS Tagging and Dependency Parsing Iliana Simova Dimitar Vasilev Alexander Popov Kiril Simov Petya Osenova Linguistic Modelling Laboratory, IICT-BAS Sofia, Bulgaria {iliana dvasilev

More information

Self-Training for Parsing Learner Text

Self-Training for Parsing Learner Text elf-training for Parsing Learner Text Aoife Cahill, Binod Gyawali and James V. Bruno Educational Testing ervice, 660 Rosedale Road, Princeton, NJ 0854, UA {acahill, bgyawali, jbruno}@ets.org Abstract We

More information

Semantic analysis of text and speech

Semantic analysis of text and speech Semantic analysis of text and speech SGN-9206 Signal processing graduate seminar II, Fall 2007 Anssi Klapuri Institute of Signal Processing, Tampere University of Technology, Finland Outline What is semantic

More information

A Survey on Product Aspect Ranking

A Survey on Product Aspect Ranking A Survey on Product Aspect Ranking Charushila Patil 1, Prof. P. M. Chawan 2, Priyamvada Chauhan 3, Sonali Wankhede 4 M. Tech Student, Department of Computer Engineering and IT, VJTI College, Mumbai, Maharashtra,

More information

How To Identify And Represent Multiword Expressions (Mwe) In A Multiword Expression (Irme)

How To Identify And Represent Multiword Expressions (Mwe) In A Multiword Expression (Irme) The STEVIN IRME Project Jan Odijk STEVIN Midterm Workshop Rotterdam, June 27, 2008 IRME Identification and lexical Representation of Multiword Expressions (MWEs) Participants: Uil-OTS, Utrecht Nicole Grégoire,

More information

Semantic Analysis of Natural Language Queries Using Domain Ontology for Information Access from Database

Semantic Analysis of Natural Language Queries Using Domain Ontology for Information Access from Database I.J. Intelligent Systems and Applications, 2013, 12, 81-90 Published Online November 2013 in MECS (http://www.mecs-press.org/) DOI: 10.5815/ijisa.2013.12.07 Semantic Analysis of Natural Language Queries

More information

Artificial Intelligence Exam DT2001 / DT2006 Ordinarie tentamen

Artificial Intelligence Exam DT2001 / DT2006 Ordinarie tentamen Artificial Intelligence Exam DT2001 / DT2006 Ordinarie tentamen Date: 2010-01-11 Time: 08:15-11:15 Teacher: Mathias Broxvall Phone: 301438 Aids: Calculator and/or a Swedish-English dictionary Points: The

More information

PyCantonese: Cantonese linguistic research in the age of big data

PyCantonese: Cantonese linguistic research in the age of big data PyCantonese: Cantonese linguistic research in the age of big data Jackson L. Lee University of Chicago http://jacksonllee.com Childhood Bilingualism Research Center, CUHK September 15, 2015 Grammar versus

More information

Plantronics Explorer 50. User Guide

Plantronics Explorer 50. User Guide Plantronics Explorer 50 User Guide Contents Welcome 3 What's in the box 4 Headset Overview 5 Be safe 5 Pair and Charge 6 Get Paired 6 Activate pair mode 6 Use two phones 6 Reconnect 6 Charge 6 Fit 7 The

More information

HIERARCHICAL HYBRID TRANSLATION BETWEEN ENGLISH AND GERMAN

HIERARCHICAL HYBRID TRANSLATION BETWEEN ENGLISH AND GERMAN HIERARCHICAL HYBRID TRANSLATION BETWEEN ENGLISH AND GERMAN Yu Chen, Andreas Eisele DFKI GmbH, Saarbrücken, Germany May 28, 2010 OUTLINE INTRODUCTION ARCHITECTURE EXPERIMENTS CONCLUSION SMT VS. RBMT [K.

More information

Accelerating and Evaluation of Syntactic Parsing in Natural Language Question Answering Systems

Accelerating and Evaluation of Syntactic Parsing in Natural Language Question Answering Systems Accelerating and Evaluation of Syntactic Parsing in Natural Language Question Answering Systems cation systems. For example, NLP could be used in Question Answering (QA) systems to understand users natural

More information

Parsing Swedish. Atro Voutilainen Conexor oy atro.voutilainen@conexor.fi. CG and FDG

Parsing Swedish. Atro Voutilainen Conexor oy atro.voutilainen@conexor.fi. CG and FDG Parsing Swedish Atro Voutilainen Conexor oy atro.voutilainen@conexor.fi This paper presents two new systems for analysing Swedish texts: a light parser and a functional dependency grammar parser. Their

More information

Topological Field Chunking in German

Topological Field Chunking in German Topological Field Chunking in German Jorn Veenstra, Frank H. Müller, Tylman Ule [veenstra,fhm,ule]@sfs.uni-tuebingen.de ESSLLI Summerschool Workshop on Machine Learning Approaches in Computational Linguistics

More information

Open Domain Information Extraction. Günter Neumann, DFKI, 2012

Open Domain Information Extraction. Günter Neumann, DFKI, 2012 Open Domain Information Extraction Günter Neumann, DFKI, 2012 Improving TextRunner Wu and Weld (2010) Open Information Extraction using Wikipedia, ACL 2010 Fader et al. (2011) Identifying Relations for

More information

NEDERBOOMS Treebank Mining for Data- based Linguistics. Liesbeth Augustinus Vincent Vandeghinste Ineke Schuurman Frank Van Eynde

NEDERBOOMS Treebank Mining for Data- based Linguistics. Liesbeth Augustinus Vincent Vandeghinste Ineke Schuurman Frank Van Eynde NEDERBOOMS Treebank Mining for Data- based Linguistics Liesbeth Augustinus Vincent Vandeghinste Ineke Schuurman Frank Van Eynde LOT Summer School - June, 2014 NEDERBOOMS Exploita)on of Dutch treebanks

More information

Multilingual and mixed-lingual TTS applications

Multilingual and mixed-lingual TTS applications Multilingual and mixed-lingual TTS applications LangTech 2003 November 24, 2003 Simona Fina, Manager Linguistics Real-life texts need mixed-lingual analysis Agenda Short presentation of SVOX Challenges

More information

Introduction to Data-Driven Dependency Parsing

Introduction to Data-Driven Dependency Parsing Introduction to Data-Driven Dependency Parsing Introductory Course, ESSLLI 2007 Ryan McDonald 1 Joakim Nivre 2 1 Google Inc., New York, USA E-mail: ryanmcd@google.com 2 Uppsala University and Växjö University,

More information