Towards a Universal Grammar for Natural Language Processing

Size: px
Start display at page:

Download "Towards a Universal Grammar for Natural Language Processing"

Transcription

1 Towards a Universal Grammar for Natural Language Processing Joakim Nivre Uppsala University Department of Linguistics and Philology Based on collaborative work with Filip Ginter, Yoav Goldberg, Jan Hajič, Chris Manning, Ryan McDonald, Natalia Silveira, Marie de Marneffe, Slav Petrov, Sampo Pyysalo, Reut Tsarfaty, Daniel Zeman and many others

2 In its substance, grammar is one and the same in all languages, even if it accidentally varies.

3 In its substance, grammar is one and the same in all languages, even if it accidentally varies.

4 In its substance, grammar is one and the same in all languages, even if it accidentally varies.

5 Universal Grammar

6 Universal Grammar All human languages are species of a common genus

7 Universal Grammar All human languages are species of a common genus Language structure is constrained by a universal cause

8 Universal Grammar All human languages are species of a common genus Language structure is constrained by a universal cause There is order in the chaos of linguistic variation

9 Natural Language Processing

10 Natural Language Processing Linguistic diversity makes our life harder Why 90% parsing accuracy for English but only 80% for Finnish? Can we even compare the numbers?

11 Natural Language Processing Linguistic diversity makes our life harder Why 90% parsing accuracy for English but only 80% for Finnish? Can we even compare the numbers? Current NLP relies heavily on linguistic annotation: In its substance, grammar is the same in all languages, even if the annotation accidentally varies.

12 Natural Language Processing Linguistic diversity makes our life harder Why 90% parsing accuracy for English but only 80% for Finnish? Can we even compare the numbers? Current NLP relies heavily on linguistic annotation: In its substance, grammar is the same in all languages, even if the annotation accidentally varies. We need to bring some order into the chaos

13

14 Language X dobj conj conj En katt jagar råttor och möss? dobj cc conj En kat jager råder og møs dobj cc conj A cat chases rats and mice

15 Language X dobj dobj conj conj conj En katt jagar råttor och möss En katt jagar råttor och möss Language Y? dobj cc conj En kat jager råder og møs En kat jager råder og møs conj conj dobj cc A cat chases rats and mice A cat chases rats and mice conj? dobj cc conj dobj cc

16 Language X dobj conj conj dobj dobj conj conj conj En katt jagar råttor och möss En katt jagar råttor och möss En katt jagar råttor och möss Language Y? dobj cc conj?? dobj cc conj dobj cc conj En kat jager råder og møs En kat jager råder og møs En kat jager råder og møs conj dobj cc dobj cc dobj cc A cat chases rats and mice A cat chases rats and mice A cat chases rats and mice conj Language Z conj conj

17 Language X dobj conj conj dobj dobj conj conj conj En katt jagar råttor och möss En katt jagar råttor och möss En katt jagar råttor och möss Language Y? dobj cc conj?? dobj cc conj dobj cc conj En kat jager råder og møs En kat jager råder og møs En kat jager råder og møs conj dobj cc dobj cc dobj cc A cat chases rats and mice A cat chases rats and mice A cat chases rats and mice conj Language Z conj conj Which languages are most closely related?

18 1/5 Language X dobj conj conj dobj dobj conj conj conj En katt jagar råttor och möss En katt jagar råttor och möss En katt jagar råttor och möss Language Y? dobj cc conj?? dobj cc conj dobj cc conj En kat jager råder og møs En kat jager råder og møs En kat jager råder og møs conj dobj cc dobj cc dobj cc A cat chases rats and mice A cat chases rats and mice A cat chases rats and mice conj Language Z conj conj Which languages are most closely related?

19 1/5 Language X dobj conj conj dobj dobj conj conj conj En katt jagar råttor och möss En katt jagar råttor och möss En katt jagar råttor och möss Language Y? dobj cc conj?? dobj cc conj dobj cc conj En kat jager råder og møs En kat jager råder og møs En kat jager råder og møs conj dobj cc dobj cc dobj cc A cat chases rats and mice A cat chases rats and mice A cat chases rats and mice conj Language Z conj conj 2/5 Which languages are most closely related?

20 1/5 Language X dobj conj conj dobj dobj conj conj conj En katt jagar råttor och möss En katt jagar råttor och möss En katt jagar råttor och möss Language Y? dobj cc conj?? dobj cc conj dobj cc conj En kat jager råder og møs En kat jager råder og møs En kat jager råder og møs conj dobj cc dobj cc dobj cc A cat chases rats and mice A cat chases rats and mice A cat chases rats 2/5 and mice conj Language Z conj conj 2/5 Which languages are most closely related?

21 Language Swedish X dobj dobj dobj conj conj dobj dobj conj conj conj conj En conj katt jagar conj råttor och möss En katt jagar råttor och möss En En katt katt jagar jagar råttor råttor och och möss möss En katt jagar råttor och möss 1/5 Language Danish Y? dobj dobj cc cc conj conj dobj cc conj? dobj cc conj? dobj En kat cc jager conj rotter og mus En kat jager råder og møs råder og 2/5 møs En kat jager rotter råder og mus møs En kat jager rotter og mus Language English Z conj conj dobj cc conj dobj cc dobj cc dobj dobj A cat cc cc chases rats and mice A cat chases rats and mice A cat cat chases chases rats rats 2/5 and and mice A cat chases rats and mice mice advmod Which languages advmod are most closely related? Toutefois, les filles adorent les Toutefois, toutefois les, fillestoutefois les adorent, fille les les adorer filles desserts les toutefois, ADV les PUNCTfilletoutefois DET adorer, NOUN les les VERB fille dessert DET dobj conj advmod conj conj dob

22 Why is this a problem?

23 Why is this a problem? Hard to compare empirical results across languages

24 Why is this a problem? Hard to compare empirical results across languages Hard to evaluate cross-lingual learning

25 Why is this a problem? Hard to compare empirical results across languages Hard to evaluate cross-lingual learning Hard to build and maintain multilingual systems

26 Why is this a problem? Hard to compare empirical results across languages Hard to evaluate cross-lingual learning Hard to build and maintain multilingual systems Hard to make progress towards a universal parser

27 dobj conj conj En katt Universal jagar råttor och möss Dependencies? dobj cc conj En kat jager rotter og mus dobj cc conj A cat chases rats and mice advmod dobj Toutefois, les filles adorent les desserts. ADV PUNCT DET NOUN VERB DET NOUN PUNCT Definite=Def Gender=Fem Number=Plur Definite=Def Gender=Masc Number=Plur Number=Plur Person=3 Number=Plur Number=Plur Tense=Pres

28 dobj conj conj En katt Universal jagar råttor och möss Dependencies? dobj cc conj En kat jager rotter og mus dobj cc conj A cat chases rats and mice advmod dobj Toutefois, les filles adorent les desserts. ADV PUNCT DET NOUN VERB DET NOUN PUNCT Definite=Def Gender=Fem Number=Plur Definite=Def Gender=Masc Number=Plur Number=Plur Person=3 Number=Plur Number=Plur Tense=Pres Part-of-speech tags

29 dobj conj conj En katt Universal jagar råttor och möss Dependencies? dobj cc conj En kat jager rotter og mus dobj cc conj A cat chases rats and mice advmod dobj Toutefois, les filles adorent les desserts. ADV PUNCT DET NOUN VERB DET NOUN PUNCT Definite=Def Gender=Fem Number=Plur Definite=Def Gender=Masc Number=Plur Number=Plur Person=3 Number=Plur Number=Plur Tense=Pres Part-of-speech tags Morphological features

30 dobj conj conj En katt Universal jagar råttor och möss Dependencies? dobj cc conj En kat jager rotter og mus dobj cc conj A cat chases rats and mice Dependency relations advmod dobj Toutefois, les filles adorent les desserts. ADV PUNCT DET NOUN VERB DET NOUN PUNCT Definite=Def Gender=Fem Number=Plur Definite=Def Gender=Masc Number=Plur Number=Plur Person=3 Number=Plur Number=Plur Tense=Pres Part-of-speech tags Morphological features

31 Universal Dependencies

32 Universal Dependencies Stanford Dependencies

33 Universal Dependencies Stanford Dependencies Google UD

34 Universal Dependencies Stanford Dependencies Stanford UD Google UD

35 Universal Dependencies Stanford Dependencies Stanford UD Google UD HamleDT

36 Universal Dependencies Stanford Dependencies Stanford UD Interset Google UD HamleDT

37 Universal Dependencies Stanford Dependencies Google UD Stanford UD HamleDT Interset Google universal tags

38 Universal Dependencies Universal Dependencies

39 Universal Dependencies Universal Dependencies Milestones: Kick-off meeting at EACL in Gothenburg, April 2014 Release of annotation guidelines, Version 1, October 2014 Release of treebanks for 10 languages, January 2015 Release of treebanks for 18 languages, May 2015 Release of treebanks for 33 languages, November 2015 Open community effort anyone can contribute!

40 Goals and Requirements

41 Goals and Requirements Cross-linguistically consistent grammatical annotation

42 Goals and Requirements Cross-linguistically consistent grammatical annotation Support multilingual research and development in NLP

43 Goals and Requirements Cross-linguistically consistent grammatical annotation Support multilingual research and development in NLP Based on common usage and existing de facto standards

44 Design Principles

45 Design Principles Dependency Widely used in practical NLP systems Available in treebanks for many languages

46 Design Principles Dependency Widely used in practical NLP systems Available in treebanks for many languages Lexicalism Basic annotation units are words syntactic words Words have morphological properties Words enter into syntactic relations

47 Design Principles Dependency Widely used in practical NLP systems Available in treebanks for many languages Lexicalism Basic annotation units are words syntactic words Words have morphological properties Words enter into syntactic relations Recoverability Transparent mapping from input text to word segmentation

48 Golden Rules

49 Golden Rules Maximize parallelism Don t annotate the same thing in different ways Don t make different things look the same

50 Golden Rules Maximize parallelism Don t annotate the same thing in different ways Don t make different things look the same But don t overdo it Don t annotate things that are not there Languages select from a universal pool of categories Allow language-specific extensions

51 En kat jager rotter og mus dobj cc conj A cat chases rats Morphology and mice advmod dobj Toutefois, les filles adorent les desserts. toutefois, les fille adorer les dessert. ADV PUNCT DET NOUN VERB DET NOUN PUNCT Definite=Def Gender=Fem Number=Plur Definite=Def Gender=Masc Number=Plur Number=Plur Person=3 Number=Plur Number=Plur Tense=Pres

52 En kat jager rotter og mus En kat jager rotter og mus conj conj dobj cc dobj cc A cat cat chases chases rats rats Morphology and and mice mice advmod dobj Toutefois, les filles adorent les desserts. toutefois, les fille adorer les dessert. ADV PUNCT DET NOUN VERB DET NOUN PUNCT Definite=Def Gender=Fem Number=Plur Definite=Def Gender=Masc Number=Plur Number=Plur Person=3 Number=Plur Number=Plur Tense=Pres Lemma representing the semantic content of the word advmod Toutefois, les filles adorent les desserts. toutefois, le fille adorer les dessert. ADV PUNCT DET NOUN VERB DET NOUN PUNCT Definite=Def Gender=Fem Number=Plur Definite=Def Gender=Masc Number=Plur Number=Plur Person=3 Number=Plur Number=Plur Tense=Pres dobj

53 En kat jager rotter og mus En kat jager rotter og mus conj conj dobj cc dobj cc A cat cat chases chases rats rats Morphology and and mice mice advmod dobj Toutefois, les filles adorent les desserts. toutefois, les fille adorer les dessert. ADV PUNCT DET NOUN VERB DET NOUN PUNCT Definite=Def Gender=Fem Number=Plur Definite=Def Gender=Masc Number=Plur Number=Plur Person=3 Number=Plur Number=Plur Tense=Pres Lemma representing the semantic content of the word advmod Part-of-speech tag representing the abstract lexical category associated with the word Toutefois, les filles adorent les desserts. toutefois, le fille adorer les dessert. ADV PUNCT DET NOUN VERB DET NOUN PUNCT Definite=Def Gender=Fem Number=Plur Definite=Def Gender=Masc Number=Plur Number=Plur Person=3 Number=Plur Number=Plur Tense=Pres dobj

54 En kat jager rotter og mus En kat jager rotter og mus conj conj dobj cc dobj cc A cat cat chases chases rats rats Morphology and and mice mice advmod dobj Toutefois, les filles adorent les desserts. toutefois, les fille adorer les dessert. ADV PUNCT DET NOUN VERB DET NOUN PUNCT Definite=Def Gender=Fem Number=Plur Definite=Def Gender=Masc Number=Plur Number=Plur Person=3 Number=Plur Number=Plur Tense=Pres Lemma representing the semantic content of the word advmod Part-of-speech tag representing the abstract lexical category associated with the word Toutefois, les filles adorent les desserts. toutefois, le fille adorer les dessert. ADV PUNCT DET NOUN VERB DET NOUN PUNCT Features representing lexical and grammatical properties Definite=Def Gender=Fem Number=Plur Definite=Def Gender=Masc Number=Plur Number=Plur Person=3 Number=Plur Number=Plur associated with the lemma or the particular word form Tense=Pres dobj

55 Part-of-Speech Tags Open Closed Other ADJ ADP PUNCT ADV AUX SYM INTJ CONJ X NOUN PROPN VERB DET NUM PART PRON SCONJ Taxonomy of 17 universal part-of-speech tags, based on the Google Universal Tagset (Petrov et al., 2012) All languages use the same inventory, but not all tags have to be used by all languages

56 Features Lexical Inflectional Nominal Inflectional Verbal PronType Gender VerbForm NumType Animacy Mood Poss Number Tense Reflex Case Aspect Definite Voice Degree Person Negative Standardized inventory of morphological features, based on the Interset system (Zeman, 2008) Languages select relevant features and can add languagespecific features or values with documentation

57 Syntax nmod dobj aux case aux The cat could have chased all the dogs down the street. DET NOUN AUX AUX VERB DET DET NOUN ADP DET NOUN PUNCT nmod dobj aux case aux The cat could have chased all the dogs down the street. DET NOUN AUX AUX VERB DET DET NOUN ADP DET NOUN PUNCT

58 nmod aux aux dobj Syntax The cat could have chased all the dogs down the street. DET NOUN AUX AUX VERB DET DET NOUN ADP DET NOUN PUNCT case nmod dobj The cat could have chased all the dogs down the street. DET NOUN AUX AUX VERB DET DET NOUN ADP DET NOUN PUNCT Content words are related by dependency relations

59 nmod aux aux dobj Syntax The cat could have chased all the dogs down the street. DET NOUN AUX AUX VERB DET DET NOUN ADP DET NOUN PUNCT case nmod dobj aux case aux The cat could have chased all the dogs down the street. DET NOUN AUX AUX VERB DET DET NOUN ADP DET NOUN PUNCT Content words are related by dependency relations nmod Function words attach to the content word they modify dobj The cat could have chased all the dogs down the street. DET NOUN AUX AUX VERB DET DET NOUN ADP DET NOUN PUNCT

60 Syntax nmod dobj aux case aux The cat could have chased all the dogs down the street. DET NOUN AUX AUX VERB DET DET NOUN ADP DET NOUN PUNCT Content words are related by dependency relations Function words attach to the content word they modify dobj aux Punctuation attach aux to head of phrase or clause The cat could have chased all the dogs down the street. DET NOUN AUX AUX VERB DET DET NOUN ADP DET NOUN PUNCT nmod case

61 pass case Hunden jagades av katten. NOUN VERB ADP NOUN PUNCT Definite=Def Voice=Pass Definite=Def pass nmod The dog was chased by the cat. DET NOUN AUX VERB ADP DET NOUN PUNCT pass nmod Hunden jagades av katten. NOUN VERB ADP NOUN PUNCT Definite=Def Voice=Pass Definite=Def nmod

62 pass Hunden jagades av katten. NOUN VERB ADP NOUN PUNCT Definite=Def Voice=Pass Definite=Def pass nmod The dog was chased by the cat. DET NOUN AUX VERB ADP DET NOUN PUNCT pass nmod Hunden jagades av katten. NOUN VERB ADP NOUN PUNCT Definite=Def Voice=Pass Definite=Def

63 pass auxpass nmod The dog was chased by the cat. DET NOUN AUX VERB ADP DET NOUN PUNCT pass nmod Hunden jagades av katten. NOUN VERB ADP NOUN PUNCT Definite=Def Voice=Pass Definite=Def

64 pass auxpass nmod case The dog was chased by the cat. DET NOUN AUX VERB ADP DET NOUN PUNCT pass nmod case Hunden jagades av katten. NOUN VERB ADP NOUN PUNCT Definite=Def Voice=Pass Definite=Def nmod

65 Dependency Relations

66 Dependency Relations Taxonomy of 40 universal grammatical relations, broadly attested in language typology (de Marneffe et al., 2014) Language-specific subtypes may be added

67 Dependency Relations Taxonomy of 40 universal grammatical relations, broadly attested in language typology (de Marneffe et al., 2014) Language-specific subtypes may be added Organizing principles Three types of structures: nominals, clauses, modifiers Core arguments vs. other dependents (not complements vs. adjuncts)

68 Dependents of Clausal Predicates Nominal Clausal Other Core pass dobj iobj csubj csubjpass ccomp xcomp Non-Core nmod vocative discourse expl advcl advmod neg aux auxpass cop mark

69

70 nmod nmod aux dobj case advmod Mary was quietly reading a book in the garden. PROPN AUX ADV VERB DET NOUN ADP DET NOUN PUNCT advcl mark aux cop neg If you are sick, you should not exercise. SCONJ PRON AUX ADJ PUNCT PRON AUX ADV VERB PUNCT ccomp mark aux xcomp Peter thought that he should stop smoking. PROPN VERB SCONJ PRON AUX VERB VERB PUNCT appos mark

71 nmod nmod aux dobj case aux dobj case advmod advmod Mary was quietly reading book in the garden Mary was quietly reading a book in the garden. PROPN AUX ADV VERB DET NOUN ADP DET NOUN PUNCT PROPN AUX ADV VERB DET NOUN ADP DET NOUN PUNCT advcl advcl mark mark aux aux cop neg cop neg If you are sick you should not exercise If you are sick, you should not exercise. SCONJ PRON AUX ADJ PUNCT PRON AUX ADV VERB PUNCT SCONJ PRON AUX ADJ PUNCT PRON AUX ADV VERB PUNCT ccomp ccomp mark mark aux aux xcomp xcomp Peter thought that he should stop smoking Peter thought that he should stop smoking. PROPN VERB SCONJ PRON AUX VERB VERB PUNCT PROPN VERB SCONJ PRON AUX VERB VERB PUNCT appos appos mark mark nmod

72 aux aux auxadvmod advmod advmod dobj dobj dobj nmod nmod nmod case case case Mary was quietly reading book in the garden Mary was quietly reading book in the garden PROPN Mary AUX was quietly ADV reading VERB DET a NOUN book ADP in DET the NOUN garden PUNCT. PROPN AUX ADV VERB DET NOUN ADP DET NOUN PUNCT PROPN AUX ADV VERB DET NOUN ADP DET NOUN PUNCT mark mark mark cop cop cop advcl advcl advcl aux aux aux If you are sick you should not exercise If you are sick you should not exercise SCONJ If PRON you AUX are ADJ sick PUNCT, PRON you should AUX ADV not exercise VERB PUNCT. SCONJ PRON AUX ADJ PUNCT PRON AUX ADV VERB PUNCT SCONJ PRON AUX ADJ PUNCT PRON AUX ADV VERB PUNCT ccomp ccomp ccompmark mark mark aux aux aux xcomp xcomp xcomp Peter thought that he should stop smoking Peter thought that he should stop smoking PROPN Peter thought VERB SCONJ that PRON he should AUX VERB stop smoking VERB PUNCT. PROPN VERB SCONJ PRON AUX VERB VERB PUNCT PROPN VERB SCONJ PRON AUX VERB VERB PUNCT appos appos appos mark mark mark nmod neg neg neg

73 PROPN AUX ADV VERB DET NOUN ADP DET NOUN PUNCT advcl Dependents mark of Nominals aux cop neg If you are sick, you should not exercise. SCONJ PRON AUX ADJ PUNCT PRON AUX ADV VERB PUNCT Nominal Clausal Other nummod appos nmod acl ccomp mark amod case xcomp Peter thought that he should stop smoking. PROPN VERB SCONJ PRON AUX VERB VERB PUNCT appos mark amod nmod case Cairo, the lovely capital of Egypt PROPN PUNCT DET ADJ NOUN ADP PROPN

74 Coordination conj cc () appos mark Coordination Cairo, the lovely capital of Eg PROPN PUNCT DET ADJ NOUN ADP PR amod nmod case conj cc conj Huey, Dewey and Louie PROPN PUNCT PROPN CONJ PROPN Coordinate structures are headed by the first conjunct Subsequent conjuncts depend on it via the conj relation Conjunctions depend on it via the cc relation Punctuation marks depend on it via the relation

75 Multiword Expressions Relation mwe name compound goeswith Examples in spite of, as well as, ad hoc Roger Bacon, Carl XVI Gustaf, New York phone book, four thousand, dress up notwith standing, with out UD annotation does not permit words with spaces Multiword expressions are analysed using special relations The mwe, name and goeswith relations are always head-initial The compound relation reflects the internal structure

76 Other Relations Relation parataxis list remnant reparandum foreign dep Explanation Loosely linked clauses of same rank Lists without syntactic structure Orphans in ellipsis linked to parallel elements Disfluency linked to (speech) repair Elements within opaque stretches of code switching Unspecified dependency Syntactically independent element of clause/phrase

77 Language-Specific Relations Language-specific relations are subtypes of universal relations added to capture important phenomena Subtyping permits us to back off to universal relations Relation acl:relcl compound:prt nmod:poss nmod:agent cc:preconj :pre Explanation Relative clause Verb particle (dress up) Genitive nominal (Mary s book) Agent in passive (saved by the bell) Preconjunction (both and) Preerminer (all those )

78 Word Segmentation

79 Word Segmentation How do we segment sentences into words? Dependent on language and writing system, often non-trivial Segmentation must be reproducible on new data

80 Word Segmentation How do we segment sentences into words? Dependent on language and writing system, often non-trivial Segmentation must be reproducible on new data Two options provided: Only include words in treebank, but document segmentation Include mapping from low-level tokenisation to words in treebank

81 Word Segmentation How do we segment sentences into words? Dependent on language and writing system, often non-trivial Segmentation must be reproducible on new data Two options provided: Only include words in treebank, but document segmentation Include mapping from low-level tokenisation to words in treebank Vamos nos a el mar. VERB PRON ADP DET NOUN PUNCT Vámonos al mar.?? NOUN PUNCT

82 CoNLL-U Format Revised version of the CoNLL-X format Two-level segmentation and secondary dependencies

83 CoNLL-U Format ID Revised version of the CoNLL-X format Two-level segmentation and secondary dependencies

84 CoNLL-U Format ID FORM Vámonos Vamos nos al a el mar. Revised version of the CoNLL-X format Two-level segmentation and secondary dependencies

85 CoNLL-U Format ID FORM Vámonos Vamos nos al a el mar. LEMMA ir nosotros a el mar. Revised version of the CoNLL-X format Two-level segmentation and secondary dependencies

86 CoNLL-U Format ID FORM LEMMA UPOSTAG 1-2 Vámonos 1 Vamos ir VERB 2 nos nosotros PRON 3-4 al 3 a a ADP 4 el el DET 5 mar mar NOUN 6... Revised version of the CoNLL-X format Two-level segmentation and secondary dependencies

87 CoNLL-U Format ID FORM LEMMA UPOSTAG XPOSTAG 1-2 Vámonos 1 Vamos ir VERB 2 nos nosotros PRON 3-4 al 3 4 a el a el ADP DET 5 mar mar NOUN 6... Revised version of the CoNLL-X format Two-level segmentation and secondary dependencies

88 CoNLL-U Format ID FORM LEMMA UPOSTAG XPOSTAG FEATS 1-2 Vámonos 1 Vamos ir VERB Mood=Imp Number=Plur Person=1 2 nos nosotros PRON PronType=Per Number=Plur Person=1 3-4 al a el mar a el mar ADP DET NOUN Definite=Def Number=Sing Gender=Masc Number=Sing Gender=Masc 6... Revised version of the CoNLL-X format Two-level segmentation and secondary dependencies

89 CoNLL-U Format ID FORM LEMMA UPOSTAG XPOSTAG FEATS HEAD 1-2 Vámonos 1 Vamos ir VERB Mood=Imp Number=Plur Person=1 0 2 nos nosotros PRON PronType=Per Number=Plur Person= al a el mar a el mar ADP DET NOUN Definite=Def Number=Sing Gender=Masc Number=Sing Gender=Masc Revised version of the CoNLL-X format Two-level segmentation and secondary dependencies

90 CoNLL-U Format ID FORM LEMMA UPOSTAG XPOSTAG FEATS HEAD DEPREL 1-2 Vámonos 1 Vamos ir VERB Mood=Imp Number=Plur Person=1 0 2 nos nosotros PRON PronType=Per Number=Plur Person=1 1 expl 3-4 al a el mar a el mar ADP DET NOUN Definite=Def Number=Sing Gender=Masc Number=Sing Gender=Masc case nmod Revised version of the CoNLL-X format Two-level segmentation and secondary dependencies

91 CoNLL-U Format ID FORM LEMMA UPOSTAG XPOSTAG FEATS HEAD DEPREL DEPS 1-2 Vámonos 1 Vamos ir VERB Mood=Imp Number=Plur Person=1 0 2 nos nosotros PRON PronType=Per Number=Plur Person=1 1 expl 3-4 al a el mar a el mar ADP DET NOUN Definite=Def Number=Sing Gender=Masc Number=Sing Gender=Masc case nmod Revised version of the CoNLL-X format Two-level segmentation and secondary dependencies

92 CoNLL-U Format ID FORM LEMMA UPOSTAG XPOSTAG FEATS HEAD DEPREL DEPS MISC 1-2 Vámonos 1 Vamos ir VERB Mood=Imp Number=Plur Person=1 0 2 nos nosotros PRON PronType=Per Number=Plur Person=1 1 expl 3-4 al a el mar a el mar ADP DET NOUN Definite=Def Number=Sing Gender=Masc Number=Sing Gender=Masc case nmod Revised version of the CoNLL-X format Two-level segmentation and secondary dependencies

93 Where are we now?

94 Where are we now? Universal Dependencies, Version 1 Guidelines released October 2014 Latest treebank release November 2015 (v1.2): Ancient Greek, Arabic, Basque, Bulgarian, Croatian, Czech, Danish, Dutch, English, Estonian, Finnish, French, German, Gothic, Greek, Hebrew, Hindi, Hungarian, Indonesian, Irish, Italian, Japanese, Latin, Norwegian, Old Church Slavonic, Persian, Polish, Portuguese, Romanian, Slovenian, Spanish, Swedish, Tamil

95 Where are we now? Universal Dependencies, Version 1 Guidelines released October 2014 Latest treebank release November 2015 (v1.2): Ancient Greek, Arabic, Basque, Bulgarian, Croatian, Czech, Danish, Dutch, English, Estonian, Finnish, French, German, Gothic, Greek, Hebrew, Hindi, Hungarian, Indonesian, Irish, Italian, Japanese, Latin, Norwegian, Old Church Slavonic, Persian, Polish, Portuguese, Romanian, Slovenian, Spanish, Swedish, Tamil Future plans: New releases every six months (May, November) Revision of guidelines as needed

96 Where are we now? Universal Dependencies, Version 1 Guidelines released October 2014 Latest treebank release November 2015 (v1.2): Ancient Greek, Arabic, Basque, Bulgarian, Croatian, Czech, Danish, Dutch, English, Estonian, Finnish, French, German, Gothic, Greek, Hebrew, Hindi, Hungarian, Indonesian, Irish, Italian, Japanese, Latin, Norwegian, Old Church Slavonic, Persian, Polish, Portuguese, Romanian, Slovenian, Spanish, Swedish, Tamil Future plans: New releases every six months (May, November) Revision of guidelines as needed Have a look at

97 So what exactly is UD?

98 So what exactly is UD? A new linguistic theory? Not at all, but we like to think it is informed by linguistic theory and potentially useful also for linguistic studies

99 So what exactly is UD? A new linguistic theory? Not at all, but we like to think it is informed by linguistic theory and potentially useful also for linguistic studies A better parsing framework? Probably not, since parsers seem to prefer function words as heads so we may have to tweak the representations for parsing

100 So what exactly is UD? A new linguistic theory? Not at all, but we like to think it is informed by linguistic theory and potentially useful also for linguistic studies A better parsing framework? Probably not, since parsers seem to prefer function words as heads so we may have to tweak the representations for parsing The ultimate annotation scheme? Not quite, more like a lingua franca for treebank developers and definitely useful for some annotation projects

101 So what exactly is UD? A new linguistic theory? Not at all, but we like to think it is informed by linguistic theory and potentially useful also for linguistic studies A better parsing framework? Probably not, since parsers seem to prefer function words as heads so we may have to tweak the representations for parsing The ultimate annotation scheme? Not quite, more like a lingua franca for treebank developers and definitely useful for some annotation projects A universal grammar? Not in the Chomskyan sense, but hopefully in the more practical sense of facilitating multilingual NLP by bringing a little order into the chaos

102 So what exactly is UD? A new linguistic theory? Not at all, but we like to think it is informed by linguistic theory and potentially useful also for linguistic studies A better parsing framework? Probably not, since parsers seem to prefer function words as heads so we may have to tweak the representations for parsing The ultimate annotation scheme? Not quite, more like a lingua franca for treebank developers and definitely useful for some annotation projects A universal grammar? Well, who knows? Not in the Chomskyan sense, but hopefully in the more practical sense of facilitating multilingual NLP by bringing a little order into the chaos

103 Acknowledgments Core UD group: Filip Ginter, Yoav Goldberg, Jan Hajič, Chris Manning, Ryan McDonald, Natalia Silveira, Marie de Marneffe, Slav Petrov, Sampo Pyysalo, Reut Tsarfaty, Dan Zeman UD contributors: Željko Agić, Riyaz Ahmad, Maria Jesus Aranzabe, Masayuki Asahara, Aitziber Atutxa, Cristina Bosco, Giuseppe G. A. Celano, Jinho Choi, Çağrı Çöltekin, Kaja Dobrovoljc, Timothy Dozat, Binyam Ephrem, Tomaž Erjavec, Richárd Farkas, Jennifer Foster, Koldo Gojenola, Iakes Goenaga, Bruno Guillaume, Nizar Habash, Dag Haug, Anders Trærup Johannsen, Hiroshi Kanayama, Jenna Kanerva, Simon Krek, Juha Kuokkala, Veronika Laippala, Alessandro Lenci, Krister Lindén, Nikola Ljubešić, Olga Lyashevskaya, Teresa Lynn, Aibek Makazhanov, Catalina Maranduc, Héctor Martínez Alonso, Anna Missilä, Verginica Mititelu, Yusuke Miyao, Simonetta Montemagni, Shinsuke Mori, Hanna Nurmi, Petya Osenova, Lilja Øvrelid, Elena Pascual, Marco Passarotti, Jussi Piitulainen, Barbara Plank, Prokopis Prokopidis, Loganathan Ramasamy, Wolfgang Seeker, Mojgan Seraji, Maria Simi, Kiril Simov, Arne Skjæerholt, Aaron Smith, Jan Štěpánek,Takaaki Tanaka, Francis Tyers, Sumire Uematsu, Veronika Vincze, Rob Voigt, Jonathan Washington

Universal Dependencies

Universal Dependencies Universal Dependencies Joakim Nivre! Uppsala University Department of Linguistics and Philology Based on collaborative work with Jinho Choi, Timothy Dozat, Filip Ginter, Yoav Goldberg, Jan Hajič, Chris

More information

Ling 201 Syntax 1. Jirka Hana April 10, 2006

Ling 201 Syntax 1. Jirka Hana April 10, 2006 Overview of topics What is Syntax? Word Classes What to remember and understand: Ling 201 Syntax 1 Jirka Hana April 10, 2006 Syntax, difference between syntax and semantics, open/closed class words, all

More information

Languages Supported. SpeechGear s products are being used to remove communications barriers throughout the world.

Languages Supported. SpeechGear s products are being used to remove communications barriers throughout the world. Languages Supported SpeechGear s products are being used to remove communications barriers throughout the world. Each of the following pages lists the languages that we currently support for that product.

More information

PRICE LIST. ALPHA TRANSLATION AGENCY www.biuro-tlumaczen.tv [email protected]

PRICE LIST. ALPHA TRANSLATION AGENCY www.biuro-tlumaczen.tv info@biuro-tlumaczen.tv We encourage you to get to know the prices of the services provided by Alpha Translation Agency in the range of standard and certified written translations of common and rare languages, as well as interpretation

More information

Reference Guide: Approved Vendors for Translation and In-Person Interpretation Services

Reference Guide: Approved Vendors for Translation and In-Person Interpretation Services Reference Guide: Approved Vendors for Translation and In-Person Interpretation Services What you need to know The government of D.C. has identified, vetted, and engaged four vendors in a citywide contract

More information

Activities. but I will require that groups present research papers

Activities. but I will require that groups present research papers CS-498 Signals AI Themes Much of AI occurs at the signal level Processing data and making inferences rather than logical reasoning Areas such as vision, speech, NLP, robotics methods bleed into other areas

More information

Statistical Machine Translation

Statistical Machine Translation Statistical Machine Translation Some of the content of this lecture is taken from previous lectures and presentations given by Philipp Koehn and Andy Way. Dr. Jennifer Foster National Centre for Language

More information

LANGUAGE CONNECTIONS YOUR LINGUISTIC GATEWAY

LANGUAGE CONNECTIONS YOUR LINGUISTIC GATEWAY 2001 Beacon Street Suite 105 Brighton, MA 02135 Tel: (617) 731-3510 Fax: (617) 731-3700 www.languageconnections.com [email protected] GSA CONTRACT #GS-10F-0126K DUNS #11-218-1040 CAGE #1

More information

LSI TRANSLATION PLUG-IN FOR RELATIVITY. within

LSI TRANSLATION PLUG-IN FOR RELATIVITY. within within LSI Translation Plug-in (LTP) for Relativity is a free plug-in that allows the Relativity user to access the STS system 201 Broadway, Cambridge, MA 02139 Contact: Mark Ettinger Tel: 800-654-5006

More information

GCE/GCSE subjects recognised for NUI matriculation purposes

GCE/GCSE subjects recognised for NUI matriculation purposes Subjects listed below are recognised for the purpose of NUI matriculation. See NUI Matriculation Regulations pp.11 and 14. Unless otherwise indicated only one subject from each group may be presented.

More information

Remote Desktop Services Guide

Remote Desktop Services Guide Remote Desktop Services Guide Mac OS X V 1.1 27/03/2014 i Contents Introduction... 1 Install and connect with Mac... 1 1. Download and install Citrix Receiver... 2 2. Installing Citrix Receiver... 4 3.

More information

Professional. Accurate. Fast.

Professional. Accurate. Fast. Professional. Accurate. Fast. Lingvo House is one of the UK's leading translation service providers. We offer highest quality linguistic solutions to most demanding clients using best professionals with

More information

CAPTURING THE VALUE OF UNSTRUCTURED DATA: INTRODUCTION TO TEXT MINING

CAPTURING THE VALUE OF UNSTRUCTURED DATA: INTRODUCTION TO TEXT MINING CAPTURING THE VALUE OF UNSTRUCTURED DATA: INTRODUCTION TO TEXT MINING Mary-Elizabeth ( M-E ) Eddlestone Principal Systems Engineer, Analytics SAS Customer Loyalty, SAS Institute, Inc. Is there valuable

More information

CINTIL-PropBank. CINTIL-PropBank Sub-corpus id Sentences Tokens Domain Sentences for regression atsts 779 5,654 Test

CINTIL-PropBank. CINTIL-PropBank Sub-corpus id Sentences Tokens Domain Sentences for regression atsts 779 5,654 Test CINTIL-PropBank I. Basic Information 1.1. Corpus information The CINTIL-PropBank (Branco et al., 2012) is a set of sentences annotated with their constituency structure and semantic role tags, composed

More information

Speaking your language...

Speaking your language... 1 About us: Cuttingedge Translation Services Pvt. Ltd. (Cuttingedge) has its corporate headquarters in Noida, India and an office in Glasgow, UK. Over the time we have serviced clients from various backgrounds

More information

Safe Harbor Statement

Safe Harbor Statement Safe Harbor Statement The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment

More information

INTERC O MBASE. Global Language Solution WWW.INTERCOMBASE.COM

INTERC O MBASE. Global Language Solution WWW.INTERCOMBASE.COM INTERC O MBASE Global Language Solution Tel.: (UK) +44 20 360 86157 E-mail: [email protected] Skype ID: intercombase.translations WWW.INTERCOMBASE.COM Services Credentials Expertise Document Translation

More information

Table 1: TSQM Version 1.4 Available Translations

Table 1: TSQM Version 1.4 Available Translations Quintiles, Inc. 1 Tables 1, 2, & 3 below list the existing and available translations for the TSQM v1.4, TSQM vii, TSQM v9. If Quintiles does not have a translation that your Company needs, the Company

More information

RESEARCH ASSISTANCE. The Portal is also accessible to the general public but restricted to the free case law databases.

RESEARCH ASSISTANCE. The Portal is also accessible to the general public but restricted to the free case law databases. RESEARCH ASSISTANCE I. Introduction The Common Portal of National Case Law is a meta-search engine which enables users to simultaneously research almost all the case law databases of the Supreme Courts

More information

Translution Price List GBP

Translution Price List GBP Translution Price List GBP TABLE OF CONTENTS Services AD HOC MACHINE TRANSLATION... LIGHT POST EDITED TRANSLATION... PROFESSIONAL TRANSLATION... 3 TRANSLATE, EDIT, REVIEW TRANSLATION (TWICE TRANSLATED)...3

More information

Tel: +971 4 266 3517 Fax: +971 4 268 9615 P.O. Box: 22392, Dubai - UAE [email protected] [email protected] www.communicationdubai.

Tel: +971 4 266 3517 Fax: +971 4 268 9615 P.O. Box: 22392, Dubai - UAE info@communicationdubai.com comm123@emirates.net.ae www.communicationdubai. Tel: +971 4 266 3517 Fax: +971 4 268 9615 P.O. Box: 22392, Dubai - UAE [email protected] [email protected] www.communicationdubai.com ALL ABOUT TRANSLATION Arabic English Online Human Translation

More information

Annotation Guidelines for Dutch-English Word Alignment

Annotation Guidelines for Dutch-English Word Alignment Annotation Guidelines for Dutch-English Word Alignment version 1.0 LT3 Technical Report LT3 10-01 Lieve Macken LT3 Language and Translation Technology Team Faculty of Translation Studies University College

More information

Who We Are. Services We Offer

Who We Are. Services We Offer Who We Are Atkins Translation Services is a professional language agency providing cost effective and rapid language services. Our network of over 70 native language professionals ensures we are able to

More information

Overview of admission requirements for the master s degree programs of the Faculty of Arts

Overview of admission requirements for the master s degree programs of the Faculty of Arts Overview of admission requirements for the master s degree programs of the Faculty of Arts Subjects Studies amounting to 78 credits Studies amounting to 42 credits Egyptology and Coptic Studies General

More information

We Answer To All Your Localization Needs!

We Answer To All Your Localization Needs! We Answer To All Your Localization Needs! Str. Traian Nr. 2, Bucharest, Romania 8950 W Olympic Blvd, California, U.S.A (RO) +40.740.182.777 / (US) +1.213.248.2367 www.i-t-local.com; [email protected]

More information

Introductory Guide to the Common European Framework of Reference (CEFR) for English Language Teachers

Introductory Guide to the Common European Framework of Reference (CEFR) for English Language Teachers Introductory Guide to the Common European Framework of Reference (CEFR) for English Language Teachers What is the Common European Framework of Reference? The Common European Framework of Reference gives

More information

Linking the world through professional language services

Linking the world through professional language services ProLINK Linking the world through professional language services ProLINK is strategically located in Hong Kong, Asia world city and gateway to China, where the East meets the West. The economy of China

More information

List of Higher School Certificate Board Developed Courses

List of Higher School Certificate Board Developed Courses List of Higher School Certificate Board Developed Courses ACE 6002 Last Updated: 27 February 2013 Subjects Courses Extension Courses Aboriginal Studies Aboriginal Studies Agriculture Agriculture Ancient

More information

Morphology. Morphology is the study of word formation, of the structure of words. 1. some words can be divided into parts which still have meaning

Morphology. Morphology is the study of word formation, of the structure of words. 1. some words can be divided into parts which still have meaning Morphology Morphology is the study of word formation, of the structure of words. Some observations about words and their structure: 1. some words can be divided into parts which still have meaning 2. many

More information

We Answer All Your Localization Needs!

We Answer All Your Localization Needs! partner We Answer All Your Localization Needs! Version: 2.0 23.05.2014 California, U.S.A Bucharest, Romania (US) +1.714.408.8094 (RO) +40.740.182.777 www.i-t-local.com [email protected] 1 of 13 Our Company

More information

Survey of University of Michigan Graduate-level Area Studies Alumni/ae & FLAS Recipients from 1996-2006: Selected Findings

Survey of University of Michigan Graduate-level Area Studies Alumni/ae & FLAS Recipients from 1996-2006: Selected Findings Survey of University of Michigan Graduate-level Area Studies Alumni/ae & FLAS Recipients from 1996-2006: Selected Findings Azumi Ann Takata, Center for Japanese Studies, International Institute Donna Parmelee,

More information

POS Tagsets and POS Tagging. Definition. Tokenization. Tagset Design. Automatic POS Tagging Bigram tagging. Maximum Likelihood Estimation 1 / 23

POS Tagsets and POS Tagging. Definition. Tokenization. Tagset Design. Automatic POS Tagging Bigram tagging. Maximum Likelihood Estimation 1 / 23 POS Def. Part of Speech POS POS L645 POS = Assigning word class information to words Dept. of Linguistics, Indiana University Fall 2009 ex: the man bought a book determiner noun verb determiner noun 1

More information

Quality Data for Your Information Infrastructure

Quality Data for Your Information Infrastructure SAP Product Brief SAP s for Small Businesses and Midsize Companies SAP Data Quality Management, Edge Edition Objectives Quality Data for Your Information Infrastructure Data quality management for confident

More information

Less Grammar, More Features

Less Grammar, More Features Less Grammar, More Features David Hall Greg Durrett Dan Klein Computer Science Division University of California, Berkeley {dlwh,gdurrett,klein}@cs.berkeley.edu Abstract We present a parser that relies

More information

Hybrid Strategies. for better products and shorter time-to-market

Hybrid Strategies. for better products and shorter time-to-market Hybrid Strategies for better products and shorter time-to-market Background Manufacturer of language technology software & services Spin-off of the research center of Germany/Heidelberg Founded in 1999,

More information

Paraphrasing controlled English texts

Paraphrasing controlled English texts Paraphrasing controlled English texts Kaarel Kaljurand Institute of Computational Linguistics, University of Zurich [email protected] Abstract. We discuss paraphrasing controlled English texts, by defining

More information

Yandex.Translate API Developer's guide

Yandex.Translate API Developer's guide 5.08.2015 .. Version 1.5 Document build date: 5.08.2015. This volume is a part of Yandex technical documentation. Yandex helpdesk site: http://help.yandex.ru 2008 2015 Yandex LLC. All rights reserved.

More information

Parsing Swedish. Atro Voutilainen Conexor oy [email protected]. CG and FDG

Parsing Swedish. Atro Voutilainen Conexor oy atro.voutilainen@conexor.fi. CG and FDG Parsing Swedish Atro Voutilainen Conexor oy [email protected] This paper presents two new systems for analysing Swedish texts: a light parser and a functional dependency grammar parser. Their

More information

Knowledge of Foreign Languages in the Czech Republic

Knowledge of Foreign Languages in the Czech Republic Knowledge of Foreign Languages in the Czech Republic Presentation of the results of a sociological survey Prepared by STEM for CzechInvest, Prague, 3 June 214 Survey Specification This presentation details

More information

MT Search Elastic Search for Magento

MT Search Elastic Search for Magento Web Site: If you have any questions, please contact us. MT Search Elastic Search for Magento Version 1.0.0 for Magento 1.9.x Download: http:///elasticsearch 2014 1 Table of Contents 1. Introduction...

More information

Formatting Custom List Information

Formatting Custom List Information Hello. MailChimp has a lot of great merge tags that can help you customize your email campaigns. You can use these merge tags to dynamically add content to your email. With merge tags, you can include

More information

Why language is hard. And what Linguistics has to say about it. Natalia Silveira Participation code: eagles

Why language is hard. And what Linguistics has to say about it. Natalia Silveira Participation code: eagles Why language is hard And what Linguistics has to say about it Natalia Silveira Participation code: eagles Christopher Natalia Silveira Manning Language processing is so easy for humans that it is like

More information

Luxembourg-Luxembourg: FL/SCIENT15 Translation services 2015/S 039-065697. Contract notice. Services

Luxembourg-Luxembourg: FL/SCIENT15 Translation services 2015/S 039-065697. Contract notice. Services 1/12 This notice in TED website: http://ted.europa.eu/udl?uri=ted:notice:65697-2015:text:en:html Luxembourg-Luxembourg: FL/SCIENT15 Translation services 2015/S 039-065697 Contract notice Services Directive

More information

European Economic and Social Committee

European Economic and Social Committee European Economic and Social Committee INTRODUCTION The European Economic and Social Committee (EESC) is organising a workshop that will allow secondary school children from the 28 EU Member States to

More information

Building gold-standard treebanks for Norwegian

Building gold-standard treebanks for Norwegian Building gold-standard treebanks for Norwegian Per Erik Solberg National Library of Norway, P.O.Box 2674 Solli, NO-0203 Oslo, Norway [email protected] ABSTRACT Språkbanken at the National Library of Norway

More information

Brasshouse Languages Course programme September to December 2016

Brasshouse Languages Course programme September to December 2016 Brasshouse Languages Course programme September to December 2016 Course Fees Venue Code Title Start End Sessions Hours Day BRASSHOUSE 16BH001071S Ancient Greek - Beginners 28-Sep 14-Dec 11 22 BRASSHOUSE

More information

Special Topics in Computer Science

Special Topics in Computer Science Special Topics in Computer Science NLP in a Nutshell CS492B Spring Semester 2009 Jong C. Park Computer Science Department Korea Advanced Institute of Science and Technology INTRODUCTION Jong C. Park, CS

More information

Microsoft stores badge guidelines. February 2016

Microsoft stores badge guidelines. February 2016 Microsoft stores badge guidelines February 2016 Welcome Together we can do amazing things. Millions of fans, thousands of partners and developers across the world empower people and organizations do great

More information

Phase 2 of the D4 Project. Helmut Schmid and Sabine Schulte im Walde

Phase 2 of the D4 Project. Helmut Schmid and Sabine Schulte im Walde Statistical Verb-Clustering Model soft clustering: Verbs may belong to several clusters trained on verb-argument tuples clusters together verbs with similar subcategorization and selectional restriction

More information

Multi language e Discovery Three Critical Steps for Litigating in a Global Economy

Multi language e Discovery Three Critical Steps for Litigating in a Global Economy Multi language e Discovery Three Critical Steps for Litigating in a Global Economy 2 3 5 6 7 Introduction e Discovery has become a pressure point in many boardrooms. Companies with international operations

More information

External Candidate Online Application

External Candidate Online Application External Candidate Online Application Candidates wishing to sit examinations in a Department of Education Secondary School Before applying as an External Candidate you must first download the School request

More information

Luxembourg-Luxembourg: FL/TERM15 Translation services 2015/S 253-462303. Contract notice. Services

Luxembourg-Luxembourg: FL/TERM15 Translation services 2015/S 253-462303. Contract notice. Services 1 / 12 This notice in TED website: http://ted.europa.eu/udl?uri=ted:notice:462303-2015:text:en:html Luxembourg-Luxembourg: FL/TERM15 Translation services 2015/S 253-462303 Contract notice Services Directive

More information

Syntactic Transfer Using a Bilingual Lexicon

Syntactic Transfer Using a Bilingual Lexicon Syntactic Transfer Using a Bilingual Lexicon Greg Durrett, Adam Pauls, and Dan Klein UC Berkeley Parsing a New Language Parsing a New Language Mozambique hope on trade with other members Parsing a New

More information

About CRC? What is Email Link?

About CRC? What is Email Link? About CRC? The Community Relations Commission for a multicultural NSW (CRC) was established by Parliament to implement a new approach to protecting and promoting community harmony in our unique culturally

More information

Cross-Language Instant Messaging with Automatic Translation

Cross-Language Instant Messaging with Automatic Translation Cross-Language Instant Messaging with Automatic Translation Che-Yu Yang Department of Information Management China University of Technology Taipei, Taiwan e-mail: [email protected] Abstract Along with

More information

Evalita 09 Parsing Task: constituency parsers and the Penn format for Italian

Evalita 09 Parsing Task: constituency parsers and the Penn format for Italian Evalita 09 Parsing Task: constituency parsers and the Penn format for Italian Cristina Bosco, Alessandro Mazzei, and Vincenzo Lombardo Dipartimento di Informatica, Università di Torino, Corso Svizzera

More information

Translating for a Multilingual European Union: Putting Multilingualism into Context Dr Angeliki PETRITS Language Officer European Commission, UK

Translating for a Multilingual European Union: Putting Multilingualism into Context Dr Angeliki PETRITS Language Officer European Commission, UK Translating for a Multilingual European Union: Putting Multilingualism into Context Dr Angeliki PETRITS Language Officer European Commission, UK [email protected] What is multilingualism?

More information

Product Globalization Service. A Partner You Can Trust

Product Globalization Service. A Partner You Can Trust Product Globalization Service A Partner You Can Trust About WistronITS WistronITS an industry-leading information service company founded in June 1992 under the Wistron Business Group. Our range of services

More information

LocaTran Translations Ltd. Professional Translation, Localization and DTP Solutions. www.locatran.com [email protected]

LocaTran Translations Ltd. Professional Translation, Localization and DTP Solutions. www.locatran.com info@locatran.com LocaTran Translations Ltd. Professional Translation, Localization and DTP Solutions About Us Founded in 2004, LocaTran Translations is an ISO 9001:2008 certified translation and localization service provider

More information

EAP 1161 1660 Grammar Competencies Levels 1 6

EAP 1161 1660 Grammar Competencies Levels 1 6 EAP 1161 1660 Grammar Competencies Levels 1 6 Grammar Committee Representatives: Marcia Captan, Maria Fallon, Ira Fernandez, Myra Redman, Geraldine Walker Developmental Editor: Cynthia M. Schuemann Approved:

More information

Luxembourg-Luxembourg: FL/RAIL16 Translation services 2016/S 054-089888. Contract notice. Services

Luxembourg-Luxembourg: FL/RAIL16 Translation services 2016/S 054-089888. Contract notice. Services 1 / 11 This notice in TED website: http://ted.europa.eu/udl?uri=ted:notice:89888-2016:text:en:html Luxembourg-Luxembourg: FL/RAIL16 Translation services 2016/S 054-089888 Contract notice Services Directive

More information

Automatic Detection and Correction of Errors in Dependency Treebanks

Automatic Detection and Correction of Errors in Dependency Treebanks Automatic Detection and Correction of Errors in Dependency Treebanks Alexander Volokh DFKI Stuhlsatzenhausweg 3 66123 Saarbrücken, Germany [email protected] Günter Neumann DFKI Stuhlsatzenhausweg

More information

Context Grammar and POS Tagging

Context Grammar and POS Tagging Context Grammar and POS Tagging Shian-jung Dick Chen Don Loritz New Technology and Research New Technology and Research LexisNexis LexisNexis Ohio, 45342 Ohio, 45342 [email protected] [email protected]

More information

IBM Content Analytics with Enterprise Search, Version 3.0

IBM Content Analytics with Enterprise Search, Version 3.0 IBM Content Analytics with Enterprise Search, Version 3.0 Highlights Enables greater accuracy and control over information with sophisticated natural language processing capabilities to deliver the right

More information

Veronika VINCZE, PhD. PERSONAL DATA Date of birth: 1 July 1981 Nationality: Hungarian

Veronika VINCZE, PhD. PERSONAL DATA Date of birth: 1 July 1981 Nationality: Hungarian Veronika VINCZE, PhD CONTACT INFORMATION Hungarian Academy of Sciences Research Group on Artificial Intelligence Tisza Lajos krt. 103., 6720 Szeged, Hungary Phone: +36 62 54 41 40 Mobile: +36 70 22 99

More information

Syntactic Theory. Background and Transformational Grammar. Dr. Dan Flickinger & PD Dr. Valia Kordoni

Syntactic Theory. Background and Transformational Grammar. Dr. Dan Flickinger & PD Dr. Valia Kordoni Syntactic Theory Background and Transformational Grammar Dr. Dan Flickinger & PD Dr. Valia Kordoni Department of Computational Linguistics Saarland University October 28, 2011 Early work on grammar There

More information

Outline of today s lecture

Outline of today s lecture Outline of today s lecture Generative grammar Simple context free grammars Probabilistic CFGs Formalism power requirements Parsing Modelling syntactic structure of phrases and sentences. Why is it useful?

More information

Testing Data-Driven Learning Algorithms for PoS Tagging of Icelandic

Testing Data-Driven Learning Algorithms for PoS Tagging of Icelandic Testing Data-Driven Learning Algorithms for PoS Tagging of Icelandic by Sigrún Helgadóttir Abstract This paper gives the results of an experiment concerned with training three different taggers on tagged

More information

Rule based Sentence Simplification for English to Tamil Machine Translation System

Rule based Sentence Simplification for English to Tamil Machine Translation System Volume 25 No8, July 2011 Rule based Sentence Simplification for English to Tamil Machine Translation System Poornima C, Dhanalakshmi V Computational Engineering and Networking Amrita Vishwa Vidyapeetham

More information

A global leader in document translations

A global leader in document translations Since 1993, Northwest Translations has been a global leader in providing exceptional high quality document translations with emphasis in the MEDICAL/LIFE SCIENCES, LEGAL, ENGINEERING, MARKETING/ADVERTISING

More information

31 Case Studies: Java Natural Language Tools Available on the Web

31 Case Studies: Java Natural Language Tools Available on the Web 31 Case Studies: Java Natural Language Tools Available on the Web Chapter Objectives Chapter Contents This chapter provides a number of sources for open source and free atural language understanding software

More information

THE ETHICS HELPLINE Worldwide Dialing Instructions April 2012

THE ETHICS HELPLINE Worldwide Dialing Instructions April 2012 COUNTRY DIALING INSTRUCTIONS US, Canada and Virgin Islands The Ethics Helpline is always available, 24/7/365 888 478 6858 (Dialing instructions for other jurisdictions follow) Coming soon internet reporting

More information

Internet sites for machine translation available language-pairs ** Part 1 direct translation sites

Internet sites for machine translation available language-pairs ** Part 1 direct translation sites Internet sites for machine translation available -pairs ** Part 1 direct translation sites Al Misbar http://www.almisbar.com/salam.html ATA Software (www.ataso ft.com) English Arabic N Alta Vista http://babelfish.altavista.com/translate.dyn

More information

Structure of Clauses. March 9, 2004

Structure of Clauses. March 9, 2004 Structure of Clauses March 9, 2004 Preview Comments on HW 6 Schedule review session Finite and non-finite clauses Constituent structure of clauses Structure of Main Clauses Discuss HW #7 Course Evals Comments

More information

Research Portfolio. Beáta B. Megyesi January 8, 2007

Research Portfolio. Beáta B. Megyesi January 8, 2007 Research Portfolio Beáta B. Megyesi January 8, 2007 Research Activities Research activities focus on mainly four areas: Natural language processing During the last ten years, since I started my academic

More information

Chinese Open Relation Extraction for Knowledge Acquisition

Chinese Open Relation Extraction for Knowledge Acquisition Chinese Open Relation Extraction for Knowledge Acquisition Yuen-Hsien Tseng 1, Lung-Hao Lee 1,2, Shu-Yen Lin 1, Bo-Shun Liao 1, Mei-Jun Liu 1, Hsin-Hsi Chen 2, Oren Etzioni 3, Anthony Fader 4 1 Information

More information

webcertain Recruitment pack Ceri Wright [Pick the date]

webcertain Recruitment pack Ceri Wright [Pick the date] Recruitment pack Ceri Wright [Pick the date] SEO Executive Have you recently caught the SEO bug and looking to develop your skills and career in a rapidly growing agency? If your answer is YES then Webcertain

More information

IPCC translation and interpretation policy. February 2015

IPCC translation and interpretation policy. February 2015 IPCC translation and interpretation policy February 2015 September 2013 1 Contents 1. Introduction 1.1 Scope 1.2 Definitions 1.3 Aims of this policy 1.4 Contact for queries on this policy 2. Background

More information

Why are Organizations Interested?

Why are Organizations Interested? SAS Text Analytics Mary-Elizabeth ( M-E ) Eddlestone SAS Customer Loyalty [email protected] +1 (607) 256-7929 Why are Organizations Interested? Text Analytics 2009: User Perspectives on Solutions

More information

HP Business Notebook Password Localization Guidelines V1.0

HP Business Notebook Password Localization Guidelines V1.0 HP Business Notebook Password Localization Guidelines V1.0 November 2009 Table of Contents: 1. Introduction..2 2. Supported Platforms...2 3. Overview of Design...3 4. Supported Keyboard Layouts in Preboot

More information

COMPUTATIONAL DATA ANALYSIS FOR SYNTAX

COMPUTATIONAL DATA ANALYSIS FOR SYNTAX COLING 82, J. Horeck~ (ed.j North-Holland Publishing Compa~y Academia, 1982 COMPUTATIONAL DATA ANALYSIS FOR SYNTAX Ludmila UhliFova - Zva Nebeska - Jan Kralik Czech Language Institute Czechoslovak Academy

More information

According to the Argentine writer Jorge Luis Borges, in the Celestial Emporium of Benevolent Knowledge, animals are divided

According to the Argentine writer Jorge Luis Borges, in the Celestial Emporium of Benevolent Knowledge, animals are divided Categories Categories According to the Argentine writer Jorge Luis Borges, in the Celestial Emporium of Benevolent Knowledge, animals are divided into 1 2 Categories those that belong to the Emperor embalmed

More information

placing people first SALARY REPORT Summary of 2014 Bratislava

placing people first SALARY REPORT Summary of 2014 Bratislava placing people first SALARY REPORT Summary of 2014 1 / Cpl Jobs Salary Report Table of content 3 / Summary of 2014 / About Cpl Jobs 4 / IT 7 / Finance 9 / BPO/SSC 2 / Cpl Jobs Salary Report Summary of

More information

Cyclope Internet Filtering Proxy. - User Guide -

Cyclope Internet Filtering Proxy. - User Guide - Cyclope Internet Filtering Proxy - User Guide - 1. Overview 3 2. Cyclope Internet Filtering Proxy User Interface 4 2.1 Login 4 2.2 Logout 4 3. Administration 5 3.1 IP Management 5 3.2 Proxy Forwarding

More information

Towards a RB-SMT Hybrid System for Translating Patent Claims Results and Perspectives

Towards a RB-SMT Hybrid System for Translating Patent Claims Results and Perspectives Towards a RB-SMT Hybrid System for Translating Patent Claims Results and Perspectives Ramona Enache and Adam Slaski Department of Computer Science and Engineering Chalmers University of Technology and

More information