Sense-Tagging Verbs in English and Chinese Hoa Trang Dang Department of Computer and Information Sciences University of Pennsylvania htd@linc.cis.upenn.edu October 30, 2003
Outline English sense-tagging Senseval-1 verbs Senseval-2 verbs WordNet verb sense groupings Chinese sense-tagging Penn Chinese Treebank People s Daily News Sense-tagging in PropBank II 1
Local Contextual Predicates for English WSD Collocational (Ratnaparkhi pos-tagger): target verb w; pos of w; pos of words at positions -1, +1, wrt w; words at positions -2, -1, +1, +2, wrt w syntactic (Collins parser): is the sentence containing w passive; is there a sentential complement, subject, direct object, or indirect object the words (if any) in the positions of subject, direct object, indirect object, particle, prepositional complement (and its object) semantic (Nymble: Bikel et al.): Named Entity tag (PERSON, ORGANIZATION, LOCATION) for proper nouns, and WN synsets and hypernyms for all nouns in above syntactic relation to w 2
Topical Contextual Keywords Generate list of keywords from training set for each verb: Sort all words k by entropy È of Ë Ò µ, where k appears anywhere in context, provided that k appears in more than (= 2) instances in the corpus Select 200-300 words k with lowest entropy (most informative) 3
Senseval-1 Lexical Sample Task Lexicon: Hector lexical database, senses are organized in hierarchies Corpus: British National Corpus High average inter-annotator agreement (95.5%) 13 verbs (12 senses/verb in corpus) Avg training set size: 215 instances/verb Baseline (most frequent sense): 57% 4
Senseval-1 Verb Results System Accuracy p-value Avg. System 66.4 0.001 ETS (Naive Bayes) 71.0 0.005 MaxEnt (lex+trans+topic) 72.3 0.100 MaxEnt (best variants) 73.7 0.400 JHU-final (Decision List) 74.3-5
Senseval-2 English Verb Lexical Sample Task Lexicon: WordNet1.7; senses are also grouped Corpus: Penn Treebank WSJ, supplemented with British National Corpus Inter-annotator agreement: 71% 29 verbs, mostly highly polysemous (16 senses/verb in corpus) Avg training set size: 110 instances/verb Baseline (most frequent sense): 40% Best system performance: 60% 6
System Accuracy and Feature Types (English) Feature (local) Accuracy Feature (local, topic) Accuracy collocation 48.3 collocation 52.9 +syn 53.9 +syn 54.2 +syn+sem 59.0 +syn+sem 60.2 Linguistically richer features improve system accuracy 7
Senseval-2 Verbs Results System Accuracy p-value Avg. System 38.2 0.001 SMU 56.3 0.010 JHU 56.6 0.020 KUNLP 57.6 0.100 MaxEnt 60.2 (Human) 71.3 0.001 8
Senseval-2 verb groupings methodology Groupings of senses done after sense-tagging for Senseval-2 Double blind grouping of each verb by two people Discussion of criteria used for groupings - syntactic and semantic Adjudication of groupings by third person using agreed-upon criteria 9
Groupings improve performance Well-defined groupings improve human inter-annotator agreement (71% to 82%) Random grouping produced insignificant improvement in interannotator agreement (71% to 73%) Similar improvement in system score (60% to 70%) 10
Chinese WSD (CTB) Lexicon: CETA (Chinese-English Translation Assistance) Dictionary Corpus: Penn Chinese Treebank (100K words) Manual segmentation, pos-tagging, parsing 28 words (multiple verb senses, possibly other pos), most polysemous in 5K-word sample of corpus 3.5 senses/word in corpus Baseline (most frequent sense): 77% 11
Contextual predicates (Chinese) Local features: Collocational features: same as for English, plus follows verb feature syntactic features: hassubj, subj, hasobj, obj-p, obj, hasinobj, Comp-VP, VP- Comp, Comp-IP, hasprd semantic features (for verbs only): HowNet noun category for each subject and object Topical features: Same as for English 12
System Accuracy and Feature Types (CTB) Feature type Accuracy Std. Dev. collocation 86.8 1.0 collocation (+ pos) 93.4 0.5 collocation + syntax 94.3 0.4 collocation + syntax + semantics 94.4 0.6 baseline 76.7 13
Chinese WSD (PDN) Five words with low accuracy and counts in CTB subsequently sense-tagged in People s Daily News (1M words). PDN corpus has manual segmentation, pos-tagging; no parse About 200 sentences/word in PDN 8.2 senses/verb in corpus Baseline (most frequent sense): 58% Automatic segmentation, pos-tagging, parsing 14
System Accuracy and Feature Types (PDN, automatic) Feature type Accuracy Std. Dev. collocation 72.3 2.2 collocation (+ pos) 70.3 2.9 collocation + syntax 71.7 3.0 collocation + syntax + semantics 72.7 3.1 baseline 57.6 15
System Accuracy and Feature Types (PDN, manual) Feature Type Accuracy Std. Dev. collocation 71.4 4.3 collocation (+ pos) 74.7 2.3 collocation + topic 72.1 3.1 16
Differences between English and Chinese Higher number of verbs in Chinese than English Lower polysemy per verb for Chinese Many multi-character Chinese verbs Much ambiguitiy in Chinese is at level of word segmentation Lexical collocational information may be sufficient for Chinese 17
PropBank II sense-tagging Feasibility study - tag a reasonable set of polysemous words in Eng/Chin CTB determine realistic, concrete sense-tagging goals for next two years Which sense distinctions will be most relevant to IE and MT? how fine-grained do we really need to be? What is the most efficient/accurate way to produce the data? hierarchical tagging? active learning? does hand correcting automatic tagging bias the results? 18