LINKING VERB PATTERN DICTIONARIES OF ENGLISH AND SPANISH

Size: px
Start display at page:

Download "LINKING VERB PATTERN DICTIONARIES OF ENGLISH AND SPANISH"

Transcription

1 LINKING VERB PATTERN DICTIONARIES OF ENGLISH AND SPANISH VÍT BAISA ~ SARA MOŽE ~ IRENE RENAU Masaryk University Brno, Czech Republic University of Wolverhampton United Kingdom Pontificia Universidad Católica de Valparaíso, Chile

2 INTRODUCTION Verbs are complex AIM: methodology and tools for the creation of a multilingual corpus-driven lexical resource for verbs using manual and automatic procedures CPA-based monolingual pattern dictionaries What are they? New multilingual resource researchers and language professionals? Preliminary study: I. Manual linking task gold standard dataset II. Automatic linking task = algorithm; evaluated against the gold standard

3 CORPUS PATTERN ANALYSIS (CPA) Corpus Pattern Analysis (CPA) (Hanks, 2004) an empirical technique in Corpus Ling. and Lexicography map word meaning onto word use through lexical analysis of phraseological patterns, collocations Basis: Theory of Norms and Exploitations (TNE) (Hanks, 2013) double helix patterns of normal usage ( norms ) vs. their exploitations Pattern semantically motivated syntagmatic pattern Syntax: SPOCA (Halliday) Semantics: typical nominal slot fillers, represented by Semantic Types (ST) mnemonic sem. labels CPA shallow ontology (Hanks and Ježek, 2010) approx. 250 STs; shared by several projects

4 PDEV: harvest WHAT IS A PATTERN?

5 CPA PATTERN DICTIONARIES Pattern Dictionary of Italian Verbs (PDIV) Elisabetta Ježek, Pavia Pattern Dictionary of English Verbs (PDEV) Public website: Prof. Hanks, University of Wolverhampton; over 1,700+ English verbs completed Procedure: corpus samples (250/500/1000 lines) from the BNC corpus (Leech, 1992); Sketch Engine word sketches (Kilgarrif et al., 2014), CPA Editor (Baisa et al., 2015) and CPA shallow ontology (Ježek and Hanks, 2010) Implicatures; register, domain, idiom/phrasal verb labels; links to FrameNet (Ruppenhofer et al., 2010) Percentages for each pattern Pattern Dictionary of Spanish Verbs (PDSV) Public website: Verbario: Irene Renau, Pontificia Universidad Católica de Valparaíso 300 high-frequency Spanish verbs (currently only 100 publicly available online) Same methodology (CPA), guidelines, ontology, tools (SkE); but: Spanish Web Corpus

6 MANUAL LINKING: SP-EN PATTERN PAIRS Gold standard: 87 SP verbs with one or more EN equivalents (total: 126 EN verbs) Medium-frequency verbs, up to 15 patterns Manual cross-linguistic links between pattern pairs semanto-syntactic similarity = tertium comparationis linking procedure developed dataset used in algorithm evaluation Issues practical, theoretical Coverage: PDEV/PDSV are WIP resources; different coverage; limited overlap!!! Zero equivalence: cultural, social, cognitive, pragmatic reasons; idioms

7 INPUT: POTENTIALLY MATCHING EN PATTERN Does it have the same basic syntactic structure as the SP pattern (i.e. SVO or SV [+no obj])? YES NO Do all semantic types in all obligatory syntactic slots match? E.g.: EN: [[Human]] admire [[Anything]] SP: [[Human]] admirar [[Anything]] YES NO OUTPUT: PERFECT MATCH Do the two patterns share at least ONE semantic type in the same obligatory syntactic slot? For example: EN: [[Eventuality 1 Human Institution]] occasion [[Eventuality 2]] SP: [[Eventuality 1]] motivar [[Eventuality 2]] YES NO Are the two semantic types in the same obligatory syntactic position related to each other in terms of inheritance in the CPA ontology (up to two nodes), e.g. [[Eventuality]] (supertype) vs. [[Activity]] and [[Plan]] (subtypes): EN: [[Eventuality 1 Human]] spoil [[Eventuality 2]] SP: [[Eventuality Human]] estropear [[Activity Plan]] YES NO OUTPUT: PARTIAL MATCH OUTPUT: NO MATCH

8 AUTOMATIC PATTERN LINKING: ALGORITHM Heuristic-based algorithm: automatic linking suggestions Similarity score 490 SP patterns and their translations into EN (statistical EN-SP dictionary <-- parallel corpus) S, DO, IO comparison of STs Full match: 1 score pt (*Human = 0.5 pt); matching empty slots (e.g. DO) 0.5 pts CPA ontology: similarity score = 0.5 N Score calculated based on the distance (N) in the CPA ontology tree Scores summed up, final score assigned to the pair, top ranking EN pattern = most likely candidate Evaluation 50 SP-EN verb pairs Excluded: SP pattern cannot be matched agains an EN pattern in the sample Final no. of candidate pattern pairs: 50 gold standard 40/50 suggested candidate pairs were correct 80% precision

9 CONCLUSION Future activities: Gold standard: more annotated data; Refine the linking procedure (fine-grained distinctions?; intralingual links) Algorithm: train, improve precision; Software adaptation: feature for adding cross-linguistic links to the dictionaries/databases.

10 REFERENCES Baisa, V., El Maarouf, I., Rychlý, P. & Rambousek, A. (2015). Software and data for Corpus Pattern Analysis. In Horák, A., Rychlý, P., and Rambousek, A. (eds.), Ninth Workshop on Recent Advances in Slavonic Natural Language Processing. Brno. Tribun EU Buyse, K. and Verlinde, S. (2013). Possible effects of free on line data driven lexicographic instruments on foreign language learning: The case of Linguee and the interactive language toolbox. In Procedia: Social and Behavioral Sciences, volume 95, pages Elsevier BV. Fillmore, C. J. and Baker, C. (2010). A frames approach to semantic analysis. The Oxford Handbook of Linguistic Analysis, pages Halliday, M. A. K. (1994). An introduction to Functional Grammar. Edward Arnold. Hanks, P. (2004). Corpus Pattern Analysis. In G. Williams & S. Vessier (Eds.), 11 th Euralex International Congress. Proceedings. Lorient: Université de Bretagne-Sud, pp Hanks, P. (2013). Lexical Analysis: Norms and Exploitations. Cambridge, MA: MIT Press. Hlavácková, D., Horák, A. (2005). Verbalex new comprehensive lexicon of verb valencies for Czech. In Proceedings of the Slovko Conference. Ježek, E., & Hanks, P. (2010) What lexical sets tell us about conceptual categories. Lexis: E-journal in English lexicology. 4: Corpus Linguistics and the Lexicon. Université Lumiere, Lyon Ježek, E., Magnini, B., Feltracco, A., Bianchini, A., and Popescu, O. (2014). T-pas: A resource of corpusderived types predicateargument structures for linguistic analysis and semantic processing. In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC 14), pages Kilgarriff, A., Baisa, V., Bušta, J., Jakubíček, M., Kovář, V., Michelfeit, J., Rychlý, P. & Suchomel, V. (2014). The Sketch Engine: ten years on. Lexicography 1(1): Leech, G. (1992) 100 million words of English: the British National Corpus (BNC). Language Research 28(1):1 13. Maarouf, I. E., Bradbury, J., and Hanks, P. (2014). PDEVlemon: a Linked Data implementation of the Pattern Dictionary of English Verbs based on the Lemon model. In Proceedings of the 3rd Workshop on Linked Data in Linguistics (LDL): Multilingual Knowledge Resources and Natural Language Processing at the Ninth International Conference on Language Resources and Evaluation (LREC 14), Reykjavik, Iceland. Navigli, R. & Ponzetto, S. (2012). BabelNet: The Automatic Construction, Evaluation and Application of a Wide-Coverage Multilingual Semantic Network. Artificial Intelligence 193: Nazar, R. & Renau, I. (2015). Ontology Population Using Corpus Statistics. In O. Papini, S. Benferhat, L. Garcia et al. (Eds.), Proceedings of the Joint Ontology Workshops 2015 colocated with the 24th International Joint Conference on Artificial Intelligence (IJCAI 2015). Buenos Aires, Argentina, July 25-27, Ruppenhofer, J., Ellsworth, M., Petruck, M. R., Johnson, C. R. & Scheffczyk, J. (2010). FrameNet II: Extended Theory and Practice. Berkeley, CA: ICSI. Vossen, P. (2002). WordNet, EuroWordNet and Global WordNet. Revue Française de Linguistique Appliquée 7(1): Yong, H. & Peng, J. (1997). Bilingual lexicography from a communicative perspective. Amsterdam: John Benjamins.

11 Pattern Dictionary of English Verbs USEFUL LINKS VERBARIO (Pattern Dictionary of Spanish Verbs) PDEV-LEMON

SemEval-2015 Task 15: A Corpus Pattern Analysis Dictionary-Entry-Building Task

SemEval-2015 Task 15: A Corpus Pattern Analysis Dictionary-Entry-Building Task SemEval-2015 Task 15: A Corpus Pattern Analysis Dictionary-Entry-Building Task Vít Baisa Masaryk University xbaisa@fi.muni.cz Ismaïl El Maarouf University of Wolverhampton i.el-maarouf@wlv.ac.uk Jane Bradbury

More information

Overview of MT techniques. Malek Boualem (FT)

Overview of MT techniques. Malek Boualem (FT) Overview of MT techniques Malek Boualem (FT) This section presents an standard overview of general aspects related to machine translation with a description of different techniques: bilingual, transfer,

More information

Domain Independent Knowledge Base Population From Structured and Unstructured Data Sources

Domain Independent Knowledge Base Population From Structured and Unstructured Data Sources Proceedings of the Twenty-Fourth International Florida Artificial Intelligence Research Society Conference Domain Independent Knowledge Base Population From Structured and Unstructured Data Sources Michelle

More information

Simple maths for keywords

Simple maths for keywords Simple maths for keywords Adam Kilgarriff Lexical Computing Ltd adam@lexmasterclass.com Abstract We present a simple method for identifying keywords of one corpus vs. another. There is no one-sizefits-all

More information

Capturing Syntactico-semantic Regularities among Terms: An Application of the FrameNet Methodology to Terminology

Capturing Syntactico-semantic Regularities among Terms: An Application of the FrameNet Methodology to Terminology Capturing Syntactico-semantic Regularities among Terms: An Application of the FrameNet Methodology to Terminology Marie-Claude L Homme, Janine Pimentel Observatoire de linguistique Sens-Texte (OLST) Université

More information

Search Result Diversification Methods to Assist Lexicographers

Search Result Diversification Methods to Assist Lexicographers Search Result Diversification Methods to Assist Lexicographers Lars Borin Markus Forsberg Karin Friberg Heppin Richard Johansson Annika Kjellandsson Språkbanken, Department of Swedish, University of Gothenburg

More information

Difficulties that Arab Students Face in Learning English and the Importance of the Writing Skill Acquisition Key Words:

Difficulties that Arab Students Face in Learning English and the Importance of the Writing Skill Acquisition Key Words: Difficulties that Arab Students Face in Learning English and the Importance of the Writing Skill Acquisition Key Words: Lexical field academic proficiency syntactic repertoire context lexical categories

More information

Construction of Thai WordNet Lexical Database from Machine Readable Dictionaries

Construction of Thai WordNet Lexical Database from Machine Readable Dictionaries Construction of Thai WordNet Lexical Database from Machine Readable Dictionaries Patanakul Sathapornrungkij Department of Computer Science Faculty of Science, Mahidol University Rama6 Road, Ratchathewi

More information

What Makes a Good Online Dictionary? Empirical Insights from an Interdisciplinary Research Project

What Makes a Good Online Dictionary? Empirical Insights from an Interdisciplinary Research Project Proceedings of elex 2011, pp. 203-208 What Makes a Good Online Dictionary? Empirical Insights from an Interdisciplinary Research Project Carolin Müller-Spitzer, Alexander Koplenig, Antje Töpel Institute

More information

Building the Multilingual Web of Data: A Hands-on tutorial (ISWC 2014, Riva del Garda - Italy)

Building the Multilingual Web of Data: A Hands-on tutorial (ISWC 2014, Riva del Garda - Italy) Building the Multilingual Web of Data: A Hands-on tutorial (ISWC 2014, Riva del Garda - Italy) Multilingual Word Sense Disambiguation and Entity Linking on the Web based on BabelNet Roberto Navigli, Tiziano

More information

Semantic analysis of text and speech

Semantic analysis of text and speech Semantic analysis of text and speech SGN-9206 Signal processing graduate seminar II, Fall 2007 Anssi Klapuri Institute of Signal Processing, Tampere University of Technology, Finland Outline What is semantic

More information

Tibetan-Chinese Bilingual Sentences Alignment Method based on Multiple Features

Tibetan-Chinese Bilingual Sentences Alignment Method based on Multiple Features , pp.273-280 http://dx.doi.org/10.14257/ijdta.2015.8.4.27 Tibetan-Chinese Bilingual Sentences Alignment Method based on Multiple Features Lirong Qiu School of Information Engineering, MinzuUniversity of

More information

Supporting FrameNet Project with Semantic Web technologies

Supporting FrameNet Project with Semantic Web technologies Supporting FrameNet Project with Semantic Web technologies Paulo Hauck 1, Regina Braga 1, Fernanda Campos 1, Tiago Torrent 2, Ely Matos 2, José Maria N. David 1 1 Pós Graduação em Ciência da Computação

More information

The Oxford Learner s Dictionary of Academic English

The Oxford Learner s Dictionary of Academic English ISEJ Advertorial The Oxford Learner s Dictionary of Academic English Oxford University Press The Oxford Learner s Dictionary of Academic English (OLDAE) is a brand new learner s dictionary aimed at students

More information

Master of Arts in Linguistics Syllabus

Master of Arts in Linguistics Syllabus Master of Arts in Linguistics Syllabus Applicants shall hold a Bachelor s degree with Honours of this University or another qualification of equivalent standard from this University or from another university

More information

Czech Verbs of Communication and the Extraction of their Frames

Czech Verbs of Communication and the Extraction of their Frames Czech Verbs of Communication and the Extraction of their Frames Václava Benešová and Ondřej Bojar Institute of Formal and Applied Linguistics ÚFAL MFF UK, Malostranské náměstí 25, 11800 Praha, Czech Republic

More information

ASSOCIATING COLLOCATIONS WITH DICTIONARY SENSES

ASSOCIATING COLLOCATIONS WITH DICTIONARY SENSES ASSOCIATING COLLOCATIONS WITH DICTIONARY SENSES Abhilash Inumella Adam Kilgarriff Vojtěch Kovář IIIT Hyderabad, India Lexical Computing Ltd., UK Masaryk Uni., Brno, Cz abhilashi@students.iiit.ac.in adam@lexmasterclass.com

More information

Using DEB Services for Knowledge Representation within the KYOTO Project

Using DEB Services for Knowledge Representation within the KYOTO Project Using DEB Services for Knowledge Representation within the KYOTO Project Aleš Horák and Adam Rambousek Faculty of Informatics, Masaryk University Botanická 68a, 602 00 Brno, Czech Republic {hales,xrambous}@fi.muni.cz

More information

DiCE in the web: An online Spanish collocation dictionary

DiCE in the web: An online Spanish collocation dictionary GRANGER, S.; PAQUOT, M. (EDS.). 2010. ELEXICOGRAPHY IN THE 21ST CENTURY: NEW CHALLENGES, NEW APPLICATIONS. PROCEEDINGS OF ELEX2009, LOUVAIN-LA-NEUVE, 22-24 OCTOBER 2009. CAHIERS DU CENTAL 7. LOUVAIN-LA-NEUVE,

More information

Identifying Focus, Techniques and Domain of Scientific Papers

Identifying Focus, Techniques and Domain of Scientific Papers Identifying Focus, Techniques and Domain of Scientific Papers Sonal Gupta Department of Computer Science Stanford University Stanford, CA 94305 sonal@cs.stanford.edu Christopher D. Manning Department of

More information

Hybrid Strategies. for better products and shorter time-to-market

Hybrid Strategies. for better products and shorter time-to-market Hybrid Strategies for better products and shorter time-to-market Background Manufacturer of language technology software & services Spin-off of the research center of Germany/Heidelberg Founded in 1999,

More information

D2.4: Two trained semantic decoders for the Appointment Scheduling task

D2.4: Two trained semantic decoders for the Appointment Scheduling task D2.4: Two trained semantic decoders for the Appointment Scheduling task James Henderson, François Mairesse, Lonneke van der Plas, Paola Merlo Distribution: Public CLASSiC Computational Learning in Adaptive

More information

2014/02/13 Sphinx Lunch

2014/02/13 Sphinx Lunch 2014/02/13 Sphinx Lunch Best Student Paper Award @ 2013 IEEE Workshop on Automatic Speech Recognition and Understanding Dec. 9-12, 2013 Unsupervised Induction and Filling of Semantic Slot for Spoken Dialogue

More information

Overview of iclef 2008: search log analysis for Multilingual Image Retrieval

Overview of iclef 2008: search log analysis for Multilingual Image Retrieval Overview of iclef 2008: search log analysis for Multilingual Image Retrieval Julio Gonzalo Paul Clough Jussi Karlgren UNED U. Sheffield SICS Spain United Kingdom Sweden julio@lsi.uned.es p.d.clough@sheffield.ac.uk

More information

SYSTRAN Chinese-English and English-Chinese Hybrid Machine Translation Systems for CWMT2011 SYSTRAN 混 合 策 略 汉 英 和 英 汉 机 器 翻 译 系 CWMT2011 技 术 报 告

SYSTRAN Chinese-English and English-Chinese Hybrid Machine Translation Systems for CWMT2011 SYSTRAN 混 合 策 略 汉 英 和 英 汉 机 器 翻 译 系 CWMT2011 技 术 报 告 SYSTRAN Chinese-English and English-Chinese Hybrid Machine Translation Systems for CWMT2011 Jin Yang and Satoshi Enoue SYSTRAN Software, Inc. 4444 Eastgate Mall, Suite 310 San Diego, CA 92121, USA E-mail:

More information

EFL Learners Synonymous Errors: A Case Study of Glad and Happy

EFL Learners Synonymous Errors: A Case Study of Glad and Happy ISSN 1798-4769 Journal of Language Teaching and Research, Vol. 1, No. 1, pp. 1-7, January 2010 Manufactured in Finland. doi:10.4304/jltr.1.1.1-7 EFL Learners Synonymous Errors: A Case Study of Glad and

More information

International Journal of Scientific & Engineering Research, Volume 4, Issue 11, November-2013 5 ISSN 2229-5518

International Journal of Scientific & Engineering Research, Volume 4, Issue 11, November-2013 5 ISSN 2229-5518 International Journal of Scientific & Engineering Research, Volume 4, Issue 11, November-2013 5 INTELLIGENT MULTIDIMENSIONAL DATABASE INTERFACE Mona Gharib Mohamed Reda Zahraa E. Mohamed Faculty of Science,

More information

1. Introduction. 2. The Lexical Constellation Model. Abstract

1. Introduction. 2. The Lexical Constellation Model. Abstract Towards a Dynamic Combinatorial Dictionary: A Proposal for Introducing Interactions between Collocations in an Electronic Dictionary of English Word Combinations Moisés Almela, Pascual Cantos, Aquilino

More information

Approaches of Using a Word-Image Ontology and an Annotated Image Corpus as Intermedia for Cross-Language Image Retrieval

Approaches of Using a Word-Image Ontology and an Annotated Image Corpus as Intermedia for Cross-Language Image Retrieval Approaches of Using a Word-Image Ontology and an Annotated Image Corpus as Intermedia for Cross-Language Image Retrieval Yih-Chen Chang and Hsin-Hsi Chen Department of Computer Science and Information

More information

PROMT Technologies for Translation and Big Data

PROMT Technologies for Translation and Big Data PROMT Technologies for Translation and Big Data Overview and Use Cases Julia Epiphantseva PROMT About PROMT EXPIRIENCED Founded in 1991. One of the world leading machine translation provider DIVERSIFIED

More information

Teaching Framework. Framework components

Teaching Framework. Framework components Teaching Framework Framework components CE/3007b/4Y09 UCLES 2014 Framework components Each category and sub-category of the framework is made up of components. The explanations below set out what is meant

More information

Collecting Polish German Parallel Corpora in the Internet

Collecting Polish German Parallel Corpora in the Internet Proceedings of the International Multiconference on ISSN 1896 7094 Computer Science and Information Technology, pp. 285 292 2007 PIPS Collecting Polish German Parallel Corpora in the Internet Monika Rosińska

More information

Kybots, knowledge yielding robots German Rigau IXA group, UPV/EHU http://ixa.si.ehu.es

Kybots, knowledge yielding robots German Rigau IXA group, UPV/EHU http://ixa.si.ehu.es KYOTO () Intelligent Content and Semantics Knowledge Yielding Ontologies for Transition-Based Organization http://www.kyoto-project.eu/ Kybots, knowledge yielding robots German Rigau IXA group, UPV/EHU

More information

An Efficient Database Design for IndoWordNet Development Using Hybrid Approach

An Efficient Database Design for IndoWordNet Development Using Hybrid Approach An Efficient Database Design for IndoWordNet Development Using Hybrid Approach Venkatesh P rabhu 2 Shilpa Desai 1 Hanumant Redkar 1 N eha P rabhugaonkar 1 Apur va N agvenkar 1 Ramdas Karmali 1 (1) GOA

More information

Testing Data-Driven Learning Algorithms for PoS Tagging of Icelandic

Testing Data-Driven Learning Algorithms for PoS Tagging of Icelandic Testing Data-Driven Learning Algorithms for PoS Tagging of Icelandic by Sigrún Helgadóttir Abstract This paper gives the results of an experiment concerned with training three different taggers on tagged

More information

A comparison analysis of modal auxiliary verbs in Technical and General English

A comparison analysis of modal auxiliary verbs in Technical and General English Available online at www.sciencedirect.com ScienceDirect Procedia - Social and Behavioral Sciences 212 ( 2015 ) 292 297 MULTIMODAL COMMUNICATION IN THE 21ST CENTURY: PROFESSIONAL AND ACADEMIC CHALLENGES.

More information

AN OPEN KNOWLEDGE BASE FOR ITALIAN LANGUAGE IN A COLLABORATIVE PERSPECTIVE

AN OPEN KNOWLEDGE BASE FOR ITALIAN LANGUAGE IN A COLLABORATIVE PERSPECTIVE AN OPEN KNOWLEDGE BASE FOR ITALIAN LANGUAGE IN A COLLABORATIVE PERSPECTIVE Chiari I, A. Gangemi, E. Jezek, A. Oltramari, G. Vetere, L. Vieu http://www.sensocomune.it/ Sapienza Università di Roma Université

More information

Natural Language Dialogue in a Virtual Assistant Interface

Natural Language Dialogue in a Virtual Assistant Interface Natural Language Dialogue in a Virtual Assistant Interface Ana M. García-Serrano, Luis Rodrigo-Aguado, Javier Calle Intelligent Systems Research Group Facultad de Informática Universidad Politécnica de

More information

Multilingual and Localization Support for Ontologies

Multilingual and Localization Support for Ontologies Multilingual and Localization Support for Ontologies Mauricio Espinoza, Asunción Gómez-Pérez and Elena Montiel-Ponsoda UPM, Laboratorio de Inteligencia Artificial, 28660 Boadilla del Monte, Spain {jespinoza,

More information

Comprendium Translator System Overview

Comprendium Translator System Overview Comprendium System Overview May 2004 Table of Contents 1. INTRODUCTION...3 2. WHAT IS MACHINE TRANSLATION?...3 3. THE COMPRENDIUM MACHINE TRANSLATION TECHNOLOGY...4 3.1 THE BEST MT TECHNOLOGY IN THE MARKET...4

More information

Jornada de Seguimiento de Proyectos, 2004 Programa Nacional de Tecnologías Informáticas

Jornada de Seguimiento de Proyectos, 2004 Programa Nacional de Tecnologías Informáticas Jornada de Seguimiento de Proyectos, 2004 Programa Nacional de Tecnologías Informáticas Automatic processing of textual information in Spanish (Tratamiento automático de la información textual en español:

More information

Visualizing WordNet Structure

Visualizing WordNet Structure Visualizing WordNet Structure Jaap Kamps Abstract Representations in WordNet are not on the level of individual words or word forms, but on the level of word meanings (lexemes). A word meaning, in turn,

More information

Building the MetaNet metaphor repository: The natural symbiosis of metaphor analysis and construction grammar

Building the MetaNet metaphor repository: The natural symbiosis of metaphor analysis and construction grammar Building the MetaNet metaphor repository: The natural symbiosis of metaphor analysis and construction grammar Oana David, oanadavid@berkeley.edu Elise Stickles, elstickles@berkeley.edu Ellen Dodge, edodge@icsi.berkeley.edu

More information

The PALAVRAS parser and its Linguateca applications - a mutually productive relationship

The PALAVRAS parser and its Linguateca applications - a mutually productive relationship The PALAVRAS parser and its Linguateca applications - a mutually productive relationship Eckhard Bick University of Southern Denmark eckhard.bick@mail.dk Outline Flow chart Linguateca Palavras History

More information

I. INTRODUCTION NOESIS ONTOLOGIES SEMANTICS AND ANNOTATION

I. INTRODUCTION NOESIS ONTOLOGIES SEMANTICS AND ANNOTATION Noesis: A Semantic Search Engine and Resource Aggregator for Atmospheric Science Sunil Movva, Rahul Ramachandran, Xiang Li, Phani Cherukuri, Sara Graves Information Technology and Systems Center University

More information

NATURAL LANGUAGE QUERY PROCESSING USING PROBABILISTIC CONTEXT FREE GRAMMAR

NATURAL LANGUAGE QUERY PROCESSING USING PROBABILISTIC CONTEXT FREE GRAMMAR NATURAL LANGUAGE QUERY PROCESSING USING PROBABILISTIC CONTEXT FREE GRAMMAR Arati K. Deshpande 1 and Prakash. R. Devale 2 1 Student and 2 Professor & Head, Department of Information Technology, Bharati

More information

Assessing speaking in the revised FCE Nick Saville and Peter Hargreaves

Assessing speaking in the revised FCE Nick Saville and Peter Hargreaves Assessing speaking in the revised FCE Nick Saville and Peter Hargreaves This paper describes the Speaking Test which forms part of the revised First Certificate of English (FCE) examination produced by

More information

Parole sintagmatiche in italiano

Parole sintagmatiche in italiano UNIVERSITÀ DEGLI STUDI ROMA TRE FACOLTÀ DI LETTERE E FILOSOFIA DIPARTIMENTO DI LINGUISTICA DOTTORATO DI RICERCA IN LINGUISTICA SINCRONICA, DIACRONICA E APPLICATA XIX CICLO A.A. 2005/2006 Parole sintagmatiche

More information

Computer Assisted Language Learning (CALL): Room for CompLing? Scott, Stella, Stacia

Computer Assisted Language Learning (CALL): Room for CompLing? Scott, Stella, Stacia Computer Assisted Language Learning (CALL): Room for CompLing? Scott, Stella, Stacia Outline I What is CALL? (scott) II Popular language learning sites (stella) Livemocha.com (stacia) III IV Specific sites

More information

The Role of Sentence Structure in Recognizing Textual Entailment

The Role of Sentence Structure in Recognizing Textual Entailment Blake,C. (In Press) The Role of Sentence Structure in Recognizing Textual Entailment. ACL-PASCAL Workshop on Textual Entailment and Paraphrasing, Prague, Czech Republic. The Role of Sentence Structure

More information

Module Catalogue for the Bachelor Program in Computational Linguistics at the University of Heidelberg

Module Catalogue for the Bachelor Program in Computational Linguistics at the University of Heidelberg Module Catalogue for the Bachelor Program in Computational Linguistics at the University of Heidelberg March 1, 2007 The catalogue is organized into sections of (1) obligatory modules ( Basismodule ) that

More information

Generating SQL Queries Using Natural Language Syntactic Dependencies and Metadata

Generating SQL Queries Using Natural Language Syntactic Dependencies and Metadata Generating SQL Queries Using Natural Language Syntactic Dependencies and Metadata Alessandra Giordani and Alessandro Moschitti Department of Computer Science and Engineering University of Trento Via Sommarive

More information

Application of Natural Language Interface to a Machine Translation Problem

Application of Natural Language Interface to a Machine Translation Problem Application of Natural Language Interface to a Machine Translation Problem Heidi M. Johnson Yukiko Sekine John S. White Martin Marietta Corporation Gil C. Kim Korean Advanced Institute of Science and Technology

More information

Varieties of lexical variation

Varieties of lexical variation Dirk Geeraerts University of Leuven Varieties of lexical Abstract This paper presents the theoretical backgr ound of a large-scale lexicological research project on lexical that was carried out at the

More information

Transformation of Free-text Electronic Health Records for Efficient Information Retrieval and Support of Knowledge Discovery

Transformation of Free-text Electronic Health Records for Efficient Information Retrieval and Support of Knowledge Discovery Transformation of Free-text Electronic Health Records for Efficient Information Retrieval and Support of Knowledge Discovery Jan Paralic, Peter Smatana Technical University of Kosice, Slovakia Center for

More information

Register Differences between Prefabs in Native and EFL English

Register Differences between Prefabs in Native and EFL English Register Differences between Prefabs in Native and EFL English MARIA WIKTORSSON 1 Introduction In the later stages of EFL (English as a Foreign Language) learning, and foreign language learning in general,

More information

INF5820 Natural Language Processing - NLP. H2009 Jan Tore Lønning jtl@ifi.uio.no

INF5820 Natural Language Processing - NLP. H2009 Jan Tore Lønning jtl@ifi.uio.no INF5820 Natural Language Processing - NLP H2009 Jan Tore Lønning jtl@ifi.uio.no Semantic Role Labeling INF5830 Lecture 13 Nov 4, 2009 Today Some words about semantics Thematic/semantic roles PropBank &

More information

2 P age. www.deafeducation.vic.edu.au

2 P age. www.deafeducation.vic.edu.au Building Connections Between the Signed and Written Language of Signing Deaf Children Michelle Baker & Michelle Stark In research relating to writing and deaf students there is a larger body of work that

More information

Empirical Machine Translation and its Evaluation

Empirical Machine Translation and its Evaluation Empirical Machine Translation and its Evaluation EAMT Best Thesis Award 2008 Jesús Giménez (Advisor, Lluís Màrquez) Universitat Politècnica de Catalunya May 28, 2010 Empirical Machine Translation Empirical

More information

An Overview of Applied Linguistics

An Overview of Applied Linguistics An Overview of Applied Linguistics Edited by: Norbert Schmitt Abeer Alharbi What is Linguistics? It is a scientific study of a language It s goal is To describe the varieties of languages and explain the

More information

ISSUES IN RULE BASED KNOWLEDGE DISCOVERING PROCESS

ISSUES IN RULE BASED KNOWLEDGE DISCOVERING PROCESS Advances and Applications in Statistical Sciences Proceedings of The IV Meeting on Dynamics of Social and Economic Systems Volume 2, Issue 2, 2010, Pages 303-314 2010 Mili Publications ISSUES IN RULE BASED

More information

Real-Time Identification of MWE Candidates in Databases from the BNC and the Web

Real-Time Identification of MWE Candidates in Databases from the BNC and the Web Real-Time Identification of MWE Candidates in Databases from the BNC and the Web Identifying and Researching Multi-Word Units British Association for Applied Linguistics Corpus Linguistics SIG Oxford Text

More information

George Mikros, Villy Tsakona, Maria Drakopoulou, Alexandra Koutra, Evangelia Triantafylli and Sofia Trypanagnostopoulou University of Athens

George Mikros, Villy Tsakona, Maria Drakopoulou, Alexandra Koutra, Evangelia Triantafylli and Sofia Trypanagnostopoulou University of Athens 1 DEVELOPING AN ENGLISH-GREEK COMPARABLE CORPUS USING WEB TEXTS George Mikros, Villy Tsakona, Maria Drakopoulou, Alexandra Koutra, Evangelia Triantafylli and Sofia Trypanagnostopoulou University of Athens

More information

Automatically Generated Customizable Online Dictionaries

Automatically Generated Customizable Online Dictionaries Automatically Generated Customizable Online Dictionaries Enikő Héja Dept. of Language Technology Research Institute for Linguistics, HAS P.O.Box. 360 H-1394, Budapest eheja@nytud.hu Dávid Takács Dept.

More information

An open and scalable framework for enriching ontologies with natural language content

An open and scalable framework for enriching ontologies with natural language content An open and scalable framework for enriching ontologies with natural language content Maria Teresa Pazienza and Armando Stellato AI Research Group, Dept. of Computer Science, Systems and Production University

More information

Representing Specialized Events with FrameBase

Representing Specialized Events with FrameBase Representing Specialized Events with FrameBase Jacobo Rouces 1, Gerard de Melo 2, and Katja Hose 1 1 Aalborg University, Denmark jrg@es.aau.dk, khose@cs.aau.dk 2 Tsinghua University, China gdm@demelo.org

More information

What s in a Lexicon. The Lexicon. Lexicon vs. Dictionary. What kind of Information should a Lexicon contain?

What s in a Lexicon. The Lexicon. Lexicon vs. Dictionary. What kind of Information should a Lexicon contain? What s in a Lexicon What kind of Information should a Lexicon contain? The Lexicon Miriam Butt November 2002 Semantic: information about lexical meaning and relations (thematic roles, selectional restrictions,

More information

Using WordNet.PT for translation: disambiguation and lexical selection decisions

Using WordNet.PT for translation: disambiguation and lexical selection decisions INTERNATIONAL JOURNAL OF TRANSLATION Vol. XX, No. XX, XXXX XXXX Using WordNet.PT for translation: disambiguation and lexical selection decisions University of Lisbon, Portugal ABSTRACT Wordnets are extensively

More information

ANALYSIS OF LEXICO-SYNTACTIC PATTERNS FOR ANTONYM PAIR EXTRACTION FROM A TURKISH CORPUS

ANALYSIS OF LEXICO-SYNTACTIC PATTERNS FOR ANTONYM PAIR EXTRACTION FROM A TURKISH CORPUS ANALYSIS OF LEXICO-SYNTACTIC PATTERNS FOR ANTONYM PAIR EXTRACTION FROM A TURKISH CORPUS Gürkan Şahin 1, Banu Diri 1 and Tuğba Yıldız 2 1 Faculty of Electrical-Electronic, Department of Computer Engineering

More information

DanNet Teaching and Research Perspectives at CST

DanNet Teaching and Research Perspectives at CST DanNet Teaching and Research Perspectives at CST Patrizia Paggio Centre for Language Technology University of Copenhagen paggio@hum.ku.dk Dias 1 Outline Previous and current research: Concept-based search:

More information

Corpus and Discourse. The Web As Corpus. Theory and Practice MARISTELLA GATTO LONDON NEW DELHI NEW YORK SYDNEY

Corpus and Discourse. The Web As Corpus. Theory and Practice MARISTELLA GATTO LONDON NEW DELHI NEW YORK SYDNEY Corpus and Discourse The Web As Corpus Theory and Practice MARISTELLA GATTO B L O O M S B U R Y LONDON NEW DELHI NEW YORK SYDNEY Contents List of Figures xiii List of Tables xvii Preface xix Acknowledgements

More information

Bachelor s Degree in English Studies

Bachelor s Degree in English Studies Bachelor s Degree in English Studies Degree Description: The length of the bachelor s degree in English Studies is 4 years. The minimum of credits required for the obtaining of the title is 240 ECTS credits,

More information

Absolute versus Relative Synonymy

Absolute versus Relative Synonymy Article 18 in LCPJ Danglli, Leonard & Abazaj, Griselda 2009: Absolute versus Relative Synonymy Absolute versus Relative Synonymy Abstract This article aims at providing an illustrated discussion of the

More information

How the Computer Translates. Svetlana Sokolova President and CEO of PROMT, PhD.

How the Computer Translates. Svetlana Sokolova President and CEO of PROMT, PhD. Svetlana Sokolova President and CEO of PROMT, PhD. How the Computer Translates Machine translation is a special field of computer application where almost everyone believes that he/she is a specialist.

More information

Syntactic Transfer Using a Bilingual Lexicon

Syntactic Transfer Using a Bilingual Lexicon Syntactic Transfer Using a Bilingual Lexicon Greg Durrett, Adam Pauls, and Dan Klein UC Berkeley Parsing a New Language Parsing a New Language Mozambique hope on trade with other members Parsing a New

More information

On the general structure of ontologies of instructional models

On the general structure of ontologies of instructional models On the general structure of ontologies of instructional models Miguel-Angel Sicilia Information Engineering Research Unit Computer Science Dept., University of Alcalá Ctra. Barcelona km. 33.6 28871 Alcalá

More information

Translation Studies. Major problems, the state of the art and research prospects, interests of scholars

Translation Studies. Major problems, the state of the art and research prospects, interests of scholars Antoni Dębski Translation Studies. Major problems, the state of the art and research prospects, interests of scholars The article discusses major problems central for modern translation studies. The author

More information

Motivation. Korpus-Abfrage: Werkzeuge und Sprachen. Overview. Languages of Corpus Query. SARA Query Possibilities 1

Motivation. Korpus-Abfrage: Werkzeuge und Sprachen. Overview. Languages of Corpus Query. SARA Query Possibilities 1 Korpus-Abfrage: Werkzeuge und Sprachen Gastreferat zur Vorlesung Korpuslinguistik mit und für Computerlinguistik Charlotte Merz 3. Dezember 2002 Motivation Lizentiatsarbeit: A Corpus Query Tool for Automatically

More information

Automatic assignment of Wikipedia encyclopedic entries to WordNet synsets

Automatic assignment of Wikipedia encyclopedic entries to WordNet synsets Automatic assignment of Wikipedia encyclopedic entries to WordNet synsets Maria Ruiz-Casado, Enrique Alfonseca and Pablo Castells Computer Science Dep., Universidad Autonoma de Madrid, 28049 Madrid, Spain

More information

Building Ontology Networks: How to Obtain a Particular Ontology Network Life Cycle?

Building Ontology Networks: How to Obtain a Particular Ontology Network Life Cycle? See discussions, stats, and author profiles for this publication at: http://www.researchgate.net/publication/47901002 Building Ontology Networks: How to Obtain a Particular Ontology Network Life Cycle?

More information

ON GETTING THE MOST OUT OF INTERNET RESOURCES TO RAISE TRANSLATION QUALITY OF PROFESSIONAL DOCUMENTATION

ON GETTING THE MOST OUT OF INTERNET RESOURCES TO RAISE TRANSLATION QUALITY OF PROFESSIONAL DOCUMENTATION General and Professional Education 3/2013 pp. 21-27 ISSN 2084-1469 ON GETTING THE MOST OUT OF INTERNET RESOURCES TO RAISE TRANSLATION QUALITY OF PROFESSIONAL DOCUMENTATION Svetlana Sheremetyeva Department

More information

LGPLLR : an open source license for NLP (Natural Language Processing) Sébastien Paumier. Université Paris-Est Marne-la-Vallée

LGPLLR : an open source license for NLP (Natural Language Processing) Sébastien Paumier. Université Paris-Est Marne-la-Vallée LGPLLR : an open source license for NLP (Natural Language Processing) Sébastien Paumier Université Paris-Est Marne-la-Vallée paumier@univ-mlv.fr Penguin from http://tux.crystalxp.net/ 1 Linguistic data

More information

Oxford Dictionary of Current Idiomatic English

Oxford Dictionary of Current Idiomatic English Oxford Dictionary of Current Idiomatic English The most comprehensive and detailed survey available of this very important area of English, and an ideal complement to any general English dictionary. ALREADY

More information

MOSAICA Presentation Cultural Heritage Project Meeting 30 June 2006

MOSAICA Presentation Cultural Heritage Project Meeting 30 June 2006 Presentation Cultural Heritage Project Meeting 30 June 2006 Raphael ATTIAS Nahum KORDA Project characteristics SCOPE: Semantically Enhanced, Multifaceted, Collaborative Access to Cultural Heritage Intelligent

More information

Sense-Tagging Verbs in English and Chinese. Hoa Trang Dang

Sense-Tagging Verbs in English and Chinese. Hoa Trang Dang Sense-Tagging Verbs in English and Chinese Hoa Trang Dang Department of Computer and Information Sciences University of Pennsylvania htd@linc.cis.upenn.edu October 30, 2003 Outline English sense-tagging

More information

Combining Ontological Knowledge and Wrapper Induction techniques into an e-retail System 1

Combining Ontological Knowledge and Wrapper Induction techniques into an e-retail System 1 Combining Ontological Knowledge and Wrapper Induction techniques into an e-retail System 1 Maria Teresa Pazienza, Armando Stellato and Michele Vindigni Department of Computer Science, Systems and Management,

More information

A Survey of Online Tools Used in English-Thai and Thai-English Translation by Thai Students

A Survey of Online Tools Used in English-Thai and Thai-English Translation by Thai Students 69 A Survey of Online Tools Used in English-Thai and Thai-English Translation by Thai Students Sarathorn Munpru, Srinakharinwirot University, Thailand Pornpol Wuttikrikunlaya, Srinakharinwirot University,

More information

Natural Language to Relational Query by Using Parsing Compiler

Natural Language to Relational Query by Using Parsing Compiler Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 4, Issue. 3, March 2015,

More information

The Prolog Interface to the Unstructured Information Management Architecture

The Prolog Interface to the Unstructured Information Management Architecture The Prolog Interface to the Unstructured Information Management Architecture Paul Fodor 1, Adam Lally 2, David Ferrucci 2 1 Stony Brook University, Stony Brook, NY 11794, USA, pfodor@cs.sunysb.edu 2 IBM

More information

Enabling Business Experts to Discover Web Services for Business Process Automation. Emerging Web Service Technologies

Enabling Business Experts to Discover Web Services for Business Process Automation. Emerging Web Service Technologies Enabling Business Experts to Discover Web Services for Business Process Automation Emerging Web Service Technologies Jan-Felix Schwarz 3 December 2009 Agenda 2 Problem & Background Approach Evaluation

More information

Search Engine Based Intelligent Help Desk System: iassist

Search Engine Based Intelligent Help Desk System: iassist Search Engine Based Intelligent Help Desk System: iassist Sahil K. Shah, Prof. Sheetal A. Takale Information Technology Department VPCOE, Baramati, Maharashtra, India sahilshahwnr@gmail.com, sheetaltakale@gmail.com

More information

Ontology and automatic code generation on modeling and simulation

Ontology and automatic code generation on modeling and simulation Ontology and automatic code generation on modeling and simulation Youcef Gheraibia Computing Department University Md Messadia Souk Ahras, 41000, Algeria youcef.gheraibia@gmail.com Abdelhabib Bourouis

More information

Central and South-East European Resources in META-SHARE

Central and South-East European Resources in META-SHARE Central and South-East European Resources in META-SHARE Tamás VÁRADI 1 Marko TADIĆ 2 (1) RESERCH INSTITUTE FOR LINGUISTICS, MTA, Budapest, Hungary (2) FACULTY OF HUMANITIES AND SOCIAL SCIENCES, ZAGREB

More information

University of Massachusetts Boston Applied Linguistics Graduate Program. APLING 601 Introduction to Linguistics. Syllabus

University of Massachusetts Boston Applied Linguistics Graduate Program. APLING 601 Introduction to Linguistics. Syllabus University of Massachusetts Boston Applied Linguistics Graduate Program APLING 601 Introduction to Linguistics Syllabus Course Description: This course examines the nature and origin of language, the history

More information

Doctoral Consortium 2013 Dept. Lenguajes y Sistemas Informáticos UNED

Doctoral Consortium 2013 Dept. Lenguajes y Sistemas Informáticos UNED Doctoral Consortium 2013 Dept. Lenguajes y Sistemas Informáticos UNED 17 19 June 2013 Monday 17 June Salón de Actos, Facultad de Psicología, UNED 15.00-16.30: Invited talk Eneko Agirre (Euskal Herriko

More information

CINTIL-PropBank. CINTIL-PropBank Sub-corpus id Sentences Tokens Domain Sentences for regression atsts 779 5,654 Test

CINTIL-PropBank. CINTIL-PropBank Sub-corpus id Sentences Tokens Domain Sentences for regression atsts 779 5,654 Test CINTIL-PropBank I. Basic Information 1.1. Corpus information The CINTIL-PropBank (Branco et al., 2012) is a set of sentences annotated with their constituency structure and semantic role tags, composed

More information

A SURVEY ON OPINION MINING FROM ONLINE REVIEW SENTENCES

A SURVEY ON OPINION MINING FROM ONLINE REVIEW SENTENCES A SURVEY ON OPINION MINING FROM ONLINE REVIEW SENTENCES Dr.P.Perumal 1,M.Kasthuri 2 1 Professor, Computer science and Engineering, Sri Ramakrishna Engineering College, TamilNadu, India 2 ME Student, Computer

More information

FUNDAMENTALS OF ARTIFICIAL INTELLIGENCE KNOWLEDGE REPRESENTATION AND NETWORKED SCHEMES

FUNDAMENTALS OF ARTIFICIAL INTELLIGENCE KNOWLEDGE REPRESENTATION AND NETWORKED SCHEMES Riga Technical University Faculty of Computer Science and Information Technology Department of Systems Theory and Design FUNDAMENTALS OF ARTIFICIAL INTELLIGENCE Lecture 7 KNOWLEDGE REPRESENTATION AND NETWORKED

More information

At the Beginning of a Compilation of a New Monolingual Dictionary of Czech (A Report on a New Lexicographic Project)

At the Beginning of a Compilation of a New Monolingual Dictionary of Czech (A Report on a New Lexicographic Project) At the Beginning of a Compilation of a New Monolingual Dictionary of Czech (A Report on a New Lexicographic Project) Pavla Kochová, Zdeňka Opavská, Martina Holcová Habrová Institute of the Czech Language

More information

Psychology G4470. Psychology and Neuropsychology of Language. Spring 2013.

Psychology G4470. Psychology and Neuropsychology of Language. Spring 2013. Psychology G4470. Psychology and Neuropsychology of Language. Spring 2013. I. Course description, as it will appear in the bulletins. II. A full description of the content of the course III. Rationale

More information