DanNet From Dictionary to Wordnet

Size: px
Start display at page:

Download "DanNet From Dictionary to Wordnet"

Transcription

1 DanNet From Dictionary to Wordnet Jörg Asmussen Society for Danish Language and Literature, DSL, Copenhagen Bolette Sandford Pedersen Centre for Language Technology, CST, University of Copenhagen Lars Trap-Jensen Society for Danish Language and Literature, DSL, Copenhagen

2 Outline 1. Introduction LTJ, 2 min. 2. Characteristics of the DDO LTJ, 5 min. 3. Building DanNet BSP, 8 min. 4. Extraction of differentia info JA, 7 min. 5. Conclusions JA, 2 min

3 DanNet Lexical-semantic wordnet for Danish Joint project Society for Danish Language and Literature Centre for Language Technology, University of Copenhagen 4 years ( ), ~ 400,000

4 Limited resources Adapt an existing wordnet? or Reuse other lexical-semantic resources: SIMPLE-DK Den Danske Ordbog, DDO

5 Outline 1. Introduction 2. Characteristics of the DDO 3. Building DanNet 4. Extraction of differentia info from definitons 5. Conclusions

6 Den Danske Ordbog Published by DSL Corpus-based, DDOC 60,000 entries Spelling, morphology, pronunciation, meaning, collocations, fixed phrases, syntax, usage, word formation, etymology

7 Den Danske Ordbog Words edited in related groups Machine readable Fine-grained microstructure 100,000 definitions

8 Semantic description

9 Semantic description Systematic domain info concerns relation

10 Semantic description Sense definition relevant info manually extracted

11 Semantic description Hyperonym

12 Semantic description Sense relations, i.e. synonyms

13 Semantic description Collocational information

14 Semantic description Authentic example

15 Semantic description

16 Definitions in the DDO Definition scheme: Genus proximum closest hyperonym: apparat technical device Differentia specifica distinctive feature: remaining part of the definition

17 Outline 1. Introduction 2. Characteristics of the DDO 3. Building DanNet 4. Extraction of differentia info from definitons 5. Conclusions

18 Building DanNet Extract definitions and genus specifications Include them in the DanNet tool Use it for domain-wise development of data: 1. Homonymy and polysemy 2. Establishing synsets 3. Adjusting the hierarchical structure

19 Homonymy & polysemy celle cell is genus proximum of gærcelle,yeast cell fængselscelle prison cell Convert lexical expressions into concepts: celle-1 part of living organism celle-2,small room

20 Establishing synsets lære studies fag subject videnskab science informatik informatics bromatologi nutrition science samfundsfag social studies datalogi computer science

21 Establishing synsets One synset lære studies fag subject videnskab science informatik informatics bromatologi nutrition science samfundsfag social studies datalogi computer science

22 Building the hierarchy Hyponymy is generally defined as X is a Y Taxonymy is a subtype of this: X is a kind/type of Y Cf. Cruse, 1991 and 2002

23 Example: Hyponymy? træ tree kirsebærtræ cherry tree birketræ birch vejtræ roadside tree

24 Example: Hyponymy? træ tree vejtræ roadside tree kirsebærtræ cherry tree birketræ birch Orthogonal Hyponymy

25

26 Building the hierarchy TOP genstand object møbel furniture siddemøbel sitting furniture stol chair

27 Building the hierarchy TOP genstand object møbel furniture indbo/bohave household effects siddemøbel sitting furniture stol chair

28 Building the hierarchy TOP genstand object møbel furniture indbo/bohave household effects siddemøbel sitting furniture stol chair

29 Definition composition Genus selection a conscious process Differentia: No editorial specifications, i.e. no fixed definition vocabulary nor syntax Consequences for DanNet: Complicates computational exploitation Semantic relations are coded manually

30 Coding relations What is done manually: No semantic info other than that of DDO Reduction of semantic info What is done automatically: Inheritance of relations from hyperonyms

31 Outline 1. Introduction 2. Characteristics of the DDO 3. Building DanNet 4. Extraction of differentia info from definitons 5. Conclusions

32 Extraction of telic role fjernsyn tv set box-shaped device that can receive tv signals and transform them into animated pictures on a screen and accompanying sound in the speakers of the device

33 Extraction of telic role fjernsyn tv set genus expression box-shaped device that can receive tv signals and transform them into animated pictures on a screen and accompanying sound in the speakers of the device

34 Extraction of telic role fjernsyn tv set genus expression box-shaped device that can receive tv signals and transform them into animated pictures on a screen and accompanying sound in the speakers of the device Telic role: VPs headed by can

35 Extraction of telic role fjernsyn tv set genus expression box-shaped device that can receive tv signals and transform them into animated pictures on a screen and accompanying sound in the speakers of the device Telic role: VPs headed by can

36 Hypothesis

37 Hypothesis VPs in a relative clause which are headed by kan can specify the telic role (i.e. the for_purpose_of relation) of the definiendum

38 Hypothesis Corpus query VPs Find a relative all definitions clause with which genus are apparat headed by kan can specify followed the by telic der role or som (i.e. the for_purpose_of relation) followed by of kan the definiendum followed by a word ending in e

39 Results of corpus query

40 Results of corpus query query VP heads denoting telic role dictionary entries

41 Results of corpus query query VP heads denoting telic role Only 26 occurrences of this pattern but 203 dictionary entries apparat definitions

42 Why this bad coverage?

43 Why this bad coverage? 1. Definitions where the pattern contains interposed material are not captured

44 Why this bad coverage? 1. Definitions where the pattern contains interposed material are not captured 2. Other stuctural patterns indicating a for_purpose_of relation than that one given in our hypothesis

45 Further patterns 1. GE that can VP-inf 2. GE that is used for to VP-inf with 3. GE for to VP-inf with/on/in 4. GE that VP-fin 5. GE for NP 6. GE that is specially designed for to VP-inf

46 Further patterns head for_purpose_of 1. GE that can VP-inf 2. GE that is used for to VP-inf with 3. GE for to VP-inf with/on/in 4. GE that VP-fin 5. GE for NP 6. GE that is specially designed for to VP-inf

47 1. GE that can VP-inf 2. GE that is used for to VP-inf with 3. GE for to VP-inf with/on/in 4. GE that VP-fin 5. GE for NP Further patterns head These patterns 6. GE that is specially designed for to VP-inf for_purpose_of capture 70% of the apparat definitions

48 A statistical approach

49 A statistical approach Frequency list of types in definitions with genus apparat

50 A statistical approach Frequency list of types in definitions with genus apparat compared with

51 A statistical approach Frequency list of types in definitions with genus apparat compared with frequency list of types in all definitions

52 A statistical approach Frequency list of types in definitions with genus apparat compared with frequency list of types in all definitions using a statistical test (e.g. log likelihood)

53 A statistical approach Frequency list of types in definitions with genus apparat compared with frequency list of types in all definitions using a statistical test (e.g. log likelihood) Salient types are listed for investigation and may give hints on semantic relations

54 Some salient types afspille to play back afspilning play back måle,measure måling,gauging måler,measuring tool målinger,measurements

55 Some salient types afspille to play back afspilning play back måle,measure måling,gauging måler,measuring tool målinger,measurements grammofon, cd-afspiller, afspiller, sequencer, diktafon kassettespiller, hjemmevideo, kassettebåndoptager, båndoptager stroboskop, måler, timer, løgnedetektor, ekkolod gasmåler, speedometer, omdrejningstæller, benzinmåler, fotofælde elmåler, trykmåler, luxmeter, spirometer, gyrometer, alkometer, newtonmeter, magnetometer, instrument, måleinstrument, kalorimeter radiosonde, satellit, fartskriver

56 Automatic extraction?

57 Automatic extraction? Basically NO... Developing reliant methods is too expensive!

58 Automatic extraction? Structural and lexical properties of definitions differ considerably

59 Automatic extraction? Structural and lexical properties of definitions differ considerably Difficult to automatically extract semantic relations from definitions

60 Automatic extraction? Structural and lexical properties of definitions differ considerably Difficult to automatically extract semantic relations from definitions Concordances and lists of salient definition types may help the editor

61 Automatic extraction? Structural and lexical properties of definitions differ considerably Difficult to automatically extract semantic relations from definitions Concordances and lists of salient definition types may help the editor But the DanNet editor still has to do the core job of analysing dictionary definitions

62 Outline 1. Introduction 2. Characteristics of the DDO 3. Building DanNet 4. Extraction of differentia info from definitons 5. Conclusions

63 Conclusion Reusing the DDO

64 Conclusion Reusing the DDO Cheap Expensive

65 Conclusion Reusing the DDO Cheap Expensive Semi-automatic exploitation of the dictionary structure hyponymy structure synonym/antonym info

66 Conclusion Reusing the DDO Cheap Expensive Semi-automatic exploitation of the dictionary structure hyponymy structure synonym/antonym info Automatic exploitation of definitions proper to find other semantic relations

67 Conclusion Reusing the DDO Cheap Expensive Semi-automatic exploitation of the dictionary structure hyponymy structure synonym/antonym info Automatic exploitation of definitions proper to find other semantic relations

68 Conclusion The DanNet approach

69 Cheap Conclusion The DanNet approach Expensive

70 Conclusion The DanNet approach Cheap Translation/expansion of existing WNs? Expensive Better coherence with other WNs Linguistic bias

71 Conclusion The DanNet approach Cheap Translation/expansion of existing WNs? Expensive Better coherence with other WNs Linguistic bias Reusing/merging language resources? More loyal to the specific language Expensive, unless based on an existing resource, i.e. a dictionary

72 Conclusion The DanNet approach Cheap Translation/expansion of existing WNs? Expensive Better coherence with other WNs Linguistic bias Reusing/merging language resources? More loyal to the specific language Expensive, unless based on an existing resource, i.e. a dictionary

WordTies - a web interface for browsing wordnets across languages. Bolette Sandford Pedersen University of Copenhagen

WordTies - a web interface for browsing wordnets across languages. Bolette Sandford Pedersen University of Copenhagen WordTies - a web interface for browsing wordnets across languages Bolette Sandford Pedersen University of Copenhagen What are wordnets? WordNets are lexical databases compiled for different languages,

More information

User studies, user behaviour and user involvement evidence and experience from The Danish Dictionary

User studies, user behaviour and user involvement evidence and experience from The Danish Dictionary User studies, user behaviour and user involvement evidence and experience from The Danish Dictionary Henrik Lorentzen, Lars Trap-Jensen Society for Danish Language and Literature, Copenhagen, Denmark E-mail:

More information

There And Back Again from Dictionary to Wordnet to Thesaurus and Vice Versa: How to Use and Reuse Dictionary Data in a Conceptual Dictionary

There And Back Again from Dictionary to Wordnet to Thesaurus and Vice Versa: How to Use and Reuse Dictionary Data in a Conceptual Dictionary There And Back Again from Dictionary to Wordnet to Thesaurus and Vice Versa: How to Use and Reuse Dictionary Data in a Conceptual Dictionary Henrik Lorentzen, Lars Trap-Jensen Society for Danish Language

More information

Linguistic Challenges in DanNet

Linguistic Challenges in DanNet Sanni Nimb Danish Society for Language and Literature 1/31 Linguistic Challenges in DanNet Introduction 1. The hyponymy hierarchy in DanNet Reusing data from Den Danske Ordbog (DDO): advantages and problems

More information

DanNet Teaching and Research Perspectives at CST

DanNet Teaching and Research Perspectives at CST DanNet Teaching and Research Perspectives at CST Patrizia Paggio Centre for Language Technology University of Copenhagen paggio@hum.ku.dk Dias 1 Outline Previous and current research: Concept-based search:

More information

A Software Tool for Thesauri Management, Browsing and Supporting Advanced Searches

A Software Tool for Thesauri Management, Browsing and Supporting Advanced Searches J. Nogueras-Iso, J.A. Bañares, J. Lacasta, J. Zarazaga-Soria 105 A Software Tool for Thesauri Management, Browsing and Supporting Advanced Searches J. Nogueras-Iso, J.A. Bañares, J. Lacasta, J. Zarazaga-Soria

More information

Processing: current projects and research at the IXA Group

Processing: current projects and research at the IXA Group Natural Language Processing: current projects and research at the IXA Group IXA Research Group on NLP University of the Basque Country Xabier Artola Zubillaga Motivation A language that seeks to survive

More information

Comparing Ontology-based and Corpusbased Domain Annotations in WordNet.

Comparing Ontology-based and Corpusbased Domain Annotations in WordNet. Comparing Ontology-based and Corpusbased Domain Annotations in WordNet. A paper by: Bernardo Magnini Carlo Strapparava Giovanni Pezzulo Alfio Glozzo Presented by: rabee ali alshemali Motive. Domain information

More information

Natural Language Processing. Part 4: lexical semantics

Natural Language Processing. Part 4: lexical semantics Natural Language Processing Part 4: lexical semantics 2 Lexical semantics A lexicon generally has a highly structured form It stores the meanings and uses of each word It encodes the relations between

More information

Sense-Tagging Verbs in English and Chinese. Hoa Trang Dang

Sense-Tagging Verbs in English and Chinese. Hoa Trang Dang Sense-Tagging Verbs in English and Chinese Hoa Trang Dang Department of Computer and Information Sciences University of Pennsylvania htd@linc.cis.upenn.edu October 30, 2003 Outline English sense-tagging

More information

Language Meaning and Use

Language Meaning and Use Language Meaning and Use Raymond Hickey, English Linguistics Website: www.uni-due.de/ele Types of meaning There are four recognisable types of meaning: lexical meaning, grammatical meaning, sentence meaning

More information

HELP DESK SYSTEMS. Using CaseBased Reasoning

HELP DESK SYSTEMS. Using CaseBased Reasoning HELP DESK SYSTEMS Using CaseBased Reasoning Topics Covered Today What is Help-Desk? Components of HelpDesk Systems Types Of HelpDesk Systems Used Need for CBR in HelpDesk Systems GE Helpdesk using ReMind

More information

Domain Knowledge Extracting in a Chinese Natural Language Interface to Databases: NChiql

Domain Knowledge Extracting in a Chinese Natural Language Interface to Databases: NChiql Domain Knowledge Extracting in a Chinese Natural Language Interface to Databases: NChiql Xiaofeng Meng 1,2, Yong Zhou 1, and Shan Wang 1 1 College of Information, Renmin University of China, Beijing 100872

More information

Intro to Linguistics Semantics

Intro to Linguistics Semantics Intro to Linguistics Semantics Jarmila Panevová & Jirka Hana January 5, 2011 Overview of topics What is Semantics The Meaning of Words The Meaning of Sentences Other things about semantics What to remember

More information

How the Computer Translates. Svetlana Sokolova President and CEO of PROMT, PhD.

How the Computer Translates. Svetlana Sokolova President and CEO of PROMT, PhD. Svetlana Sokolova President and CEO of PROMT, PhD. How the Computer Translates Machine translation is a special field of computer application where almost everyone believes that he/she is a specialist.

More information

Semantic analysis of text and speech

Semantic analysis of text and speech Semantic analysis of text and speech SGN-9206 Signal processing graduate seminar II, Fall 2007 Anssi Klapuri Institute of Signal Processing, Tampere University of Technology, Finland Outline What is semantic

More information

CINDOR Conceptual Interlingua Document Retrieval: TREC-8 Evaluation.

CINDOR Conceptual Interlingua Document Retrieval: TREC-8 Evaluation. CINDOR Conceptual Interlingua Document Retrieval: TREC-8 Evaluation. Miguel Ruiz, Anne Diekema, Páraic Sheridan MNIS-TextWise Labs Dey Centennial Plaza 401 South Salina Street Syracuse, NY 13202 Abstract:

More information

Application Architectures

Application Architectures Software Engineering Application Architectures Based on Software Engineering, 7 th Edition by Ian Sommerville Objectives To explain the organization of two fundamental models of business systems - batch

More information

L130: Chapter 5d. Dr. Shannon Bischoff. Dr. Shannon Bischoff () L130: Chapter 5d 1 / 25

L130: Chapter 5d. Dr. Shannon Bischoff. Dr. Shannon Bischoff () L130: Chapter 5d 1 / 25 L130: Chapter 5d Dr. Shannon Bischoff Dr. Shannon Bischoff () L130: Chapter 5d 1 / 25 Outline 1 Syntax 2 Clauses 3 Constituents Dr. Shannon Bischoff () L130: Chapter 5d 2 / 25 Outline Last time... Verbs...

More information

Clever Search: A WordNet Based Wrapper for Internet Search Engines

Clever Search: A WordNet Based Wrapper for Internet Search Engines Clever Search: A WordNet Based Wrapper for Internet Search Engines Peter M. Kruse, André Naujoks, Dietmar Rösner, Manuela Kunze Otto-von-Guericke-Universität Magdeburg, Institut für Wissens- und Sprachverarbeitung,

More information

Online dictionaries how do users find them and what do they do once they have?

Online dictionaries how do users find them and what do they do once they have? Online dictionaries how do users find them and what do they do once they have? Henrik Lorentzen & Liisa Theilgaard Keywords: online dictionaries, search strategies, query log analysis, information retrieval,

More information

Title: Chinese Characters and Top Ontology in EuroWordNet

Title: Chinese Characters and Top Ontology in EuroWordNet Title: Chinese Characters and Top Ontology in EuroWordNet Paper by: Shun Sylvia Wong & Karel Pala Presentation By: Patrick Baker Introduction WordNet, Cyc, HowNet, and EuroWordNet each use a hierarchical

More information

Automatic Speech Recognition and Hybrid Machine Translation for High-Quality Closed-Captioning and Subtitling for Video Broadcast

Automatic Speech Recognition and Hybrid Machine Translation for High-Quality Closed-Captioning and Subtitling for Video Broadcast Automatic Speech Recognition and Hybrid Machine Translation for High-Quality Closed-Captioning and Subtitling for Video Broadcast Hassan Sawaf Science Applications International Corporation (SAIC) 7990

More information

A Mapping of CIDOC CRM Events to German Wordnet for Event Detection in Texts

A Mapping of CIDOC CRM Events to German Wordnet for Event Detection in Texts A Mapping of CIDOC CRM Events to German Wordnet for Event Detection in Texts Martin Scholz Friedrich-Alexander-University Erlangen-Nürnberg Digital Humanities Research Group Outline Motivation: information

More information

Human Language Technology Research and the Development of the Brazilian Portuguese Wordnet

Human Language Technology Research and the Development of the Brazilian Portuguese Wordnet 1 Human Language Technology Research and the Development of the Brazilian Portuguese Wordnet Bento Carlos DIAS-DA-SILVA Faculdade de Ciências e Letras, Universidade Estadual Paulista, Rodovia Araraquara-Jau

More information

Bridging CAQDAS with text mining: Text analyst s toolbox for Big Data: Science in the Media Project

Bridging CAQDAS with text mining: Text analyst s toolbox for Big Data: Science in the Media Project Bridging CAQDAS with text mining: Text analyst s toolbox for Big Data: Science in the Media Project Ahmet Suerdem Istanbul Bilgi University; LSE Methodology Dept. Science in the media project is funded

More information

DK-CLARIN WP 2.1 Technical Report Jørg Asmussen, DSL, with input from other WP 2 members Final version of May 5, 2015 1

DK-CLARIN WP 2.1 Technical Report Jørg Asmussen, DSL, with input from other WP 2 members Final version of May 5, 2015 1 Text formatting What an annotated text should look like DK-CLARIN WP 2.1 Technical Report Jørg Asmussen, DSL, with input from other WP 2 members Final version of May 5, 2015 1 Deliverables concerned D2

More information

The Oxford Learner s Dictionary of Academic English

The Oxford Learner s Dictionary of Academic English ISEJ Advertorial The Oxford Learner s Dictionary of Academic English Oxford University Press The Oxford Learner s Dictionary of Academic English (OLDAE) is a brand new learner s dictionary aimed at students

More information

Automatic assignment of Wikipedia encyclopedic entries to WordNet synsets

Automatic assignment of Wikipedia encyclopedic entries to WordNet synsets Automatic assignment of Wikipedia encyclopedic entries to WordNet synsets Maria Ruiz-Casado, Enrique Alfonseca and Pablo Castells Computer Science Dep., Universidad Autonoma de Madrid, 28049 Madrid, Spain

More information

Module Catalogue for the Bachelor Program in Computational Linguistics at the University of Heidelberg

Module Catalogue for the Bachelor Program in Computational Linguistics at the University of Heidelberg Module Catalogue for the Bachelor Program in Computational Linguistics at the University of Heidelberg March 1, 2007 The catalogue is organized into sections of (1) obligatory modules ( Basismodule ) that

More information

Using DEB Services for Knowledge Representation within the KYOTO Project

Using DEB Services for Knowledge Representation within the KYOTO Project Using DEB Services for Knowledge Representation within the KYOTO Project Aleš Horák and Adam Rambousek Faculty of Informatics, Masaryk University Botanická 68a, 602 00 Brno, Czech Republic {hales,xrambous}@fi.muni.cz

More information

HIERARCHICAL HYBRID TRANSLATION BETWEEN ENGLISH AND GERMAN

HIERARCHICAL HYBRID TRANSLATION BETWEEN ENGLISH AND GERMAN HIERARCHICAL HYBRID TRANSLATION BETWEEN ENGLISH AND GERMAN Yu Chen, Andreas Eisele DFKI GmbH, Saarbrücken, Germany May 28, 2010 OUTLINE INTRODUCTION ARCHITECTURE EXPERIMENTS CONCLUSION SMT VS. RBMT [K.

More information

1. Introduction. 2. Lemma selection

1. Introduction. 2. Lemma selection Orthographical Dictionaries: How Much Can You Expect? The Danish Spelling Dictionary Revis(it)ed Henrik Lorentzen Dept. for Digital Dictionaries and Text Corpora, DSL Society for Danish Language and Literature

More information

COMPUTATIONAL DATA ANALYSIS FOR SYNTAX

COMPUTATIONAL DATA ANALYSIS FOR SYNTAX COLING 82, J. Horeck~ (ed.j North-Holland Publishing Compa~y Academia, 1982 COMPUTATIONAL DATA ANALYSIS FOR SYNTAX Ludmila UhliFova - Zva Nebeska - Jan Kralik Czech Language Institute Czechoslovak Academy

More information

Software Engineering. System Models. Based on Software Engineering, 7 th Edition by Ian Sommerville

Software Engineering. System Models. Based on Software Engineering, 7 th Edition by Ian Sommerville Software Engineering System Models Based on Software Engineering, 7 th Edition by Ian Sommerville Objectives To explain why the context of a system should be modeled as part of the RE process To describe

More information

Construction of Thai WordNet Lexical Database from Machine Readable Dictionaries

Construction of Thai WordNet Lexical Database from Machine Readable Dictionaries Construction of Thai WordNet Lexical Database from Machine Readable Dictionaries Patanakul Sathapornrungkij Department of Computer Science Faculty of Science, Mahidol University Rama6 Road, Ratchathewi

More information

TERMINOGRAPHY and LEXICOGRAPHY What is the difference? Summary. Anja Drame TermNet

TERMINOGRAPHY and LEXICOGRAPHY What is the difference? Summary. Anja Drame TermNet TERMINOGRAPHY and LEXICOGRAPHY What is the difference? Summary Anja Drame TermNet Summary/ Conclusion Variety of language (GPL = general purpose SPL = special purpose) Lexicography GPL SPL (special-purpose

More information

Syntactic Theory on Swedish

Syntactic Theory on Swedish Syntactic Theory on Swedish Mats Uddenfeldt Pernilla Näsfors June 13, 2003 Report for Introductory course in NLP Department of Linguistics Uppsala University Sweden Abstract Using the grammar presented

More information

Interactive Dynamic Information Extraction

Interactive Dynamic Information Extraction Interactive Dynamic Information Extraction Kathrin Eichler, Holmer Hemsen, Markus Löckelt, Günter Neumann, and Norbert Reithinger Deutsches Forschungszentrum für Künstliche Intelligenz - DFKI, 66123 Saarbrücken

More information

M3039 MPEG 97/ January 1998

M3039 MPEG 97/ January 1998 INTERNATIONAL ORGANISATION FOR STANDARDISATION ORGANISATION INTERNATIONALE DE NORMALISATION ISO/IEC JTC1/SC29/WG11 CODING OF MOVING PICTURES AND ASSOCIATED AUDIO INFORMATION ISO/IEC JTC1/SC29/WG11 M3039

More information

An Efficient Database Design for IndoWordNet Development Using Hybrid Approach

An Efficient Database Design for IndoWordNet Development Using Hybrid Approach An Efficient Database Design for IndoWordNet Development Using Hybrid Approach Venkatesh P rabhu 2 Shilpa Desai 1 Hanumant Redkar 1 N eha P rabhugaonkar 1 Apur va N agvenkar 1 Ramdas Karmali 1 (1) GOA

More information

International Journal of Scientific & Engineering Research, Volume 4, Issue 11, November-2013 5 ISSN 2229-5518

International Journal of Scientific & Engineering Research, Volume 4, Issue 11, November-2013 5 ISSN 2229-5518 International Journal of Scientific & Engineering Research, Volume 4, Issue 11, November-2013 5 INTELLIGENT MULTIDIMENSIONAL DATABASE INTERFACE Mona Gharib Mohamed Reda Zahraa E. Mohamed Faculty of Science,

More information

Programs are Knowledge Bases

Programs are Knowledge Bases Programs are Knowledge Bases Daniel Ratiu and Florian Deissenboeck Institut für Informatik, Technische Universität München Boltzmannstr. 3, D-85748 Garching b. München, Germany {ratiu deissenb}@in.tum.de

More information

Extracting user interests from search query logs: A clustering approach

Extracting user interests from search query logs: A clustering approach Extracting user interests from search query logs: A clustering approach Lyes Limam, David Coquil, Harald Kosch Fakultät für Mathematik und Informatik Universität Passau, Germany {limam,coquil,kosch}@dimis.fim.uni-passau.de

More information

Customizing an English-Korean Machine Translation System for Patent Translation *

Customizing an English-Korean Machine Translation System for Patent Translation * Customizing an English-Korean Machine Translation System for Patent Translation * Sung-Kwon Choi, Young-Gil Kim Natural Language Processing Team, Electronics and Telecommunications Research Institute,

More information

ON GETTING THE MOST OUT OF INTERNET RESOURCES TO RAISE TRANSLATION QUALITY OF PROFESSIONAL DOCUMENTATION

ON GETTING THE MOST OUT OF INTERNET RESOURCES TO RAISE TRANSLATION QUALITY OF PROFESSIONAL DOCUMENTATION General and Professional Education 3/2013 pp. 21-27 ISSN 2084-1469 ON GETTING THE MOST OUT OF INTERNET RESOURCES TO RAISE TRANSLATION QUALITY OF PROFESSIONAL DOCUMENTATION Svetlana Sheremetyeva Department

More information

A Workbench for Prototyping XML Data Exchange (extended abstract)

A Workbench for Prototyping XML Data Exchange (extended abstract) A Workbench for Prototyping XML Data Exchange (extended abstract) Renzo Orsini and Augusto Celentano Università Ca Foscari di Venezia, Dipartimento di Informatica via Torino 155, 30172 Mestre (VE), Italy

More information

Discourse Processing for Context Question Answering Based on Linguistic Knowledge

Discourse Processing for Context Question Answering Based on Linguistic Knowledge Discourse Processing for Context Question Answering Based on Linguistic Knowledge Mingyu Sun a,joycey.chai b a Department of Linguistics Michigan State University East Lansing, MI 48824 sunmingy@msu.edu

More information

Paraphrasing controlled English texts

Paraphrasing controlled English texts Paraphrasing controlled English texts Kaarel Kaljurand Institute of Computational Linguistics, University of Zurich kaljurand@gmail.com Abstract. We discuss paraphrasing controlled English texts, by defining

More information

Phase 2 of the D4 Project. Helmut Schmid and Sabine Schulte im Walde

Phase 2 of the D4 Project. Helmut Schmid and Sabine Schulte im Walde Statistical Verb-Clustering Model soft clustering: Verbs may belong to several clusters trained on verb-argument tuples clusters together verbs with similar subcategorization and selectional restriction

More information

THE BACHELOR S DEGREE IN SPANISH

THE BACHELOR S DEGREE IN SPANISH Academic regulations for THE BACHELOR S DEGREE IN SPANISH THE FACULTY OF HUMANITIES THE UNIVERSITY OF AARHUS 2007 1 Framework conditions Heading Title Prepared by Effective date Prescribed points Text

More information

MULTIFUNCTIONAL DICTIONARIES

MULTIFUNCTIONAL DICTIONARIES In: A. Zampolli, A. Capelli (eds., 1984): The possibilities and limits of the computer in producing and publishing dictionaries. Linguistica Computationale III, Pisa: Giardini, 279-288 MULTIFUNCTIONAL

More information

Problems with the current speling.org system

Problems with the current speling.org system Problems with the current speling.org system Jacob Sparre Andersen 22nd May 2005 Abstract We out-line some of the problems with the current speling.org system, as well as some ideas for resolving the problems.

More information

Semantic Analysis of. Tag Similarity Measures in. Collaborative Tagging Systems

Semantic Analysis of. Tag Similarity Measures in. Collaborative Tagging Systems Semantic Analysis of Tag Similarity Measures in Collaborative Tagging Systems 1 Ciro Cattuto, 2 Dominik Benz, 2 Andreas Hotho, 2 Gerd Stumme 1 Complex Networks Lagrange Laboratory (CNLL), ISI Foundation,

More information

Reading Listening and speaking Writing. Reading Listening and speaking Writing. Grammar in context: present Identifying the relevance of

Reading Listening and speaking Writing. Reading Listening and speaking Writing. Grammar in context: present Identifying the relevance of Acknowledgements Page 3 Introduction Page 8 Academic orientation Page 10 Setting study goals in academic English Focusing on academic study Reading and writing in academic English Attending lectures Studying

More information

Monitoring BPMN-Processes with Rules in a Distributed Environment

Monitoring BPMN-Processes with Rules in a Distributed Environment Monitoring BPMN-Processes with Rules in a Distributed Environment Lothar Hotz 1, Stephanie von Riegen 1, Lars Braubach 2, Alexander Pokahr 2, and Torsten Schwinghammer 3 1 HITeC e.v. c/o Fachbereich Informatik,

More information

A Knowledge Based Approach to Support Learning Technical Terminology *

A Knowledge Based Approach to Support Learning Technical Terminology * A Knowledge Based Approach to Support Learning Technical Terminology * Vania Dimitrova 1, Darina Dicheva 2, Paul Brna 1, John Self 1 1 Computer Based Learning Unit, University of Leeds, Leeds LS2 9JT UK

More information

Getting Off to a Good Start: Best Practices for Terminology

Getting Off to a Good Start: Best Practices for Terminology Getting Off to a Good Start: Best Practices for Terminology Technologies for term bases, term extraction and term checks Angelika Zerfass, zerfass@zaac.de Tools in the Terminology Life Cycle Extraction

More information

FLORIDA TEACHER STANDARDS for ESOL ENDORSEMENT 2010

FLORIDA TEACHER STANDARDS for ESOL ENDORSEMENT 2010 FLORIDA TEACHER STANDARDS for ESOL ENDORSEMENT 2010 Domain 1: Culture (Cross-Cultural Communications) Standard 1: Culture as a Factor in ELLs Learning Teachers will know and apply understanding of theories

More information

Effective Data Retrieval Mechanism Using AML within the Web Based Join Framework

Effective Data Retrieval Mechanism Using AML within the Web Based Join Framework Effective Data Retrieval Mechanism Using AML within the Web Based Join Framework Usha Nandini D 1, Anish Gracias J 2 1 ushaduraisamy@yahoo.co.in 2 anishgracias@gmail.com Abstract A vast amount of assorted

More information

CLARIN in Denmark European and Nordic perspectives

CLARIN in Denmark European and Nordic perspectives CLARIN in Denmark European and Nordic perspectives Hanne Fersøe University of Copenhagen Centre for Language Technology Copenhagen, Denmark hannef@hum.ku.dk Bente Maegaard University of Copenhagen Centre

More information

ANALYSIS OF LEXICO-SYNTACTIC PATTERNS FOR ANTONYM PAIR EXTRACTION FROM A TURKISH CORPUS

ANALYSIS OF LEXICO-SYNTACTIC PATTERNS FOR ANTONYM PAIR EXTRACTION FROM A TURKISH CORPUS ANALYSIS OF LEXICO-SYNTACTIC PATTERNS FOR ANTONYM PAIR EXTRACTION FROM A TURKISH CORPUS Gürkan Şahin 1, Banu Diri 1 and Tuğba Yıldız 2 1 Faculty of Electrical-Electronic, Department of Computer Engineering

More information

Adaptive Context-sensitive Analysis for JavaScript

Adaptive Context-sensitive Analysis for JavaScript Adaptive Context-sensitive Analysis for JavaScript Shiyi Wei and Barbara G. Ryder Department of Computer Science Virginia Tech Blacksburg, VA, USA {wei, ryder}@cs.vt.edu Abstract Context sensitivity is

More information

Critical Reading. English Language Arts Curriculum Framework. Revised 2010

Critical Reading. English Language Arts Curriculum Framework. Revised 2010 Critical Reading English Language Arts Curriculum Framework Revised 2010 Course Title: Critical Reading Course/Unit Credit: 1 Course Number: 419110 Teacher Licensure: Please refer to the Course Code Management

More information

A. Schedule: Reading, problem set #2, midterm. B. Problem set #1: Aim to have this for you by Thursday (but it could be Tuesday)

A. Schedule: Reading, problem set #2, midterm. B. Problem set #1: Aim to have this for you by Thursday (but it could be Tuesday) Lecture 5: Fallacies of Clarity Vagueness and Ambiguity Philosophy 130 September 23, 25 & 30, 2014 O Rourke I. Administrative A. Schedule: Reading, problem set #2, midterm B. Problem set #1: Aim to have

More information

icompilecorpora: A Web-based Application to Semi-automatically Compile Multilingual Comparable Corpora

icompilecorpora: A Web-based Application to Semi-automatically Compile Multilingual Comparable Corpora icompilecorpora: A Web-based Application to Semi-automatically Compile Multilingual Comparable Corpora Hernani Costa Gloria Corpas Pastor Miriam Seghiri University of Malaga University of Malaga University

More information

Syllabus: a list of items to be covered in a course / a set of headings. Language syllabus: language elements and linguistic or behavioral skills

Syllabus: a list of items to be covered in a course / a set of headings. Language syllabus: language elements and linguistic or behavioral skills Lexical Content and Organisation of a Language Course Syllabus: a list of items to be covered in a course / a set of headings Language syllabus: language elements and linguistic or behavioral skills Who

More information

Joint Steering Committee for Development of RDA

Joint Steering Committee for Development of RDA Page 1 of 11 To: From: Subject: Joint Steering Committee for Development of RDA Gordon Dunsire, Chair, JSC Technical Working Group RDA models for authority data Abstract This paper discusses the models

More information

An open and scalable framework for enriching ontologies with natural language content

An open and scalable framework for enriching ontologies with natural language content An open and scalable framework for enriching ontologies with natural language content Maria Teresa Pazienza and Armando Stellato AI Research Group, Dept. of Computer Science, Systems and Production University

More information

Secure semantic based search over cloud

Secure semantic based search over cloud Volume: 2, Issue: 5, 162-167 May 2015 www.allsubjectjournal.com e-issn: 2349-4182 p-issn: 2349-5979 Impact Factor: 3.762 Sarulatha.M PG Scholar, Dept of CSE Sri Krishna College of Technology Coimbatore,

More information

Ling 201 Syntax 1. Jirka Hana April 10, 2006

Ling 201 Syntax 1. Jirka Hana April 10, 2006 Overview of topics What is Syntax? Word Classes What to remember and understand: Ling 201 Syntax 1 Jirka Hana April 10, 2006 Syntax, difference between syntax and semantics, open/closed class words, all

More information

A Framework for Ontology-Based Knowledge Management System

A Framework for Ontology-Based Knowledge Management System A Framework for Ontology-Based Knowledge Management System Jiangning WU Institute of Systems Engineering, Dalian University of Technology, Dalian, 116024, China E-mail: jnwu@dlut.edu.cn Abstract Knowledge

More information

Hybrid Strategies. for better products and shorter time-to-market

Hybrid Strategies. for better products and shorter time-to-market Hybrid Strategies for better products and shorter time-to-market Background Manufacturer of language technology software & services Spin-off of the research center of Germany/Heidelberg Founded in 1999,

More information

Extraction and Visualization of Protein-Protein Interactions from PubMed

Extraction and Visualization of Protein-Protein Interactions from PubMed Extraction and Visualization of Protein-Protein Interactions from PubMed Ulf Leser Knowledge Management in Bioinformatics Humboldt-Universität Berlin Finding Relevant Knowledge Find information about Much

More information

Chapter ML:XI. XI. Cluster Analysis

Chapter ML:XI. XI. Cluster Analysis Chapter ML:XI XI. Cluster Analysis Data Mining Overview Cluster Analysis Basics Hierarchical Cluster Analysis Iterative Cluster Analysis Density-Based Cluster Analysis Cluster Evaluation Constrained Cluster

More information

Key words related to the foci of the paper: master s degree, essay, admission exam, graders

Key words related to the foci of the paper: master s degree, essay, admission exam, graders Assessment on the basis of essay writing in the admission to the master s degree in the Republic of Azerbaijan Natig Aliyev Mahabbat Akbarli Javanshir Orujov The State Students Admission Commission (SSAC),

More information

A generic approach for data integration using RDF, OWL and XML

A generic approach for data integration using RDF, OWL and XML A generic approach for data integration using RDF, OWL and XML Miguel A. Macias-Garcia, Victor J. Sosa-Sosa, and Ivan Lopez-Arevalo Laboratory of Information Technology (LTI) CINVESTAV-TAMAULIPAS Km 6

More information

Local Culture in Global English:

Local Culture in Global English: Local Culture in Global English: a case study of Kultur in Sprache / Sprachwissenschaft in Kulturwissenschaften Josef Schmied Chair English Language & Linguistics Chemnitz University of Technology www.tu-chemnitz.de/phil/english/linguist

More information

Cross-lingual Synonymy Overlap

Cross-lingual Synonymy Overlap Cross-lingual Synonymy Overlap Anca Dinu 1, Liviu P. Dinu 2, Ana Sabina Uban 2 1 Faculty of Foreign Languages and Literatures, University of Bucharest 2 Faculty of Mathematics and Computer Science, University

More information

Chapter 8 The Enhanced Entity- Relationship (EER) Model

Chapter 8 The Enhanced Entity- Relationship (EER) Model Chapter 8 The Enhanced Entity- Relationship (EER) Model Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 8 Outline Subclasses, Superclasses, and Inheritance Specialization

More information

Facilitating Business Process Discovery using Email Analysis

Facilitating Business Process Discovery using Email Analysis Facilitating Business Process Discovery using Email Analysis Matin Mavaddat Matin.Mavaddat@live.uwe.ac.uk Stewart Green Stewart.Green Ian Beeson Ian.Beeson Jin Sa Jin.Sa Abstract Extracting business process

More information

Intelligent Search for Answering Clinical Questions Coronado Group, Ltd. Innovation Initiatives

Intelligent Search for Answering Clinical Questions Coronado Group, Ltd. Innovation Initiatives Intelligent Search for Answering Clinical Questions Coronado Group, Ltd. Innovation Initiatives Search The Way You Think Copyright 2009 Coronado, Ltd. All rights reserved. All other product names and logos

More information

Data Deduplication in Slovak Corpora

Data Deduplication in Slovak Corpora Ľ. Štúr Institute of Linguistics, Slovak Academy of Sciences, Bratislava, Slovakia Abstract. Our paper describes our experience in deduplication of a Slovak corpus. Two methods of deduplication a plain

More information

An Iterative Method of Extracting Chinese ISA Relations for Ontology Learning

An Iterative Method of Extracting Chinese ISA Relations for Ontology Learning 870 JOURNAL OF COMPUTERS, VOL. 5, NO. 6, JUNE 2010 An Iterative Method of Extracting Chinese ISA Relations for Ontology Learning Lei Liu College of Applied Sciences, Beijing University of Technology, Beijing,

More information

Modern foreign languages

Modern foreign languages Modern foreign languages Programme of study for key stage 3 and attainment targets (This is an extract from The National Curriculum 2007) Crown copyright 2007 Qualifications and Curriculum Authority 2007

More information

Local Culture in Global English:

Local Culture in Global English: Local Culture in Global English: a case study of Kultur in Sprache / Sprachwissenschaft in Kulturwissenschaften Josef Schmied Chair English Language & Linguistics Chemnitz University of Technology www.tu-chemnitz.de

More information

Reading for Success : A Novel Study for Stuart Little by E.B. White. Common Core Standards Grades 5, 6, 7

Reading for Success : A Novel Study for Stuart Little by E.B. White. Common Core Standards Grades 5, 6, 7 Common Core Standards Copyright 2010. National Governors Association Center for Best Practices and Council of Chief State School Officers. All rights reserved. LESSON 1 2 3 4 5 6 7 8 9 Speaking and Listening:

More information

Combining Contextual Features for Word Sense Disambiguation

Combining Contextual Features for Word Sense Disambiguation Proceedings of the SIGLEX/SENSEVAL Workshop on Word Sense Disambiguation: Recent Successes and Future Directions, Philadelphia, July 2002, pp. 88-94. Association for Computational Linguistics. Combining

More information

A Mixed Trigrams Approach for Context Sensitive Spell Checking

A Mixed Trigrams Approach for Context Sensitive Spell Checking A Mixed Trigrams Approach for Context Sensitive Spell Checking Davide Fossati and Barbara Di Eugenio Department of Computer Science University of Illinois at Chicago Chicago, IL, USA dfossa1@uic.edu, bdieugen@cs.uic.edu

More information

Selecting a Taxonomy Management Tool. Wendi Pohs InfoClear Consulting #SLATaxo

Selecting a Taxonomy Management Tool. Wendi Pohs InfoClear Consulting #SLATaxo Selecting a Taxonomy Management Tool Wendi Pohs InfoClear Consulting #SLATaxo InfoClear Consulting What do we do? Content Analytics Strategy and Implementation, including: Taxonomy/Ontology development

More information

Glossary of translation tool types

Glossary of translation tool types Glossary of translation tool types Tool type Description French equivalent Active terminology recognition tools Bilingual concordancers Active terminology recognition (ATR) tools automatically analyze

More information

Natural Language to Relational Query by Using Parsing Compiler

Natural Language to Relational Query by Using Parsing Compiler Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 4, Issue. 3, March 2015,

More information

Picking them up and Figuring them out: Verb-Particle Constructions, Noise and Idiomaticity

Picking them up and Figuring them out: Verb-Particle Constructions, Noise and Idiomaticity Picking them up and Figuring them out: Verb-Particle Constructions, Noise and Idiomaticity Carlos Ramisch, Aline Villavicencio, Leonardo Moura and Marco Idiart Institute of Informatics, Federal University

More information

Object-Oriented Software Specification in Programming Language Design and Implementation

Object-Oriented Software Specification in Programming Language Design and Implementation Object-Oriented Software Specification in Programming Language Design and Implementation Barrett R. Bryant and Viswanathan Vaidyanathan Department of Computer and Information Sciences University of Alabama

More information

Skills for Effective Business Communication: Efficiency, Collaboration, and Success

Skills for Effective Business Communication: Efficiency, Collaboration, and Success Skills for Effective Business Communication: Efficiency, Collaboration, and Success Michael Shorenstein Center for Communication Kennedy School of Government Harvard University September 30, 2014 I: Introduction

More information

Corpus and Discourse. The Web As Corpus. Theory and Practice MARISTELLA GATTO LONDON NEW DELHI NEW YORK SYDNEY

Corpus and Discourse. The Web As Corpus. Theory and Practice MARISTELLA GATTO LONDON NEW DELHI NEW YORK SYDNEY Corpus and Discourse The Web As Corpus Theory and Practice MARISTELLA GATTO B L O O M S B U R Y LONDON NEW DELHI NEW YORK SYDNEY Contents List of Figures xiii List of Tables xvii Preface xix Acknowledgements

More information

Mining a Change-Based Software Repository

Mining a Change-Based Software Repository Mining a Change-Based Software Repository Romain Robbes Faculty of Informatics University of Lugano, Switzerland 1 Introduction The nature of information found in software repositories determines what

More information

DISA at ImageCLEF 2014: The search-based solution for scalable image annotation

DISA at ImageCLEF 2014: The search-based solution for scalable image annotation DISA at ImageCLEF 2014: The search-based solution for scalable image annotation Petra Budikova, Jan Botorek, Michal Batko, and Pavel Zezula Masaryk University, Brno, Czech Republic {budikova,botorek,batko,zezula}@fi.muni.cz

More information

Efficient Techniques for Improved Data Classification and POS Tagging by Monitoring Extraction, Pruning and Updating of Unknown Foreign Words

Efficient Techniques for Improved Data Classification and POS Tagging by Monitoring Extraction, Pruning and Updating of Unknown Foreign Words , pp.290-295 http://dx.doi.org/10.14257/astl.2015.111.55 Efficient Techniques for Improved Data Classification and POS Tagging by Monitoring Extraction, Pruning and Updating of Unknown Foreign Words Irfan

More information

Monitoring BPMN-Processes with Rules in a Distributed Environment

Monitoring BPMN-Processes with Rules in a Distributed Environment Monitoring BPMN-Processes with Rules in a Distributed Environment Lothar Hotz 1, Stephanie von Riegen 1, Lars Braubach 2, Alexander Pokahr 2, and Torsten Schwinghammer 3 1 HITeC e.v. c/o Fachbereich Informatik,

More information