LEXUS: a web based lexicon tool
|
|
|
- Merry Underwood
- 10 years ago
- Views:
Transcription
1 LEXUS: a web based lexicon tool Jacquelijn Ringersma Max Planck Institute for Psycholinguistics Nijmegen, The Netherlands
2 Content Max Planck Institute Archive of linguistic resources Tool support (archiving software and enrichment software) LEXUS and ViCoS Interdisciplinary software development challenges and problems
3 Max Planck Institute for psycholinguistics Max Planck Gesellschaft 78 research institutes (Germany) 3 outside Germany: 2 Italy (art) 1 The Netherlands (psycholinguistics) The study of mental processes involved in language production, language comprehension and language acquisition, as well as the relation between language, thought, and culture
4 Max Planck Institute for psycholinguistics Archive for linguistic resources Different types of linguistic material: Endangered languages archive, the European second learner corpus, the National Corpus of Spoken Dutch, gesture corpora, acquisition corpora and language documentation corpora More than objects, 25 Tb data: digitized audio and video images annotations Included formats: o.a. XML, HTML, Chat, Toolbox, PDF, Wav, Mpeg1,2,4 Organization: Metadata descriptions, data base Access via the Internet: Meta data search & content search access to these resources is limited and can be made available upon request
5 Documentation of endangered languages DoBeS = Dokumentation Bedrohter Sprachen DoBeS has two major pillars: language documentation by experienced teams to preserve part of cultural heritage and to help in revitalization where possible creating an organized, accessible and persistent archive
6 Archive Content: Yélî Dnye (Rossell Island) Multimedia Lexicon Described Corpus Typed Relations within the Lexicon Photos Annotated Media
7 Tool Support Archiving: IMDI, LAMUS, AMS Data enrichment: ELAN, Synpathy, ADDIT, ANNEX, LEXUS More Language archiving tools:
8 From documentation to exploitation So: now what? Languages have been documented Video and audio is stored in the archive (Part of) the material is annotated Regional archives have been installed at some 10 locations to return the material to the speech communities So, now: exploitation Language is more than video, audio, annotations and lexica Language represents worlds of concepts
9 LEXUS - Lexicon tool LEXUS Web based lexicon tool Based on the ISO recommendations for linguistic resources LMF : Linguistic Markup Framework (lexicon structure) DCR: Data Category Registries (concept naming) LMF/DCR: a modular structure for content interoperability between (all aspects) of lexical resources. ViCoS in LEXUS Accessible conceptual spaces
10 LEXUS - Lexicon tool Creation of lexica from scratch, import lexica from other formats
11 LEXUS - Lexicon tool Creation of lexica from scratch, import lexica from other formats User defined view of the information in the lexical entries
12 LEXUS - Lexicon tool Creation of lexica from scratch, import lexica from other formats User defined view of the information in the lexical entries Linking multi-media fragments to lexical entries
13 LEXUS - Lexicon tool Creation of lexica from scratch, import lexica from other formats User defined view of the information in the lexical entries Linking multi-media fragments to lexical entries Creation of links in images
14 LEXUS - Lexicon tool Link to: kauo e mei terminal bud (female)
15 LEXUS - Lexicon tool Creation of lexica from scratch, import lexica from other formats User defined view of the information in the lexical entries Linking multi-media fragments to lexical entries Creation of links in images Link to resources within the digital archive (or other external web-based resources) interaction with other archiving tools
16 LEXUS - further developments Towards a multi-media dictionary of the Marquesan and Tuamotuan languages of French polynesia Building a digital multi-media encyclopedic dictionary with LEXUS Improving basic LEXUS functionalities Conceptual spaces Improved User Interface Project team: Linguist team (Gablitz, Mosel) Developers (Kemps, Zinn, Alcock) Speech community (Kape, Guillome, Tetahiotupa, Tahia, Mataiki, Bruneau Pati)
17 LEXUS - further developments Towards a multi-media dictionary of the Marquesan and Tuamotuan languages of French polynesia Building a digital multi-media encyclopedic dictionary with LEXUS Improving basic LEXUS functionalities Conceptual spaces Improved User Interface Aim: Speech community input and extensions Community based instance of the lexicon
18 LEXUS - further developments Project workflow Joint action linguist and speech community Field work Lexicon creation Data archiving and annotation Definition of SW requirements Creation of MM lexicon all Developers Lexus basic functionalities Lexicon import Further developments of LEXUS
19 LEXUS - further developments Issues that came up: User Interface Conceptual spaces in multi media encyclopedia Collaborative workspaces
20 User Interface LEXUS - further developments User wants to enter the lexicon through the lexical entries, either by from the listed lexicon or by search :
21 LEXUS - further developments Conceptual spaces in multi media encyclopedia Conventional paper dictionaries: network of meanings less visible Paper dictionaries limited usefulness in language maintenance and language revival (Manning et al., 2000) Members of speech community prefer following semantic links of different semantic types (synonyms, antonyms, lexical, taxonomies)
22 LEXUS - further developments Conceptual spaces in multi media encyclopedia
23 ViCoS Vizualizing Conceptual Spaces Complement lexical spaces with ontological spaces Allow users to construct a space of culturally relevant concepts Concepts as centres for all sorts of information relations to other concepts anchored in the language to express them linked to multimedia archive to describe them
24 ViCoS
25 ViCoS Show ViCoS demo
26 Interdisciplinary software development challenges and problems Our challenge: Design a product that fits the needs of the SC and thus contribute to maintain and possible revitalize a documented language and consequently present and preserve the cultural heritage More practical: Simple user interface for a complex tool is it possible? Collaborative workspaces to work in a Wiki-like manner
27 Interdisciplinary software development challenges and problems So, what do we encounter: Interesting project and collaboration, but NOT easy: Need to bridge the concept gap Communication over distances Different expectations different (sub)-goals Software limitations of an online tool IPR between developer team and linguist team IPR between speech community and linguist team
28 Interdisciplinary software development challenges and problems Is there a positive conclusion? Interaction opens worlds First reactions on concept UI and ViCoS from SC are positive First experience of SC and LS is useful for the development of ViCoS More DoBeS projects are interested in using LEXUS as an exploitation tool We invite documentation teams to discuss their options in using LEXUS and ViCoS Acknowledgements: Thanks to Gaby Cablitz, Jean Kape, Guillome Taimana for their contributions
The Language Archive at the Max Planck Institute for Psycholinguistics. Alexander König (with thanks to J. Ringersma)
The Language Archive at the Max Planck Institute for Psycholinguistics Alexander König (with thanks to J. Ringersma) Fourth SLCN Workshop, Berlin, December 2010 Content 1.The Language Archive Why Archiving?
Technology in language documentation
Technology in language documentation Jacquelijn Ringersma Max Planck Institute for Psycholinguistics Documenting oral traditions in the non-western world Language (archiving) technology Language documentation:
Sustainable Solutions for Endangered Languages Data: The Language Archive
Charting Vanishing Voices: A Collaborative Workshop to Map Endangered Oral Cultures World Oral Literature Project 2012 Workshop CRASSH, Cambridge Sustainable Solutions for Endangered Languages Data: The
LAMUS & LAT Archiving software
LAMUS & LAT Archiving software Daan Broeder Max-Planck Institute for Psycholinguistics The Language Archive Max Planck Institute for Psycholinguistics Nijmegen, The Netherlands The Language Archive - 2011
The Rise of Documentary Linguistics and a New Kind of Corpus
The Rise of Documentary Linguistics and a New Kind of Corpus Gary F. Simons SIL International 5th National Natural Language Research Symposium De La Salle University, Manila, 25 Nov 2008 Milestones in
Central and South-East European Resources in META-SHARE
Central and South-East European Resources in META-SHARE Tamás VÁRADI 1 Marko TADIĆ 2 (1) RESERCH INSTITUTE FOR LINGUISTICS, MTA, Budapest, Hungary (2) FACULTY OF HUMANITIES AND SOCIAL SCIENCES, ZAGREB
PDF hosted at the Radboud Repository of the Radboud University Nijmegen
PDF hosted at the Radboud Repository of the Radboud University Nijmegen The following full text is a publisher's version. For additional information about this publication click this link. http://hdl.handle.net/2066/60933
How To Create A Clarin Metadata Infrastructure
Creating & Testing CLARIN Metadata Components Folkert de Vriend (1), Daan Broeder (2), Griet Depoorter (3), Laura van Eerten (3), Dieter van Uytvanck (2) 1) Meertens Institute Joan Muyskenweg 25, Amsterdam,
ANNEX - Annotation Explorer
ANNEX - Annotation Explorer Version 1.6 This manual was last updated in November 2014. The latest version can be found at: http://tla.mpi.nl/tools/tla-tools/annex/ Francesca Bechis Elisa Gorgaini The Language
CLARIN: Common Language Resources and Technology Infrastructure
CLARIN: Common Language Resources and Technology Infrastructure Tamás Váradi, Peter Wittenburg, Steven Krauwer, Martin Wynne, Kimmo Koskenniemi Hungarian Academy of Sciences (Budapest), MPI for Psycholinguistics
Computerized Language Analysis (CLAN) from The CHILDES Project
Vol. 1, No. 1 (June 2007), pp. 107 112 http://nflrc.hawaii.edu/ldc/ Computerized Language Analysis (CLAN) from The CHILDES Project Reviewed by FELICITY MEAKINS, University of Melbourne CLAN is an annotation
Annotation in Language Documentation
Annotation in Language Documentation Univ. Hamburg Workshop Annotation SEBASTIAN DRUDE 2015-10-29 Topics 1. Language Documentation 2. Data and Annotation (theory) 3. Types and interdependencies of Annotations
BUSINESS VALUE OF SEMANTIC TECHNOLOGY
BUSINESS VALUE OF SEMANTIC TECHNOLOGY Preliminary Findings Industry Advisory Council Emerging Technology (ET) SIG Information Sharing & Collaboration Committee July 15, 2005 Mills Davis Managing Director
ENTERPRISE DOCUMENTS & RECORD MANAGEMENT
ENTERPRISE DOCUMENTS & RECORD MANAGEMENT DOCWAY PLATFORM ENTERPRISE DOCUMENTS & RECORD MANAGEMENT 1 DAL SITO WEB OLD XML DOCWAY DETAIL DOCWAY Platform, based on ExtraWay Technology Native XML Database,
Elan. Complex annotations of video and audio resources Multiple annotation tiers, hierarchically structured Search multiple coded files
Elan Complex annotations of video and audio resources Multiple annotation tiers, hierarchically structured Search multiple coded files Elan sources of information Developed by Max Planck Institute for
Component MetaData Infrastructure
It s fun to play with the Component MetaData Infrastructure Using component metadata Dieter Van Uytvanck Max Planck Institute for Psycholinguistics [email protected] Overview Traditional metadata
SPRING SCHOOL. Empirical methods in Usage-Based Linguistics
SPRING SCHOOL Empirical methods in Usage-Based Linguistics University of Lille 3, 13 & 14 May, 2013 WORKSHOP 1: Corpus linguistics workshop WORKSHOP 2: Corpus linguistics: Multivariate Statistics for Semantics
EXMARaLDA and the FOLK tools two toolsets for transcribing and annotating spoken language
EXMARaLDA and the FOLK tools two toolsets for transcribing and annotating spoken language Thomas Schmidt Institut für Deutsche Sprache, Mannheim R 5, 6-13 D-68161 Mannheim [email protected]
WebLicht: Web-based LRT services for German
WebLicht: Web-based LRT services for German Erhard Hinrichs, Marie Hinrichs, Thomas Zastrow Seminar für Sprachwissenschaft, University of Tübingen [email protected] Abstract This software
Processing: current projects and research at the IXA Group
Natural Language Processing: current projects and research at the IXA Group IXA Research Group on NLP University of the Basque Country Xabier Artola Zubillaga Motivation A language that seeks to survive
CLARIN project DiscAn :
CLARIN project DiscAn : Towards a Discourse Annotation system for Dutch language corpora Ted Sanders Kirsten Vis Utrecht Institute of Linguistics Utrecht University Daan Broeder TLA Max-Planck Institute
OpenText Content Hub for Publishers
OpenText Content Hub for Publishers For managing content across all your publishing channels July 2011 TOGETHER, WE ARE THE CONTENT EXPERTS WHITEPAPER 1 What is OpenText Content Hub for Publishers? OpenText
Survey Results: Requirements and Use Cases for Linguistic Linked Data
Survey Results: Requirements and Use Cases for Linguistic Linked Data 1 Introduction This survey was conducted by the FP7 Project LIDER (http://www.lider-project.eu/) as input into the W3C Community Group
The Knowledge Sharing Infrastructure KSI. Steven Krauwer
The Knowledge Sharing Infrastructure KSI Steven Krauwer 1 Why a KSI? Building or using a complex installation requires specialized skills and expertise. CLARIN is no exception. CLARIN is populated with
DATA MANAGEMENT PLAN DELIVERABLE NUMBER RESPONSIBLE AUTHOR. Co- funded by the Horizon 2020 Framework Programme of the European Union
DATA MANAGEMENT PLAN Co- funded by the Horizon 2020 Framework Programme of the European Union DELIVERABLE NUMBER DELIVERABLE TITLE D7.4 Data Management Plan RESPONSIBLE AUTHOR DFKI GRANT AGREEMENT N. PROJECT
STEPS IN LANGUAGE DOCUMENTATION AND REVITALIZATION JACK MARTIN NICK THIEBERGER
STEPS IN LANGUAGE DOCUMENTATION AND REVITALIZATION JACK MARTIN NICK THIEBERGER Steps in Documentation Steps in Community Relations STEPS IN LANGUAGE DOCUMENTATION Ideally: as rich as possible a set of
Carla Simões, [email protected]. Speech Analysis and Transcription Software
Carla Simões, [email protected] Speech Analysis and Transcription Software 1 Overview Methods for Speech Acoustic Analysis Why Speech Acoustic Analysis? Annotation Segmentation Alignment Speech Analysis
Talend Metadata Manager. Reduce Risk and Friction in your Information Supply Chain
Talend Metadata Manager Reduce Risk and Friction in your Information Supply Chain Talend Metadata Manager Talend Metadata Manager provides a comprehensive set of capabilities for all facets of metadata
2009-06-03. What objects must be associable with an identifier? 1 Catch plus: continuous access to cultural heritage plus http://www.catchplus.
Persistent Identifiers Hennie Brugman Technical coordinator CATCH plus project 1 Max-Planck-Institute for Psycholinguistics, Nijmegen, Netherlands Institute for Sound and Vision, Hilversum, Netherland
CoLang 2014 Data Management and Archiving Course. Session 2. Nick Thieberger University of Melbourne
CoLang 2014 Data Management and Archiving Course Session 2 Nick Thieberger University of Melbourne Quiz In a morning recording session you recorded two speakers, each telling a story, then recorded your
User Guide for ELAN Linguistic Annotator
User Guide for ELAN Linguistic Annotator version 4.1.0 This user guide was last updated on 2013-10-07 The latest version can be downloaded from: http://tla.mpi.nl/tools/tla-tools/elan/ Author: Maddalena
Checklist and guidance for a Data Management Plan
Checklist and guidance for a Data Management Plan Please cite as: DMPTuuli-project. (2016). Checklist and guidance for a Data Management Plan. v.1.0. Available online: https://wiki.helsinki.fi/x/dzeacw
CLARIN-NL Third Call: Closed Call
CLARIN-NL Third Call: Closed Call CLARIN-NL launches in its third call a Closed Call for project proposals. This called is only open for researchers who have been explicitly invited to submit a project
Preserving French Scientific data
Preserving French Scientific data Marion MASSOL (CINES) [email protected] DARIAH General VCC Meeting November 28 th, 29 th, 30 th 2012 AGENDA 1. Preserving data: our mission and strategy 4. The file
Making Content Easy to Find. DC2010 Pittsburgh, PA Betsy Fanning AIIM
Making Content Easy to Find DC2010 Pittsburgh, PA Betsy Fanning AIIM Who is AIIM? The leading industry association representing professionals working in Enterprise Content Management (ECM). We offer a
Language Documentation and Description
Language Documentation and Description ISSN 1740-6234 This article appears in: Language Documentation and Description, vol 12: Special Issue on Language Documentation and Archiving. Editors: David Nathan
Giuseppe Riccardi, Marco Ronchetti. University of Trento
Giuseppe Riccardi, Marco Ronchetti University of Trento 1 Outline Searching Information Next Generation Search Interfaces Needle E-learning Application Multimedia Docs Indexing, Search and Presentation
Amit Sheth & Ajith Ranabahu, 2010. Presented by Mohammad Hossein Danesh
Amit Sheth & Ajith Ranabahu, 2010 Presented by Mohammad Hossein Danesh 1 Agenda Introduction to Cloud Computing Research Motivation Semantic Modeling Can Help Use of DSLs Solution Conclusion 2 3 Motivation
A Short Introduction to Transcribing with ELAN. Ingrid Rosenfelder Linguistics Lab University of Pennsylvania
A Short Introduction to Transcribing with ELAN Ingrid Rosenfelder Linguistics Lab University of Pennsylvania January 2011 Contents 1 Source 2 2 Opening files for annotation 2 2.1 Starting a new transcription.....................
SURFsara Data Services
SURFsara Data Services SUPPORTING DATA-INTENSIVE SCIENCES Mark van de Sanden The world of the many Many different users (well organised (international) user communities, research groups, universities,
Technical concepts of kopal. Tobias Steinke, Deutsche Nationalbibliothek June 11, 2007, Berlin
Technical concepts of kopal Tobias Steinke, Deutsche Nationalbibliothek June 11, 2007, Berlin 1 Overview Project kopal Ideas Organisation Results Technical concepts DIAS kolibri Models of reusability 2
The challenges of becoming a Trusted Digital Repository
The challenges of becoming a Trusted Digital Repository Annemieke de Jong is Preservation Officer at the Netherlands Institute for Sound and Vision (NISV) in Hilversum. She is responsible for setting out
How To Manage Your Digital Assets On A Computer Or Tablet Device
In This Presentation: What are DAMS? Terms Why use DAMS? DAMS vs. CMS How do DAMS work? Key functions of DAMS DAMS and records management DAMS and DIRKS Examples of DAMS Questions Resources What are DAMS?
Information and documentation The Dublin Core metadata element set
ISO TC 46/SC 4 N515 Date: 2003-02-26 ISO 15836:2003(E) ISO TC 46/SC 4 Secretariat: ANSI Information and documentation The Dublin Core metadata element set Information et documentation Éléments fondamentaux
Adding Robust Digital Asset Management to Oracle s Storage Archive Manager (SAM)
Adding Robust Digital Asset Management to Oracle s Storage Archive Manager (SAM) Oracle's Sun Storage Archive Manager (SAM) self-protecting file system software reduces operating costs by providing data
Multilingual, Multiperson, Multimedia: Linking Audio-Visual with Text Material in Language Documentation.
Multilingual, Multiperson, Multimedia: Linking Audio-Visual with Text Material in Language Documentation. Patrick McConvell AIATSIS 1. Introduction Language documentation for endangered and Indigenous
Essentials of Language Documentation
Essentials of Language Documentation Trends in Linguistics Studies and Monographs 178 Editors Walter Bisang Hans Henrich Hock Werner Winter Mouton de Gruyter Berlin New York Essentials of Language Documentation
Bridging CAQDAS with text mining: Text analyst s toolbox for Big Data: Science in the Media Project
Bridging CAQDAS with text mining: Text analyst s toolbox for Big Data: Science in the Media Project Ahmet Suerdem Istanbul Bilgi University; LSE Methodology Dept. Science in the media project is funded
EUDAT. Towards a pan-european Collaborative Data Infrastructure. Willem Elbers
EUDAT Towards a pan-european Collaborative Data Infrastructure Willem Elbers EUDAT / MPI-TLA Focus meeting: Data repositories SURF, Utrecht March 3, 2014 Outline EUDAT project EUDAT services Summary and
M3039 MPEG 97/ January 1998
INTERNATIONAL ORGANISATION FOR STANDARDISATION ORGANISATION INTERNATIONALE DE NORMALISATION ISO/IEC JTC1/SC29/WG11 CODING OF MOVING PICTURES AND ASSOCIATED AUDIO INFORMATION ISO/IEC JTC1/SC29/WG11 M3039
What is Multimedia? Derived from the word Multi and Media
What is Multimedia? Derived from the word Multi and Media Multi Many, Multiple, Media Tools that is used to represent or do a certain things, delivery medium, a form of mass communication newspaper, magazine
E-Content Service Group Virtual Meeting. Digital Preservation: How to Get Started
E-Content Service Group Virtual Meeting Digital Preservation: How to Get Started Slide 2 of 17 Agenda Committee Members E-Content Purpose Presentation by Greg Zick Questions Discussion for May Meeting
www.coveo.com Unifying Search for the Desktop, the Enterprise and the Web
wwwcoveocom Unifying Search for the Desktop, the Enterprise and the Web wwwcoveocom Why you need Coveo Enterprise Search Quickly find documents scattered across your enterprise network Coveo is actually
MASTER OF PHILOSOPHY IN ENGLISH AND APPLIED LINGUISTICS
University of Cambridge: Programme Specifications Every effort has been made to ensure the accuracy of the information in this programme specification. Programme specifications are produced and then reviewed
What Does Interoperability Mean, Anyway? Toward an Operational Definition of Interoperability for Language Technology
What Does Interoperability Mean, Anyway? Toward an Operational Definition of Interoperability for Language Technology Nancy Ide Department of Computer Science Vassar College [email protected] James Pustejovsky
Research Network and Database System (FuD)
Research Network and Database System (FuD) at the Collaborative Research Centre 600 (CRC 600) Strangers and Poor People. Changing Patterns of Inclusion and Exclusion from Classical Antiquity to the Present
DICOM Conformance Statement FORUM
DICOM Conformance Statement FORUM Version 3.1 Carl Zeiss Meditec AG Goeschwitzerstraße 51-52 07745 Jena Germany www.meditec.zeiss.com Document: DICOM Conformance Statement_FORUM_3.1.doc Page 1 of 25 1
ENABLING SEMANTIC SEARCH IN STRUCTURED P2P NETWORKS VIA DISTRIBUTED DATABASES AND WEB SERVICES
ENABLING SEMANTIC SEARCH IN STRUCTURED P2P NETWORKS VIA DISTRIBUTED DATABASES AND WEB SERVICES Maria Teresa Andrade FEUP / INESC Porto [email protected] ; [email protected] http://www.fe.up.pt/~mandrade/
Akoma Ntoso an open document standard for Parliaments
Akoma Ntoso an open document standard for Parliaments Monica Palmirani Associate Professor of Legal Informatics Law School CIRSFID University of Bologna Fabio Vitali Associate Professor of Computer Science
Presentation fiche: ESCO, the forthcoming European Skills, Competencies and Occupations taxonomy
EUROPEAN COMMISSION Employment, Social Affairs and Equal Opportunities DG Employment, Lisbon Strategy, International Affairs Employment Services, Mobility Brussels, 18 January 2010 EMPL D-3/LK D(2009)
Master of Arts in Linguistics Syllabus
Master of Arts in Linguistics Syllabus Applicants shall hold a Bachelor s degree with Honours of this University or another qualification of equivalent standard from this University or from another university
KNOWLEDGE ORGANIZATION
KNOWLEDGE ORGANIZATION Gabi Reinmann Germany [email protected] Synonyms Information organization, information classification, knowledge representation, knowledge structuring Definition The term
Standards Development. PROS 14/00x Specification 3: Long term preservation formats
Standards Development PROS 14/00x Specification 3: Long term preservation formats 1 2 Copyright Statement State of Victoria 2014 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 This work is licensed
BYODs & FAIR Data Stewardship
BYODs & FAIR Data Stewardship Luiz Olavo Bonino [email protected] www.elixir-europe.org Summary FAIR Data stewardship Approach in NL BYOD FAIR Data tooling ecosystem Way of working (FAIR) Data Stewardship
CERN Document Server
CERN Document Server Document Management System for Grey Literature in Networked Environment Martin Vesely CERN Geneva, Switzerland GL5, December 4-5, 2003 Amsterdam, The Netherlands Overview Searching
Digital libraries of the future and the role of libraries
Digital libraries of the future and the role of libraries Donatella Castelli ISTI-CNR, Pisa, Italy Abstract Purpose: To introduce the digital libraries of the future, their enabling technologies and their
