LEXUS: a web based lexicon tool Jacquelijn Ringersma Max Planck Institute for Psycholinguistics Nijmegen, The Netherlands
Content Max Planck Institute Archive of linguistic resources Tool support (archiving software and enrichment software) LEXUS and ViCoS Interdisciplinary software development challenges and problems
Max Planck Institute for psycholinguistics Max Planck Gesellschaft 78 research institutes (Germany) 3 outside Germany: 2 Italy (art) 1 The Netherlands (psycholinguistics) The study of mental processes involved in language production, language comprehension and language acquisition, as well as the relation between language, thought, and culture
Max Planck Institute for psycholinguistics Archive for linguistic resources Different types of linguistic material: Endangered languages archive, the European second learner corpus, the National Corpus of Spoken Dutch, gesture corpora, acquisition corpora and language documentation corpora More than 230.000 objects, 25 Tb data: digitized audio and video images annotations Included formats: o.a. XML, HTML, Chat, Toolbox, PDF, Wav, Mpeg1,2,4 Organization: Metadata descriptions, data base Access via the Internet: Meta data search & content search access to these resources is limited and can be made available upon request
Documentation of endangered languages DoBeS = Dokumentation Bedrohter Sprachen DoBeS has two major pillars: language documentation by experienced teams to preserve part of cultural heritage and to help in revitalization where possible creating an organized, accessible and persistent archive
Archive Content: Yélî Dnye (Rossell Island) Multimedia Lexicon Described Corpus Typed Relations within the Lexicon Photos Annotated Media
Tool Support Archiving: IMDI, LAMUS, AMS Data enrichment: ELAN, Synpathy, ADDIT, ANNEX, LEXUS More Language archiving tools: www.lat-mpi.eu
From documentation to exploitation So: now what? Languages have been documented Video and audio is stored in the archive (Part of) the material is annotated Regional archives have been installed at some 10 locations to return the material to the speech communities So, now: exploitation Language is more than video, audio, annotations and lexica Language represents worlds of concepts
LEXUS - Lexicon tool LEXUS Web based lexicon tool Based on the ISO recommendations for linguistic resources LMF : Linguistic Markup Framework (lexicon structure) DCR: Data Category Registries (concept naming) LMF/DCR: a modular structure for content interoperability between (all aspects) of lexical resources. ViCoS in LEXUS Accessible conceptual spaces
LEXUS - Lexicon tool Creation of lexica from scratch, import lexica from other formats
LEXUS - Lexicon tool Creation of lexica from scratch, import lexica from other formats User defined view of the information in the lexical entries
LEXUS - Lexicon tool Creation of lexica from scratch, import lexica from other formats User defined view of the information in the lexical entries Linking multi-media fragments to lexical entries
LEXUS - Lexicon tool Creation of lexica from scratch, import lexica from other formats User defined view of the information in the lexical entries Linking multi-media fragments to lexical entries Creation of links in images
LEXUS - Lexicon tool Link to: kauo e mei terminal bud (female)
LEXUS - Lexicon tool Creation of lexica from scratch, import lexica from other formats User defined view of the information in the lexical entries Linking multi-media fragments to lexical entries Creation of links in images Link to resources within the digital archive (or other external web-based resources) interaction with other archiving tools
LEXUS - further developments Towards a multi-media dictionary of the Marquesan and Tuamotuan languages of French polynesia Building a digital multi-media encyclopedic dictionary with LEXUS Improving basic LEXUS functionalities Conceptual spaces Improved User Interface Project team: Linguist team (Gablitz, Mosel) Developers (Kemps, Zinn, Alcock) Speech community (Kape, Guillome, Tetahiotupa, Tahia, Mataiki, Bruneau Pati)
LEXUS - further developments Towards a multi-media dictionary of the Marquesan and Tuamotuan languages of French polynesia Building a digital multi-media encyclopedic dictionary with LEXUS Improving basic LEXUS functionalities Conceptual spaces Improved User Interface Aim: Speech community input and extensions Community based instance of the lexicon
LEXUS - further developments Project workflow Joint action linguist and speech community Field work Lexicon creation Data archiving and annotation Definition of SW requirements Creation of MM lexicon all Developers Lexus basic functionalities Lexicon import Further developments of LEXUS
LEXUS - further developments Issues that came up: User Interface Conceptual spaces in multi media encyclopedia Collaborative workspaces
User Interface LEXUS - further developments User wants to enter the lexicon through the lexical entries, either by from the listed lexicon or by search :
LEXUS - further developments Conceptual spaces in multi media encyclopedia Conventional paper dictionaries: network of meanings less visible Paper dictionaries limited usefulness in language maintenance and language revival (Manning et al., 2000) Members of speech community prefer following semantic links of different semantic types (synonyms, antonyms, lexical, taxonomies)
LEXUS - further developments Conceptual spaces in multi media encyclopedia
ViCoS Vizualizing Conceptual Spaces Complement lexical spaces with ontological spaces Allow users to construct a space of culturally relevant concepts Concepts as centres for all sorts of information relations to other concepts anchored in the language to express them linked to multimedia archive to describe them
ViCoS
ViCoS Show ViCoS demo
Interdisciplinary software development challenges and problems Our challenge: Design a product that fits the needs of the SC and thus contribute to maintain and possible revitalize a documented language and consequently present and preserve the cultural heritage More practical: Simple user interface for a complex tool is it possible? Collaborative workspaces to work in a Wiki-like manner
Interdisciplinary software development challenges and problems So, what do we encounter: Interesting project and collaboration, but NOT easy: Need to bridge the concept gap Communication over distances Different expectations different (sub)-goals Software limitations of an online tool IPR between developer team and linguist team IPR between speech community and linguist team
Interdisciplinary software development challenges and problems Is there a positive conclusion? Interaction opens worlds First reactions on concept UI and ViCoS from SC are positive First experience of SC and LS is useful for the development of ViCoS More DoBeS projects are interested in using LEXUS as an exploitation tool We invite documentation teams to discuss their options in using LEXUS and ViCoS Acknowledgements: Thanks to Gaby Cablitz, Jean Kape, Guillome Taimana for their contributions