Dictionary Building with the Jibiki Platform: Software Demonstration

Similar documents
Mobility management and vertical handover decision making in heterogeneous wireless networks

ibalance-abf: a Smartphone-Based Audio-Biofeedback Balance System

A graph based framework for the definition of tools dealing with sparse and irregular distributed data-structures

A usage coverage based approach for assessing product family design

Papillon Lexical Database Project: Monolingual Dictionaries and Interlingual Links

Multilingual Aligned Corpora From Movie Subtitles

Faut-il des cyberarchivistes, et quel doit être leur profil professionnel?

Online Generic Editing of Heterogeneous Dictionary Entries in Papillon Project

QASM: a Q&A Social Media System Based on Social Semantics

Study on Cloud Service Mode of Agricultural Information Institutions

ANIMATED PHASE PORTRAITS OF NONLINEAR AND CHAOTIC DYNAMICAL SYSTEMS

Expanding Renewable Energy by Implementing Demand Response

Managing Risks at Runtime in VoIP Networks and Services

Overview of model-building strategies in population PK/PD analyses: literature survey.

Discussion on the paper Hypotheses testing by convex optimization by A. Goldenschluger, A. Juditsky and A. Nemirovski.

Online vehicle routing and scheduling with continuous vehicle tracking

Towards Unified Tag Data Translation for the Internet of Things

CLIR-Based Collaborative Construction of a Multilingual Terminological Dictionary for Cultural Resources

FP-Hadoop: Efficient Execution of Parallel Jobs Over Skewed Data

Use of tabletop exercise in industrial training disaster.

An update on acoustics designs for HVAC (Engineering)

Testing Web Services for Robustness: A Tool Demo

Global Identity Management of Virtual Machines Based on Remote Secure Elements

VR4D: An Immersive and Collaborative Experience to Improve the Interior Design Process

Papillon Lexical Database Project

Minkowski Sum of Polytopes Defined by Their Vertices

A model driven approach for bridging ILOG Rule Language and RIF

Aligning subjective tests using a low cost common set

ANALYSIS OF SNOEK-KOSTER (H) RELAXATION IN IRON

Additional mechanisms for rewriting on-the-fly SPARQL queries proxy

Towards Collaborative Learning via Shared Artefacts over the Grid

Information Technology Education in the Sri Lankan School System: Challenges and Perspectives

Model2Roo : Web Application Development based on the Eclipse Modeling Framework and Spring Roo

IntroClassJava: A Benchmark of 297 Small and Buggy Java Programs

What Development for Bioenergy in Asia: A Long-term Analysis of the Effects of Policy Instruments using TIAM-FR model

Digital libraries: Comparison of 10 software

Novel Client Booking System in KLCC Twin Tower Bridge

E-commerce and Network Marketing Strategy

An integrated planning-simulation-architecture approach for logistics sharing management: A case study in Northern Thailand and Southern China

GDS Resource Record: Generalization of the Delegation Signer Model

Cracks detection by a moving photothermal probe

SELECTIVELY ABSORBING COATINGS

Business intelligence systems and user s parameters: an application to a documents database

ControVol: A Framework for Controlled Schema Evolution in NoSQL Application Development

Cobi: Communitysourcing Large-Scale Conference Scheduling

Partial and Dynamic reconfiguration of FPGAs: a top down design methodology for an automatic implementation

A few elements in software development engineering education

DEM modeling of penetration test in static and dynamic conditions

An In-Context and Collaborative Software Localisation Model: Demonstration

Dhydro: a generic environment developed to edit and access multilingual terminological data on the Internet

Ontology-based Tailoring of Software Process Models

Territorial Intelligence and Innovation for the Socio-Ecological Transition

Performance Evaluation of Encryption Algorithms Key Length Size on Web Browsers

Proposal for the configuration of multi-domain network monitoring architecture

Terminology Extraction from Log Files

Conceptual Workflow for Complex Data Integration using AXML

The Effectiveness of non-focal exposure to web banner ads

Software Designing a Data Warehouse Using Java

Application-Aware Protection in DWDM Optical Networks

Crowdsourcing and digitization

The truck scheduling problem at cross-docking terminals

Introduction to the papers of TWG18: Mathematics teacher education and professional development.

Florin Paun. To cite this version: HAL Id: halshs

Leveraging ambient applications interactions with their environment to improve services selection relevancy

Training Ircam s Score Follower

Advantages and disadvantages of e-learning at the technical university

BILAL W S SHAFEI. An-Najah National University LRC / French department An-Najah Street. P.O. Box 07 Nablus Palestine

Glossary of translation tool types

Studying (in) a French grande école

CNC Kompetenzzenter mit der multimedialen Software

Hinky: Defending Against Text-based Message Spam on Smartphones

Virtual plants in high school informatics L-systems

ISO9001 Certification in UK Organisations A comparative study of motivations and impacts.

Heterogeneous PLC-RF networking for LLNs

Optimization results for a generalized coupon collector problem

A modeling approach for locating logistics platforms for fast parcels delivery in urban areas

An Automatic Reversible Transformation from Composite to Visitor in Java

Methodology of organizational learning in risk management : Development of a collective memory for sanitary alerts

Running an HCI Experiment in Multiple Parallel Universes

SIMS AND SCANNING ION MICROSCOPY

Vegetable Supply Chain Knowledge Retrieval and Ontology

Physicians balance billing, supplemental insurance and access to health care

Diversity against accidental and deliberate faults

ONLINE TRANSLATION SERVICES FOR THE LAO LANGUAGE

New implementions of predictive alternate analog/rf test with augmented model redundancy

Innovative training for increasing the knowledge base of the European polymer industry in relation to REACH

Flauncher and DVMS Deploying and Scheduling Thousands of Virtual Machines on Hundreds of Nodes Distributed Geographically

Stored Documents and the FileCabinet

RAMAN SCATTERING INDUCED BY UNDOPED AND DOPED POLYPARAPHENYLENE

Focus Group on Computer Tools Used for Professional Writing and Preliminary Evaluation of LinguisTech

Hassle-free POS-Tagging for the Alsatian Dialects

ASPECT LRE Technical and Semantic Interoperability Workshop

Movida studio: a modeling environment to create viewpoints and manage variability in views

A Framework for Data Management for the Online Volunteer Translators' Aid System QRLex

Lifelong Learning and Competence Management for the Furniture Industry

Distance education engineering

Evaluating grapheme-to-phoneme converters in automatic speech recognition context

P2Prec: A Social-Based P2P Recommendation System

In a paperless world a new role for academic libraries: Providing Open Access

Multilateral Privacy in Clouds: Requirements for Use in Industry

Transcription:

Dictionary Building with the Jibiki Platform: Software Demonstration Mathieu Mangeot To cite this version: Mathieu Mangeot. Dictionary Building with the Jibiki Platform: Software Demonstration. EURALEX 2006, May 2006, Torino, Italy. pp.121-126, 2006. <hal-00968619> HAL Id: hal-00968619 https://hal.archives-ouvertes.fr/hal-00968619 Submitted on 1 Apr 2014 HAL is a multi-disciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

2 DICTIONARY BUILDING WITH THE JIBIKI PLATFORM Mathieu MANGEOT Condillac - LISTIC - Université de Savoie Campus Scientifique F-73376 LE BOURGET DU LAC CEDEX Email : Mathieu.Mangeot@univ-savoie.fr ABSTRACT The Jibiki platform is an online generic environment for writing and querying all kinds of dictionaries: terminological glossaries, bilingual dictionaries, multilingual lexical databases, etc. It has been developed mainly by Mathieu Mangeot (Université de Savoie, France) and Gilles Sérasset (Université de Grenoble 1, France), thanks to research driven by the GETA team of the CLIPS laboratory in Grenoble, France. The platform allows one to lookup all the dictionaries available on the server and to display the results in the same window. The advanced query interface offers a combination of multiple search criteria. The writing of the entries is done directly online on the platform via a web browser. The writing interface is generated automatically from the description of the structure of the entries (an XML schema), thus allowing the edition of (almost) any type of dictionary entry.

2 1. Overview of the platform Dictionary Building with the Jibiki Platform The Jibiki platform is an online generic environment for writing and querying all kinds of dictionaries: terminological glossaries, bilingual dictionaries, multilingual lexical databases, etc. It has been developed mainly by Mathieu Mangeot (Université de Savoie, France) and Gilles Sérasset (Université de Grenoble 1, France), now helped by Francis Brunet-Manquat, thanks to research driven by the GETA team of the CLIPS laboratory in Grenoble, France. The platform is implemented in Java, exclusively with open source tools. It is based on Enhydra, a web server of dynamic java objects and Postgres, a relational database. The interface is available in English, Estonian, French, German and Japanese. New languages can be easily added. Annex tools have been added on various instances of the platform. Some facilitate the communication between communities of users (forums, distribution lists) and others, the work of the lexicographers (tool for managing aligned bilingual corpora). 2. Comparison with existing software In this section, we will briefly compare our software with two other well known dictionary building software: TshwaneLex 1 and IDM s Dictionary Publishing Software. Jibiki platform TshwaneLex IDM DPS Price Not for commercial 1900 for commercial? Free for academic 150 for academic Offline editing With an XML editor Yes Yes Online searching Yes Yes Unclear Online editing Yes No? Unicode handling Yes Yes Yes XML structure Yes Unclear Yes Importing existing Yes Yes No XML dictionary Corpus handling Not included Not included Included 1 http://tshwanedje.com/tshwanelex/ 2

3. Projects currently using the platform The platform is currently used by three lexicographical or terminological projects. 3.1. Papillon Project This project 2, launched in 2001, is at the origin of the building of the platform. Its main goal is the construction of a multilingual lexical database with a pivot structure covering among others the following languages: Chinese, English, French, German, Japanese, Lao, Malay, Thai and Vietnamese. The resulting resources are publicly available and free of rights. The projects is open to all those who are interested in these languages. 3.2. GDEF Project The GDEF 3 (Great Estonian-French Dictionary) started in 2003. Its goal is to build a bilingual Estonian-French dictionary of about 80,000 entries, by a team of 8 people made of linguists, as well as Estonian and French translators. 3.3. LexALP Project The European project LexALP, launched in 2005, use the Papillon platform in order to develop a terminological database of legal and administrative terms in the main Alpine languages (French, German, Italian and Slovene). 4. Dictionary lookup The platform allows one to lookup all the dictionaries available on the server and to display the results in the same window. The advanced query interface offers a combination of multiple search criteria on: 2 http://www.papillon-dictionary.org 3 http://estfra.ee 3

the languages: source, targets, available resources; the character string: prefix, suffix, substring; the content of the entries: headword, variants, pronunciation, domain, gloss, part-of-speech, translations, examples, etc. It is even possible to define new search criteria when a new resource is added by defining common pointers on searchable information parts. In the case where a normal search returned no results, a reverse lookup is also executed. 5. Entries Writing The writing of the entries is done directly online on the platform via a web browser. The writing interface is generated automatically from the description of the structure of the entries (an XML schema), thus allowing the edition of (almost) any type of dictionary entry. The interface, built upon an HTML form can also deal with relatively complex structures thanks to more complex interactors that combine the basic HTML ones (text boxes, radio buttons, pop-up menus). Such example is the list management one that allows the writers to add, delete or reorder elements in a list by simply clicking on a button. These elements can be themselves complex objects containing lists of other objects, etc. A specific module allows the writer to establish links to entries in other resources available on the server. This technique is mainly used for linking an entry to its translation in another language when the translation already exist as a separate entry. The writing process is divided in several steps depending on the project. The GDEF is the most complete with three steps: 1) A contributor writes an entry; 2) It is next revised by a reviewer; 3) It is then validated by a validator; When the entry is validated, it is integrated into the dictionary and all the users can search it. 4

6. Tasks Management In order to manage the different tasks and roles, the platform gives the possibility to define groups and access rights. There are several groups with predefined rights: If the user is not logged, it can lookup the public resources available on the platform. When the user is registered and logged, s/he is included de facto in the contributors group and can contribute through the entry writing interface. The users members of the reviewers group can revise the contributions written by users members of their working group. The users members of the validators group can validate the previously revised contributions. Then, the server administrators manage users and their groups, add new resources, etc. In order to facilitate the construction work of the dictionary and possibly the remuneration of the writers, it is possible to obtain a summary of all the contributions in a given period of time. Then, the dictionaries can be exported as a whole or by parts, in several formats (text, HTML, XML, PDF, etc.) and printed. References Antoine Chalvin & Mathieu Mangeot (2006) Méthodes et outils pour la lexicographie bilingue en ligne : le cas du grand dictionnaire estonien-français In proc. EURALEX 2006, à paraître, Turin, Italie, 6-9 septembre. Mangeot, Mathieu (2001) Environnements centralisés et distribués pour lexicographes et lexicologues en contexte multilingue. Thèse de doctorat, Spécialité Informatique, Université Joseph Fourier Grenoble I, 2001, 280 p. Mangeot, Mathieu ; Sérasset, Gilles ; Lafourcade, Mathieu (2003) Construction collaborative de données lexicales multilingues, le projet Papillon. Les dictionnaires électroniques : pour les personnes, les machines ou pour les deux? ed. M. Zock and J. Carroll, TAL Traitement Automatique des Langues, Vol. 44 : 2/2003, 151-176. Mathieu Mangeot and David Thevenin (2004) Online generic editing of heterogeneous dictionary entries in Papillon project In Proc. of the COLING 2004 conference, vol. 2, pages 1029--1035, Geneva, Switzerland, 26 August. Gilles Sérasset (2004) A generic collaborative platform for multilingual lexical database development In Gilles Sérasset, editor, COLING 2004 Multilingual Linguistic Resources Workshop, pages 73--79, Geneva, Switzerland, 28 August. Gilles Sérasset (2005) Multilingual legal terminology on the jibiki platform: The lexalp project In Mathieu Lafourcade, editor, Proc. of Papillon 2005 Workshop, pages 64--73,Chiang Rai,Thailand, 11-13 December. 5