Towards automatic terminology extraction for Norwegian based on parallel corpora
|
|
- Tobias Watkins
- 8 years ago
- Views:
Transcription
1 Towards automatic terminology extraction for Norwegian based on parallel corpora Gisle Andersen LSP Conference, Vienna 8 July 2015
2 Background and contents NHH is developing a national infrastructure that integrates terminological language resources Termportalen; WP7 in CLARINO project (NFR) Many specialist fields lacking systematic terminology Case: Sjøfartsdirektoratet (Norwegian Maritime Authority) Contents Introduction: aim Data and methods Pattern matching Conclusion 2
3 Aim of work Purpose: providing aid to field experts where systematic terminology work is lacking A generic system meant to enhance terminology for various domains Maximising value of existing tools and language resources Setting up an infrastructure, a production line for term extraction (TE) Using a variety of techniques for parallel corpusbased TE 3
4 The necessary disclaimer What is extracted through computational methods are always term candidates. Need for subsequent manual check by field experts Need to supply additional information about concepts (definitions, structure, term variation); cf. e.g. Heylen & De Hertog (2015) 4
5 Towards automatic terminology extraction for Norwegian DATA AND METHODS 5
6 The corpus (1/2) Sjøfartsdirektoratet (NMA) Parallel corpus of translated texts (EN NO) Policies and legislation relating to shipping - navigation, communication, safety, etc. Current version: translated regulations from International Maritime Organization (IMO) - small; currently 9 items To be extended to include - Skipssikkerhetsloven / The Ship Safety and Security Act - NMA s own regulations - Etc. 6
7 The corpus (2/2) Title of regulation Navn på forskrift TCA2 fil TCA2 fil Regulations of 1 July 2014 No on the Forskrift 01. juli 2014 om bygging av RCS_E RCS_N construction of ships skip Regulations of 1 July 2014 No. 944 Forskrift 1. juli 2014 om farlig last på RDG_E RDG_N on dangerous goods on Norwegian ships norske skip Regulations of 1 July 2014 No on fire Forskrift 1. juli 2014 om brannsikring RFP_E RFP_N protection on ships på skip Regulations of 1 July 2014 on life-saving Forskrift 1. juli 2014 om RLS_E RLS_N appliances on ships redningsredskaper på skip Regulations of 5 June 2014 No. 805 on medical examination of employees on Norwegian ships and mobile offshore units Forskrift nr. xxxx om helseundersøkelse av arbeidstakere på norske skip og flyttbare RME_E RME_N Regulations of 5 September 2014 No on navigation and navigational aids for ships and mobile offshore units Regulations of 1 July 2014 No. 955 concerning radiocommunication equipment for Norwegian ships and mobile offshore units Regulations of 5 January 2014 No on a safety management system for Norwegian ships and mobile offshore units IMO standard marine communication phrases (SMCPs) innretninger Forskrift om navigasjon og navigasjonshjelpemidler for skip og flyttbare innretninger Forskrift 1. juli 2014 om radiokommunikasjonsutstyr for norske skip og flyttbare innretninger Forskrift om sikkerhetsstyringssystem for norske skip, og flyttbare innretninger IMOs standarduttrykk for maritim kommunikasjon RNN_E RRR_E RSM_E SMCP_E RNN_N RRE_N RSM_N SMCP_N 7
8 System architecture for TE 8
9 Step 1: Text conversion: doc html xml 9
10 Step 2: Alignment of parallel corpus texts 10
11 Step 3A: Pattern matching Premise: recognisable patterns in sentence and paragraph structure, punctuation, etc. suggesting termhood Extraction based on regular expressions (perl) <s>b) barges;</s> <s>the spooling device shall:</s> <s>a) initial certification upon changes in use;</s> <s>e) handrails, corridors and passageways, doorways, doors, lifts, vehicle decks, passenger lounges, accommodation and washrooms shall be Wire/chain stoppers shall be dimensioned for a safe working load <s>d) lektere</s> <s>spoleapparatet skal:</s> <s>a) førstegangssertifisering ved endret bruk</s> <s>e) Håndlister, korridorer og ganger, døråpninger, dører, heiser, bildekk, passasjersalonger, innredning og toaletter skal være En wire- og kjettingstopper skal være dimensjonert for en sikker arbeidsbelastning 11
12 Step 3B: Check of terminological inventory Premise: if word/sequence of words is already registered as term in other component of Termportalen, it has high termhood (it is likely to constitute a term in current context also) Question 1: same or different translation relation Question 2: same or different domain Methodological issue: inflected forms in texts; base form in term base 12
13 Step 3C: Neology detection Premise: if word/sequence of words can be shown to be a neologism (domain-specific vocabulary), it has high termhood (is likely to be a term) Check against inventory of words in large general language corpus (GLC); Norsk aviskorpus (Norwegian Newspaper Corpus, NNC; cf. Andersen 2012; Andersen & Hofland 2012) Check among neologisms registered in NNC s neology database 13
14 Step 3D: Monolingual/bilingual lexicon lookup Premise: if word/sequence of words is found among the lexical inventory in a mono/bilingual technical or specialised dictionary, it has high termhood Agreement with Kunnskapsforlaget to reuse some of their manuscripts 14
15 Step 3E: Association measures (AMs) Premise: terms are often constituted as collocations, i.e. words with a strong tendency to co-occur, so strong collocations may be seen as indicators of termhood Association measures, statistical measures of unithood/termhood (Heylen & De Hertog 2015) Important to select adequate AM for TE, e.g. Pointwise Mutual Information, Chi-square (cf. Lyse & Andersen 2012) Collocation patterns should be compared with GLC data (NNC) 15
16 Step 3F: Parsing techniques Premise: terminological units are typically constituted as (complex) noun phrases; output from syntactic parsing may give good guidance towards terminological units Parsers for Norwegian and English: INESS project (UiB; cf. Rosén 2012) 16
17 Towards automatic terminology extraction for Norwegian A CLOSER LOOK AT PATTERN MATCHING 17
18 Term extraction based on pattern matching final loading conditions Hydrostatics containing the following parameters as a function of the draught with a specified reference point Endelige lastetilstander Hydrostatikk som inneholder følgende parametere som funksjon av dypgang med spesifisert referansepunkt?? displacement deplasement KB KB centre of buoyancy oppdriftssenter If warranted by the ferry's size or type Når fergens størrelse eller type tilsier det the Norwegian Maritime Authority may require the mooring arrangement to be dimensioned for a mooring force higher than 30 tonnes kan Sjøfartsdirektoratet kreve at fortøyningsarrangementet blir dimensjonert for høyere fortøyningskraft enn 30 vekttonn KM transverse metacentre above the baseline KM tverrskips metasenter over basis AwT waterline area TP1 tonnes per unit submersion MT1 moment to change trim LCF longitudinal centre of flotation LCB longitudinal centre of buoyancy AwT vannlinjeareal TP1 enhets neddykking MT1 enhets trimmoment LCF langskips flotasjonssenter LCB langskips oppdriftssenter 18
19 Output of procedure: term database file (tsv) 19
20 The next stage: manual editing in Termportalen 20
21 Towards automatic terminology extraction for Norwegian CONCLUSION 21
22 Other remaining tasks The output of each processing procedure: a bilingual list of term candidates Precision and recall needs to be checked against a gold standard Will be developed via manual term extraction performed by field experts/research assistant The performance/contribution of each module will be checked separately Degree of overlap needs alto to be checked 22
23 Summary A hybrid approach, using a combination linguistic and statistical approaches to (bilingual) TE combining the strengths of both approaches at the same time attempting to utilise and maximise the value of old existing language resources although based on data drawn specifically from the maritime sector, the production line and infrastructure proposed here is meant to be generic and applicable in (all) other domains 23
24 References Andersen, Gisle, ed Exploring Newspaper Language - Using the web to create and investgate a large corpus of modern Norwegian. Amsterdam: John Benjamins. Andersen, Gisle, and Knut Hofland Building a large monitor corpus based on newspapers on the web. In Exploring Newspaper Language - Using the web to create and investigate a large corpus of modern Norwegian, edited by G. Andersen. Amsterdam: John Benjamins. Heylen, Kris, and Dirk De Hertog Automatic term extraction. In Handbook of Terminology, edited by H. J. Kockaert and F. Steurs. Amsterdam: John Benjamins. Lyse, Gunn Inger, and Gisle Andersen Collocations and statistical analysis of n-grams. In Exploring Newspaper Language - Using the web to create and investigate a large corpus of modern Norwegian, edited by G. Andersen. Amsterdam: John Benjamins. Rosén, Victoria Exploring corpora through syntactic annotation. In Exploring Newspaper Language - Using the web to create and investigate a large corpus of modern Norwegian, edited by G. Andersen: John Benjamins. 24
Overview of MT techniques. Malek Boualem (FT)
Overview of MT techniques Malek Boualem (FT) This section presents an standard overview of general aspects related to machine translation with a description of different techniques: bilingual, transfer,
More informationThe SYSTRAN Linguistics Platform: A Software Solution to Manage Multilingual Corporate Knowledge
The SYSTRAN Linguistics Platform: A Software Solution to Manage Multilingual Corporate Knowledge White Paper October 2002 I. Translation and Localization New Challenges Businesses are beginning to encounter
More informationCustomizing an English-Korean Machine Translation System for Patent Translation *
Customizing an English-Korean Machine Translation System for Patent Translation * Sung-Kwon Choi, Young-Gil Kim Natural Language Processing Team, Electronics and Telecommunications Research Institute,
More informationAnnotation Guidelines for Dutch-English Word Alignment
Annotation Guidelines for Dutch-English Word Alignment version 1.0 LT3 Technical Report LT3 10-01 Lieve Macken LT3 Language and Translation Technology Team Faculty of Translation Studies University College
More informationSYSTRAN Chinese-English and English-Chinese Hybrid Machine Translation Systems for CWMT2011 SYSTRAN 混 合 策 略 汉 英 和 英 汉 机 器 翻 译 系 CWMT2011 技 术 报 告
SYSTRAN Chinese-English and English-Chinese Hybrid Machine Translation Systems for CWMT2011 Jin Yang and Satoshi Enoue SYSTRAN Software, Inc. 4444 Eastgate Mall, Suite 310 San Diego, CA 92121, USA E-mail:
More informationLanguage policies and language use in Norwegian higher education
Language policies and language use in Norwegian higher education National Languages g and Terminology in Higher Education, Science & Technology, 7 November 2013 Marita Kristiansen Norwegian School of Economics
More informationTerminology Extraction from Log Files
Terminology Extraction from Log Files Hassan Saneifar 1,2, Stéphane Bonniol 2, Anne Laurent 1, Pascal Poncelet 1, and Mathieu Roche 1 1 LIRMM - Université Montpellier 2 - CNRS 161 rue Ada, 34392 Montpellier
More informationMaskinöversättning 2008. F2 Översättningssvårigheter + Översättningsstrategier
Maskinöversättning 2008 F2 Översättningssvårigheter + Översättningsstrategier Flertydighet i källspråket poäng point, points, credit, credits, var verb ->was, were pron -> each adv -> where adj -> every
More informationSYSTRAN 混 合 策 略 汉 英 和 英 汉 机 器 翻 译 系 统
SYSTRAN Chinese-English and English-Chinese Hybrid Machine Translation Systems Jin Yang, Satoshi Enoue Jean Senellart, Tristan Croiset SYSTRAN Software, Inc. SYSTRAN SA 9333 Genesee Ave. Suite PL1 La Grande
More informationLearning Translation Rules from Bilingual English Filipino Corpus
Proceedings of PACLIC 19, the 19 th Asia-Pacific Conference on Language, Information and Computation. Learning Translation s from Bilingual English Filipino Corpus Michelle Wendy Tan, Raymond Joseph Ang,
More informationHybrid Machine Translation Guided by a Rule Based System
Hybrid Machine Translation Guided by a Rule Based System Cristina España-Bonet, Gorka Labaka, Arantza Díaz de Ilarraza, Lluís Màrquez Kepa Sarasola Universitat Politècnica de Catalunya University of the
More informationTibetan-Chinese Bilingual Sentences Alignment Method based on Multiple Features
, pp.273-280 http://dx.doi.org/10.14257/ijdta.2015.8.4.27 Tibetan-Chinese Bilingual Sentences Alignment Method based on Multiple Features Lirong Qiu School of Information Engineering, MinzuUniversity of
More informationTrends in corpus specialisation
ANA DÍAZ-NEGRILLO / FRANCISCO JAVIER DÍAZ-PÉREZ Trends in corpus specialisation 1. Introduction Computerised corpus linguistics set off around the 1960s with the compilation and exploitation of the first
More informationFinding financial terminology in Norwegian newspapers
Finding financial terminology in Norwegian newspapers The article presents a study of anglicisms evident in Norwegian newspapers that can be related to the current financial crisis of 2007-2010. Examples
More informationAutomatic Speech Recognition and Hybrid Machine Translation for High-Quality Closed-Captioning and Subtitling for Video Broadcast
Automatic Speech Recognition and Hybrid Machine Translation for High-Quality Closed-Captioning and Subtitling for Video Broadcast Hassan Sawaf Science Applications International Corporation (SAIC) 7990
More informationThe PALAVRAS parser and its Linguateca applications - a mutually productive relationship
The PALAVRAS parser and its Linguateca applications - a mutually productive relationship Eckhard Bick University of Southern Denmark eckhard.bick@mail.dk Outline Flow chart Linguateca Palavras History
More informationThe Oxford Learner s Dictionary of Academic English
ISEJ Advertorial The Oxford Learner s Dictionary of Academic English Oxford University Press The Oxford Learner s Dictionary of Academic English (OLDAE) is a brand new learner s dictionary aimed at students
More informationTerminology Extraction from Log Files
Terminology Extraction from Log Files Hassan Saneifar, Stéphane Bonniol, Anne Laurent, Pascal Poncelet, Mathieu Roche To cite this version: Hassan Saneifar, Stéphane Bonniol, Anne Laurent, Pascal Poncelet,
More informationSemantic annotation of requirements for automatic UML class diagram generation
www.ijcsi.org 259 Semantic annotation of requirements for automatic UML class diagram generation Soumaya Amdouni 1, Wahiba Ben Abdessalem Karaa 2 and Sondes Bouabid 3 1 University of tunis High Institute
More informationREVIEW OF STCW PASSENGER SHIP-SPECIFIC SAFETY TRAINING. Proposed Amendments to the STCW Convention passenger ship specific safety training
E SUB-COMMITTEE ON HUMAN ELEMENT, TRAINING AND WATCHKEEPING 3rd session Agenda item 10 HTW 3/10 30 October 2015 Original: ENGLISH REVIEW OF STCW PASSENGER SHIP-SPECIFIC SAFETY TRAINING Proposed Amendments
More informationAutomatic Mining of Internet Translation Reference Knowledge Based on Multiple Search Engines
, 22-24 October, 2014, San Francisco, USA Automatic Mining of Internet Translation Reference Knowledge Based on Multiple Search Engines Baosheng Yin, Wei Wang, Ruixue Lu, Yang Yang Abstract With the increasing
More informationCollecting Polish German Parallel Corpora in the Internet
Proceedings of the International Multiconference on ISSN 1896 7094 Computer Science and Information Technology, pp. 285 292 2007 PIPS Collecting Polish German Parallel Corpora in the Internet Monika Rosińska
More informationInteractive Dynamic Information Extraction
Interactive Dynamic Information Extraction Kathrin Eichler, Holmer Hemsen, Markus Löckelt, Günter Neumann, and Norbert Reithinger Deutsches Forschungszentrum für Künstliche Intelligenz - DFKI, 66123 Saarbrücken
More informationAutomatic identification of construction candidates for a Swedish constructicon
Automatic identification of construction candidates for a Swedish constructicon Linnea Bäckström, Lars Borin, Markus Forsberg, Benjamin Lyngfelt, Julia Prentice, and Emma Sköldberg Språkbanken University
More informationFrom Terminology Extraction to Terminology Validation: An Approach Adapted to Log Files
Journal of Universal Computer Science, vol. 21, no. 4 (2015), 604-635 submitted: 22/11/12, accepted: 26/3/15, appeared: 1/4/15 J.UCS From Terminology Extraction to Terminology Validation: An Approach Adapted
More informationSchema documentation for types1.2.xsd
Generated with oxygen XML Editor Take care of the environment, print only if necessary! 8 february 2011 Table of Contents : ""...........................................................................................................
More informationJoint efforts to further develop and incorporate Apertium into the document management flow at Universitat Oberta de Catalunya
Joint efforts to further develop and incorporate Apertium into the document management flow at Universitat Oberta de Catalunya Luis Villarejo*, Sergio Ortiz** and Mireia Ginestí** *Learning Technologies
More informationQuestion template for interviews
Question template for interviews This interview template creates a framework for the interviews. The template should not be considered too restrictive. If an interview reveals information not covered by
More informationProcessing: current projects and research at the IXA Group
Natural Language Processing: current projects and research at the IXA Group IXA Research Group on NLP University of the Basque Country Xabier Artola Zubillaga Motivation A language that seeks to survive
More informationRegulation of 15 September 1992 No. 704 concerning operating arrangements on Norwegian ships
Regulation of 5 September 992 No. 704 concerning operating arrangements on Norwegian ships Laid down by the Norwegian Maritime Directorate on 5 September 992 pursuant to the Act of 9 June 903 no. 7 relating
More informationIdentifying Focus, Techniques and Domain of Scientific Papers
Identifying Focus, Techniques and Domain of Scientific Papers Sonal Gupta Department of Computer Science Stanford University Stanford, CA 94305 sonal@cs.stanford.edu Christopher D. Manning Department of
More informationNATURAL LANGUAGE QUERY PROCESSING USING SEMANTIC GRAMMAR
NATURAL LANGUAGE QUERY PROCESSING USING SEMANTIC GRAMMAR 1 Gauri Rao, 2 Chanchal Agarwal, 3 Snehal Chaudhry, 4 Nikita Kulkarni,, 5 Dr. S.H. Patil 1 Lecturer department o f Computer Engineering BVUCOE,
More informationA MATTER OF STABILITY AND TRIM By Samuel Halpern
A MATTER OF STABILITY AND TRIM By Samuel Halpern INTRODUCTION This short paper deals with the location of Titanic s Center of Buoyancy (B), Center of Gravity (G) and Metacenter Height (M) on the night
More informationBrill s rule-based PoS tagger
Beáta Megyesi Department of Linguistics University of Stockholm Extract from D-level thesis (section 3) Brill s rule-based PoS tagger Beáta Megyesi Eric Brill introduced a PoS tagger in 1992 that was based
More informationPrivacy Issues in Online Machine Translation Services European Perspective.
Privacy Issues in Online Machine Translation Services European Perspective. Pawel Kamocki, Jim O'Regan IDS Mannheim / Paris Descartes / WWU Münster Centre for Language and Communication Studies, Trinity
More informationNatural Language Dialogue in a Virtual Assistant Interface
Natural Language Dialogue in a Virtual Assistant Interface Ana M. García-Serrano, Luis Rodrigo-Aguado, Javier Calle Intelligent Systems Research Group Facultad de Informática Universidad Politécnica de
More informationCorpus and Discourse. The Web As Corpus. Theory and Practice MARISTELLA GATTO LONDON NEW DELHI NEW YORK SYDNEY
Corpus and Discourse The Web As Corpus Theory and Practice MARISTELLA GATTO B L O O M S B U R Y LONDON NEW DELHI NEW YORK SYDNEY Contents List of Figures xiii List of Tables xvii Preface xix Acknowledgements
More informationHybrid Strategies. for better products and shorter time-to-market
Hybrid Strategies for better products and shorter time-to-market Background Manufacturer of language technology software & services Spin-off of the research center of Germany/Heidelberg Founded in 1999,
More informationOrder on maritime security training on board ships
Translation. Only the Danish document has legal validity. Order no. 1279 of 7 November 2013 issued by the Danish Maritime Authority Order on maritime security training on board ships In pursuance of section
More informationDAM-LR at the INL Archive Formation and Local INL. Remco van Veenendaal veenendaal@inl.nl http://imdi.inl.nl 01/03/2007 DAM-LR
DAM-LR at the INL Archive Formation and Local INL Remco van Veenendaal veenendaal@inl.nl http://imdi.inl.nl Introducing Remco van Veenendaal Project manager DAM-LR Acting project manager Dutch HLT Agency
More informationGUIDELINES FOR FLOODING DETECTION SYSTEMS ON PASSENGER SHIPS
INTERNATIONAL MARITIME ORGANIZATION 4 ALBERT EMBANKMENT LONDON SE1 7SR Telephone: 020 7735 7611 Fax: 020 7587 3210 IMO E Ref. T1/2.04 MSC.1/Circ.1291 9 December 2008 GUIDELINES FOR FLOODING DETECTION SYSTEMS
More informationMicro blogs Oriented Word Segmentation System
Micro blogs Oriented Word Segmentation System Yijia Liu, Meishan Zhang, Wanxiang Che, Ting Liu, Yihe Deng Research Center for Social Computing and Information Retrieval Harbin Institute of Technology,
More informationHow To Rank Term And Collocation In A Newspaper
You Can t Beat Frequency (Unless You Use Linguistic Knowledge) A Qualitative Evaluation of Association Measures for Collocation and Term Extraction Joachim Wermter Udo Hahn Jena University Language & Information
More informationHow To Identify And Represent Multiword Expressions (Mwe) In A Multiword Expression (Irme)
The STEVIN IRME Project Jan Odijk STEVIN Midterm Workshop Rotterdam, June 27, 2008 IRME Identification and lexical Representation of Multiword Expressions (MWEs) Participants: Uil-OTS, Utrecht Nicole Grégoire,
More informationHvis personallisten ikke er ført slik reglene sier, kan Skatteetaten ilegge overtredelsesgebyr.
Denne boken er utgitt av Skatteetaten og sendes gratis til alle som er pålagt å føre personalliste fra 1. januar 2014. Det vil si bransjene servering, frisør, skjønnhetspleie, bilpleie og bilverksted.
More informationCINTIL-PropBank. CINTIL-PropBank Sub-corpus id Sentences Tokens Domain Sentences for regression atsts 779 5,654 Test
CINTIL-PropBank I. Basic Information 1.1. Corpus information The CINTIL-PropBank (Branco et al., 2012) is a set of sentences annotated with their constituency structure and semantic role tags, composed
More informationSearch and Data Mining: Techniques. Text Mining Anya Yarygina Boris Novikov
Search and Data Mining: Techniques Text Mining Anya Yarygina Boris Novikov Introduction Generally used to denote any system that analyzes large quantities of natural language text and detects lexical or
More informationCentral and South-East European Resources in META-SHARE
Central and South-East European Resources in META-SHARE Tamás VÁRADI 1 Marko TADIĆ 2 (1) RESERCH INSTITUTE FOR LINGUISTICS, MTA, Budapest, Hungary (2) FACULTY OF HUMANITIES AND SOCIAL SCIENCES, ZAGREB
More informationBILINGUAL TRANSLATION SYSTEM
BILINGUAL TRANSLATION SYSTEM (FOR ENGLISH AND TAMIL) Dr. S. Saraswathi Associate Professor M. Anusiya P. Kanivadhana S. Sathiya Abstract--- The project aims in developing Bilingual Translation System for
More informationACCURAT Analysis and Evaluation of Comparable Corpora for Under Resourced Areas of Machine Translation www.accurat-project.eu Project no.
ACCURAT Analysis and Evaluation of Comparable Corpora for Under Resourced Areas of Machine Translation www.accurat-project.eu Project no. 248347 Deliverable D5.4 Report on requirements, implementation
More informationModule Catalogue for the Bachelor Program in Computational Linguistics at the University of Heidelberg
Module Catalogue for the Bachelor Program in Computational Linguistics at the University of Heidelberg March 1, 2007 The catalogue is organized into sections of (1) obligatory modules ( Basismodule ) that
More informationInternational Journal of Scientific & Engineering Research, Volume 4, Issue 11, November-2013 5 ISSN 2229-5518
International Journal of Scientific & Engineering Research, Volume 4, Issue 11, November-2013 5 INTELLIGENT MULTIDIMENSIONAL DATABASE INTERFACE Mona Gharib Mohamed Reda Zahraa E. Mohamed Faculty of Science,
More informationRevisiting Context-based Projection Methods for Term-Translation Spotting in Comparable Corpora
Revisiting Context-based Projection Methods for Term-Translation Spotting in Comparable Corpora Audrey Laroche OLST Dép. de linguistique et de traduction Université de Montréal audrey.laroche@umontreal.ca
More informationDomain-specific terminology extraction for Machine Translation. Mihael Arcan
Domain-specific terminology extraction for Machine Translation Mihael Arcan Outline Phd topic Introduction Resources Tools Multi Word Extraction (MWE) extraction Projection of MWE Evaluation Future Work
More informationAbout risk analyses / risk evaluation
About risk analyses / risk evaluation Tools Examples Practical exercise Thale Henden (HMS-koordinator PA/Medfak) Anne-Kristin Bjørnbakk (Satkkevollan bedriftshelsetjeneste) Karin Lia (HMS-koordinator NFH)
More informationTranslation and Localization Services
Translation and Localization Services Company Overview InterSol, Inc., a California corporation founded in 1996, provides clients with international language solutions. InterSol delivers multilingual solutions
More informationAustralian Maritime Safety Authority
Australian Maritime Safety Authority About the Australian Maritime Safety Authority The Australian Maritime Safety Authority (AMSA) is a statutory authority established under the Australian Maritime Safety
More informationTRANSREAD LIVRABLE 3.1 QUALITY CONTROL IN HUMAN TRANSLATIONS: USE CASES AND SPECIFICATIONS. Projet ANR 201 2 CORD 01 5
Projet ANR 201 2 CORD 01 5 TRANSREAD Lecture et interaction bilingues enrichies par les données d'alignement LIVRABLE 3.1 QUALITY CONTROL IN HUMAN TRANSLATIONS: USE CASES AND SPECIFICATIONS Avril 201 4
More informationAUTHOR(S) Are W. Brandt CLIENT(S) STF22 A98833 Unrestricted Arne Johansen, DBE and Joakim Nielsen, SRV
TITLE SINTEF REPORT SINTEF Civil and Environmental Engineering Norwegian Fire Research Laboratory Address: Location: N-7034 Trondheim, NORWAY Tiller bru, Tiller Telephone: +47 73 59 10 78 Fax: +47 73 59
More informationThe University of Amsterdam s Question Answering System at QA@CLEF 2007
The University of Amsterdam s Question Answering System at QA@CLEF 2007 Valentin Jijkoun, Katja Hofmann, David Ahn, Mahboob Alam Khalid, Joris van Rantwijk, Maarten de Rijke, and Erik Tjong Kim Sang ISLA,
More informationOccupational Noise in the Norwegian oil industry:
Occupational Noise in the Norwegian oil industry: Cost/benefit as a result of new requirements in Norwegian Oil and Gas Recommended Guidelines for Handling Noise Tønnes A. Ognedal, Sinus AS Reidulf Klovning,
More informationInformation extraction from online XML-encoded documents
Information extraction from online XML-encoded documents From: AAAI Technical Report WS-98-14. Compilation copyright 1998, AAAI (www.aaai.org). All rights reserved. Patricia Lutsky ArborText, Inc. 1000
More informationMarkus Dickinson. Dept. of Linguistics, Indiana University Catapult Workshop Series; February 1, 2013
Markus Dickinson Dept. of Linguistics, Indiana University Catapult Workshop Series; February 1, 2013 1 / 34 Basic text analysis Before any sophisticated analysis, we want ways to get a sense of text data
More informationRecent developments in machine translation policy at the European Patent Office
Recent developments in machine translation policy at the European Patent Office Dr Georg Artelsmair Director European Co-operation European Patent Office Brussels, 17 November 2010 The European Patent
More informationMotivation. Korpus-Abfrage: Werkzeuge und Sprachen. Overview. Languages of Corpus Query. SARA Query Possibilities 1
Korpus-Abfrage: Werkzeuge und Sprachen Gastreferat zur Vorlesung Korpuslinguistik mit und für Computerlinguistik Charlotte Merz 3. Dezember 2002 Motivation Lizentiatsarbeit: A Corpus Query Tool for Automatically
More informationArchitecture of an Ontology-Based Domain- Specific Natural Language Question Answering System
Architecture of an Ontology-Based Domain- Specific Natural Language Question Answering System Athira P. M., Sreeja M. and P. C. Reghuraj Department of Computer Science and Engineering, Government Engineering
More informationExtraction and Visualization of Protein-Protein Interactions from PubMed
Extraction and Visualization of Protein-Protein Interactions from PubMed Ulf Leser Knowledge Management in Bioinformatics Humboldt-Universität Berlin Finding Relevant Knowledge Find information about Much
More information11-792 Software Engineering EMR Project Report
11-792 Software Engineering EMR Project Report Team Members Phani Gadde Anika Gupta Ting-Hao (Kenneth) Huang Chetan Thayur Suyoun Kim Vision Our aim is to build an intelligent system which is capable of
More informationTeaching terms: a corpus-based approach to terminology in ESP classes
Teaching terms: a corpus-based approach to terminology in ESP classes Maria João Cotter Lisbon School of Accountancy and Administration (ISCAL) (Portugal) Abstract This paper will build up on corpus linguistic
More informationTechWatch. Technology and Market Observation powered by SMILA
TechWatch Technology and Market Observation powered by SMILA PD Dr. Günter Neumann DFKI, Deutsches Forschungszentrum für Künstliche Intelligenz GmbH, Juni 2011 Goal - Observation of Innovations and Trends»
More informationCombining Ontological Knowledge and Wrapper Induction techniques into an e-retail System 1
Combining Ontological Knowledge and Wrapper Induction techniques into an e-retail System 1 Maria Teresa Pazienza, Armando Stellato and Michele Vindigni Department of Computer Science, Systems and Management,
More informationBerlin-Brandenburg Academy of sciences and humanities (BBAW) resources / services
Berlin-Brandenburg Academy of sciences and humanities (BBAW) resources / services speakers: Kai Zimmer and Jörg Didakowski Clarin Workshop WP2 February 2009 BBAW/DWDS The BBAW and its 40 longterm projects
More informationDEPARTMENT OF MARINE SERVICES AND MERCHANT SHIPPING (ADOMS) Boatmaster s Licenses
CIRCULAR Local 2013-001 DEPARTMENT OF MARINE SERVICES AND MERCHANT SHIPPING (ADOMS) Boatmaster s Licenses Ref SCV Code. Companies operating SCV certificated vessels under the flag of Antigua and Barbuda.
More informationAn Online Service for SUbtitling by MAchine Translation
SUMAT CIP-ICT-PSP-270919 An Online Service for SUbtitling by MAchine Translation Annual Public Report 2011 Editor(s): Contributor(s): Reviewer(s): Status-Version: Volha Petukhova, Arantza del Pozo Mirjam
More informationNiels Hjørnet Yacht Design Yacht Design. Niels Hjørnet Yacht Design
Niels Hjørnet Yacht Design Øko-Ø færge Røde tal på bundlinjen 011 Egholm færgen: Fursund Færgeri: Thyborøn-Agger færgen: Mors-Thy færgefart: Venø færgen: Hals-Egense færgen: Hvalpsund-Sundsøre færgen:
More informationREPORT ON THE WORKBENCH FOR DEVELOPERS
REPORT ON THE WORKBENCH FOR DEVELOPERS for developers DELIVERABLE D3.2 VERSION 1.3 2015 JUNE 15 QTLeap Machine translation is a computational procedure that seeks to provide the translation of utterances
More informationGetting Off to a Good Start: Best Practices for Terminology
Getting Off to a Good Start: Best Practices for Terminology Technologies for term bases, term extraction and term checks Angelika Zerfass, zerfass@zaac.de Tools in the Terminology Life Cycle Extraction
More informationCollaborative Machine Translation Service for Scientific texts
Collaborative Machine Translation Service for Scientific texts Patrik Lambert patrik.lambert@lium.univ-lemans.fr Jean Senellart Systran SA senellart@systran.fr Laurent Romary Humboldt Universität Berlin
More informationLanguage and Computation
Language and Computation week 13, Thursday, April 24 Tamás Biró Yale University tamas.biro@yale.edu http://www.birot.hu/courses/2014-lc/ Tamás Biró, Yale U., Language and Computation p. 1 Practical matters
More informationOpen Domain Information Extraction. Günter Neumann, DFKI, 2012
Open Domain Information Extraction Günter Neumann, DFKI, 2012 Improving TextRunner Wu and Weld (2010) Open Information Extraction using Wikipedia, ACL 2010 Fader et al. (2011) Identifying Relations for
More informationRRSS - Rating Reviews Support System purpose built for movies recommendation
RRSS - Rating Reviews Support System purpose built for movies recommendation Grzegorz Dziczkowski 1,2 and Katarzyna Wegrzyn-Wolska 1 1 Ecole Superieur d Ingenieurs en Informatique et Genie des Telecommunicatiom
More informationIntroduction to IE with GATE
Introduction to IE with GATE based on Material from Hamish Cunningham, Kalina Bontcheva (University of Sheffield) Melikka Khosh Niat 8. Dezember 2010 1 What is IE? 2 GATE 3 ANNIE 4 Annotation and Evaluation
More informationNatural Language to Relational Query by Using Parsing Compiler
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 4, Issue. 3, March 2015,
More informationRelease: 1. CPP20307 Certificate II in Technical Security
Release: 1 CPP20307 Certificate II in Technical Security CPP20307 Certificate II in Technical Security Modification History Description Pathways Information Licensing/Regulatory Information Entry Requirements
More informationA Mixed Trigrams Approach for Context Sensitive Spell Checking
A Mixed Trigrams Approach for Context Sensitive Spell Checking Davide Fossati and Barbara Di Eugenio Department of Computer Science University of Illinois at Chicago Chicago, IL, USA dfossa1@uic.edu, bdieugen@cs.uic.edu
More informationComprendium Translator System Overview
Comprendium System Overview May 2004 Table of Contents 1. INTRODUCTION...3 2. WHAT IS MACHINE TRANSLATION?...3 3. THE COMPRENDIUM MACHINE TRANSLATION TECHNOLOGY...4 3.1 THE BEST MT TECHNOLOGY IN THE MARKET...4
More informationGUIDE to completion of the Excel spreadsheet
GUIDE to completion of the Excel spreadsheet 1 S i d e 1 TABLE OF CONTENTS 1 TABLE OF CONTENTS... 2 1.1 Data Entry Generalities... 3 1.2 Prepare data... 5 1.3 Simple Data Entry (horizontal direction)...
More informationTHE knowledge needed by software developers
SUBMITTED TO IEEE TRANSACTIONS ON SOFTWARE ENGINEERING 1 Extracting Development Tasks to Navigate Software Documentation Christoph Treude, Martin P. Robillard and Barthélémy Dagenais Abstract Knowledge
More informationWord Completion and Prediction in Hebrew
Experiments with Language Models for בס"ד Word Completion and Prediction in Hebrew 1 Yaakov HaCohen-Kerner, Asaf Applebaum, Jacob Bitterman Department of Computer Science Jerusalem College of Technology
More informationON GETTING THE MOST OUT OF INTERNET RESOURCES TO RAISE TRANSLATION QUALITY OF PROFESSIONAL DOCUMENTATION
General and Professional Education 3/2013 pp. 21-27 ISSN 2084-1469 ON GETTING THE MOST OUT OF INTERNET RESOURCES TO RAISE TRANSLATION QUALITY OF PROFESSIONAL DOCUMENTATION Svetlana Sheremetyeva Department
More informationGlossary of translation tool types
Glossary of translation tool types Tool type Description French equivalent Active terminology recognition tools Bilingual concordancers Active terminology recognition (ATR) tools automatically analyze
More informationCar Passenger Ferry Portugal
Car Passenger Ferry Portugal Type Ref. ID Living Area Total Area Pris Car & Passenger Vessels HQA-709503 0 sq. m 0 sq. m Be om pris Lugarer Senger Etasjer Furnished Annonsert Dato 52 112 0 Nei July 7,
More informationReal-Time Identification of MWE Candidates in Databases from the BNC and the Web
Real-Time Identification of MWE Candidates in Databases from the BNC and the Web Identifying and Researching Multi-Word Units British Association for Applied Linguistics Corpus Linguistics SIG Oxford Text
More informationPontifícia Universidade Católica do Rio Grande do Sul Faculdade de Informática. Building Domain Specific Corpora in Portuguese Language
Pontifícia Universidade Católica do Rio Grande do Sul Faculdade de Informática Programa de Pós-Graduação em Ciência da Computação Building Domain Specific Corpora in Portuguese Language Lucelene Lopes,
More informationUsing the BNC to create and develop educational materials and a website for learners of English
Using the BNC to create and develop educational materials and a website for learners of English Danny Minn a, Hiroshi Sano b, Marie Ino b and Takahiro Nakamura c a Kitakyushu University b Tokyo University
More informationTerm extraction for user profiling: evaluation by the user
Term extraction for user profiling: evaluation by the user Suzan Verberne 1, Maya Sappelli 1,2, Wessel Kraaij 1,2 1 Institute for Computing and Information Sciences, Radboud University Nijmegen 2 TNO,
More informationNATURAL LANGUAGE QUERY PROCESSING USING PROBABILISTIC CONTEXT FREE GRAMMAR
NATURAL LANGUAGE QUERY PROCESSING USING PROBABILISTIC CONTEXT FREE GRAMMAR Arati K. Deshpande 1 and Prakash. R. Devale 2 1 Student and 2 Professor & Head, Department of Information Technology, Bharati
More informationParsing Software Requirements with an Ontology-based Semantic Role Labeler
Parsing Software Requirements with an Ontology-based Semantic Role Labeler Michael Roth University of Edinburgh mroth@inf.ed.ac.uk Ewan Klein University of Edinburgh ewan@inf.ed.ac.uk Abstract Software
More informationRegulations regarding health requirements for persons working on installations in petroleum activities offshore
Unauthorized translation of the Norwegian FOR 2010-12-20 nr 1780: Forskrift om helsekrav for personer I arbeid på innretninger I petroleumsvirksomheten til havs Regulations regarding health requirements
More informationNorwegian hospital planning tools
Norwegian hospital planning tools Knut Bergsland SINTEF Health Research Espoo, Dec.1.2006 1 SINTEF Health Research Research for improved health and a better quality of life 2 SINTEF Health Research Organisational
More information