The CroCo Translation Archive
|
|
|
- Lesley Neal
- 10 years ago
- Views:
Transcription
1 LINGUISTIC PROPERTIES OF TRANSLATIONS A CORPUS-BASED INVESTIGATION FOR THE LANGUAGE PAIR ENGLISH-GERMAN The CroCo Translation Archive Language Archives: Standards, Creation and Access Mihaela Vela & Silvia Hansen-Schirra Universität des Saarlandes Overview CroCo Corpus Representation Linguistic Annotation Alignment Outlook
2 Corpus-based comparison of translations with originals in source AND target language Specific properties of translations: e.g. simplification, normalisation, explicitation Blum-Kulka 1986, Baker 1996, Olohan & Baker /27/2006 Language Archives: Standards, Creation and Access 3 The CroCo Corpus English texts Reference Corpus ER Registercontrolled Corpus EO 8 registers, at least 10 texts each, 3125 words (av.) 17 registers, 2,000 word samples each Translation Corpus GTrans Translation Corpus ETrans Reference Corpus GR German texts Registercontrolled Corpus GO
3 File headers (Text Encoding Initiative) Information about: Author, publication, register information (text type) Translator, translation process Text body (multi-layer stand-off XML) Linguistic annotation and alignment 4/27/2006 Language Archives: Standards, Creation and Access 5 The Header <teiheader> <filedesc> <filename>go_fiction_001.txt</filename> <subcorpus>fiction_go</subcorpus> <language>german</language> <titlestmt> <title>mein Jahr als Mörder</title> <author>delius, Friedrich Christian</author> </titlestmt> <translation></translation> <publicationstmt> <publisher>rowohlt Berlin Verlag</publisher> <date>2004</date> <distributor> <availability>local</availability> </publicationstmt> <registeranalysis> </registeranalysis>
4 Morphology (MPro), part-of-speech (TnT), phrases (MPro), grammatical functions Representation format: XML Annotation Multi-layer: each annotation different layer Stand-off annotation: annotation layers separate Connection within a language by Xlink, Xpointer, xml:base attributes 4/27/2006 Language Archives: Standards, Creation and Access 7 XML Annotation name="go.tok.xml" xml:lang="de" doctype="ori"> <header xlink:href="go.header.xml"/> <tokens> <token id="t64" strg="ich"/> <token id="t65" strg="spielte"/> <token id="t66" strg="viele"/> <token id="t67 strg="möglichkeiten"/> <token id="t68" strg="durch"/> <token id="t69" strg=","/> </tokens> name="go.tag.xml"> <tokens xml:base="go.tok.xml"> <token pos="pper" xlink:href="#t64"/> <token pos="vvfin" xlink:href="#t65"/> <token pos="pidat" xlink:href="#t66"/> <token pos="nn" xlink:href="#t67"/> <token pos="ptkvz" xlink:href="#68"/> <token pos="yc" xlink:href="#t69"/> </tokens> name="go.chunk.xml"> <chunks xml:base="go.tok.xml"> <chunk id="ch13"> <tok xlink:href="#t66"/> <tok xlink:href="#t67"/> <chunk id="ch14"> <tok xlink:href="#t70"/> <chunk id="ch15"> <tok xlink:href="#t71"/>
5 name="go.chunk.xml"> <chunks xml:base="go.tok.xml"> <chunk id="ch13"> <tok xlink:href="#t66"/> <tok xlink:href="#t67"/> <chunk id="ch14"> <tok xlink:href="#t70"/> <chunk id="ch15"> <tok xlink:href="#t71"/> name="go.ps.xml"> <chunks xml:base="go.chunk.xml"> <chunk ps="np" xlink:href="#ch13"/> <chunk ps="vpfin" xlink:href="#ch14"/> <chunk ps="np" xlink:href="#ch15"/> <chunk ps="np" xlink:href="#ch16"/> <chunk ps="pp" xlink:href="#ch17"/> <chunk ps="np" xlink:href="#ch18"/> <chunk ps="vppred" xlink:href="#ch19"/> name="go.gf.xml"> <chunks xml:base="go.chunk.xml"> chunk gf="dobj" xlink:href="#ch13"/> <chunk gf="fin" link:href="#ch14"/> <chunk gf="iobj" xlink:href="#ch15"/> <chunk gf="dobj" xlink:href="#ch16"/> <chunk gf="adv" xlink:href="#ch17"/> <chunk gf="pred" xlink:href="#ch19"/> 4/27/2006 Language Archives: Standards, Creation and Access 9 Alignment Sentences (WinAlign, Trados), Clauses (MMAX II), Phrases (MMAX II), Words (GIZA++) Representation format: XML Alignment Multi-layer: each alignment different layer Stand-off annotation: alignment layers separate Connection between source and target language by Xlink and Xpointer attributes plus <translations> element
6 German source He Ihre Hände ließen ihn leise English target wimmern whimpered softly under her hands EMPTY LINK name= GO2Etrans.tokenAlign.xml"> <translations xml:base="/corpus/"> <translation trans.loc= GO.tok.xml xml:lang="ge" n="1"/> <translation trans.loc= Etrans.tok.xml" xml:lang="en" n="2"/> </translations> <tokens> <token> <align xlink:href="#t3"/> </token> <token> <align xlink:href="#t4"/> </token> <token> <align xlink:href="#t4"/> <align xlink:href="#t1"/> </token> 4/27/2006 Language Archives: Standards, Creation and Access 11 XML Chunk Alignemnt German source SUBJ FIN DOBJ ADV PRED Ihre Hände ließen ihn leise wimmern He whimpered softly under her hands SUBJ FIN ADV ADV English target CROSSING LINE name= GO2ETrans.gfAlign.xml"> <translations xml:base="/corpus/"> <translation trans.loc= GO.chunk.xml" xml:lang="ge" n="1"/> <translation trans.loc= ETrans.chunk.xml" xml:lang="en" n="2"/> </translations> <chunks> <chunk> <align xlink:href="#ch1"/> <align xlink:href="#ch1"/> <chunk> <align xlink:href="#ch3"/> <chunk> <align xlink:href="#ch4"/>
7 Corpus exploitation with XQuery and XSLT Corpus access via Internet Graphical query interface Empirical analysis of explicitation (and other translation properties) Definition of the translation unit? 4/27/2006 Language Archives: Standards, Creation and Access 13
Die Vielfalt vereinen: Die CLARIN-Eingangsformate CMDI und TCF
Die Vielfalt vereinen: Die CLARIN-Eingangsformate CMDI und TCF Susanne Haaf & Bryan Jurish Deutsches Textarchiv 1. The Metadata Format CMDI Metadata? Metadata Format? and more Metadata? Metadata Format?
CINTIL-PropBank. CINTIL-PropBank Sub-corpus id Sentences Tokens Domain Sentences for regression atsts 779 5,654 Test
CINTIL-PropBank I. Basic Information 1.1. Corpus information The CINTIL-PropBank (Branco et al., 2012) is a set of sentences annotated with their constituency structure and semantic role tags, composed
FoLiA: Format for Linguistic Annotation
Maarten van Gompel Radboud University Nijmegen 20-01-2012 Introduction Introduction What is FoLiA? Generalised XML-based format for a wide variety of linguistic annotation Characteristics Generalised paradigm
Text-Driven Ontology Generation and Extension in the Finance Domain. Mihaela Vela Language Technology Lab DFKI Saarbrücken
Text-Driven Ontology Generation and Extension in the Finance Domain Mihaela Vela Language Technology Lab DFKI Saarbrücken European MUSING project Development of Business Intelligence tools and modules
Extraction and Visualization of Protein-Protein Interactions from PubMed
Extraction and Visualization of Protein-Protein Interactions from PubMed Ulf Leser Knowledge Management in Bioinformatics Humboldt-Universität Berlin Finding Relevant Knowledge Find information about Much
Metadata in Translation Tools: Importance, Usage, Storage, Transfer. Angelika Zerfaß & Richard Sikes
Metadata in Translation Tools: Importance, Usage, Storage, Transfer Angelika Zerfaß & Richard Sikes Metadata is... Data that describes other data. It provides information about a certain item's content.
Dutch Parallel Corpus
Dutch Parallel Corpus Lieve Macken [email protected] LT 3, Language and Translation Technology Team Faculty of Applied Language Studies University College Ghent November 29th 2011 Lieve Macken (LT
Statistical Machine Translation
Statistical Machine Translation Some of the content of this lecture is taken from previous lectures and presentations given by Philipp Koehn and Andy Way. Dr. Jennifer Foster National Centre for Language
Schema documentation for types1.2.xsd
Generated with oxygen XML Editor Take care of the environment, print only if necessary! 8 february 2011 Table of Contents : ""...........................................................................................................
Presentation / Interface 1.3
W3C Recommendations Mobile Web Best Practices 1.0 Canonical XML Version 1.1 Cascading Style Sheets, level 2 (CSS2) SPARQL Query Results XML Format SPARQL Protocol for RDF SPARQL Query Language for RDF
Natural Language to Relational Query by Using Parsing Compiler
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 4, Issue. 3, March 2015,
Convergence of Translation Memory and Statistical Machine Translation
Convergence of Translation Memory and Statistical Machine Translation Philipp Koehn and Jean Senellart 4 November 2010 Progress in Translation Automation 1 Translation Memory (TM) translators store past
Annotation in Language Documentation
Annotation in Language Documentation Univ. Hamburg Workshop Annotation SEBASTIAN DRUDE 2015-10-29 Topics 1. Language Documentation 2. Data and Annotation (theory) 3. Types and interdependencies of Annotations
ANALEC: a New Tool for the Dynamic Annotation of Textual Data
ANALEC: a New Tool for the Dynamic Annotation of Textual Data Frédéric Landragin, Thierry Poibeau and Bernard Victorri LATTICE-CNRS École Normale Supérieure & Université Paris 3-Sorbonne Nouvelle 1 rue
Overview of MT techniques. Malek Boualem (FT)
Overview of MT techniques Malek Boualem (FT) This section presents an standard overview of general aspects related to machine translation with a description of different techniques: bilingual, transfer,
Berlin-Brandenburg Academy of sciences and humanities (BBAW) resources / services
Berlin-Brandenburg Academy of sciences and humanities (BBAW) resources / services speakers: Kai Zimmer and Jörg Didakowski Clarin Workshop WP2 February 2009 BBAW/DWDS The BBAW and its 40 longterm projects
Terminology Extraction from Log Files
Terminology Extraction from Log Files Hassan Saneifar, Stéphane Bonniol, Anne Laurent, Pascal Poncelet, Mathieu Roche To cite this version: Hassan Saneifar, Stéphane Bonniol, Anne Laurent, Pascal Poncelet,
CLARIN project DiscAn :
CLARIN project DiscAn : Towards a Discourse Annotation system for Dutch language corpora Ted Sanders Kirsten Vis Utrecht Institute of Linguistics Utrecht University Daan Broeder TLA Max-Planck Institute
WebLicht: Web-based LRT services for German
WebLicht: Web-based LRT services for German Erhard Hinrichs, Marie Hinrichs, Thomas Zastrow Seminar für Sprachwissenschaft, University of Tübingen [email protected] Abstract This software
SWIFT Aligner, A Multifunctional Tool for Parallel Corpora: Visualization, Word Alignment, and (Morpho)-Syntactic Cross-Language Transfer
SWIFT Aligner, A Multifunctional Tool for Parallel Corpora: Visualization, Word Alignment, and (Morpho)-Syntactic Cross-Language Transfer Timur Gilmanov, Olga Scrivner, Sandra Kübler Indiana University
Empirical Machine Translation and its Evaluation
Empirical Machine Translation and its Evaluation EAMT Best Thesis Award 2008 Jesús Giménez (Advisor, Lluís Màrquez) Universitat Politècnica de Catalunya May 28, 2010 Empirical Machine Translation Empirical
2. Distributed Handwriting Recognition. Abstract. 1. Introduction
XPEN: An XML Based Format for Distributed Online Handwriting Recognition A.P.Lenaghan, R.R.Malyan, School of Computing and Information Systems, Kingston University, UK {a.lenaghan,r.malyan}@kingston.ac.uk
TS3: an Improved Version of the Bilingual Concordancer TransSearch
TS3: an Improved Version of the Bilingual Concordancer TransSearch Stéphane HUET, Julien BOURDAILLET and Philippe LANGLAIS EAMT 2009 - Barcelona June 14, 2009 Computer assisted translation Preferred by
Kybots, knowledge yielding robots German Rigau IXA group, UPV/EHU http://ixa.si.ehu.es
KYOTO () Intelligent Content and Semantics Knowledge Yielding Ontologies for Transition-Based Organization http://www.kyoto-project.eu/ Kybots, knowledge yielding robots German Rigau IXA group, UPV/EHU
Generating SQL Queries Using Natural Language Syntactic Dependencies and Metadata
Generating SQL Queries Using Natural Language Syntactic Dependencies and Metadata Alessandra Giordani and Alessandro Moschitti Department of Computer Science and Engineering University of Trento Via Sommarive
<EAD> ENCODED ARCHIVAL DESCRIPTION
ENCODED ARCHIVAL DESCRIPTION EAD as used within the Archives Portal Europe for finding aids and holdings guides Description and best practice guide May 2013 Document history Date Version Description
Using the BNC to create and develop educational materials and a website for learners of English
Using the BNC to create and develop educational materials and a website for learners of English Danny Minn a, Hiroshi Sano b, Marie Ino b and Takahiro Nakamura c a Kitakyushu University b Tokyo University
Interactive Dynamic Information Extraction
Interactive Dynamic Information Extraction Kathrin Eichler, Holmer Hemsen, Markus Löckelt, Günter Neumann, and Norbert Reithinger Deutsches Forschungszentrum für Künstliche Intelligenz - DFKI, 66123 Saarbrücken
Shallow Parsing with Apache UIMA
Shallow Parsing with Apache UIMA Graham Wilcock University of Helsinki Finland [email protected] Abstract Apache UIMA (Unstructured Information Management Architecture) is a framework for linguistic
A GrAF-compliant Indonesian Speech Recognition Web Service on the Language Grid for Transcription Crowdsourcing
A GrAF-compliant Indonesian Speech Recognition Web Service on the Language Grid for Transcription Crowdsourcing LAW VI JEJU 2012 Bayu Distiawan Trisedya & Ruli Manurung Faculty of Computer Science Universitas
a Chinese-to-Spanish rule-based machine translation
Chinese-to-Spanish rule-based machine translation system Jordi Centelles 1 and Marta R. Costa-jussà 2 1 Centre de Tecnologies i Aplicacions del llenguatge i la Parla (TALP), Universitat Politècnica de
The Language Archive at the Max Planck Institute for Psycholinguistics. Alexander König (with thanks to J. Ringersma)
The Language Archive at the Max Planck Institute for Psycholinguistics Alexander König (with thanks to J. Ringersma) Fourth SLCN Workshop, Berlin, December 2010 Content 1.The Language Archive Why Archiving?
An Online Service for SUbtitling by MAchine Translation
SUMAT CIP-ICT-PSP-270919 An Online Service for SUbtitling by MAchine Translation Annual Public Report 2011 Editor(s): Contributor(s): Reviewer(s): Status-Version: Volha Petukhova, Arantza del Pozo Mirjam
Chapter 8. Final Results on Dutch Senseval-2 Test Data
Chapter 8 Final Results on Dutch Senseval-2 Test Data The general idea of testing is to assess how well a given model works and that can only be done properly on data that has not been seen before. Supervised
EDAK: A CORPUS TO ANALYSE LINGUISTIC VARIATION
EDAK: A CORPUS TO ANALYSE LINGUISTIC VARIATION Gotzon Aurrekoetxea 1, Jon Sánchez, Igor Odriozola, Urkixo Zumarkalea z/g Universidad del País Vasco-Euskal Herriko Unibertsitatea (UPV-EHU) ABSTRACT This
Search and Data Mining: Techniques. Text Mining Anya Yarygina Boris Novikov
Search and Data Mining: Techniques Text Mining Anya Yarygina Boris Novikov Introduction Generally used to denote any system that analyzes large quantities of natural language text and detects lexical or
Terminology Extraction from Log Files
Terminology Extraction from Log Files Hassan Saneifar 1,2, Stéphane Bonniol 2, Anne Laurent 1, Pascal Poncelet 1, and Mathieu Roche 1 1 LIRMM - Université Montpellier 2 - CNRS 161 rue Ada, 34392 Montpellier
Special Topics in Computer Science
Special Topics in Computer Science NLP in a Nutshell CS492B Spring Semester 2009 Jong C. Park Computer Science Department Korea Advanced Institute of Science and Technology INTRODUCTION Jong C. Park, CS
Comprendium Translator System Overview
Comprendium System Overview May 2004 Table of Contents 1. INTRODUCTION...3 2. WHAT IS MACHINE TRANSLATION?...3 3. THE COMPRENDIUM MACHINE TRANSLATION TECHNOLOGY...4 3.1 THE BEST MT TECHNOLOGY IN THE MARKET...4
Declarative Parsing and Annotation of Electronic Dictionaries
Declarative Parsing and Annotation of Electronic Dictionaries Christian Schneiker 1, Dietmar Seipel 1, Werner Wegstein 2, and Klaus Prätor 3 1 Department of Computer Science {schneiker seipel}@informatik.uni-wuerzburg.de
HIERARCHICAL HYBRID TRANSLATION BETWEEN ENGLISH AND GERMAN
HIERARCHICAL HYBRID TRANSLATION BETWEEN ENGLISH AND GERMAN Yu Chen, Andreas Eisele DFKI GmbH, Saarbrücken, Germany May 28, 2010 OUTLINE INTRODUCTION ARCHITECTURE EXPERIMENTS CONCLUSION SMT VS. RBMT [K.
Testing Data-Driven Learning Algorithms for PoS Tagging of Icelandic
Testing Data-Driven Learning Algorithms for PoS Tagging of Icelandic by Sigrún Helgadóttir Abstract This paper gives the results of an experiment concerned with training three different taggers on tagged
SYSTRAN Chinese-English and English-Chinese Hybrid Machine Translation Systems for CWMT2011 SYSTRAN 混 合 策 略 汉 英 和 英 汉 机 器 翻 译 系 CWMT2011 技 术 报 告
SYSTRAN Chinese-English and English-Chinese Hybrid Machine Translation Systems for CWMT2011 Jin Yang and Satoshi Enoue SYSTRAN Software, Inc. 4444 Eastgate Mall, Suite 310 San Diego, CA 92121, USA E-mail:
Clustering Connectionist and Statistical Language Processing
Clustering Connectionist and Statistical Language Processing Frank Keller [email protected] Computerlinguistik Universität des Saarlandes Clustering p.1/21 Overview clustering vs. classification supervised
Natural Language Processing in the EHR Lifecycle
Insight Driven Health Natural Language Processing in the EHR Lifecycle Cecil O. Lynch, MD, MS [email protected] Health & Public Service Outline Medical Data Landscape Value Proposition of NLP
Efficient Techniques for Improved Data Classification and POS Tagging by Monitoring Extraction, Pruning and Updating of Unknown Foreign Words
, pp.290-295 http://dx.doi.org/10.14257/astl.2015.111.55 Efficient Techniques for Improved Data Classification and POS Tagging by Monitoring Extraction, Pruning and Updating of Unknown Foreign Words Irfan
Annotation Guidelines for Dutch-English Word Alignment
Annotation Guidelines for Dutch-English Word Alignment version 1.0 LT3 Technical Report LT3 10-01 Lieve Macken LT3 Language and Translation Technology Team Faculty of Translation Studies University College
Language Interface for an XML. Constructing a Generic Natural. Database. Rohit Paravastu
Constructing a Generic Natural Language Interface for an XML Database Rohit Paravastu Motivation Ability to communicate with a database in natural language regarded as the ultimate goal for DB query interfaces
Intelligent Analysis of User Interactions in a Collaborative Software Engineering Context
Intelligent Analysis of User Interactions in a Collaborative Software Engineering Context Alejandro Corbellini 1,2, Silvia Schiaffino 1,2, Daniela Godoy 1,2 1 ISISTAN Research Institute, UNICEN University,
Markus Dickinson. Dept. of Linguistics, Indiana University Catapult Workshop Series; February 1, 2013
Markus Dickinson Dept. of Linguistics, Indiana University Catapult Workshop Series; February 1, 2013 1 / 34 Basic text analysis Before any sophisticated analysis, we want ways to get a sense of text data
Language and Computation
Language and Computation week 13, Thursday, April 24 Tamás Biró Yale University [email protected] http://www.birot.hu/courses/2014-lc/ Tamás Biró, Yale U., Language and Computation p. 1 Practical matters
POSBIOTM-NER: A Machine Learning Approach for. Bio-Named Entity Recognition
POSBIOTM-NER: A Machine Learning Approach for Bio-Named Entity Recognition Yu Song, Eunji Yi, Eunju Kim, Gary Geunbae Lee, Department of CSE, POSTECH, Pohang, Korea 790-784 Soo-Jun Park Bioinformatics
CorA: A web-based annotation tool for historical and other non-standard language data
CorA: A web-based annotation tool for historical and other non-standard language data Marcel Bollmann, Florian Petran, Stefanie Dipper, Julia Krasselt Department of Linguistics Ruhr-University Bochum,
Chapter 5. Phrase-based models. Statistical Machine Translation
Chapter 5 Phrase-based models Statistical Machine Translation Motivation Word-Based Models translate words as atomic units Phrase-Based Models translate phrases as atomic units Advantages: many-to-many
Lou Burnard Consulting 2014-06-21
Getting started with oxygen Lou Burnard Consulting 2014-06-21 1 Introducing oxygen In this first exercise we will use oxygen to : create a new XML document gradually add markup to the document carry out
Architecture of an Ontology-Based Domain- Specific Natural Language Question Answering System
Architecture of an Ontology-Based Domain- Specific Natural Language Question Answering System Athira P. M., Sreeja M. and P. C. Reghuraj Department of Computer Science and Engineering, Government Engineering
Duplication in Corpora
Duplication in Corpora Nadjet Bouayad-Agha and Adam Kilgarriff Information Technology Research Institute University of Brighton Lewes Road Brighton BN2 4GJ, UK email: [email protected]
Automated Extraction of Security Policies from Natural-Language Software Documents
Automated Extraction of Security Policies from Natural-Language Software Documents Xusheng Xiao 1 Amit Paradkar 2 Suresh Thummalapenta 3 Tao Xie 1 1 Dept. of Computer Science, North Carolina State University,
Tibetan-Chinese Bilingual Sentences Alignment Method based on Multiple Features
, pp.273-280 http://dx.doi.org/10.14257/ijdta.2015.8.4.27 Tibetan-Chinese Bilingual Sentences Alignment Method based on Multiple Features Lirong Qiu School of Information Engineering, MinzuUniversity of
An Interface from YAWL to OpenERP
An Interface from YAWL to OpenERP Joerg Evermann Faculty of Business Administration, Memorial University of Newfoundland, Canada [email protected] Abstract. The paper describes an interface from the YAWL
From Databases to Natural Language: The Unusual Direction
From Databases to Natural Language: The Unusual Direction Yannis Ioannidis Dept. of Informatics & Telecommunications, MaDgIK Lab University of Athens, Hellas (Greece) [email protected] http://www.di.uoa.gr/
EXMARaLDA and the FOLK tools two toolsets for transcribing and annotating spoken language
EXMARaLDA and the FOLK tools two toolsets for transcribing and annotating spoken language Thomas Schmidt Institut für Deutsche Sprache, Mannheim R 5, 6-13 D-68161 Mannheim [email protected]
The SYSTRAN Linguistics Platform: A Software Solution to Manage Multilingual Corporate Knowledge
The SYSTRAN Linguistics Platform: A Software Solution to Manage Multilingual Corporate Knowledge White Paper October 2002 I. Translation and Localization New Challenges Businesses are beginning to encounter
Learning Mathematics with
Deutsches Forschungszentrum für f r Künstliche K Intelligenz Learning Mathematics with Jörg Siekmann German Research Centre for Artificial Intelligence DFKI Universität des Saarlandes e-learning: Systems
NATURAL LANGUAGE QUERY PROCESSING USING SEMANTIC GRAMMAR
NATURAL LANGUAGE QUERY PROCESSING USING SEMANTIC GRAMMAR 1 Gauri Rao, 2 Chanchal Agarwal, 3 Snehal Chaudhry, 4 Nikita Kulkarni,, 5 Dr. S.H. Patil 1 Lecturer department o f Computer Engineering BVUCOE,
Factored Translation Models
Factored Translation s Philipp Koehn and Hieu Hoang [email protected], [email protected] School of Informatics University of Edinburgh 2 Buccleuch Place, Edinburgh EH8 9LW Scotland, United Kingdom
Agents and Web Services
Agents and Web Services ------SENG609.22 Tutorial 1 Dong Liu Abstract: The basics of web services are reviewed in this tutorial. Agents are compared to web services in many aspects, and the impacts of
Computer Aided Translation
Computer Aided Translation Philipp Koehn 30 April 2015 Why Machine Translation? 1 Assimilation reader initiates translation, wants to know content user is tolerant of inferior quality focus of majority
Annotated Corpora in the Cloud: Free Storage and Free Delivery
Annotated Corpora in the Cloud: Free Storage and Free Delivery Graham Wilcock University of Helsinki [email protected] Abstract The paper describes a technical strategy for implementing natural
Building gold-standard treebanks for Norwegian
Building gold-standard treebanks for Norwegian Per Erik Solberg National Library of Norway, P.O.Box 2674 Solli, NO-0203 Oslo, Norway [email protected] ABSTRACT Språkbanken at the National Library of Norway
Towards a Cross-Linguistic Production Data Archive: Structure and Exploration*
Towards a Cross-Linguistic Production Data Archive: Structure and Exploration* Michael Götze 1, Stavros Skopeteas 1, Torsten Roloff 1, and Ruben Stoel 2 1 SFB 632 Information Structure, Institut für Linguistik,
The Transition of Phrase based to Factored based Translation for Tamil language in SMT Systems
The Transition of Phrase based to Factored based Translation for Tamil language in SMT Systems Dr. Ananthi Sheshasaayee 1, Angela Deepa. V.R 2 1 Research Supervisior, Department of Computer Science & Application,
LKMS/Mnemosyne: Semantic Web technologies and Markup techniques for a Legal Knowledge Management System.
LKMS/Mnemosyne: Semantic Web technologies and Markup techniques for a Legal Knowledge Management System. L. Gilardoni, C.Biasuzzi, M.Ferraro, R.Fonti, P.Slavazza Quinary with F. Toffoletto, V.F. Giglio,
