SakkamMT White Paper
|
|
- Domenic Morrison
- 8 years ago
- Views:
Transcription
1 SakkamMT White Paper Sakkam K.K. Wakamatsu Building 7F Nihonbashi Honcho Chuo-ku Tokyo
2 Introduction Ever since the emergence of machine translation there has been a debate about if and when machines will replace humans for everyday translation tasks. With Sakkam Machine Translation (SakkamMT) we do not attempt to replace human translators; instead we focus on translation activities that are impractical or impossible with a traditional approach to translation. These activities are characterized by one or more of the following elements. Extreme time sensitivity. The value of some categories of information decreases rapidly from the moment it is made available. The announcement of US employment data has the potential to move currency markets in the seconds following its release but is old news one minute later. To have value, the translation of such releases needs to be subsecond. Massive volume. The volume of user-generated content is exploding, whether through social networking sites, online auctions or game networks. There are simply not enough human translators available to translate or monitor or extract meaning from tens of millions of messages per day. Limited availability of domain expertise. Many translation tasks require a high degree of domain expertise in addition to fluency in source and target languages. It is often difficult to identify translators with the requisite skill set and this problem is exacerbated if the translation task is not a discrete project but an ongoing, around the clock activity. We have also taken a fundamentally different approach in developing the technology that underpins SakkamMT. From the outset, SakkamMT was designed to provide native speaker quality translations within limited domains. As such, both its targeted applications and usefulness are very different than existing attempts at machine translation, which seek to provide general capabilities, but of low quality. As anyone who has ever used existing Internet and PC-based translation systems will know, their output is unusable within a business context. This paper provides an overview of the SakkamMT architecture and how SakkamMT is being used where human translation is either impractical or impossible. Readers wishing to learn more about the technical foundations of Sakkam MT are invited to review the bibliography.
3 Architecture The Interim Representation Model SakkamMT works by parsing the source language text using domain specific rules to create a language independent Interim Representation Model (IRM). The IRM can then be used to drive output to another language. This approach affords a number of benefits, 1. As more output languages are added, the systems scales linearly (O(N)) rather than with the number of language pairs (O(N 2 )). 2. The IRM may be used not just to output other human languages, but also to provide a computerreadable API for entity extraction. For example, dates could be extracted from an to populate a calendar. 3. The same content may be translated in different styles. For example, a new headline could be output in a highly abbreviated news style, and/or a more grammatically correct prose style depending upon the requirement. 4. The IRM facilitates a common-sense check of understanding and prevents some of the more egregious errors MT systems often make. Once the IRM has been populated, it is possible to judge whether the source text has been understood or not. At this stage a potential wrong translation can be suppressed. This is in marked contrast to generalized translation systems that will always output something regardless of quality. Cognitive Categories The design of the IRM draws heavily on research in Cognitive Psychology into semantic categorizations, which differ significantly from mathematically formal categories. For example, cognitive categories do not in general support transitive closure, which is a characteristic of mathematically formal categories. Hence, the categorization system that is used in IRM understands that although a car seat is a chair and a chair is an item of furniture, a car seat is not an item of furniture. Cognitive categories also display a degree of typicality a goldfish is not as good an example of a pet as a dog. Sometimes, category membership can be so ill-defined as to need legal action to clarify. The question of whether a tomato is a vegetable or a fruit ended up being decided by the US Supreme Court in response to a new tax on vegetables but not fruit. Perhaps not surprisingly, they choose to classify it as a vegetable to bring it within scope of the tax. Botanists would class it as a fruit. Cognitive categories are also crucially ad-hoc and can be determined dynamically, set by context. For example, a tomato can be a missile. Traditional hierarchically structured schemas and taxonomies are unsuited to this, but Sakkam s IRM has been designed from the outset with this flexibility in mind. Populating the IRM Populating the IRM requires parsing the source language to extract meaning. This is done using a series of linguistic rules which operate atomically upon the text until the structure of the IRM has been built. These rules are different from what many MT systems use in that they are primarily based upon the linguistic field of pragmatics, rather than more typical syntax-based grammatical rules. SakkamMT uses little syntactical grammar in its parsing for the simple reason that in many cases (such as headlines, s etc), the grammar can quite often be wrong. Instead, SakkamMT
4 takes what is closer to a construction grammar approach in using pragmatic criteria to determine the logical units. Note that here, as in the rest of this paper, we are taking pragmatics to refer to the sub-division of linguistics, not the everyday English meaning of the word. Figure 1. shows how alternate expressions using different phrasings result in the same pragmaticbased model, despite having very different syntactic structures. While in some cases, syntax is crucial (for example A killed B, only syntax can tell you which is the subject, and which is the object), in many cases it has little to add. In this particular example, the grammatical part of speech that approximate appears as, can be a noun, verb, adjective or adverb, with little implication to meaning. Source Syntactic Structure Interim Representation Model (IRM) Approximately, the distance to London is 200 km The approximate distance to London is 200 km The distance approximately to London is 200 km The distance to London approximation is 200 km The distance to London approximately is 200 km The distance to London approximates to 200 km The distance to London is approximately 200 km The distance to London is 200 km approximately It is approximately 200 km distance to London ADV NP PP VP(V NP) SENTENCE [ ITEM[ TYPE[London(LOCATION) NP(ADJ N) PP VP(V NP) ATTRIBUTE[ MEASURE[ TYPE[distance(LENGTH) NP ADV PP VP(V NP) RELATION[to NP(NP PP N) VP(V NP) ATTRIBUTE[ VALUE[ NP PP VP(ADV V NP) NUMBER[200 UNITS[Km NP PP VP(V NP) SIGN[DEFAULT-POSITIVE PRECISION[approximate NP PP VP(V ADV NP) NP PP VP(V NP ADV) NP VP(ADV NP(NP NP) PP) Figure 1. Syntactic Structure and Pragmatic Structure Once the IRM has been populated, a set of output rules can be brought to bear to express the IRM in the target language. Variations of style can be introduced at this stage. It should be noted that for some very terse styles such as are common in news headlines, we may be generating grammatically incorrect text, but nevertheless it is actually more appropriate for the audience. Looking at the example in Figure 1 again, we also note that for any particular representation, then both the variety and acceptability of grammatical phrasings will be highly dependent upon the target language. Existing machine translation systems will often instead try to translate the gross sentence syntactic structure, and then fill in the slots with translations for the individual noun and verb phrases. This however can lead to combinations that at best seem unnatural and at worst can appear nonsensical. As an example, one Internet-based machine translation system translates: Approximately, the distance to London is 200 km
5 as 200. Literally this would be: Approximation, it is 200km to London. The English is deliberately stilted to reflect how it would sound in Japanese. The sentence pattern exactly corresponds to the English source but the result is an incorrect Japanese sentence. Translating the other nine English phrasings results in nine different translations of varying degrees of accuracy. By contrast, SakkamMT uses the same IRM for all ten phrasings and uses the more acceptable and correct Japanese form, 200, again the same for all ten variations of the input. Another example illustrates how pragmatics can enable the correct disambiguation between the different meanings of a word. The following Japanese text is an extract from an item that was listed on Yahoo! Auctions. / The meaning of the first part of the text (shown in blue) is, Transformers Bumblebee Replica Mask 1:1 scale, genuine goods. Most translation engines will translate the second part (shown in red) as "I have a rash!". Although this is linguistically correct it is pragmatically wrong. In the context of an auction for a mask, the correct translation, and the one that is made by SakkamMT, is "You can wear it!"
6 Business Implementation A SakkamMT implementation consists of the following activities, Project definition. Agree the scope, scale and timeframes for the project. The project may include a pilot phase, which provides an opportunity to demonstrate the effectiveness of the SakkamMT approach in the client s own environment. Enhancement of the Interim Representation Model (IRM). Most deployments require modifications to our standard IRM to reflect the precise nature of the communication for the client s application. Compilation of project specific dictionaries and named entity databases. Existing client dictionaries and translation memories are used as available and, where necessary, additional material is developed through a combination of manual compilation and automated text analysis. All new content is then categorized to ensure consistency with the SakkamMT categorization model. Lifecycle planning. Typical SakkamMT deployments involve the continual translation of a feed of information over a period of months and years. Over the life of the project, the source content may change as new terminology, and even concepts, are introduced. In some cases, SakkamMT will be able to adapt automatically to these changes but there may also be an ongoing the requirement to ensure new terminology is being correctly used and named entity databases are up-to-date. Technical integration. Sakkam provides a simple, secure Web API for integration with client systems. The SakkamMT infrastructure is hosted at Amazon EC2, which enables us to scale our servers smoothly to many millions of translations per day and, by deploying production servers in different continents, to offer continuity of service even if any entire datacenter becomes unavailable. Pilot phase. End-to-end deployment of SakkamMT within the client environment for a limited but representation subset of the overall project scope. Live deployment. Full deployment of SakkamMT with the client environment.
7 Case study: Financial News Feed The foreign currency exchange markets dwarf those of equities, with on a typical day something of the order of $1.5 trillion being traded and the Yen/Dollar rate as one of the major currency pairs. The Yen/Dollar rate is highly sensitive to US economic news, with billions of dollars changing hands within seconds of the release of a key economic indicator. Often by the time Japanese translations are available, the market has already moved to factor in the news and the opportunity to profit has been lost. SakkamMT is being used by Intisar Technology to provide business critical translations of US Financial News headlines from a leading financial news vendor. Translations are made with no more than a sub-second delay, enabling Intisar s customers valuable time to profit. Clearly, accuracy of translation, as well as immediacy of output is essential in this environment. Intisar has integrated this feed of translated stories into their realtime market data platform that supports traders workstations, such as Tradesignal. Figure 2. SakkamMT translated news stories in Tradesignal Existing general purpose machine translation systems are incapable of providing even the gist of these news releases and human translators, if they have no domain expertise, are liable to make mistakes. Competing Japanese news vendors that rely on human translation are tens of seconds to several minutes behind, as well as requiring a high cost base (skilled bilingual financial domain experts available 24x7), that must ultimately be passed on to the consumer. For details of this news feed and other Intisar services, please contact
8 Case Study: Internet Auctions Increasing globalization has driven cross-boarder interest in collectables, and this now acts as a major revenue driver for Internet auctions and marketplaces. As a specific example, there is significant worldwide interest in manga and anime, many of which have limited availability outside of Japan. Figure 3 illustrates the price difference for a given item s English language listing on ebay and an identical item s Japanese language listing on Yahoo! Auctions. The difference in price represents the language premium that US collectors need to pay to search for items and bid in English. Figure 3: US-Japan Price Differentials Some collectors do attempt to use web based translators but these general services are too inaccurate to allow a collector bid with any confidence. Examples proliferate of inaccurate translations and even the reversal of meaning, as in the below example. Original listing Internet translation Home for longer storage is a new unused item, in a box outside the gall, and a GE. For a completely new, please bid. Unfortunately this is wrong. In fact the translation should be do not bid as is correctly shown in the below SakkamMT:
9 Original listing SakkamMT translation This is a new, unused product. However, since it has been stored at home for a long time there are a number of marks and scratches on the outer box. Please do not bid if you are looking for a completely new item. SakkamMT is being used to provide immediate translations of anime related memorabilia auctioned on Internet sites. For a live demonstration of SakkamMT translating auction content from Japan s Yahoo! Auctions, please on Twitter (
10 Bibliography: Technical Foundations Conceptual Modelling Margolis & Laurence (ed.), Concepts: Core Readings, MIT Press, 1991 Talmy, Leonard Toward a Cognitive Semantics Vol 1, MIT Press, 2000 Vosniadou & Ortony (ed.), Similarity and Analogical Reasoning, Cambridge University press, 1989 Linguistics Croft, William, Radical Construction Grammar: syntactic theory in typological perspective, Oxford University Press, 2001 Levin, Beth, English Verb Classes and Alternations: A Preliminary Investigation, University of Chicago Press, Chicago, IL Sperber, Dan and Wilson, Deirdre. Relevance: Communication and Cognition. Oxford: Blackwell, 1986/1995. Matsui, Tomoko. Bridging and Relevance. John Benjamins Publishing Co Wilson, Deirdre & Carston, Robyn A unitary approach to lexical pragmatics: Relevance, inference and ad hoc concepts. In N. Burton-Roberts (ed.) Pragmatics. Palgrave, London : Computational Architecture Mitchell, Melanie, Analogy-Making as Perception: A Computer Model. Cambridge, MA: MIT Press, 1993 Mitchell, Melanie, Analogy-making as a complex adaptive system. In L. Segel and I. Cohen (editors), Design Principles for the Immune System and Other Distributed Autonomous Systems. New York: Oxford University Press, 2001
WHITE PAPER. Machine Translation of Language for Safety Information Sharing Systems
WHITE PAPER Machine Translation of Language for Safety Information Sharing Systems September 2004 Disclaimers; Non-Endorsement All data and information in this document are provided as is, without any
More informationModern foreign languages
Modern foreign languages Programme of study for key stage 3 and attainment targets (This is an extract from The National Curriculum 2007) Crown copyright 2007 Qualifications and Curriculum Authority 2007
More informationHow the Computer Translates. Svetlana Sokolova President and CEO of PROMT, PhD.
Svetlana Sokolova President and CEO of PROMT, PhD. How the Computer Translates Machine translation is a special field of computer application where almost everyone believes that he/she is a specialist.
More informationIntroduction to formal semantics -
Introduction to formal semantics - Introduction to formal semantics 1 / 25 structure Motivation - Philosophy paradox antinomy division in object und Meta language Semiotics syntax semantics Pragmatics
More informationOverview of MT techniques. Malek Boualem (FT)
Overview of MT techniques Malek Boualem (FT) This section presents an standard overview of general aspects related to machine translation with a description of different techniques: bilingual, transfer,
More informationA Survey of Online Tools Used in English-Thai and Thai-English Translation by Thai Students
69 A Survey of Online Tools Used in English-Thai and Thai-English Translation by Thai Students Sarathorn Munpru, Srinakharinwirot University, Thailand Pornpol Wuttikrikunlaya, Srinakharinwirot University,
More informationBuilding a Question Classifier for a TREC-Style Question Answering System
Building a Question Classifier for a TREC-Style Question Answering System Richard May & Ari Steinberg Topic: Question Classification We define Question Classification (QC) here to be the task that, given
More informationTaxonomies in Practice Welcome to the second decade of online taxonomy construction
Building a Taxonomy for Auto-classification by Wendi Pohs EDITOR S SUMMARY Taxonomies have expanded from browsing aids to the foundation for automatic classification. Early auto-classification methods
More informationParsing Technology and its role in Legacy Modernization. A Metaware White Paper
Parsing Technology and its role in Legacy Modernization A Metaware White Paper 1 INTRODUCTION In the two last decades there has been an explosion of interest in software tools that can automate key tasks
More informationAsk your teacher about any which you aren t sure of, especially any differences.
Punctuation in Academic Writing Academic punctuation presentation/ Defining your terms practice Choose one of the things below and work together to describe its form and uses in as much detail as possible,
More informationNatural Language to Relational Query by Using Parsing Compiler
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 4, Issue. 3, March 2015,
More informationInternational Journal of Scientific & Engineering Research, Volume 4, Issue 11, November-2013 5 ISSN 2229-5518
International Journal of Scientific & Engineering Research, Volume 4, Issue 11, November-2013 5 INTELLIGENT MULTIDIMENSIONAL DATABASE INTERFACE Mona Gharib Mohamed Reda Zahraa E. Mohamed Faculty of Science,
More informationClarified Communications
Clarified Communications WebWorks Chapter 1 Who We Are WebWorks was founded due to the electronics industry s requirement for User Guides in Danish. The History WebWorks was founded in 2004 as a direct
More informationIntegrating Reading and Writing for Effective Language Teaching
Integrating Reading and Writing for Effective Language Teaching Ruwaida Abu Rass (Israel) Writing is a difficult skill for native speakers and nonnative speakers alike, because writers must balance multiple
More informationLing 201 Syntax 1. Jirka Hana April 10, 2006
Overview of topics What is Syntax? Word Classes What to remember and understand: Ling 201 Syntax 1 Jirka Hana April 10, 2006 Syntax, difference between syntax and semantics, open/closed class words, all
More informationKnowledge Discovery from patents using KMX Text Analytics
Knowledge Discovery from patents using KMX Text Analytics Dr. Anton Heijs anton.heijs@treparel.com Treparel Abstract In this white paper we discuss how the KMX technology of Treparel can help searchers
More informationEnglish Descriptive Grammar
English Descriptive Grammar 2015/2016 Code: 103410 ECTS Credits: 6 Degree Type Year Semester 2500245 English Studies FB 1 1 2501902 English and Catalan FB 1 1 2501907 English and Classics FB 1 1 2501910
More informationBut have you ever wondered how to create your own website?
Foreword We live in a time when websites have become part of our everyday lives, replacing newspapers and books, and offering users a whole range of new opportunities. You probably visit at least a few
More informationPresented to The Federal Big Data Working Group Meetup On 07 June 2014 By Chuck Rehberg, CTO Semantic Insights a Division of Trigent Software
Semantic Research using Natural Language Processing at Scale; A continued look behind the scenes of Semantic Insights Research Assistant and Research Librarian Presented to The Federal Big Data Working
More informationTeaching Vocabulary to Young Learners (Linse, 2005, pp. 120-134)
Teaching Vocabulary to Young Learners (Linse, 2005, pp. 120-134) Very young children learn vocabulary items related to the different concepts they are learning. When children learn numbers or colors in
More informationA + dvancer College Readiness Online Alignment to Florida PERT
A + dvancer College Readiness Online Alignment to Florida PERT Area Objective ID Topic Subject Activity Mathematics Math MPRC1 Equations: Solve linear in one variable College Readiness-Arithmetic Solving
More informationCOURSE OBJECTIVES SPAN 100/101 ELEMENTARY SPANISH LISTENING. SPEAKING/FUNCTIONAl KNOWLEDGE
SPAN 100/101 ELEMENTARY SPANISH COURSE OBJECTIVES This Spanish course pays equal attention to developing all four language skills (listening, speaking, reading, and writing), with a special emphasis on
More informationIntroduction to Software Paradigms & Procedural Programming Paradigm
Introduction & Procedural Programming Sample Courseware Introduction to Software Paradigms & Procedural Programming Paradigm This Lesson introduces main terminology to be used in the whole course. Thus,
More informationA terminology model approach for defining and managing statistical metadata
A terminology model approach for defining and managing statistical metadata Comments to : R. Karge (49) 30-6576 2791 mail reinhard.karge@run-software.com Content 1 Introduction... 4 2 Knowledge presentation...
More informationNATURAL LANGUAGE QUERY PROCESSING USING PROBABILISTIC CONTEXT FREE GRAMMAR
NATURAL LANGUAGE QUERY PROCESSING USING PROBABILISTIC CONTEXT FREE GRAMMAR Arati K. Deshpande 1 and Prakash. R. Devale 2 1 Student and 2 Professor & Head, Department of Information Technology, Bharati
More informationSDL BeGlobal: Machine Translation for Multilingual Search and Text Analytics Applications
INSIGHT SDL BeGlobal: Machine Translation for Multilingual Search and Text Analytics Applications José Curto David Schubmehl IDC OPINION Global Headquarters: 5 Speen Street Framingham, MA 01701 USA P.508.872.8200
More informationMining the Software Change Repository of a Legacy Telephony System
Mining the Software Change Repository of a Legacy Telephony System Jelber Sayyad Shirabad, Timothy C. Lethbridge, Stan Matwin School of Information Technology and Engineering University of Ottawa, Ottawa,
More informationWhy major in linguistics (and what does a linguist do)?
Why major in linguistics (and what does a linguist do)? Written by Monica Macaulay and Kristen Syrett What is linguistics? If you are considering a linguistics major, you probably already know at least
More informationClustering Connectionist and Statistical Language Processing
Clustering Connectionist and Statistical Language Processing Frank Keller keller@coli.uni-sb.de Computerlinguistik Universität des Saarlandes Clustering p.1/21 Overview clustering vs. classification supervised
More informationAdvice Document: Bilingual Drafting, Translation and Interpretation
Advice Document: Bilingual Drafting, Translation and Interpretation Background The principal aim of the Welsh Language Commissioner, an independent body established under the Welsh Language Measure (Wales)
More informationExtraction of Legal Definitions from a Japanese Statutory Corpus Toward Construction of a Legal Term Ontology
Extraction of Legal Definitions from a Japanese Statutory Corpus Toward Construction of a Legal Term Ontology Makoto Nakamura, Yasuhiro Ogawa, Katsuhiko Toyama Japan Legal Information Institute, Graduate
More informationThe SYSTRAN Linguistics Platform: A Software Solution to Manage Multilingual Corporate Knowledge
The SYSTRAN Linguistics Platform: A Software Solution to Manage Multilingual Corporate Knowledge White Paper October 2002 I. Translation and Localization New Challenges Businesses are beginning to encounter
More informationGuidelines for Masters / Magister / MA Theses
Guidelines for Masters / Magister / MA Theses Table of Contents Language of the Guidelines Digital Copy Process Basics Research question Research methodology (social sciences) Research methodology (computer
More informationConcept Formation. Robert Goldstone. Thomas T. Hills. Samuel B. Day. Indiana University. Department of Psychology. Indiana University
1 Concept Formation Robert L. Goldstone Thomas T. Hills Samuel B. Day Indiana University Correspondence Address: Robert Goldstone Department of Psychology Indiana University Bloomington, IN. 47408 Other
More informationParaphrasing controlled English texts
Paraphrasing controlled English texts Kaarel Kaljurand Institute of Computational Linguistics, University of Zurich kaljurand@gmail.com Abstract. We discuss paraphrasing controlled English texts, by defining
More informationSection 8 Foreign Languages. Article 1 OVERALL OBJECTIVE
Section 8 Foreign Languages Article 1 OVERALL OBJECTIVE To develop students communication abilities such as accurately understanding and appropriately conveying information, ideas,, deepening their understanding
More informationComprendium Translator System Overview
Comprendium System Overview May 2004 Table of Contents 1. INTRODUCTION...3 2. WHAT IS MACHINE TRANSLATION?...3 3. THE COMPRENDIUM MACHINE TRANSLATION TECHNOLOGY...4 3.1 THE BEST MT TECHNOLOGY IN THE MARKET...4
More informationMulti language e Discovery Three Critical Steps for Litigating in a Global Economy
Multi language e Discovery Three Critical Steps for Litigating in a Global Economy 2 3 5 6 7 Introduction e Discovery has become a pressure point in many boardrooms. Companies with international operations
More informationAcademic Standards for Reading, Writing, Speaking, and Listening June 1, 2009 FINAL Elementary Standards Grades 3-8
Academic Standards for Reading, Writing, Speaking, and Listening June 1, 2009 FINAL Elementary Standards Grades 3-8 Pennsylvania Department of Education These standards are offered as a voluntary resource
More informationOverview of the TACITUS Project
Overview of the TACITUS Project Jerry R. Hobbs Artificial Intelligence Center SRI International 1 Aims of the Project The specific aim of the TACITUS project is to develop interpretation processes for
More informationSemantic analysis of text and speech
Semantic analysis of text and speech SGN-9206 Signal processing graduate seminar II, Fall 2007 Anssi Klapuri Institute of Signal Processing, Tampere University of Technology, Finland Outline What is semantic
More informationFrom Logic to Montague Grammar: Some Formal and Conceptual Foundations of Semantic Theory
From Logic to Montague Grammar: Some Formal and Conceptual Foundations of Semantic Theory Syllabus Linguistics 720 Tuesday, Thursday 2:30 3:45 Room: Dickinson 110 Course Instructor: Seth Cable Course Mentor:
More informationKNOWLEDGE-BASED IN MEDICAL DECISION SUPPORT SYSTEM BASED ON SUBJECTIVE INTELLIGENCE
JOURNAL OF MEDICAL INFORMATICS & TECHNOLOGIES Vol. 22/2013, ISSN 1642-6037 medical diagnosis, ontology, subjective intelligence, reasoning, fuzzy rules Hamido FUJITA 1 KNOWLEDGE-BASED IN MEDICAL DECISION
More informationLanguage Meaning and Use
Language Meaning and Use Raymond Hickey, English Linguistics Website: www.uni-due.de/ele Types of meaning There are four recognisable types of meaning: lexical meaning, grammatical meaning, sentence meaning
More informationQUALITY CONTROL PROCESS FOR TAXONOMY DEVELOPMENT
AUTHORED BY MAKOTO KOIZUMI, IAN HICKS AND ATSUSHI TAKEDA JULY 2013 FOR XBRL INTERNATIONAL, INC. QUALITY CONTROL PROCESS FOR TAXONOMY DEVELOPMENT Including Japan EDINET and UK HMRC Case Studies Copyright
More informationNTT DATA Big Data Reference Architecture Ver. 1.0
NTT DATA Big Data Reference Architecture Ver. 1.0 Big Data Reference Architecture is a joint work of NTT DATA and EVERIS SPAIN, S.L.U. Table of Contents Chap.1 Advance of Big Data Utilization... 2 Chap.2
More informationMoving Enterprise Applications into VoiceXML. May 2002
Moving Enterprise Applications into VoiceXML May 2002 ViaFone Overview ViaFone connects mobile employees to to enterprise systems to to improve overall business performance. Enterprise Application Focus;
More informationAppendix B Data Quality Dimensions
Appendix B Data Quality Dimensions Purpose Dimensions of data quality are fundamental to understanding how to improve data. This appendix summarizes, in chronological order of publication, three foundational
More informationFUNDAMENTAL TECHNOLOGIES FOR IP TRANSLATION SERVICES
FUNDAMENTAL TECHNOLOGIES FOR IP TRANSLATION SERVICES 2 The strength of intellectual property documents, including international patent applications, relies in part on the quality of the translation. While
More informationUniversal. Event. Product. Computer. 1 warehouse.
Dynamic multi-dimensional models for text warehouses Maria Zamr Bleyberg, Karthik Ganesh Computing and Information Sciences Department Kansas State University, Manhattan, KS, 66506 Abstract In this paper,
More informationNatural Language Database Interface for the Community Based Monitoring System *
Natural Language Database Interface for the Community Based Monitoring System * Krissanne Kaye Garcia, Ma. Angelica Lumain, Jose Antonio Wong, Jhovee Gerard Yap, Charibeth Cheng De La Salle University
More informationOnline Multilingual Translation of Technical Service Reports over the World Wide Web
Online Multilingual Translation of Technical Service Reports over the World Wide Web S. Liu, S.C. Hui, S. Foo and P.C. Leong School of Applied Science, Nanyang Technological University Nanyang Avenue,
More informationDATA QUALITY AND SCALE IN CONTEXT OF EUROPEAN SPATIAL DATA HARMONISATION
DATA QUALITY AND SCALE IN CONTEXT OF EUROPEAN SPATIAL DATA HARMONISATION Katalin Tóth, Vanda Nunes de Lima European Commission Joint Research Centre, Ispra, Italy ABSTRACT The proposal for the INSPIRE
More informationWriting learning objectives
Writing learning objectives This material was excerpted and adapted from the following web site: http://www.utexas.edu/academic/diia/assessment/iar/students/plan/objectives/ What is a learning objective?
More informationHybrid Strategies. for better products and shorter time-to-market
Hybrid Strategies for better products and shorter time-to-market Background Manufacturer of language technology software & services Spin-off of the research center of Germany/Heidelberg Founded in 1999,
More informationHow In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time
SCALEOUT SOFTWARE How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time by Dr. William Bain and Dr. Mikhail Sobolev, ScaleOut Software, Inc. 2012 ScaleOut Software, Inc. 12/27/2012 T wenty-first
More informationFlattening Enterprise Knowledge
Flattening Enterprise Knowledge Do you Control Your Content or Does Your Content Control You? 1 Executive Summary: Enterprise Content Management (ECM) is a common buzz term and every IT manager knows it
More informationMAP for Language & International Communication Spanish Language Learning Outcomes by Level
Novice Abroad I This course is designed for students with little or no prior knowledge of the language. By the end of the course, the successful student will develop a basic foundation in the five skills:
More informationProtecting Data with a Unified Platform
Protecting Data with a Unified Platform The Essentials Series sponsored by Introduction to Realtime Publishers by Don Jones, Series Editor For several years now, Realtime has produced dozens and dozens
More informationIntroduction: Reading and writing; talking and thinking
Introduction: Reading and writing; talking and thinking We begin, not with reading, writing or reasoning, but with talk, which is a more complicated business than most people realize. Of course, being
More informationTEN RULES OF GRAMMAR AND USAGE THAT YOU SHOULD KNOW
TEN RULES OF GRAMMAR AND USAGE THAT YOU SHOULD KNOW 2003 The Writing Center at GULC. All rights reserved. The following are ten of the most common grammar and usage errors that law students make in their
More informationSome Implications of Controlling Contextual Constraint: Exploring Word Meaning Inference by Using a Cloze Task
Some Implications of Controlling Contextual Constraint: Exploring Word Meaning Inference by Using a Cloze Task Abstract 20 vs Keywords: Lexical Inference, Contextual Constraint, Cloze Task 1. Introduction
More informationThe Principle of Translation Management Systems
The Principle of Translation Management Systems Computer-aided translations with the help of translation memory technology deliver numerous advantages. Nevertheless, many enterprises have not yet or only
More informationFOR IMMEDIATE RELEASE
FOR IMMEDIATE RELEASE Hitachi Developed Basic Artificial Intelligence Technology that Enables Logical Dialogue Analyzes huge volumes of text data on issues under debate, and presents reasons and grounds
More informationCompetencies for Secondary Teachers: Computer Science, Grades 4-12
1. Computational Thinking CSTA: Comp. Thinking 1.1 The ability to use the basic steps in algorithmic problemsolving to design solutions (e.g., problem statement and exploration, examination of sample instances,
More informationEmpirical Machine Translation and its Evaluation
Empirical Machine Translation and its Evaluation EAMT Best Thesis Award 2008 Jesús Giménez (Advisor, Lluís Màrquez) Universitat Politècnica de Catalunya May 28, 2010 Empirical Machine Translation Empirical
More informationWhy SBVR? Donald Chapin. Chair, OMG SBVR Revision Task Force Business Semantics Ltd Donald.Chapin@BusinessSemantics.com
Why SBVR? Towards a Business Natural Language (BNL) for Financial Services Panel Demystifying Financial Services Semantics Conference New York,13 March 2012 Donald Chapin Chair, OMG SBVR Revision Task
More informationThe Harvard style. Reference with confidence. (2012 Edition)
Reference with confidence: The Harvard style 1 Reference with confidence The Harvard style (2012 Edition) As used in: Archaeology Biochemistry (as well as Vancouver) Biology (as well as Vancouver) Economics
More informationCurriculum Vitae JEFF LOUCKS
Curriculum Vitae JEFF LOUCKS Department of Psychology University of Regina 3737 Wascana Parkway Regina, SK, S4S 0A2 Email: Jeff.Loucks@uregina.ca Phone: (306) 585-4033 Web Page: uregina.ca/~loucks5j Education
More informationStudy Plan for Master of Arts in Applied Linguistics
Study Plan for Master of Arts in Applied Linguistics Master of Arts in Applied Linguistics is awarded by the Faculty of Graduate Studies at Jordan University of Science and Technology (JUST) upon the fulfillment
More informationTranslation Solution for
Translation Solution for Case Study Contents PROMT Translation Solution for PayPal Case Study 1 Contents 1 Summary 1 Background for Using MT at PayPal 1 PayPal s Initial Requirements for MT Vendor 2 Business
More informationWeek 3. COM1030. Requirements Elicitation techniques. 1. Researching the business background
Aims of the lecture: 1. Introduce the issue of a systems requirements. 2. Discuss problems in establishing requirements of a system. 3. Consider some practical methods of doing this. 4. Relate the material
More informationA framing effect is usually said to occur when equivalent descriptions of a
FRAMING EFFECTS A framing effect is usually said to occur when equivalent descriptions of a decision problem lead to systematically different decisions. Framing has been a major topic of research in the
More informationApplication Architectures
Software Engineering Application Architectures Based on Software Engineering, 7 th Edition by Ian Sommerville Objectives To explain the organization of two fundamental models of business systems - batch
More informationIntroduction to Intercultural Communication 1.1. The Scope of Intercultural Communication
1 An Introduction to Intercultural Communication 1.1 The Scope of Intercultural Communication Sometimes intercultural conversations go very smoothly and are extremely intriguing; think of a walk at sunset
More informationKPMG Unlocks Hidden Value in Client Information with Smartlogic Semaphore
CASE STUDY KPMG Unlocks Hidden Value in Client Information with Smartlogic Semaphore Sponsored by: IDC David Schubmehl July 2014 IDC OPINION Dan Vesset Big data in all its forms and associated technologies,
More informationExtracted Templates. Postgres database: results
Natural Language Processing and Expert System Techniques for Equity Derivatives Trading: the IE-Expert System Marco Costantino Laboratory for Natural Language Engineering Department of Computer Science
More informationApplication of Natural Language Interface to a Machine Translation Problem
Application of Natural Language Interface to a Machine Translation Problem Heidi M. Johnson Yukiko Sekine John S. White Martin Marietta Corporation Gil C. Kim Korean Advanced Institute of Science and Technology
More informationOpen-Source, Cross-Platform Java Tools Working Together on a Dialogue System
Open-Source, Cross-Platform Java Tools Working Together on a Dialogue System Oana NICOLAE Faculty of Mathematics and Computer Science, Department of Computer Science, University of Craiova, Romania oananicolae1981@yahoo.com
More informationDEFINING, TEACHING AND ASSESSING LIFELONG LEARNING SKILLS
DEFINING, TEACHING AND ASSESSING LIFELONG LEARNING SKILLS Nikos J. Mourtos Abstract - Lifelong learning skills have always been important in any education and work setting. However, ABET EC recently put
More informationIntroduction. Philipp Koehn. 28 January 2016
Introduction Philipp Koehn 28 January 2016 Administrativa 1 Class web site: http://www.mt-class.org/jhu/ Tuesdays and Thursdays, 1:30-2:45, Hodson 313 Instructor: Philipp Koehn (with help from Matt Post)
More information3. What is Knowledge Management
3. What is Knowledge Management ETL525 Knowledge Management Tutorial One 5 December 2008 K.T. Lam lblkt@ust.hk Last updated: 4 December 2008 KM History The subject of KM was originally arisen in the field
More informationKeywords academic writing phraseology dissertations online support international students
Phrasebank: a University-wide Online Writing Resource John Morley, Director of Academic Support Programmes, School of Languages, Linguistics and Cultures, The University of Manchester Summary A salient
More informationFEAWEB ASP Issue: 1.0 Stakeholder Needs Issue Date: 03/29/2000. 04/07/2000 1.0 Initial Description Marco Bittencourt
)($:(%$63 6WDNHKROGHU1HHGV,VVXH 5HYLVLRQ+LVWRU\ 'DWH,VVXH 'HVFULSWLRQ $XWKRU 04/07/2000 1.0 Initial Description Marco Bittencourt &RQILGHQWLDO DPM-FEM-UNICAMP, 2000 Page 2 7DEOHRI&RQWHQWV 1. Objectives
More informationSYNTACTIC PATTERNS IN ADVERTISEMENT SLOGANS Vindi Karsita and Aulia Apriana State University of Malang Email: vindikarsita@gmail.
SYNTACTIC PATTERNS IN ADVERTISEMENT SLOGANS Vindi Karsita and Aulia Apriana State University of Malang Email: vindikarsita@gmail.com ABSTRACT: This study aims at investigating the syntactic patterns of
More informationstress, intonation and pauses and pronounce English sounds correctly. (b) To speak accurately to the listener(s) about one s thoughts and feelings,
Section 9 Foreign Languages I. OVERALL OBJECTIVE To develop students basic communication abilities such as listening, speaking, reading and writing, deepening their understanding of language and culture
More informationUsing In-Memory Computing to Simplify Big Data Analytics
SCALEOUT SOFTWARE Using In-Memory Computing to Simplify Big Data Analytics by Dr. William Bain, ScaleOut Software, Inc. 2012 ScaleOut Software, Inc. 12/27/2012 T he big data revolution is upon us, fed
More informationWriting Goals and Objectives If you re not sure where you are going, you re liable to end up some place else. ~ Robert Mager, 1997
Writing Goals and Objectives If you re not sure where you are going, you re liable to end up some place else. ~ Robert Mager, 1997 Instructional goals and objectives are the heart of instruction. When
More informationD2.4: Two trained semantic decoders for the Appointment Scheduling task
D2.4: Two trained semantic decoders for the Appointment Scheduling task James Henderson, François Mairesse, Lonneke van der Plas, Paola Merlo Distribution: Public CLASSiC Computational Learning in Adaptive
More informationINPUTLOG 6.0 a research tool for logging and analyzing writing process data. Linguistic analysis. Linguistic analysis
INPUTLOG 6.0 a research tool for logging and analyzing writing process data Linguistic analysis From character level analyses to word level analyses Linguistic analysis 2 Linguistic Analyses The concept
More informationICAME Journal No. 24. Reviews
ICAME Journal No. 24 Reviews Collins COBUILD Grammar Patterns 2: Nouns and Adjectives, edited by Gill Francis, Susan Hunston, andelizabeth Manning, withjohn Sinclair as the founding editor-in-chief of
More informationUser choice as an evaluation metric for web translation services in cross language instant messaging applications
User choice as an evaluation metric for web translation services in cross language instant messaging applications William Ogden, Ron Zacharski Sieun An, and Yuki Ishikawa New Mexico State University University
More informationHow To Write The English Language Learner Can Do Booklet
WORLD-CLASS INSTRUCTIONAL DESIGN AND ASSESSMENT The English Language Learner CAN DO Booklet Grades 9-12 Includes: Performance Definitions CAN DO Descriptors For use in conjunction with the WIDA English
More informationText Mining - Scope and Applications
Journal of Computer Science and Applications. ISSN 2231-1270 Volume 5, Number 2 (2013), pp. 51-55 International Research Publication House http://www.irphouse.com Text Mining - Scope and Applications Miss
More informationDiscourse Markers in English Writing
Discourse Markers in English Writing Li FENG Abstract Many devices, such as reference, substitution, ellipsis, and discourse marker, contribute to a discourse s cohesion and coherence. This paper focuses
More informationAuto-Classification for Document Archiving and Records Declaration
Auto-Classification for Document Archiving and Records Declaration Josemina Magdalen, Architect, IBM November 15, 2013 Agenda IBM / ECM/ Content Classification for Document Archiving and Records Management
More informationEvolution of Forex the Active Trader s Market
Evolution of Forex the Active Trader s Market The practice of trading currencies online has increased threefold from 2002 to 2005, and the growth curve is expected to continue. Forex, an abbreviation for
More informationDIFFERENT TECHNIQUES FOR DEVELOPING COMMUNICATION SKILLS
DIFFERENT TECHNIQUES FOR DEVELOPING COMMUNICATION SKILLS MALA JAIN Department of Humanities Truba Institute of Engineering and Information Technology, Bhopal (M.P.), India Abstract Communication is an
More informationEnglish Grammar Checker
International l Journal of Computer Sciences and Engineering Open Access Review Paper Volume-4, Issue-3 E-ISSN: 2347-2693 English Grammar Checker Pratik Ghosalkar 1*, Sarvesh Malagi 2, Vatsal Nagda 3,
More informationCELTA. Syllabus and Assessment Guidelines. Fourth Edition. Certificate in Teaching English to Speakers of Other Languages
CELTA Certificate in Teaching English to Speakers of Other Languages Syllabus and Assessment Guidelines Fourth Edition CELTA (Certificate in Teaching English to Speakers of Other Languages) is regulated
More information