SakkamMT White Paper

Size: px
Start display at page:

Download "SakkamMT White Paper"

Transcription

1 SakkamMT White Paper Sakkam K.K. Wakamatsu Building 7F Nihonbashi Honcho Chuo-ku Tokyo

2 Introduction Ever since the emergence of machine translation there has been a debate about if and when machines will replace humans for everyday translation tasks. With Sakkam Machine Translation (SakkamMT) we do not attempt to replace human translators; instead we focus on translation activities that are impractical or impossible with a traditional approach to translation. These activities are characterized by one or more of the following elements. Extreme time sensitivity. The value of some categories of information decreases rapidly from the moment it is made available. The announcement of US employment data has the potential to move currency markets in the seconds following its release but is old news one minute later. To have value, the translation of such releases needs to be subsecond. Massive volume. The volume of user-generated content is exploding, whether through social networking sites, online auctions or game networks. There are simply not enough human translators available to translate or monitor or extract meaning from tens of millions of messages per day. Limited availability of domain expertise. Many translation tasks require a high degree of domain expertise in addition to fluency in source and target languages. It is often difficult to identify translators with the requisite skill set and this problem is exacerbated if the translation task is not a discrete project but an ongoing, around the clock activity. We have also taken a fundamentally different approach in developing the technology that underpins SakkamMT. From the outset, SakkamMT was designed to provide native speaker quality translations within limited domains. As such, both its targeted applications and usefulness are very different than existing attempts at machine translation, which seek to provide general capabilities, but of low quality. As anyone who has ever used existing Internet and PC-based translation systems will know, their output is unusable within a business context. This paper provides an overview of the SakkamMT architecture and how SakkamMT is being used where human translation is either impractical or impossible. Readers wishing to learn more about the technical foundations of Sakkam MT are invited to review the bibliography.

3 Architecture The Interim Representation Model SakkamMT works by parsing the source language text using domain specific rules to create a language independent Interim Representation Model (IRM). The IRM can then be used to drive output to another language. This approach affords a number of benefits, 1. As more output languages are added, the systems scales linearly (O(N)) rather than with the number of language pairs (O(N 2 )). 2. The IRM may be used not just to output other human languages, but also to provide a computerreadable API for entity extraction. For example, dates could be extracted from an to populate a calendar. 3. The same content may be translated in different styles. For example, a new headline could be output in a highly abbreviated news style, and/or a more grammatically correct prose style depending upon the requirement. 4. The IRM facilitates a common-sense check of understanding and prevents some of the more egregious errors MT systems often make. Once the IRM has been populated, it is possible to judge whether the source text has been understood or not. At this stage a potential wrong translation can be suppressed. This is in marked contrast to generalized translation systems that will always output something regardless of quality. Cognitive Categories The design of the IRM draws heavily on research in Cognitive Psychology into semantic categorizations, which differ significantly from mathematically formal categories. For example, cognitive categories do not in general support transitive closure, which is a characteristic of mathematically formal categories. Hence, the categorization system that is used in IRM understands that although a car seat is a chair and a chair is an item of furniture, a car seat is not an item of furniture. Cognitive categories also display a degree of typicality a goldfish is not as good an example of a pet as a dog. Sometimes, category membership can be so ill-defined as to need legal action to clarify. The question of whether a tomato is a vegetable or a fruit ended up being decided by the US Supreme Court in response to a new tax on vegetables but not fruit. Perhaps not surprisingly, they choose to classify it as a vegetable to bring it within scope of the tax. Botanists would class it as a fruit. Cognitive categories are also crucially ad-hoc and can be determined dynamically, set by context. For example, a tomato can be a missile. Traditional hierarchically structured schemas and taxonomies are unsuited to this, but Sakkam s IRM has been designed from the outset with this flexibility in mind. Populating the IRM Populating the IRM requires parsing the source language to extract meaning. This is done using a series of linguistic rules which operate atomically upon the text until the structure of the IRM has been built. These rules are different from what many MT systems use in that they are primarily based upon the linguistic field of pragmatics, rather than more typical syntax-based grammatical rules. SakkamMT uses little syntactical grammar in its parsing for the simple reason that in many cases (such as headlines, s etc), the grammar can quite often be wrong. Instead, SakkamMT

4 takes what is closer to a construction grammar approach in using pragmatic criteria to determine the logical units. Note that here, as in the rest of this paper, we are taking pragmatics to refer to the sub-division of linguistics, not the everyday English meaning of the word. Figure 1. shows how alternate expressions using different phrasings result in the same pragmaticbased model, despite having very different syntactic structures. While in some cases, syntax is crucial (for example A killed B, only syntax can tell you which is the subject, and which is the object), in many cases it has little to add. In this particular example, the grammatical part of speech that approximate appears as, can be a noun, verb, adjective or adverb, with little implication to meaning. Source Syntactic Structure Interim Representation Model (IRM) Approximately, the distance to London is 200 km The approximate distance to London is 200 km The distance approximately to London is 200 km The distance to London approximation is 200 km The distance to London approximately is 200 km The distance to London approximates to 200 km The distance to London is approximately 200 km The distance to London is 200 km approximately It is approximately 200 km distance to London ADV NP PP VP(V NP) SENTENCE [ ITEM[ TYPE[London(LOCATION) NP(ADJ N) PP VP(V NP) ATTRIBUTE[ MEASURE[ TYPE[distance(LENGTH) NP ADV PP VP(V NP) RELATION[to NP(NP PP N) VP(V NP) ATTRIBUTE[ VALUE[ NP PP VP(ADV V NP) NUMBER[200 UNITS[Km NP PP VP(V NP) SIGN[DEFAULT-POSITIVE PRECISION[approximate NP PP VP(V ADV NP) NP PP VP(V NP ADV) NP VP(ADV NP(NP NP) PP) Figure 1. Syntactic Structure and Pragmatic Structure Once the IRM has been populated, a set of output rules can be brought to bear to express the IRM in the target language. Variations of style can be introduced at this stage. It should be noted that for some very terse styles such as are common in news headlines, we may be generating grammatically incorrect text, but nevertheless it is actually more appropriate for the audience. Looking at the example in Figure 1 again, we also note that for any particular representation, then both the variety and acceptability of grammatical phrasings will be highly dependent upon the target language. Existing machine translation systems will often instead try to translate the gross sentence syntactic structure, and then fill in the slots with translations for the individual noun and verb phrases. This however can lead to combinations that at best seem unnatural and at worst can appear nonsensical. As an example, one Internet-based machine translation system translates: Approximately, the distance to London is 200 km

5 as 200. Literally this would be: Approximation, it is 200km to London. The English is deliberately stilted to reflect how it would sound in Japanese. The sentence pattern exactly corresponds to the English source but the result is an incorrect Japanese sentence. Translating the other nine English phrasings results in nine different translations of varying degrees of accuracy. By contrast, SakkamMT uses the same IRM for all ten phrasings and uses the more acceptable and correct Japanese form, 200, again the same for all ten variations of the input. Another example illustrates how pragmatics can enable the correct disambiguation between the different meanings of a word. The following Japanese text is an extract from an item that was listed on Yahoo! Auctions. / The meaning of the first part of the text (shown in blue) is, Transformers Bumblebee Replica Mask 1:1 scale, genuine goods. Most translation engines will translate the second part (shown in red) as "I have a rash!". Although this is linguistically correct it is pragmatically wrong. In the context of an auction for a mask, the correct translation, and the one that is made by SakkamMT, is "You can wear it!"

6 Business Implementation A SakkamMT implementation consists of the following activities, Project definition. Agree the scope, scale and timeframes for the project. The project may include a pilot phase, which provides an opportunity to demonstrate the effectiveness of the SakkamMT approach in the client s own environment. Enhancement of the Interim Representation Model (IRM). Most deployments require modifications to our standard IRM to reflect the precise nature of the communication for the client s application. Compilation of project specific dictionaries and named entity databases. Existing client dictionaries and translation memories are used as available and, where necessary, additional material is developed through a combination of manual compilation and automated text analysis. All new content is then categorized to ensure consistency with the SakkamMT categorization model. Lifecycle planning. Typical SakkamMT deployments involve the continual translation of a feed of information over a period of months and years. Over the life of the project, the source content may change as new terminology, and even concepts, are introduced. In some cases, SakkamMT will be able to adapt automatically to these changes but there may also be an ongoing the requirement to ensure new terminology is being correctly used and named entity databases are up-to-date. Technical integration. Sakkam provides a simple, secure Web API for integration with client systems. The SakkamMT infrastructure is hosted at Amazon EC2, which enables us to scale our servers smoothly to many millions of translations per day and, by deploying production servers in different continents, to offer continuity of service even if any entire datacenter becomes unavailable. Pilot phase. End-to-end deployment of SakkamMT within the client environment for a limited but representation subset of the overall project scope. Live deployment. Full deployment of SakkamMT with the client environment.

7 Case study: Financial News Feed The foreign currency exchange markets dwarf those of equities, with on a typical day something of the order of $1.5 trillion being traded and the Yen/Dollar rate as one of the major currency pairs. The Yen/Dollar rate is highly sensitive to US economic news, with billions of dollars changing hands within seconds of the release of a key economic indicator. Often by the time Japanese translations are available, the market has already moved to factor in the news and the opportunity to profit has been lost. SakkamMT is being used by Intisar Technology to provide business critical translations of US Financial News headlines from a leading financial news vendor. Translations are made with no more than a sub-second delay, enabling Intisar s customers valuable time to profit. Clearly, accuracy of translation, as well as immediacy of output is essential in this environment. Intisar has integrated this feed of translated stories into their realtime market data platform that supports traders workstations, such as Tradesignal. Figure 2. SakkamMT translated news stories in Tradesignal Existing general purpose machine translation systems are incapable of providing even the gist of these news releases and human translators, if they have no domain expertise, are liable to make mistakes. Competing Japanese news vendors that rely on human translation are tens of seconds to several minutes behind, as well as requiring a high cost base (skilled bilingual financial domain experts available 24x7), that must ultimately be passed on to the consumer. For details of this news feed and other Intisar services, please contact

8 Case Study: Internet Auctions Increasing globalization has driven cross-boarder interest in collectables, and this now acts as a major revenue driver for Internet auctions and marketplaces. As a specific example, there is significant worldwide interest in manga and anime, many of which have limited availability outside of Japan. Figure 3 illustrates the price difference for a given item s English language listing on ebay and an identical item s Japanese language listing on Yahoo! Auctions. The difference in price represents the language premium that US collectors need to pay to search for items and bid in English. Figure 3: US-Japan Price Differentials Some collectors do attempt to use web based translators but these general services are too inaccurate to allow a collector bid with any confidence. Examples proliferate of inaccurate translations and even the reversal of meaning, as in the below example. Original listing Internet translation Home for longer storage is a new unused item, in a box outside the gall, and a GE. For a completely new, please bid. Unfortunately this is wrong. In fact the translation should be do not bid as is correctly shown in the below SakkamMT:

9 Original listing SakkamMT translation This is a new, unused product. However, since it has been stored at home for a long time there are a number of marks and scratches on the outer box. Please do not bid if you are looking for a completely new item. SakkamMT is being used to provide immediate translations of anime related memorabilia auctioned on Internet sites. For a live demonstration of SakkamMT translating auction content from Japan s Yahoo! Auctions, please on Twitter (

10 Bibliography: Technical Foundations Conceptual Modelling Margolis & Laurence (ed.), Concepts: Core Readings, MIT Press, 1991 Talmy, Leonard Toward a Cognitive Semantics Vol 1, MIT Press, 2000 Vosniadou & Ortony (ed.), Similarity and Analogical Reasoning, Cambridge University press, 1989 Linguistics Croft, William, Radical Construction Grammar: syntactic theory in typological perspective, Oxford University Press, 2001 Levin, Beth, English Verb Classes and Alternations: A Preliminary Investigation, University of Chicago Press, Chicago, IL Sperber, Dan and Wilson, Deirdre. Relevance: Communication and Cognition. Oxford: Blackwell, 1986/1995. Matsui, Tomoko. Bridging and Relevance. John Benjamins Publishing Co Wilson, Deirdre & Carston, Robyn A unitary approach to lexical pragmatics: Relevance, inference and ad hoc concepts. In N. Burton-Roberts (ed.) Pragmatics. Palgrave, London : Computational Architecture Mitchell, Melanie, Analogy-Making as Perception: A Computer Model. Cambridge, MA: MIT Press, 1993 Mitchell, Melanie, Analogy-making as a complex adaptive system. In L. Segel and I. Cohen (editors), Design Principles for the Immune System and Other Distributed Autonomous Systems. New York: Oxford University Press, 2001

WHITE PAPER. Machine Translation of Language for Safety Information Sharing Systems

WHITE PAPER. Machine Translation of Language for Safety Information Sharing Systems WHITE PAPER Machine Translation of Language for Safety Information Sharing Systems September 2004 Disclaimers; Non-Endorsement All data and information in this document are provided as is, without any

More information

Modern foreign languages

Modern foreign languages Modern foreign languages Programme of study for key stage 3 and attainment targets (This is an extract from The National Curriculum 2007) Crown copyright 2007 Qualifications and Curriculum Authority 2007

More information

How the Computer Translates. Svetlana Sokolova President and CEO of PROMT, PhD.

How the Computer Translates. Svetlana Sokolova President and CEO of PROMT, PhD. Svetlana Sokolova President and CEO of PROMT, PhD. How the Computer Translates Machine translation is a special field of computer application where almost everyone believes that he/she is a specialist.

More information

Introduction to formal semantics -

Introduction to formal semantics - Introduction to formal semantics - Introduction to formal semantics 1 / 25 structure Motivation - Philosophy paradox antinomy division in object und Meta language Semiotics syntax semantics Pragmatics

More information

Overview of MT techniques. Malek Boualem (FT)

Overview of MT techniques. Malek Boualem (FT) Overview of MT techniques Malek Boualem (FT) This section presents an standard overview of general aspects related to machine translation with a description of different techniques: bilingual, transfer,

More information

A Survey of Online Tools Used in English-Thai and Thai-English Translation by Thai Students

A Survey of Online Tools Used in English-Thai and Thai-English Translation by Thai Students 69 A Survey of Online Tools Used in English-Thai and Thai-English Translation by Thai Students Sarathorn Munpru, Srinakharinwirot University, Thailand Pornpol Wuttikrikunlaya, Srinakharinwirot University,

More information

Building a Question Classifier for a TREC-Style Question Answering System

Building a Question Classifier for a TREC-Style Question Answering System Building a Question Classifier for a TREC-Style Question Answering System Richard May & Ari Steinberg Topic: Question Classification We define Question Classification (QC) here to be the task that, given

More information

Taxonomies in Practice Welcome to the second decade of online taxonomy construction

Taxonomies in Practice Welcome to the second decade of online taxonomy construction Building a Taxonomy for Auto-classification by Wendi Pohs EDITOR S SUMMARY Taxonomies have expanded from browsing aids to the foundation for automatic classification. Early auto-classification methods

More information

Parsing Technology and its role in Legacy Modernization. A Metaware White Paper

Parsing Technology and its role in Legacy Modernization. A Metaware White Paper Parsing Technology and its role in Legacy Modernization A Metaware White Paper 1 INTRODUCTION In the two last decades there has been an explosion of interest in software tools that can automate key tasks

More information

Ask your teacher about any which you aren t sure of, especially any differences.

Ask your teacher about any which you aren t sure of, especially any differences. Punctuation in Academic Writing Academic punctuation presentation/ Defining your terms practice Choose one of the things below and work together to describe its form and uses in as much detail as possible,

More information

Natural Language to Relational Query by Using Parsing Compiler

Natural Language to Relational Query by Using Parsing Compiler Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 4, Issue. 3, March 2015,

More information

International Journal of Scientific & Engineering Research, Volume 4, Issue 11, November-2013 5 ISSN 2229-5518

International Journal of Scientific & Engineering Research, Volume 4, Issue 11, November-2013 5 ISSN 2229-5518 International Journal of Scientific & Engineering Research, Volume 4, Issue 11, November-2013 5 INTELLIGENT MULTIDIMENSIONAL DATABASE INTERFACE Mona Gharib Mohamed Reda Zahraa E. Mohamed Faculty of Science,

More information

Clarified Communications

Clarified Communications Clarified Communications WebWorks Chapter 1 Who We Are WebWorks was founded due to the electronics industry s requirement for User Guides in Danish. The History WebWorks was founded in 2004 as a direct

More information

Integrating Reading and Writing for Effective Language Teaching

Integrating Reading and Writing for Effective Language Teaching Integrating Reading and Writing for Effective Language Teaching Ruwaida Abu Rass (Israel) Writing is a difficult skill for native speakers and nonnative speakers alike, because writers must balance multiple

More information

Ling 201 Syntax 1. Jirka Hana April 10, 2006

Ling 201 Syntax 1. Jirka Hana April 10, 2006 Overview of topics What is Syntax? Word Classes What to remember and understand: Ling 201 Syntax 1 Jirka Hana April 10, 2006 Syntax, difference between syntax and semantics, open/closed class words, all

More information

Knowledge Discovery from patents using KMX Text Analytics

Knowledge Discovery from patents using KMX Text Analytics Knowledge Discovery from patents using KMX Text Analytics Dr. Anton Heijs anton.heijs@treparel.com Treparel Abstract In this white paper we discuss how the KMX technology of Treparel can help searchers

More information

English Descriptive Grammar

English Descriptive Grammar English Descriptive Grammar 2015/2016 Code: 103410 ECTS Credits: 6 Degree Type Year Semester 2500245 English Studies FB 1 1 2501902 English and Catalan FB 1 1 2501907 English and Classics FB 1 1 2501910

More information

But have you ever wondered how to create your own website?

But have you ever wondered how to create your own website? Foreword We live in a time when websites have become part of our everyday lives, replacing newspapers and books, and offering users a whole range of new opportunities. You probably visit at least a few

More information

Presented to The Federal Big Data Working Group Meetup On 07 June 2014 By Chuck Rehberg, CTO Semantic Insights a Division of Trigent Software

Presented to The Federal Big Data Working Group Meetup On 07 June 2014 By Chuck Rehberg, CTO Semantic Insights a Division of Trigent Software Semantic Research using Natural Language Processing at Scale; A continued look behind the scenes of Semantic Insights Research Assistant and Research Librarian Presented to The Federal Big Data Working

More information

Teaching Vocabulary to Young Learners (Linse, 2005, pp. 120-134)

Teaching Vocabulary to Young Learners (Linse, 2005, pp. 120-134) Teaching Vocabulary to Young Learners (Linse, 2005, pp. 120-134) Very young children learn vocabulary items related to the different concepts they are learning. When children learn numbers or colors in

More information

A + dvancer College Readiness Online Alignment to Florida PERT

A + dvancer College Readiness Online Alignment to Florida PERT A + dvancer College Readiness Online Alignment to Florida PERT Area Objective ID Topic Subject Activity Mathematics Math MPRC1 Equations: Solve linear in one variable College Readiness-Arithmetic Solving

More information

COURSE OBJECTIVES SPAN 100/101 ELEMENTARY SPANISH LISTENING. SPEAKING/FUNCTIONAl KNOWLEDGE

COURSE OBJECTIVES SPAN 100/101 ELEMENTARY SPANISH LISTENING. SPEAKING/FUNCTIONAl KNOWLEDGE SPAN 100/101 ELEMENTARY SPANISH COURSE OBJECTIVES This Spanish course pays equal attention to developing all four language skills (listening, speaking, reading, and writing), with a special emphasis on

More information

Introduction to Software Paradigms & Procedural Programming Paradigm

Introduction to Software Paradigms & Procedural Programming Paradigm Introduction & Procedural Programming Sample Courseware Introduction to Software Paradigms & Procedural Programming Paradigm This Lesson introduces main terminology to be used in the whole course. Thus,

More information

A terminology model approach for defining and managing statistical metadata

A terminology model approach for defining and managing statistical metadata A terminology model approach for defining and managing statistical metadata Comments to : R. Karge (49) 30-6576 2791 mail reinhard.karge@run-software.com Content 1 Introduction... 4 2 Knowledge presentation...

More information

NATURAL LANGUAGE QUERY PROCESSING USING PROBABILISTIC CONTEXT FREE GRAMMAR

NATURAL LANGUAGE QUERY PROCESSING USING PROBABILISTIC CONTEXT FREE GRAMMAR NATURAL LANGUAGE QUERY PROCESSING USING PROBABILISTIC CONTEXT FREE GRAMMAR Arati K. Deshpande 1 and Prakash. R. Devale 2 1 Student and 2 Professor & Head, Department of Information Technology, Bharati

More information

SDL BeGlobal: Machine Translation for Multilingual Search and Text Analytics Applications

SDL BeGlobal: Machine Translation for Multilingual Search and Text Analytics Applications INSIGHT SDL BeGlobal: Machine Translation for Multilingual Search and Text Analytics Applications José Curto David Schubmehl IDC OPINION Global Headquarters: 5 Speen Street Framingham, MA 01701 USA P.508.872.8200

More information

Mining the Software Change Repository of a Legacy Telephony System

Mining the Software Change Repository of a Legacy Telephony System Mining the Software Change Repository of a Legacy Telephony System Jelber Sayyad Shirabad, Timothy C. Lethbridge, Stan Matwin School of Information Technology and Engineering University of Ottawa, Ottawa,

More information

Why major in linguistics (and what does a linguist do)?

Why major in linguistics (and what does a linguist do)? Why major in linguistics (and what does a linguist do)? Written by Monica Macaulay and Kristen Syrett What is linguistics? If you are considering a linguistics major, you probably already know at least

More information

Clustering Connectionist and Statistical Language Processing

Clustering Connectionist and Statistical Language Processing Clustering Connectionist and Statistical Language Processing Frank Keller keller@coli.uni-sb.de Computerlinguistik Universität des Saarlandes Clustering p.1/21 Overview clustering vs. classification supervised

More information

Advice Document: Bilingual Drafting, Translation and Interpretation

Advice Document: Bilingual Drafting, Translation and Interpretation Advice Document: Bilingual Drafting, Translation and Interpretation Background The principal aim of the Welsh Language Commissioner, an independent body established under the Welsh Language Measure (Wales)

More information

Extraction of Legal Definitions from a Japanese Statutory Corpus Toward Construction of a Legal Term Ontology

Extraction of Legal Definitions from a Japanese Statutory Corpus Toward Construction of a Legal Term Ontology Extraction of Legal Definitions from a Japanese Statutory Corpus Toward Construction of a Legal Term Ontology Makoto Nakamura, Yasuhiro Ogawa, Katsuhiko Toyama Japan Legal Information Institute, Graduate

More information

The SYSTRAN Linguistics Platform: A Software Solution to Manage Multilingual Corporate Knowledge

The SYSTRAN Linguistics Platform: A Software Solution to Manage Multilingual Corporate Knowledge The SYSTRAN Linguistics Platform: A Software Solution to Manage Multilingual Corporate Knowledge White Paper October 2002 I. Translation and Localization New Challenges Businesses are beginning to encounter

More information

Guidelines for Masters / Magister / MA Theses

Guidelines for Masters / Magister / MA Theses Guidelines for Masters / Magister / MA Theses Table of Contents Language of the Guidelines Digital Copy Process Basics Research question Research methodology (social sciences) Research methodology (computer

More information

Concept Formation. Robert Goldstone. Thomas T. Hills. Samuel B. Day. Indiana University. Department of Psychology. Indiana University

Concept Formation. Robert Goldstone. Thomas T. Hills. Samuel B. Day. Indiana University. Department of Psychology. Indiana University 1 Concept Formation Robert L. Goldstone Thomas T. Hills Samuel B. Day Indiana University Correspondence Address: Robert Goldstone Department of Psychology Indiana University Bloomington, IN. 47408 Other

More information

Paraphrasing controlled English texts

Paraphrasing controlled English texts Paraphrasing controlled English texts Kaarel Kaljurand Institute of Computational Linguistics, University of Zurich kaljurand@gmail.com Abstract. We discuss paraphrasing controlled English texts, by defining

More information

Section 8 Foreign Languages. Article 1 OVERALL OBJECTIVE

Section 8 Foreign Languages. Article 1 OVERALL OBJECTIVE Section 8 Foreign Languages Article 1 OVERALL OBJECTIVE To develop students communication abilities such as accurately understanding and appropriately conveying information, ideas,, deepening their understanding

More information

Comprendium Translator System Overview

Comprendium Translator System Overview Comprendium System Overview May 2004 Table of Contents 1. INTRODUCTION...3 2. WHAT IS MACHINE TRANSLATION?...3 3. THE COMPRENDIUM MACHINE TRANSLATION TECHNOLOGY...4 3.1 THE BEST MT TECHNOLOGY IN THE MARKET...4

More information

Multi language e Discovery Three Critical Steps for Litigating in a Global Economy

Multi language e Discovery Three Critical Steps for Litigating in a Global Economy Multi language e Discovery Three Critical Steps for Litigating in a Global Economy 2 3 5 6 7 Introduction e Discovery has become a pressure point in many boardrooms. Companies with international operations

More information

Academic Standards for Reading, Writing, Speaking, and Listening June 1, 2009 FINAL Elementary Standards Grades 3-8

Academic Standards for Reading, Writing, Speaking, and Listening June 1, 2009 FINAL Elementary Standards Grades 3-8 Academic Standards for Reading, Writing, Speaking, and Listening June 1, 2009 FINAL Elementary Standards Grades 3-8 Pennsylvania Department of Education These standards are offered as a voluntary resource

More information

Overview of the TACITUS Project

Overview of the TACITUS Project Overview of the TACITUS Project Jerry R. Hobbs Artificial Intelligence Center SRI International 1 Aims of the Project The specific aim of the TACITUS project is to develop interpretation processes for

More information

Semantic analysis of text and speech

Semantic analysis of text and speech Semantic analysis of text and speech SGN-9206 Signal processing graduate seminar II, Fall 2007 Anssi Klapuri Institute of Signal Processing, Tampere University of Technology, Finland Outline What is semantic

More information

From Logic to Montague Grammar: Some Formal and Conceptual Foundations of Semantic Theory

From Logic to Montague Grammar: Some Formal and Conceptual Foundations of Semantic Theory From Logic to Montague Grammar: Some Formal and Conceptual Foundations of Semantic Theory Syllabus Linguistics 720 Tuesday, Thursday 2:30 3:45 Room: Dickinson 110 Course Instructor: Seth Cable Course Mentor:

More information

KNOWLEDGE-BASED IN MEDICAL DECISION SUPPORT SYSTEM BASED ON SUBJECTIVE INTELLIGENCE

KNOWLEDGE-BASED IN MEDICAL DECISION SUPPORT SYSTEM BASED ON SUBJECTIVE INTELLIGENCE JOURNAL OF MEDICAL INFORMATICS & TECHNOLOGIES Vol. 22/2013, ISSN 1642-6037 medical diagnosis, ontology, subjective intelligence, reasoning, fuzzy rules Hamido FUJITA 1 KNOWLEDGE-BASED IN MEDICAL DECISION

More information

Language Meaning and Use

Language Meaning and Use Language Meaning and Use Raymond Hickey, English Linguistics Website: www.uni-due.de/ele Types of meaning There are four recognisable types of meaning: lexical meaning, grammatical meaning, sentence meaning

More information

QUALITY CONTROL PROCESS FOR TAXONOMY DEVELOPMENT

QUALITY CONTROL PROCESS FOR TAXONOMY DEVELOPMENT AUTHORED BY MAKOTO KOIZUMI, IAN HICKS AND ATSUSHI TAKEDA JULY 2013 FOR XBRL INTERNATIONAL, INC. QUALITY CONTROL PROCESS FOR TAXONOMY DEVELOPMENT Including Japan EDINET and UK HMRC Case Studies Copyright

More information

NTT DATA Big Data Reference Architecture Ver. 1.0

NTT DATA Big Data Reference Architecture Ver. 1.0 NTT DATA Big Data Reference Architecture Ver. 1.0 Big Data Reference Architecture is a joint work of NTT DATA and EVERIS SPAIN, S.L.U. Table of Contents Chap.1 Advance of Big Data Utilization... 2 Chap.2

More information

Moving Enterprise Applications into VoiceXML. May 2002

Moving Enterprise Applications into VoiceXML. May 2002 Moving Enterprise Applications into VoiceXML May 2002 ViaFone Overview ViaFone connects mobile employees to to enterprise systems to to improve overall business performance. Enterprise Application Focus;

More information

Appendix B Data Quality Dimensions

Appendix B Data Quality Dimensions Appendix B Data Quality Dimensions Purpose Dimensions of data quality are fundamental to understanding how to improve data. This appendix summarizes, in chronological order of publication, three foundational

More information

FUNDAMENTAL TECHNOLOGIES FOR IP TRANSLATION SERVICES

FUNDAMENTAL TECHNOLOGIES FOR IP TRANSLATION SERVICES FUNDAMENTAL TECHNOLOGIES FOR IP TRANSLATION SERVICES 2 The strength of intellectual property documents, including international patent applications, relies in part on the quality of the translation. While

More information

Universal. Event. Product. Computer. 1 warehouse.

Universal. Event. Product. Computer. 1 warehouse. Dynamic multi-dimensional models for text warehouses Maria Zamr Bleyberg, Karthik Ganesh Computing and Information Sciences Department Kansas State University, Manhattan, KS, 66506 Abstract In this paper,

More information

Natural Language Database Interface for the Community Based Monitoring System *

Natural Language Database Interface for the Community Based Monitoring System * Natural Language Database Interface for the Community Based Monitoring System * Krissanne Kaye Garcia, Ma. Angelica Lumain, Jose Antonio Wong, Jhovee Gerard Yap, Charibeth Cheng De La Salle University

More information

Online Multilingual Translation of Technical Service Reports over the World Wide Web

Online Multilingual Translation of Technical Service Reports over the World Wide Web Online Multilingual Translation of Technical Service Reports over the World Wide Web S. Liu, S.C. Hui, S. Foo and P.C. Leong School of Applied Science, Nanyang Technological University Nanyang Avenue,

More information

DATA QUALITY AND SCALE IN CONTEXT OF EUROPEAN SPATIAL DATA HARMONISATION

DATA QUALITY AND SCALE IN CONTEXT OF EUROPEAN SPATIAL DATA HARMONISATION DATA QUALITY AND SCALE IN CONTEXT OF EUROPEAN SPATIAL DATA HARMONISATION Katalin Tóth, Vanda Nunes de Lima European Commission Joint Research Centre, Ispra, Italy ABSTRACT The proposal for the INSPIRE

More information

Writing learning objectives

Writing learning objectives Writing learning objectives This material was excerpted and adapted from the following web site: http://www.utexas.edu/academic/diia/assessment/iar/students/plan/objectives/ What is a learning objective?

More information

Hybrid Strategies. for better products and shorter time-to-market

Hybrid Strategies. for better products and shorter time-to-market Hybrid Strategies for better products and shorter time-to-market Background Manufacturer of language technology software & services Spin-off of the research center of Germany/Heidelberg Founded in 1999,

More information

How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time

How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time SCALEOUT SOFTWARE How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time by Dr. William Bain and Dr. Mikhail Sobolev, ScaleOut Software, Inc. 2012 ScaleOut Software, Inc. 12/27/2012 T wenty-first

More information

Flattening Enterprise Knowledge

Flattening Enterprise Knowledge Flattening Enterprise Knowledge Do you Control Your Content or Does Your Content Control You? 1 Executive Summary: Enterprise Content Management (ECM) is a common buzz term and every IT manager knows it

More information

MAP for Language & International Communication Spanish Language Learning Outcomes by Level

MAP for Language & International Communication Spanish Language Learning Outcomes by Level Novice Abroad I This course is designed for students with little or no prior knowledge of the language. By the end of the course, the successful student will develop a basic foundation in the five skills:

More information

Protecting Data with a Unified Platform

Protecting Data with a Unified Platform Protecting Data with a Unified Platform The Essentials Series sponsored by Introduction to Realtime Publishers by Don Jones, Series Editor For several years now, Realtime has produced dozens and dozens

More information

Introduction: Reading and writing; talking and thinking

Introduction: Reading and writing; talking and thinking Introduction: Reading and writing; talking and thinking We begin, not with reading, writing or reasoning, but with talk, which is a more complicated business than most people realize. Of course, being

More information

TEN RULES OF GRAMMAR AND USAGE THAT YOU SHOULD KNOW

TEN RULES OF GRAMMAR AND USAGE THAT YOU SHOULD KNOW TEN RULES OF GRAMMAR AND USAGE THAT YOU SHOULD KNOW 2003 The Writing Center at GULC. All rights reserved. The following are ten of the most common grammar and usage errors that law students make in their

More information

Some Implications of Controlling Contextual Constraint: Exploring Word Meaning Inference by Using a Cloze Task

Some Implications of Controlling Contextual Constraint: Exploring Word Meaning Inference by Using a Cloze Task Some Implications of Controlling Contextual Constraint: Exploring Word Meaning Inference by Using a Cloze Task Abstract 20 vs Keywords: Lexical Inference, Contextual Constraint, Cloze Task 1. Introduction

More information

The Principle of Translation Management Systems

The Principle of Translation Management Systems The Principle of Translation Management Systems Computer-aided translations with the help of translation memory technology deliver numerous advantages. Nevertheless, many enterprises have not yet or only

More information

FOR IMMEDIATE RELEASE

FOR IMMEDIATE RELEASE FOR IMMEDIATE RELEASE Hitachi Developed Basic Artificial Intelligence Technology that Enables Logical Dialogue Analyzes huge volumes of text data on issues under debate, and presents reasons and grounds

More information

Competencies for Secondary Teachers: Computer Science, Grades 4-12

Competencies for Secondary Teachers: Computer Science, Grades 4-12 1. Computational Thinking CSTA: Comp. Thinking 1.1 The ability to use the basic steps in algorithmic problemsolving to design solutions (e.g., problem statement and exploration, examination of sample instances,

More information

Empirical Machine Translation and its Evaluation

Empirical Machine Translation and its Evaluation Empirical Machine Translation and its Evaluation EAMT Best Thesis Award 2008 Jesús Giménez (Advisor, Lluís Màrquez) Universitat Politècnica de Catalunya May 28, 2010 Empirical Machine Translation Empirical

More information

Why SBVR? Donald Chapin. Chair, OMG SBVR Revision Task Force Business Semantics Ltd Donald.Chapin@BusinessSemantics.com

Why SBVR? Donald Chapin. Chair, OMG SBVR Revision Task Force Business Semantics Ltd Donald.Chapin@BusinessSemantics.com Why SBVR? Towards a Business Natural Language (BNL) for Financial Services Panel Demystifying Financial Services Semantics Conference New York,13 March 2012 Donald Chapin Chair, OMG SBVR Revision Task

More information

The Harvard style. Reference with confidence. (2012 Edition)

The Harvard style. Reference with confidence. (2012 Edition) Reference with confidence: The Harvard style 1 Reference with confidence The Harvard style (2012 Edition) As used in: Archaeology Biochemistry (as well as Vancouver) Biology (as well as Vancouver) Economics

More information

Curriculum Vitae JEFF LOUCKS

Curriculum Vitae JEFF LOUCKS Curriculum Vitae JEFF LOUCKS Department of Psychology University of Regina 3737 Wascana Parkway Regina, SK, S4S 0A2 Email: Jeff.Loucks@uregina.ca Phone: (306) 585-4033 Web Page: uregina.ca/~loucks5j Education

More information

Study Plan for Master of Arts in Applied Linguistics

Study Plan for Master of Arts in Applied Linguistics Study Plan for Master of Arts in Applied Linguistics Master of Arts in Applied Linguistics is awarded by the Faculty of Graduate Studies at Jordan University of Science and Technology (JUST) upon the fulfillment

More information

Translation Solution for

Translation Solution for Translation Solution for Case Study Contents PROMT Translation Solution for PayPal Case Study 1 Contents 1 Summary 1 Background for Using MT at PayPal 1 PayPal s Initial Requirements for MT Vendor 2 Business

More information

Week 3. COM1030. Requirements Elicitation techniques. 1. Researching the business background

Week 3. COM1030. Requirements Elicitation techniques. 1. Researching the business background Aims of the lecture: 1. Introduce the issue of a systems requirements. 2. Discuss problems in establishing requirements of a system. 3. Consider some practical methods of doing this. 4. Relate the material

More information

A framing effect is usually said to occur when equivalent descriptions of a

A framing effect is usually said to occur when equivalent descriptions of a FRAMING EFFECTS A framing effect is usually said to occur when equivalent descriptions of a decision problem lead to systematically different decisions. Framing has been a major topic of research in the

More information

Application Architectures

Application Architectures Software Engineering Application Architectures Based on Software Engineering, 7 th Edition by Ian Sommerville Objectives To explain the organization of two fundamental models of business systems - batch

More information

Introduction to Intercultural Communication 1.1. The Scope of Intercultural Communication

Introduction to Intercultural Communication 1.1. The Scope of Intercultural Communication 1 An Introduction to Intercultural Communication 1.1 The Scope of Intercultural Communication Sometimes intercultural conversations go very smoothly and are extremely intriguing; think of a walk at sunset

More information

KPMG Unlocks Hidden Value in Client Information with Smartlogic Semaphore

KPMG Unlocks Hidden Value in Client Information with Smartlogic Semaphore CASE STUDY KPMG Unlocks Hidden Value in Client Information with Smartlogic Semaphore Sponsored by: IDC David Schubmehl July 2014 IDC OPINION Dan Vesset Big data in all its forms and associated technologies,

More information

Extracted Templates. Postgres database: results

Extracted Templates. Postgres database: results Natural Language Processing and Expert System Techniques for Equity Derivatives Trading: the IE-Expert System Marco Costantino Laboratory for Natural Language Engineering Department of Computer Science

More information

Application of Natural Language Interface to a Machine Translation Problem

Application of Natural Language Interface to a Machine Translation Problem Application of Natural Language Interface to a Machine Translation Problem Heidi M. Johnson Yukiko Sekine John S. White Martin Marietta Corporation Gil C. Kim Korean Advanced Institute of Science and Technology

More information

Open-Source, Cross-Platform Java Tools Working Together on a Dialogue System

Open-Source, Cross-Platform Java Tools Working Together on a Dialogue System Open-Source, Cross-Platform Java Tools Working Together on a Dialogue System Oana NICOLAE Faculty of Mathematics and Computer Science, Department of Computer Science, University of Craiova, Romania oananicolae1981@yahoo.com

More information

DEFINING, TEACHING AND ASSESSING LIFELONG LEARNING SKILLS

DEFINING, TEACHING AND ASSESSING LIFELONG LEARNING SKILLS DEFINING, TEACHING AND ASSESSING LIFELONG LEARNING SKILLS Nikos J. Mourtos Abstract - Lifelong learning skills have always been important in any education and work setting. However, ABET EC recently put

More information

Introduction. Philipp Koehn. 28 January 2016

Introduction. Philipp Koehn. 28 January 2016 Introduction Philipp Koehn 28 January 2016 Administrativa 1 Class web site: http://www.mt-class.org/jhu/ Tuesdays and Thursdays, 1:30-2:45, Hodson 313 Instructor: Philipp Koehn (with help from Matt Post)

More information

3. What is Knowledge Management

3. What is Knowledge Management 3. What is Knowledge Management ETL525 Knowledge Management Tutorial One 5 December 2008 K.T. Lam lblkt@ust.hk Last updated: 4 December 2008 KM History The subject of KM was originally arisen in the field

More information

Keywords academic writing phraseology dissertations online support international students

Keywords academic writing phraseology dissertations online support international students Phrasebank: a University-wide Online Writing Resource John Morley, Director of Academic Support Programmes, School of Languages, Linguistics and Cultures, The University of Manchester Summary A salient

More information

FEAWEB ASP Issue: 1.0 Stakeholder Needs Issue Date: 03/29/2000. 04/07/2000 1.0 Initial Description Marco Bittencourt

FEAWEB ASP Issue: 1.0 Stakeholder Needs Issue Date: 03/29/2000. 04/07/2000 1.0 Initial Description Marco Bittencourt )($:(%$63 6WDNHKROGHU1HHGV,VVXH 5HYLVLRQ+LVWRU\ 'DWH,VVXH 'HVFULSWLRQ $XWKRU 04/07/2000 1.0 Initial Description Marco Bittencourt &RQILGHQWLDO DPM-FEM-UNICAMP, 2000 Page 2 7DEOHRI&RQWHQWV 1. Objectives

More information

SYNTACTIC PATTERNS IN ADVERTISEMENT SLOGANS Vindi Karsita and Aulia Apriana State University of Malang Email: vindikarsita@gmail.

SYNTACTIC PATTERNS IN ADVERTISEMENT SLOGANS Vindi Karsita and Aulia Apriana State University of Malang Email: vindikarsita@gmail. SYNTACTIC PATTERNS IN ADVERTISEMENT SLOGANS Vindi Karsita and Aulia Apriana State University of Malang Email: vindikarsita@gmail.com ABSTRACT: This study aims at investigating the syntactic patterns of

More information

stress, intonation and pauses and pronounce English sounds correctly. (b) To speak accurately to the listener(s) about one s thoughts and feelings,

stress, intonation and pauses and pronounce English sounds correctly. (b) To speak accurately to the listener(s) about one s thoughts and feelings, Section 9 Foreign Languages I. OVERALL OBJECTIVE To develop students basic communication abilities such as listening, speaking, reading and writing, deepening their understanding of language and culture

More information

Using In-Memory Computing to Simplify Big Data Analytics

Using In-Memory Computing to Simplify Big Data Analytics SCALEOUT SOFTWARE Using In-Memory Computing to Simplify Big Data Analytics by Dr. William Bain, ScaleOut Software, Inc. 2012 ScaleOut Software, Inc. 12/27/2012 T he big data revolution is upon us, fed

More information

Writing Goals and Objectives If you re not sure where you are going, you re liable to end up some place else. ~ Robert Mager, 1997

Writing Goals and Objectives If you re not sure where you are going, you re liable to end up some place else. ~ Robert Mager, 1997 Writing Goals and Objectives If you re not sure where you are going, you re liable to end up some place else. ~ Robert Mager, 1997 Instructional goals and objectives are the heart of instruction. When

More information

D2.4: Two trained semantic decoders for the Appointment Scheduling task

D2.4: Two trained semantic decoders for the Appointment Scheduling task D2.4: Two trained semantic decoders for the Appointment Scheduling task James Henderson, François Mairesse, Lonneke van der Plas, Paola Merlo Distribution: Public CLASSiC Computational Learning in Adaptive

More information

INPUTLOG 6.0 a research tool for logging and analyzing writing process data. Linguistic analysis. Linguistic analysis

INPUTLOG 6.0 a research tool for logging and analyzing writing process data. Linguistic analysis. Linguistic analysis INPUTLOG 6.0 a research tool for logging and analyzing writing process data Linguistic analysis From character level analyses to word level analyses Linguistic analysis 2 Linguistic Analyses The concept

More information

ICAME Journal No. 24. Reviews

ICAME Journal No. 24. Reviews ICAME Journal No. 24 Reviews Collins COBUILD Grammar Patterns 2: Nouns and Adjectives, edited by Gill Francis, Susan Hunston, andelizabeth Manning, withjohn Sinclair as the founding editor-in-chief of

More information

User choice as an evaluation metric for web translation services in cross language instant messaging applications

User choice as an evaluation metric for web translation services in cross language instant messaging applications User choice as an evaluation metric for web translation services in cross language instant messaging applications William Ogden, Ron Zacharski Sieun An, and Yuki Ishikawa New Mexico State University University

More information

How To Write The English Language Learner Can Do Booklet

How To Write The English Language Learner Can Do Booklet WORLD-CLASS INSTRUCTIONAL DESIGN AND ASSESSMENT The English Language Learner CAN DO Booklet Grades 9-12 Includes: Performance Definitions CAN DO Descriptors For use in conjunction with the WIDA English

More information

Text Mining - Scope and Applications

Text Mining - Scope and Applications Journal of Computer Science and Applications. ISSN 2231-1270 Volume 5, Number 2 (2013), pp. 51-55 International Research Publication House http://www.irphouse.com Text Mining - Scope and Applications Miss

More information

Discourse Markers in English Writing

Discourse Markers in English Writing Discourse Markers in English Writing Li FENG Abstract Many devices, such as reference, substitution, ellipsis, and discourse marker, contribute to a discourse s cohesion and coherence. This paper focuses

More information

Auto-Classification for Document Archiving and Records Declaration

Auto-Classification for Document Archiving and Records Declaration Auto-Classification for Document Archiving and Records Declaration Josemina Magdalen, Architect, IBM November 15, 2013 Agenda IBM / ECM/ Content Classification for Document Archiving and Records Management

More information

Evolution of Forex the Active Trader s Market

Evolution of Forex the Active Trader s Market Evolution of Forex the Active Trader s Market The practice of trading currencies online has increased threefold from 2002 to 2005, and the growth curve is expected to continue. Forex, an abbreviation for

More information

DIFFERENT TECHNIQUES FOR DEVELOPING COMMUNICATION SKILLS

DIFFERENT TECHNIQUES FOR DEVELOPING COMMUNICATION SKILLS DIFFERENT TECHNIQUES FOR DEVELOPING COMMUNICATION SKILLS MALA JAIN Department of Humanities Truba Institute of Engineering and Information Technology, Bhopal (M.P.), India Abstract Communication is an

More information

English Grammar Checker

English Grammar Checker International l Journal of Computer Sciences and Engineering Open Access Review Paper Volume-4, Issue-3 E-ISSN: 2347-2693 English Grammar Checker Pratik Ghosalkar 1*, Sarvesh Malagi 2, Vatsal Nagda 3,

More information

CELTA. Syllabus and Assessment Guidelines. Fourth Edition. Certificate in Teaching English to Speakers of Other Languages

CELTA. Syllabus and Assessment Guidelines. Fourth Edition. Certificate in Teaching English to Speakers of Other Languages CELTA Certificate in Teaching English to Speakers of Other Languages Syllabus and Assessment Guidelines Fourth Edition CELTA (Certificate in Teaching English to Speakers of Other Languages) is regulated

More information