SakkamMT White Paper

Save this PDF as:
 WORD  PNG  TXT  JPG

Size: px
Start display at page:

Download "SakkamMT White Paper"

Transcription

1 SakkamMT White Paper Sakkam K.K. Wakamatsu Building 7F Nihonbashi Honcho Chuo-ku Tokyo

2 Introduction Ever since the emergence of machine translation there has been a debate about if and when machines will replace humans for everyday translation tasks. With Sakkam Machine Translation (SakkamMT) we do not attempt to replace human translators; instead we focus on translation activities that are impractical or impossible with a traditional approach to translation. These activities are characterized by one or more of the following elements. Extreme time sensitivity. The value of some categories of information decreases rapidly from the moment it is made available. The announcement of US employment data has the potential to move currency markets in the seconds following its release but is old news one minute later. To have value, the translation of such releases needs to be subsecond. Massive volume. The volume of user-generated content is exploding, whether through social networking sites, online auctions or game networks. There are simply not enough human translators available to translate or monitor or extract meaning from tens of millions of messages per day. Limited availability of domain expertise. Many translation tasks require a high degree of domain expertise in addition to fluency in source and target languages. It is often difficult to identify translators with the requisite skill set and this problem is exacerbated if the translation task is not a discrete project but an ongoing, around the clock activity. We have also taken a fundamentally different approach in developing the technology that underpins SakkamMT. From the outset, SakkamMT was designed to provide native speaker quality translations within limited domains. As such, both its targeted applications and usefulness are very different than existing attempts at machine translation, which seek to provide general capabilities, but of low quality. As anyone who has ever used existing Internet and PC-based translation systems will know, their output is unusable within a business context. This paper provides an overview of the SakkamMT architecture and how SakkamMT is being used where human translation is either impractical or impossible. Readers wishing to learn more about the technical foundations of Sakkam MT are invited to review the bibliography.

3 Architecture The Interim Representation Model SakkamMT works by parsing the source language text using domain specific rules to create a language independent Interim Representation Model (IRM). The IRM can then be used to drive output to another language. This approach affords a number of benefits, 1. As more output languages are added, the systems scales linearly (O(N)) rather than with the number of language pairs (O(N 2 )). 2. The IRM may be used not just to output other human languages, but also to provide a computerreadable API for entity extraction. For example, dates could be extracted from an to populate a calendar. 3. The same content may be translated in different styles. For example, a new headline could be output in a highly abbreviated news style, and/or a more grammatically correct prose style depending upon the requirement. 4. The IRM facilitates a common-sense check of understanding and prevents some of the more egregious errors MT systems often make. Once the IRM has been populated, it is possible to judge whether the source text has been understood or not. At this stage a potential wrong translation can be suppressed. This is in marked contrast to generalized translation systems that will always output something regardless of quality. Cognitive Categories The design of the IRM draws heavily on research in Cognitive Psychology into semantic categorizations, which differ significantly from mathematically formal categories. For example, cognitive categories do not in general support transitive closure, which is a characteristic of mathematically formal categories. Hence, the categorization system that is used in IRM understands that although a car seat is a chair and a chair is an item of furniture, a car seat is not an item of furniture. Cognitive categories also display a degree of typicality a goldfish is not as good an example of a pet as a dog. Sometimes, category membership can be so ill-defined as to need legal action to clarify. The question of whether a tomato is a vegetable or a fruit ended up being decided by the US Supreme Court in response to a new tax on vegetables but not fruit. Perhaps not surprisingly, they choose to classify it as a vegetable to bring it within scope of the tax. Botanists would class it as a fruit. Cognitive categories are also crucially ad-hoc and can be determined dynamically, set by context. For example, a tomato can be a missile. Traditional hierarchically structured schemas and taxonomies are unsuited to this, but Sakkam s IRM has been designed from the outset with this flexibility in mind. Populating the IRM Populating the IRM requires parsing the source language to extract meaning. This is done using a series of linguistic rules which operate atomically upon the text until the structure of the IRM has been built. These rules are different from what many MT systems use in that they are primarily based upon the linguistic field of pragmatics, rather than more typical syntax-based grammatical rules. SakkamMT uses little syntactical grammar in its parsing for the simple reason that in many cases (such as headlines, s etc), the grammar can quite often be wrong. Instead, SakkamMT

4 takes what is closer to a construction grammar approach in using pragmatic criteria to determine the logical units. Note that here, as in the rest of this paper, we are taking pragmatics to refer to the sub-division of linguistics, not the everyday English meaning of the word. Figure 1. shows how alternate expressions using different phrasings result in the same pragmaticbased model, despite having very different syntactic structures. While in some cases, syntax is crucial (for example A killed B, only syntax can tell you which is the subject, and which is the object), in many cases it has little to add. In this particular example, the grammatical part of speech that approximate appears as, can be a noun, verb, adjective or adverb, with little implication to meaning. Source Syntactic Structure Interim Representation Model (IRM) Approximately, the distance to London is 200 km The approximate distance to London is 200 km The distance approximately to London is 200 km The distance to London approximation is 200 km The distance to London approximately is 200 km The distance to London approximates to 200 km The distance to London is approximately 200 km The distance to London is 200 km approximately It is approximately 200 km distance to London ADV NP PP VP(V NP) SENTENCE [ ITEM[ TYPE[London(LOCATION) NP(ADJ N) PP VP(V NP) ATTRIBUTE[ MEASURE[ TYPE[distance(LENGTH) NP ADV PP VP(V NP) RELATION[to NP(NP PP N) VP(V NP) ATTRIBUTE[ VALUE[ NP PP VP(ADV V NP) NUMBER[200 UNITS[Km NP PP VP(V NP) SIGN[DEFAULT-POSITIVE PRECISION[approximate NP PP VP(V ADV NP) NP PP VP(V NP ADV) NP VP(ADV NP(NP NP) PP) Figure 1. Syntactic Structure and Pragmatic Structure Once the IRM has been populated, a set of output rules can be brought to bear to express the IRM in the target language. Variations of style can be introduced at this stage. It should be noted that for some very terse styles such as are common in news headlines, we may be generating grammatically incorrect text, but nevertheless it is actually more appropriate for the audience. Looking at the example in Figure 1 again, we also note that for any particular representation, then both the variety and acceptability of grammatical phrasings will be highly dependent upon the target language. Existing machine translation systems will often instead try to translate the gross sentence syntactic structure, and then fill in the slots with translations for the individual noun and verb phrases. This however can lead to combinations that at best seem unnatural and at worst can appear nonsensical. As an example, one Internet-based machine translation system translates: Approximately, the distance to London is 200 km

5 as 200. Literally this would be: Approximation, it is 200km to London. The English is deliberately stilted to reflect how it would sound in Japanese. The sentence pattern exactly corresponds to the English source but the result is an incorrect Japanese sentence. Translating the other nine English phrasings results in nine different translations of varying degrees of accuracy. By contrast, SakkamMT uses the same IRM for all ten phrasings and uses the more acceptable and correct Japanese form, 200, again the same for all ten variations of the input. Another example illustrates how pragmatics can enable the correct disambiguation between the different meanings of a word. The following Japanese text is an extract from an item that was listed on Yahoo! Auctions. / The meaning of the first part of the text (shown in blue) is, Transformers Bumblebee Replica Mask 1:1 scale, genuine goods. Most translation engines will translate the second part (shown in red) as "I have a rash!". Although this is linguistically correct it is pragmatically wrong. In the context of an auction for a mask, the correct translation, and the one that is made by SakkamMT, is "You can wear it!"

6 Business Implementation A SakkamMT implementation consists of the following activities, Project definition. Agree the scope, scale and timeframes for the project. The project may include a pilot phase, which provides an opportunity to demonstrate the effectiveness of the SakkamMT approach in the client s own environment. Enhancement of the Interim Representation Model (IRM). Most deployments require modifications to our standard IRM to reflect the precise nature of the communication for the client s application. Compilation of project specific dictionaries and named entity databases. Existing client dictionaries and translation memories are used as available and, where necessary, additional material is developed through a combination of manual compilation and automated text analysis. All new content is then categorized to ensure consistency with the SakkamMT categorization model. Lifecycle planning. Typical SakkamMT deployments involve the continual translation of a feed of information over a period of months and years. Over the life of the project, the source content may change as new terminology, and even concepts, are introduced. In some cases, SakkamMT will be able to adapt automatically to these changes but there may also be an ongoing the requirement to ensure new terminology is being correctly used and named entity databases are up-to-date. Technical integration. Sakkam provides a simple, secure Web API for integration with client systems. The SakkamMT infrastructure is hosted at Amazon EC2, which enables us to scale our servers smoothly to many millions of translations per day and, by deploying production servers in different continents, to offer continuity of service even if any entire datacenter becomes unavailable. Pilot phase. End-to-end deployment of SakkamMT within the client environment for a limited but representation subset of the overall project scope. Live deployment. Full deployment of SakkamMT with the client environment.

7 Case study: Financial News Feed The foreign currency exchange markets dwarf those of equities, with on a typical day something of the order of $1.5 trillion being traded and the Yen/Dollar rate as one of the major currency pairs. The Yen/Dollar rate is highly sensitive to US economic news, with billions of dollars changing hands within seconds of the release of a key economic indicator. Often by the time Japanese translations are available, the market has already moved to factor in the news and the opportunity to profit has been lost. SakkamMT is being used by Intisar Technology to provide business critical translations of US Financial News headlines from a leading financial news vendor. Translations are made with no more than a sub-second delay, enabling Intisar s customers valuable time to profit. Clearly, accuracy of translation, as well as immediacy of output is essential in this environment. Intisar has integrated this feed of translated stories into their realtime market data platform that supports traders workstations, such as Tradesignal. Figure 2. SakkamMT translated news stories in Tradesignal Existing general purpose machine translation systems are incapable of providing even the gist of these news releases and human translators, if they have no domain expertise, are liable to make mistakes. Competing Japanese news vendors that rely on human translation are tens of seconds to several minutes behind, as well as requiring a high cost base (skilled bilingual financial domain experts available 24x7), that must ultimately be passed on to the consumer. For details of this news feed and other Intisar services, please contact

8 Case Study: Internet Auctions Increasing globalization has driven cross-boarder interest in collectables, and this now acts as a major revenue driver for Internet auctions and marketplaces. As a specific example, there is significant worldwide interest in manga and anime, many of which have limited availability outside of Japan. Figure 3 illustrates the price difference for a given item s English language listing on ebay and an identical item s Japanese language listing on Yahoo! Auctions. The difference in price represents the language premium that US collectors need to pay to search for items and bid in English. Figure 3: US-Japan Price Differentials Some collectors do attempt to use web based translators but these general services are too inaccurate to allow a collector bid with any confidence. Examples proliferate of inaccurate translations and even the reversal of meaning, as in the below example. Original listing Internet translation Home for longer storage is a new unused item, in a box outside the gall, and a GE. For a completely new, please bid. Unfortunately this is wrong. In fact the translation should be do not bid as is correctly shown in the below SakkamMT:

9 Original listing SakkamMT translation This is a new, unused product. However, since it has been stored at home for a long time there are a number of marks and scratches on the outer box. Please do not bid if you are looking for a completely new item. SakkamMT is being used to provide immediate translations of anime related memorabilia auctioned on Internet sites. For a live demonstration of SakkamMT translating auction content from Japan s Yahoo! Auctions, please on Twitter (http://twitter.com/animeauctions).

10 Bibliography: Technical Foundations Conceptual Modelling Margolis & Laurence (ed.), Concepts: Core Readings, MIT Press, 1991 Talmy, Leonard Toward a Cognitive Semantics Vol 1, MIT Press, 2000 Vosniadou & Ortony (ed.), Similarity and Analogical Reasoning, Cambridge University press, 1989 Linguistics Croft, William, Radical Construction Grammar: syntactic theory in typological perspective, Oxford University Press, 2001 Levin, Beth, English Verb Classes and Alternations: A Preliminary Investigation, University of Chicago Press, Chicago, IL Sperber, Dan and Wilson, Deirdre. Relevance: Communication and Cognition. Oxford: Blackwell, 1986/1995. Matsui, Tomoko. Bridging and Relevance. John Benjamins Publishing Co Wilson, Deirdre & Carston, Robyn A unitary approach to lexical pragmatics: Relevance, inference and ad hoc concepts. In N. Burton-Roberts (ed.) Pragmatics. Palgrave, London : Computational Architecture Mitchell, Melanie, Analogy-Making as Perception: A Computer Model. Cambridge, MA: MIT Press, 1993 Mitchell, Melanie, Analogy-making as a complex adaptive system. In L. Segel and I. Cohen (editors), Design Principles for the Immune System and Other Distributed Autonomous Systems. New York: Oxford University Press, 2001

How the Computer Translates. Svetlana Sokolova President and CEO of PROMT, PhD.

How the Computer Translates. Svetlana Sokolova President and CEO of PROMT, PhD. Svetlana Sokolova President and CEO of PROMT, PhD. How the Computer Translates Machine translation is a special field of computer application where almost everyone believes that he/she is a specialist.

More information

WHITE PAPER. Machine Translation of Language for Safety Information Sharing Systems

WHITE PAPER. Machine Translation of Language for Safety Information Sharing Systems WHITE PAPER Machine Translation of Language for Safety Information Sharing Systems September 2004 Disclaimers; Non-Endorsement All data and information in this document are provided as is, without any

More information

Modern foreign languages

Modern foreign languages Modern foreign languages Programme of study for key stage 3 and attainment targets (This is an extract from The National Curriculum 2007) Crown copyright 2007 Qualifications and Curriculum Authority 2007

More information

Introduction to formal semantics -

Introduction to formal semantics - Introduction to formal semantics - Introduction to formal semantics 1 / 25 structure Motivation - Philosophy paradox antinomy division in object und Meta language Semiotics syntax semantics Pragmatics

More information

Natural Language to Relational Query by Using Parsing Compiler

Natural Language to Relational Query by Using Parsing Compiler Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 4, Issue. 3, March 2015,

More information

International Journal of Scientific & Engineering Research, Volume 4, Issue 11, November-2013 5 ISSN 2229-5518

International Journal of Scientific & Engineering Research, Volume 4, Issue 11, November-2013 5 ISSN 2229-5518 International Journal of Scientific & Engineering Research, Volume 4, Issue 11, November-2013 5 INTELLIGENT MULTIDIMENSIONAL DATABASE INTERFACE Mona Gharib Mohamed Reda Zahraa E. Mohamed Faculty of Science,

More information

Overview of MT techniques. Malek Boualem (FT)

Overview of MT techniques. Malek Boualem (FT) Overview of MT techniques Malek Boualem (FT) This section presents an standard overview of general aspects related to machine translation with a description of different techniques: bilingual, transfer,

More information

Table 4. Subdivisions of Individual Languages and Language Families

Table 4. Subdivisions of Individual Languages and Language Families Table 4. Subdivisions of Individual Languages and Language Families The following notation is never used alone, but may be used as required by add notes under subdivisions of specific languages or language

More information

A Survey of Online Tools Used in English-Thai and Thai-English Translation by Thai Students

A Survey of Online Tools Used in English-Thai and Thai-English Translation by Thai Students 69 A Survey of Online Tools Used in English-Thai and Thai-English Translation by Thai Students Sarathorn Munpru, Srinakharinwirot University, Thailand Pornpol Wuttikrikunlaya, Srinakharinwirot University,

More information

SDL BeGlobal: Machine Translation for Multilingual Search and Text Analytics Applications

SDL BeGlobal: Machine Translation for Multilingual Search and Text Analytics Applications INSIGHT SDL BeGlobal: Machine Translation for Multilingual Search and Text Analytics Applications José Curto David Schubmehl IDC OPINION Global Headquarters: 5 Speen Street Framingham, MA 01701 USA P.508.872.8200

More information

Taxonomies in Practice Welcome to the second decade of online taxonomy construction

Taxonomies in Practice Welcome to the second decade of online taxonomy construction Building a Taxonomy for Auto-classification by Wendi Pohs EDITOR S SUMMARY Taxonomies have expanded from browsing aids to the foundation for automatic classification. Early auto-classification methods

More information

Ask your teacher about any which you aren t sure of, especially any differences.

Ask your teacher about any which you aren t sure of, especially any differences. Punctuation in Academic Writing Academic punctuation presentation/ Defining your terms practice Choose one of the things below and work together to describe its form and uses in as much detail as possible,

More information

Parsing Technology and its role in Legacy Modernization. A Metaware White Paper

Parsing Technology and its role in Legacy Modernization. A Metaware White Paper Parsing Technology and its role in Legacy Modernization A Metaware White Paper 1 INTRODUCTION In the two last decades there has been an explosion of interest in software tools that can automate key tasks

More information

Building a Question Classifier for a TREC-Style Question Answering System

Building a Question Classifier for a TREC-Style Question Answering System Building a Question Classifier for a TREC-Style Question Answering System Richard May & Ari Steinberg Topic: Question Classification We define Question Classification (QC) here to be the task that, given

More information

Knowledge Discovery from patents using KMX Text Analytics

Knowledge Discovery from patents using KMX Text Analytics Knowledge Discovery from patents using KMX Text Analytics Dr. Anton Heijs anton.heijs@treparel.com Treparel Abstract In this white paper we discuss how the KMX technology of Treparel can help searchers

More information

Teaching Vocabulary to Young Learners (Linse, 2005, pp. 120-134)

Teaching Vocabulary to Young Learners (Linse, 2005, pp. 120-134) Teaching Vocabulary to Young Learners (Linse, 2005, pp. 120-134) Very young children learn vocabulary items related to the different concepts they are learning. When children learn numbers or colors in

More information

College-level L2 English Writing Competence: Conjunctions and Errors

College-level L2 English Writing Competence: Conjunctions and Errors College-level L2 English Writing Competence: Mo Li Abstract Learners output of the target language has been taken into consideration a great deal over the past decade. One of the issues of L2 writing research

More information

Essential Questions in World Language

Essential Questions in World Language Essential Questions in World Language Compiled by Jay McTighe Motivation/Goals/Benefits Why learn another language? What are my motivations to learn another language? What are my expectations about learning

More information

Presented to The Federal Big Data Working Group Meetup On 07 June 2014 By Chuck Rehberg, CTO Semantic Insights a Division of Trigent Software

Presented to The Federal Big Data Working Group Meetup On 07 June 2014 By Chuck Rehberg, CTO Semantic Insights a Division of Trigent Software Semantic Research using Natural Language Processing at Scale; A continued look behind the scenes of Semantic Insights Research Assistant and Research Librarian Presented to The Federal Big Data Working

More information

Clarified Communications

Clarified Communications Clarified Communications WebWorks Chapter 1 Who We Are WebWorks was founded due to the electronics industry s requirement for User Guides in Danish. The History WebWorks was founded in 2004 as a direct

More information

Mining the Software Change Repository of a Legacy Telephony System

Mining the Software Change Repository of a Legacy Telephony System Mining the Software Change Repository of a Legacy Telephony System Jelber Sayyad Shirabad, Timothy C. Lethbridge, Stan Matwin School of Information Technology and Engineering University of Ottawa, Ottawa,

More information

NTT DATA Big Data Reference Architecture Ver. 1.0

NTT DATA Big Data Reference Architecture Ver. 1.0 NTT DATA Big Data Reference Architecture Ver. 1.0 Big Data Reference Architecture is a joint work of NTT DATA and EVERIS SPAIN, S.L.U. Table of Contents Chap.1 Advance of Big Data Utilization... 2 Chap.2

More information

Ling 201 Syntax 1. Jirka Hana April 10, 2006

Ling 201 Syntax 1. Jirka Hana April 10, 2006 Overview of topics What is Syntax? Word Classes What to remember and understand: Ling 201 Syntax 1 Jirka Hana April 10, 2006 Syntax, difference between syntax and semantics, open/closed class words, all

More information

CHAPTER I INTRODUCTION

CHAPTER I INTRODUCTION 1 CHAPTER I INTRODUCTION A. Background of the Study Language is used to communicate with other people. People need to study how to use language especially foreign language. Language can be study in linguistic

More information

Factors influencing the acquisition of a foreign language

Factors influencing the acquisition of a foreign language 26 ASPECTS OF ENGLISH VOCABULARY ACQUISITION Luminiţa ANDREI COCÂRŢĂ Abstract The present article focuses on some aspects of English vocabulary acquisition, starting from factors influencing this process

More information

Chapter 9 Language. Review: Where have we been? Where are we going?

Chapter 9 Language. Review: Where have we been? Where are we going? Chapter 9 Language Review: Where have we been? Stimulation reaches our sensory receptors Attention determines which stimuli undergo pattern recognition Information is transferred into LTM for later use

More information

VERBS OF MOTION AND SENTENCE PRODUCTION IN SECOND LANGUAGE

VERBS OF MOTION AND SENTENCE PRODUCTION IN SECOND LANGUAGE VERBS OF MOTION AND SENTENCE PRODUCTION IN SECOND LANGUAGE Stanislava Antonijević & Sarah Berthaud School of Health Sciences, National University of Ireland, Galway The current study examines production

More information

Ling 130 Notes: English syntax

Ling 130 Notes: English syntax Ling 130 Notes: English syntax Sophia A. Malamud March 13, 2014 1 Introduction: syntactic composition A formal language is a set of strings - finite sequences of minimal units (words/morphemes, for natural

More information

A + dvancer College Readiness Online Alignment to Florida PERT

A + dvancer College Readiness Online Alignment to Florida PERT A + dvancer College Readiness Online Alignment to Florida PERT Area Objective ID Topic Subject Activity Mathematics Math MPRC1 Equations: Solve linear in one variable College Readiness-Arithmetic Solving

More information

1. Sentence Processing

1. Sentence Processing Linguistics 401, section 3 Sentence processing October 25, 2007 1. Sentence Processing Syntax tells us that sentences have a structure, but it doesn't tell us how that structure is used. s this one of

More information

English Descriptive Grammar

English Descriptive Grammar English Descriptive Grammar 2015/2016 Code: 103410 ECTS Credits: 6 Degree Type Year Semester 2500245 English Studies FB 1 1 2501902 English and Catalan FB 1 1 2501907 English and Classics FB 1 1 2501910

More information

The SYSTRAN Linguistics Platform: A Software Solution to Manage Multilingual Corporate Knowledge

The SYSTRAN Linguistics Platform: A Software Solution to Manage Multilingual Corporate Knowledge The SYSTRAN Linguistics Platform: A Software Solution to Manage Multilingual Corporate Knowledge White Paper October 2002 I. Translation and Localization New Challenges Businesses are beginning to encounter

More information

Appendix B Data Quality Dimensions

Appendix B Data Quality Dimensions Appendix B Data Quality Dimensions Purpose Dimensions of data quality are fundamental to understanding how to improve data. This appendix summarizes, in chronological order of publication, three foundational

More information

COURSE OBJECTIVES SPAN 100/101 ELEMENTARY SPANISH LISTENING. SPEAKING/FUNCTIONAl KNOWLEDGE

COURSE OBJECTIVES SPAN 100/101 ELEMENTARY SPANISH LISTENING. SPEAKING/FUNCTIONAl KNOWLEDGE SPAN 100/101 ELEMENTARY SPANISH COURSE OBJECTIVES This Spanish course pays equal attention to developing all four language skills (listening, speaking, reading, and writing), with a special emphasis on

More information

Online Multilingual Translation of Technical Service Reports over the World Wide Web

Online Multilingual Translation of Technical Service Reports over the World Wide Web Online Multilingual Translation of Technical Service Reports over the World Wide Web S. Liu, S.C. Hui, S. Foo and P.C. Leong School of Applied Science, Nanyang Technological University Nanyang Avenue,

More information

Advice Document: Bilingual Drafting, Translation and Interpretation

Advice Document: Bilingual Drafting, Translation and Interpretation Advice Document: Bilingual Drafting, Translation and Interpretation Background The principal aim of the Welsh Language Commissioner, an independent body established under the Welsh Language Measure (Wales)

More information

Deep Structure and Transformations

Deep Structure and Transformations Lecture 4 Deep Structure and Transformations Thus far, we got the impression that the base component (phrase structure rules and lexicon) of the Standard Theory of syntax generates sentences and assigns

More information

But have you ever wondered how to create your own website?

But have you ever wondered how to create your own website? Foreword We live in a time when websites have become part of our everyday lives, replacing newspapers and books, and offering users a whole range of new opportunities. You probably visit at least a few

More information

NATURAL LANGUAGE QUERY PROCESSING USING PROBABILISTIC CONTEXT FREE GRAMMAR

NATURAL LANGUAGE QUERY PROCESSING USING PROBABILISTIC CONTEXT FREE GRAMMAR NATURAL LANGUAGE QUERY PROCESSING USING PROBABILISTIC CONTEXT FREE GRAMMAR Arati K. Deshpande 1 and Prakash. R. Devale 2 1 Student and 2 Professor & Head, Department of Information Technology, Bharati

More information

Dictionaries - Part 1

Dictionaries - Part 1 Teacher Development Pack Dictionaries - Part 1 What Dictionaries Are For Before your read What dictionaries are for, complete the following task: How many different kinds of dictionary have you used a

More information

Editing Your Writing for Grammar Mistakes

Editing Your Writing for Grammar Mistakes Editing Your Writing for Grammar Mistakes Does grammar matter? In most assignment guidelines given in the Faculty of Business a requirement for clear expression is mentioned. Some assignment guidelines

More information

Syntax Parsing And Sentence Correction Using Grammar Rule For English Language

Syntax Parsing And Sentence Correction Using Grammar Rule For English Language Syntax Parsing And Sentence Correction Using Grammar Rule For English Language Pratik D. Pawale, Amol N. Pokale, Santosh D. Sakore, Suyog S. Sutar, Jyoti P. Kshirsagar. Dept. of Computer Engineering, JSPM

More information

Overview of the TACITUS Project

Overview of the TACITUS Project Overview of the TACITUS Project Jerry R. Hobbs Artificial Intelligence Center SRI International 1 Aims of the Project The specific aim of the TACITUS project is to develop interpretation processes for

More information

FUNDAMENTAL TECHNOLOGIES FOR IP TRANSLATION SERVICES

FUNDAMENTAL TECHNOLOGIES FOR IP TRANSLATION SERVICES FUNDAMENTAL TECHNOLOGIES FOR IP TRANSLATION SERVICES 2 The strength of intellectual property documents, including international patent applications, relies in part on the quality of the translation. While

More information

Multi language e Discovery Three Critical Steps for Litigating in a Global Economy

Multi language e Discovery Three Critical Steps for Litigating in a Global Economy Multi language e Discovery Three Critical Steps for Litigating in a Global Economy 2 3 5 6 7 Introduction e Discovery has become a pressure point in many boardrooms. Companies with international operations

More information

Why major in linguistics (and what does a linguist do)?

Why major in linguistics (and what does a linguist do)? Why major in linguistics (and what does a linguist do)? Written by Monica Macaulay and Kristen Syrett What is linguistics? If you are considering a linguistics major, you probably already know at least

More information

Integrating Reading and Writing for Effective Language Teaching

Integrating Reading and Writing for Effective Language Teaching Integrating Reading and Writing for Effective Language Teaching Ruwaida Abu Rass (Israel) Writing is a difficult skill for native speakers and nonnative speakers alike, because writers must balance multiple

More information

Moving Enterprise Applications into VoiceXML. May 2002

Moving Enterprise Applications into VoiceXML. May 2002 Moving Enterprise Applications into VoiceXML May 2002 ViaFone Overview ViaFone connects mobile employees to to enterprise systems to to improve overall business performance. Enterprise Application Focus;

More information

QUALITY CONTROL PROCESS FOR TAXONOMY DEVELOPMENT

QUALITY CONTROL PROCESS FOR TAXONOMY DEVELOPMENT AUTHORED BY MAKOTO KOIZUMI, IAN HICKS AND ATSUSHI TAKEDA JULY 2013 FOR XBRL INTERNATIONAL, INC. QUALITY CONTROL PROCESS FOR TAXONOMY DEVELOPMENT Including Japan EDINET and UK HMRC Case Studies Copyright

More information

Section 8 Foreign Languages. Article 1 OVERALL OBJECTIVE

Section 8 Foreign Languages. Article 1 OVERALL OBJECTIVE Section 8 Foreign Languages Article 1 OVERALL OBJECTIVE To develop students communication abilities such as accurately understanding and appropriately conveying information, ideas,, deepening their understanding

More information

Extraction of Legal Definitions from a Japanese Statutory Corpus Toward Construction of a Legal Term Ontology

Extraction of Legal Definitions from a Japanese Statutory Corpus Toward Construction of a Legal Term Ontology Extraction of Legal Definitions from a Japanese Statutory Corpus Toward Construction of a Legal Term Ontology Makoto Nakamura, Yasuhiro Ogawa, Katsuhiko Toyama Japan Legal Information Institute, Graduate

More information

Guidelines for Masters / Magister / MA Theses

Guidelines for Masters / Magister / MA Theses Guidelines for Masters / Magister / MA Theses Table of Contents Language of the Guidelines Digital Copy Process Basics Research question Research methodology (social sciences) Research methodology (computer

More information

Basic Spelling Rules: Learn the four basic spelling rules and techniques for studying hard-to-spell words. Practice spelling from dictation.

Basic Spelling Rules: Learn the four basic spelling rules and techniques for studying hard-to-spell words. Practice spelling from dictation. CLAD Grammar & Writing Workshops Adjective Clauses This workshop includes: review of the rules for choice of adjective pronouns oral practice sentence combining practice practice correcting errors in adjective

More information

Concept Formation. Robert Goldstone. Thomas T. Hills. Samuel B. Day. Indiana University. Department of Psychology. Indiana University

Concept Formation. Robert Goldstone. Thomas T. Hills. Samuel B. Day. Indiana University. Department of Psychology. Indiana University 1 Concept Formation Robert L. Goldstone Thomas T. Hills Samuel B. Day Indiana University Correspondence Address: Robert Goldstone Department of Psychology Indiana University Bloomington, IN. 47408 Other

More information

Paraphrasing controlled English texts

Paraphrasing controlled English texts Paraphrasing controlled English texts Kaarel Kaljurand Institute of Computational Linguistics, University of Zurich kaljurand@gmail.com Abstract. We discuss paraphrasing controlled English texts, by defining

More information

Semantic analysis of text and speech

Semantic analysis of text and speech Semantic analysis of text and speech SGN-9206 Signal processing graduate seminar II, Fall 2007 Anssi Klapuri Institute of Signal Processing, Tampere University of Technology, Finland Outline What is semantic

More information

Language Meaning and Use

Language Meaning and Use Language Meaning and Use Raymond Hickey, English Linguistics Website: www.uni-due.de/ele Types of meaning There are four recognisable types of meaning: lexical meaning, grammatical meaning, sentence meaning

More information

Natural Language Database Interface for the Community Based Monitoring System *

Natural Language Database Interface for the Community Based Monitoring System * Natural Language Database Interface for the Community Based Monitoring System * Krissanne Kaye Garcia, Ma. Angelica Lumain, Jose Antonio Wong, Jhovee Gerard Yap, Charibeth Cheng De La Salle University

More information

Brit. J. Phil. Sci. 50 (1999), REVIEW. JERRY A. FODOR Concepts: Where Cognitive Science Went Wrong

Brit. J. Phil. Sci. 50 (1999), REVIEW. JERRY A. FODOR Concepts: Where Cognitive Science Went Wrong Brit. J. Phil. Sci. 50 (1999), 487 491 REVIEW JERRY A. FODOR Concepts: Where Cognitive Science Went Wrong Oxford: Oxford University Press, 1998, cloth 30.00/US$55.00 ISBN: 0 19 823637 9 (cloth), 0 19 823636

More information

Introduction to Software Paradigms & Procedural Programming Paradigm

Introduction to Software Paradigms & Procedural Programming Paradigm Introduction & Procedural Programming Sample Courseware Introduction to Software Paradigms & Procedural Programming Paradigm This Lesson introduces main terminology to be used in the whole course. Thus,

More information

How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time

How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time SCALEOUT SOFTWARE How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time by Dr. William Bain and Dr. Mikhail Sobolev, ScaleOut Software, Inc. 2012 ScaleOut Software, Inc. 12/27/2012 T wenty-first

More information

KPMG Unlocks Hidden Value in Client Information with Smartlogic Semaphore

KPMG Unlocks Hidden Value in Client Information with Smartlogic Semaphore CASE STUDY KPMG Unlocks Hidden Value in Client Information with Smartlogic Semaphore Sponsored by: IDC David Schubmehl July 2014 IDC OPINION Dan Vesset Big data in all its forms and associated technologies,

More information

Hybrid Strategies. for better products and shorter time-to-market

Hybrid Strategies. for better products and shorter time-to-market Hybrid Strategies for better products and shorter time-to-market Background Manufacturer of language technology software & services Spin-off of the research center of Germany/Heidelberg Founded in 1999,

More information

A terminology model approach for defining and managing statistical metadata

A terminology model approach for defining and managing statistical metadata A terminology model approach for defining and managing statistical metadata Comments to : R. Karge (49) 30-6576 2791 mail reinhard.karge@run-software.com Content 1 Introduction... 4 2 Knowledge presentation...

More information

Developing Fluency in English Speaking. For Japanese English Learners

Developing Fluency in English Speaking. For Japanese English Learners ACADEMIC REPORTS Fac. Eng. Tokyo Polytech. Univ. Vol. 35 No.2 (2012) 52 Developing Fluency in English Speaking For Japanese English Learners Hanae Hoshino * Abstract The purpose of this paper is to show

More information

Evolution of Forex the Active Trader s Market

Evolution of Forex the Active Trader s Market Evolution of Forex the Active Trader s Market The practice of trading currencies online has increased threefold from 2002 to 2005, and the growth curve is expected to continue. Forex, an abbreviation for

More information

American. English File. Starter. and the Common European Framework of Reference. Karen Ludlow

American. English File. Starter. and the Common European Framework of Reference. Karen Ludlow American English File and the Common European Framework of Reference Karen Ludlow Starter 2 Int r o d u c t i o n What is this booklet for? The aim of this booklet is to give a clear and simple introduction

More information

Syntax II: Issues in Syntax Spring Semester 2013

Syntax II: Issues in Syntax Spring Semester 2013 Syntax II: Issues in Syntax Spring Semester 2013 ENGL 627S / LING 522 T-TH 1:30-2:45pm, Heav 110 Instructor Dr. Elaine Francis Email: ejfranci@purdue.edu Office: Heav 408 Office hours: Tues-Thurs 9:45-10:15am

More information

Open-Source, Cross-Platform Java Tools Working Together on a Dialogue System

Open-Source, Cross-Platform Java Tools Working Together on a Dialogue System Open-Source, Cross-Platform Java Tools Working Together on a Dialogue System Oana NICOLAE Faculty of Mathematics and Computer Science, Department of Computer Science, University of Craiova, Romania oananicolae1981@yahoo.com

More information

From Logic to Montague Grammar: Some Formal and Conceptual Foundations of Semantic Theory

From Logic to Montague Grammar: Some Formal and Conceptual Foundations of Semantic Theory From Logic to Montague Grammar: Some Formal and Conceptual Foundations of Semantic Theory Syllabus Linguistics 720 Tuesday, Thursday 2:30 3:45 Room: Dickinson 110 Course Instructor: Seth Cable Course Mentor:

More information

FOR IMMEDIATE RELEASE

FOR IMMEDIATE RELEASE FOR IMMEDIATE RELEASE Hitachi Developed Basic Artificial Intelligence Technology that Enables Logical Dialogue Analyzes huge volumes of text data on issues under debate, and presents reasons and grounds

More information

The Principle of Translation Management Systems

The Principle of Translation Management Systems The Principle of Translation Management Systems Computer-aided translations with the help of translation memory technology deliver numerous advantages. Nevertheless, many enterprises have not yet or only

More information

Protecting Data with a Unified Platform

Protecting Data with a Unified Platform Protecting Data with a Unified Platform The Essentials Series sponsored by Introduction to Realtime Publishers by Don Jones, Series Editor For several years now, Realtime has produced dozens and dozens

More information

Empirical Machine Translation and its Evaluation

Empirical Machine Translation and its Evaluation Empirical Machine Translation and its Evaluation EAMT Best Thesis Award 2008 Jesús Giménez (Advisor, Lluís Màrquez) Universitat Politècnica de Catalunya May 28, 2010 Empirical Machine Translation Empirical

More information

Comprendium Translator System Overview

Comprendium Translator System Overview Comprendium System Overview May 2004 Table of Contents 1. INTRODUCTION...3 2. WHAT IS MACHINE TRANSLATION?...3 3. THE COMPRENDIUM MACHINE TRANSLATION TECHNOLOGY...4 3.1 THE BEST MT TECHNOLOGY IN THE MARKET...4

More information

Oral Language. Section II

Oral Language. Section II Section II Oral Language Rationale The development of literacy begins through the use of spoken language. Oral language provides a means to observe children as they learn to construct conceptual meanings

More information

Text Mining - Scope and Applications

Text Mining - Scope and Applications Journal of Computer Science and Applications. ISSN 2231-1270 Volume 5, Number 2 (2013), pp. 51-55 International Research Publication House http://www.irphouse.com Text Mining - Scope and Applications Miss

More information

Clustering Connectionist and Statistical Language Processing

Clustering Connectionist and Statistical Language Processing Clustering Connectionist and Statistical Language Processing Frank Keller keller@coli.uni-sb.de Computerlinguistik Universität des Saarlandes Clustering p.1/21 Overview clustering vs. classification supervised

More information

KNOWLEDGE-BASED IN MEDICAL DECISION SUPPORT SYSTEM BASED ON SUBJECTIVE INTELLIGENCE

KNOWLEDGE-BASED IN MEDICAL DECISION SUPPORT SYSTEM BASED ON SUBJECTIVE INTELLIGENCE JOURNAL OF MEDICAL INFORMATICS & TECHNOLOGIES Vol. 22/2013, ISSN 1642-6037 medical diagnosis, ontology, subjective intelligence, reasoning, fuzzy rules Hamido FUJITA 1 KNOWLEDGE-BASED IN MEDICAL DECISION

More information

Spike Trading: Spot FX Vs Futures

Spike Trading: Spot FX Vs Futures You should be aware of all the risks associated with foreign exchange and futures trading. There is a substantial risk of loss in foreign exchange and futures trading. Past performance is not indicative

More information

Flattening Enterprise Knowledge

Flattening Enterprise Knowledge Flattening Enterprise Knowledge Do you Control Your Content or Does Your Content Control You? 1 Executive Summary: Enterprise Content Management (ECM) is a common buzz term and every IT manager knows it

More information

Models of Dissertation in Design Introduction Taking a practical but formal position built on a general theoretical research framework (Love, 2000) th

Models of Dissertation in Design Introduction Taking a practical but formal position built on a general theoretical research framework (Love, 2000) th Presented at the 3rd Doctoral Education in Design Conference, Tsukuba, Japan, Ocotber 2003 Models of Dissertation in Design S. Poggenpohl Illinois Institute of Technology, USA K. Sato Illinois Institute

More information

Academic Standards for Reading, Writing, Speaking, and Listening June 1, 2009 FINAL Elementary Standards Grades 3-8

Academic Standards for Reading, Writing, Speaking, and Listening June 1, 2009 FINAL Elementary Standards Grades 3-8 Academic Standards for Reading, Writing, Speaking, and Listening June 1, 2009 FINAL Elementary Standards Grades 3-8 Pennsylvania Department of Education These standards are offered as a voluntary resource

More information

DATA QUALITY AND SCALE IN CONTEXT OF EUROPEAN SPATIAL DATA HARMONISATION

DATA QUALITY AND SCALE IN CONTEXT OF EUROPEAN SPATIAL DATA HARMONISATION DATA QUALITY AND SCALE IN CONTEXT OF EUROPEAN SPATIAL DATA HARMONISATION Katalin Tóth, Vanda Nunes de Lima European Commission Joint Research Centre, Ispra, Italy ABSTRACT The proposal for the INSPIRE

More information

An Introduction to Translation & Localization for the Busy Executive

An Introduction to Translation & Localization for the Busy Executive An Introduction to Translation & Localization for the Busy Executive This whitepaper contains copyrighted material. Additional copies may be obtained at http://www.acclaro.com/whitepapers 2009 Acclaro

More information

Why SBVR? Donald Chapin. Chair, OMG SBVR Revision Task Force Business Semantics Ltd Donald.Chapin@BusinessSemantics.com

Why SBVR? Donald Chapin. Chair, OMG SBVR Revision Task Force Business Semantics Ltd Donald.Chapin@BusinessSemantics.com Why SBVR? Towards a Business Natural Language (BNL) for Financial Services Panel Demystifying Financial Services Semantics Conference New York,13 March 2012 Donald Chapin Chair, OMG SBVR Revision Task

More information

Working with fractions, decimals and percentages 2 EDEXCEL FUNCTIONAL SKILLS PILOT. English Level 2. Teacher s Notes. Section D

Working with fractions, decimals and percentages 2 EDEXCEL FUNCTIONAL SKILLS PILOT. English Level 2. Teacher s Notes. Section D Working with fractions, decimals and percentages 2 EDEXCEL FUNCTIONAL SKILLS PILOT English Level 2 Teacher s Notes Section D Understanding and writing texts D2 Presenting information and ideas logically

More information

CSE4213 Lecture Notes

CSE4213 Lecture Notes CSE4213 Lecture Notes Introduction to B Tools Computer Science and Software Engineering Monash University 20070226 / Lecture 1 ajh 1/15 1 Outline 2 3 4 5 ajh 2/15 In this course we will be introducing

More information

Mapping into meaning: What s behind language? What s a mapping? Cognitive linguistics. Do mappings allow symbols? A lexically entrenched mapping

Mapping into meaning: What s behind language? What s a mapping? Cognitive linguistics. Do mappings allow symbols? A lexically entrenched mapping Mapping into meaning: What s behind language? Meaning construction involves the apprehension of a novel experience as a kind of memory, through the active mapping of new experiences onto readymade models.

More information

Application Architectures

Application Architectures Software Engineering Application Architectures Based on Software Engineering, 7 th Edition by Ian Sommerville Objectives To explain the organization of two fundamental models of business systems - batch

More information

FEAWEB ASP Issue: 1.0 Stakeholder Needs Issue Date: 03/29/2000. 04/07/2000 1.0 Initial Description Marco Bittencourt

FEAWEB ASP Issue: 1.0 Stakeholder Needs Issue Date: 03/29/2000. 04/07/2000 1.0 Initial Description Marco Bittencourt )($:(%$63 6WDNHKROGHU1HHGV,VVXH 5HYLVLRQ+LVWRU\ 'DWH,VVXH 'HVFULSWLRQ $XWKRU 04/07/2000 1.0 Initial Description Marco Bittencourt &RQILGHQWLDO DPM-FEM-UNICAMP, 2000 Page 2 7DEOHRI&RQWHQWV 1. Objectives

More information

Extracted Templates. Postgres database: results

Extracted Templates. Postgres database: results Natural Language Processing and Expert System Techniques for Equity Derivatives Trading: the IE-Expert System Marco Costantino Laboratory for Natural Language Engineering Department of Computer Science

More information

Localizing Your Mobile App is Good for Business

Localizing Your Mobile App is Good for Business Global Insight Localizing Your Mobile App is Good for Business Simply put, the more people who can find and use your mobile application in their native language, the larger your potential market. But launching

More information

TEN RULES OF GRAMMAR AND USAGE THAT YOU SHOULD KNOW

TEN RULES OF GRAMMAR AND USAGE THAT YOU SHOULD KNOW TEN RULES OF GRAMMAR AND USAGE THAT YOU SHOULD KNOW 2003 The Writing Center at GULC. All rights reserved. The following are ten of the most common grammar and usage errors that law students make in their

More information

Discourse Markers in English Writing

Discourse Markers in English Writing Discourse Markers in English Writing Li FENG Abstract Many devices, such as reference, substitution, ellipsis, and discourse marker, contribute to a discourse s cohesion and coherence. This paper focuses

More information

IACBE Advancing Academic Quality in Business Education Worldwide

IACBE Advancing Academic Quality in Business Education Worldwide IACBE Advancing Academic Quality in Business Education Worldwide Bloom s Taxonomy of Educational Objectives and Writing Intended Learning Outcomes Statements International Assembly for Collegiate Business

More information

MAP for Language & International Communication Spanish Language Learning Outcomes by Level

MAP for Language & International Communication Spanish Language Learning Outcomes by Level Novice Abroad I This course is designed for students with little or no prior knowledge of the language. By the end of the course, the successful student will develop a basic foundation in the five skills:

More information

B. Tech. Project Report

B. Tech. Project Report B. Tech. Project Report Utkarsh Upadhyay (Y5488) Mentor: Prof. R.M.K. Sinha November 6, 2008 Contents 1 Introduction & motivation 2 1.1 Interactive Broker s Algorithmic Trading Olympiad........ 2 2 The

More information

Business School. Is grammar only a problem for non-english speaking background students?

Business School. Is grammar only a problem for non-english speaking background students? Business School Editing your writing for grammar mistakes Editing Your Writing for Grammar Mistakes Does grammar matter? In most assignment guidelines given in the Business School, assessment criteria

More information

3. What is Knowledge Management

3. What is Knowledge Management 3. What is Knowledge Management ETL525 Knowledge Management Tutorial One 5 December 2008 K.T. Lam lblkt@ust.hk Last updated: 4 December 2008 KM History The subject of KM was originally arisen in the field

More information