Hybrid Machine Translation For English to Marathi: A Research Evaluation In Machine Translation

Size: px
Start display at page:

Download "Hybrid Machine Translation For English to Marathi: A Research Evaluation In Machine Translation"

Transcription

1 International Conference on Electrical, Electronics, and Optimization Techniques (ICEEOT) Hybrid Machine Translation For English to Marathi: A Research Evaluation In Machine Translation Pramod Salunkhe Aniket.D. Kadam Prof. Shashank Joshi PhD. Scholar Research Scholar Ph.D.Research Guide Dept. Comp. Dept.Info. Dept. Tech Computer (Hybrid Translator) Prof.Shuhas patil Ph.D. Research Guide Dept. Computer Dr.Devendrasingh Thakore Ph.D.Research Guide Dept. Computer Shrikant Jadhav Aspiring Ms Computer Student Dept.comp BVDUCOEP BVDUCOEP BVDUCOEP BVDUCOEP BVDUCOEP Ghrce SSPU [email protected] [email protected] [email protected] [email protected] [email protected] [email protected] Abstract Information Present in Different language and Structure gives Rise to language as barrier in information retrieval. Informative Document on queen Elizabeth is been writing by foreign language English which makes its difficult for a Marathi reader to understand and seek History of England, on Similar lines Literature Work on Shivaji is mostly documented in Marathi which makes foreign Historians difficult to gain know, in both case user is at times unknown of facts due to language and may lose interest on information. Vital information on current happing on village and taluka level are been published in newspaper with local language which setback information spread among other Masses. Many Time government documents and forms are been presented in English Language where a lay man from Marathi language background finds difficulty to understand information and even avoid such procedures therefore it highly urges for need of automated Software based Translation system which would assist in cross Domain information Retrieval. Machine Translation assist to translate Information presented in one language to other language. Information can be present in form of text, speech and image translating this information helps for sharing of information and ultimately information gain. A lot of work has been done on Translation of English to Hindi, Tamil Bangla and other foreign languages also. Machine Translation is challenging Research Area with numerous issues due to language ambiguity like grammar, Structure and even fluency of use. Numerous Methodologies have been proposed and developed which have uplifts and downfalls also, Although statistical and rule based at core with each having limitations.rule based produce accurate mapped translation and are trainable system but costly, whereas statistical produce fluent translation but lack accuracy and sense. Hybrid is combine approach which integrated approach and helps to optimize translation output. The research manuscript we present hybrid machine Translator for English to Marathi language which translated Web pages, text Documents on Agriculture (crops fruits for farmer), Medical reports in Marathi and tourism related information. Proposed System consists of Parallel Multi-Engines which process statistical and rule based Translation for same input document and produce a optimized result by performing statistical over rule based which give fluent language sense outputs. Mapper algorithm is been used in rule based Translation, with Agriculture corpus, medical and tourism corpus for statistical evaluation. Marathi wordnet has been implemented to enhance dictionary and incorporate better translation result Currently System has been proposed for text Document which can be extended to speech and voice. Comparative analysis for point view in one dimension of only limited set of Queries is done with Google Translator. Holding hybrid approach as better methodology. A Systematic survey of only 10 key articles used in research has been done. This research article is extension of our previous research surveys and partial implementations. And innovative S- measure has been new parameter proposed and evaluated by our research team. Keywords Machine Translation,Rule based Translation,Statistical Translation,Mapping Rules,hybrid translation,google Translator, I. INTRODUCTION Machine enabled transformation is core research in Natural Language (NL) for removing language as obstacle in communication and information access with help of bi-lingual machine translation. [9]Research work in Machine translation has been done from English to Hindi, English to Urdu to another language like telgu many native languages and foreign languages like Arabic, Chinese and Spanish. The research problem to address is to community of Marathi language, language spoken and used by people more Than 0.8 billion individuals has been derived from Sanskrit. Word in order is major problem in translation of spring language to objective language. Marathi is mostly spoken language in State of Maharashtra.[4]The structure of language is twin documented from left side to right end, from top end to Bottom end of document. Marathi terms are derived from Sanskrit Nava /16/$ IEEE

2 derived from Navin,month in English Maas derived from Machine. Individuals from different culture and language base are not able to easy communicate where a translation system would facilitate to complete the gap. [5]This research work is directed to first summarize than translate which is useful to Marathi scholar in study of some research work of English writer. In any research work there are numerous issues and problems to address this research work is formulating building hybrid Machine Translation system T= [context {assertive sentence, interrogative sentences}] with summarization to illuminate irrelevant paragraph from document. [7][3][8]As research work first script is progressing this is first research article addressing literature Examination and small introduction to proposed work. Machine Translation is translation obtained by machine on large scale from source to target language. India is nation where large diversity is observed in culture with diversity in spoken language.five major division hold the Indian.[7][1] English to Marathi language Translator (EMLT) is a field of computer science and linguistics concerned with the interactions between computers and human (natural) language. EMLT[11]systems convert information from computer databases into readable human language. Natural systems convert samples of human language into more formal representations such as parse trees or first-order logic structures that are easier for computer programs to manipulate. The concept of EMLT is as shown below. Fig 1: Basic EMLT Concept Rule Based English to Marathi Translator converting Simple English affirmative sentences to Marathi. In this research work we are converting the simple English affirmative sentences to Marathi sentences. [7]This is basically a machine translation. We have chosen the transfer based approach which is the thin line between the semantic and the direct approach. For that we have designed the parser which helps us to map the English sentence binding to the rules and then getting converted into target language[2]. Languages with Hindi official language and English foreign adopted language. English is the far most used language all over world and require research work to translate these English documents to native languages for knowledge gain process. World has accepted the English as major communication language [1,2].Marathi is state language of Maharashtra with world of information many articles and web articles are been written in English languages mostly in regional language.rich amount of information is written by language experts on particular topic related to local spoken language.problem lies to understand this doc s and articles in different language for which computer assisted translation is faster and better solution than human assisted translation. Government needs a Translation system for assistance and communication orders Large research effort is been taken by major organizations like IIT Bombay,C-DAC,IIT Hyderabad for better fully automated machine translation system. [9]the research evaluation of work carried out by them shows a major upliftment in software development.[2] Diverse approaches with unique methodology have been under taken by them to solve various research issues and problem in machine translation. Although major projects have been on English to Hindi, Tamil, Bangala, Urdu...etc. Smaller amount of work exists in English to Marathi Translation IIT Mumbai P.S.Battacharya s work is appreciate with formation of wordnet in a Marathi parallel corpus for Marathi English and Hindi and interlingua approach in translation which is also a hybrid approach in machine translation[6,2]. This research article is been documented in 5 subgroups. Subdivision I give introduction on subject, subdivision II Related work and survey on systems, Subdivision III core technique in our work, IV Implementation details, V comparative study and Evaluation parameters, concluding mark on work Research Contribution: Crisp literature Survey on key 11 articles for implementation Background knowledge of Machine Translation Research Scope in Marathi language Hybrid Machine Translation Demonstration Comparison with google Translator. S-measure an innovative Evaluation parameter in MT System II. BACKGROUND A. Levels Of Language Processing The most explanatory method for presenting what actually happens within a Natural Language Processing system is by means of the levels of language approach. This is also referred to as the synchronic model of language and is distinguished from the earlier sequential model, which hypothesizes that the levels of human language processing follow one another in a strictly sequential manner. Psycholinguistic research suggests that language processing is much more dynamic, as the levels can interact in a variety of orders. Introspection reveals that we frequently use

3 information we gain from what is typically thought of as a higher level of processing to assist in a lower level of analysis[7] 1.) Morphology: This level deals with the componential nature of words, which are composed of morphemes the smallest units of meaning 2.) Lexical: At this level, humans, as well as NLP systems, interpret the meaning of individual words. Several types of processing contribute to word-level understanding the first of these being assignment of a single part-of-speech tag to each word. In this processing, words that can function as more than one part-of-speech are assigned the most probable part-of speech tag based on the context in which they occur. 3.) Syntactic: This level focuses on analyzing the words in a sentence so as to uncover the grammatical structure of the sentence. This requires both a grammar and a parser. The output of this level of processing is a (possibly delinearized) representation of the sentence that reveals the structural dependency relationships between the words. 4.) Pragmatic: This level is concerned with the purposeful use of language in situations and utilizes context over and above the contents of the text for understanding The goal is to explain how extra meaning is read into texts without actually being encoded in them. This requires much world knowledge, including the understanding of intentions, plans, and goals. Some NLP applications may utilize knowledge bases and inference modules. B. Types of Machine Translation Scheme 1.) Transfer-based machine translation: Both transfer-based and interlingua-based machine translation have the same idea: to make a translation it is necessary to have an intermediate representation that captures the "meaning" of the original sentence in order to generate the correct translation. In interlingua-based MT this intermediate representation must be independent of the languages in question, whereas in transfer-based MT, it has some dependence on the language pair involved. 2) Example-based machine translation (EBMT): A method of machine translation often characterized by its use of a bilingual corpus with parallel texts as its main knowledge base, at run-time. It is essentially a translation by analogy and can be viewed as an implementation of case-based reasoning approach of machine learning. 3) Rule-Based Machine Translation (RBMT) Rule-Based Machine Translation (RBMT also known as Knowledge-Based Machine Translation ; Classical Approach of MT) is a general term that denotes machine translation systems based on linguistic information about source and target languages basically retrieved from (unilingual, bilingual or multilingual) dictionaries and grammars covering the main semantic, morphological, and syntactic regularities of each language respectively. Having input sentences (in some source language), an RBMT system generates them to output sentences (in some target language) on the basis of morphological, syntactic, and semantic analysis of both the source and the target languages involved in a concrete translation task. 4) Statistical Machine Translation Bilingual text analysis is core technique used in Statistical system which trains over the bilingual corpus. statistical machine translation were introduced by Warren Weaver in Nowadays it is by far the most widely studied machine translation method.as translation is fluent to language this technique is active research in Machine Translation though has higher cost of corpus, hard to error and cannot work for non-parent languages. With sub categorization as word based, phrase based, syntax based, hierarchical phrase based. III. LITERATURE SURVEY A. Surrvey Investigation [3]Abstract: In Machine Translation text from one language gets transformed from one language to other,pattern divergence has been major challenge. Investigating and highlighting this divergence is vital for better MT(machine Translation).this assists to come up with techniques to overcome them. Divergence: Lexico-semantic-Divergence:Thematic-Divergence, Structural Divergence, Conflational& Inflectional Divergence, Categorical Divergence. Syntactic Divergence in English Marathi Translation: Constituent order Divergence, Pleonastic Divergence. Common Divergence: Replicative Words, Determiner System, Morphological Gaps, Honorific Difference Research scope: Elimiatating all above this divergence requires structured technique and are scope of implementation [2] Abstract: Research Evaluation is major work in achieving a success. Which is been evaluated in terms of comparison of methodology or technique i.e. Algorithm. Machine Translation is research domain with numerous techniques and Methodologies with every technique having advantage over other in terms of evaluation parameter. Statistical approach achieves fluency parameter in translation but lacks accuracy whereas rule based system overcome accuracy to present accurate translation but requires large time for development whereas statistical requires shorter time in development. The novel approach is to bring twin techniques in combination to achieve better results in terms of evaluation parameters of system achieve better accuracy and fluency. In hybrid approach

4 statistical methodology runs over with rule based methodology to achieve better result output of statistical system is corrected in comparison to rule based system. This manuscript the implementation work is presented with evaluation parameters to compare system in terms of research hypothesis proposed and its achieved is presented Methodology: Hybrid MT is been presented with Statistical and rule based approach which yields good set of output. A comparative analysis is been presented on search work and how evaluation parameters should be considered in evaluating machine translation is presented. Parameters other than just precision and recall need to be incorporated like adequacy and fluency. Methodology: Research scope: Evaluating system with WER, BLUE and Meteor is research scope left unaddressed. [4]Abstract: Mapping methodology is been presented for building Better Machine Translations. structural properties of English hindi and Marathi are been studied,with hindi spoken by 14% and Marathi by 5% a world of concepts rise and give knowledge building as in similar line to sysnet and Eurowordnet we require WorldNet for Marathi and English with Hindi or tri-worldnet for bilingual. Research scope: Developing WorldNet is challenge but WorldNet would largely facilitate development of better language translation. [1]Abstract: Hybrid machine Translation is presented with hypothesis that limitations of rule based and statistical can be overcome with integrated approach i.e Hybrid Translation. With evaluation of translation with new parameters like adequacy and fluency. Research scope: Combining new Translation methodologies in statistical and rule based are to be tested like Transfer based, Interlingua and many more which definitely show research area unaddressed in English to Marathi [5] Abstract: Comparison of SMT and RMT is been presented in with best example. Authors have highlighted.comparison of two techniques limitation of two and urged that a Hybrid approach would definitely enhance the work of machine Translation. Research Scope: Implementing Hybrid Translation schemes have some limitations and need to address with better programming and optimized algorithms with New Multi-core technology and parallel processing. [11] Rule based Translation is presented for MT and found very good for assertive sentences Concluding Remark on Survey: Above literature review mainly focuses on Marathi Translation Schemes [8] author has proposed UI tags for web pages translation which proposes hybrid process that builds bilingual dictionary on RBI portal and parser is built in C.[9] research scholar as built rule based scheme for E-M translation which extend better process with grammar correction sentiment examination and spell check.[11] the writer has constructed E- M scheme with on rule scheme for Assertive type Sentences OPEN NLP TOOLS Have been integrated. rule based scheme is been built with a comparative Evaluation to Google Translator large work on Grammar lexicaon is been built for morphological better output. [10] the hybrid approach is been buit with intermediate language Selection for machine translation system is been developed on parallel corpa and match mapping is process. In all a hybrid approach is suited for E-M Translation and can develop Better System in terms of Accuracy so the research work focus is on Hybrid System. The Summarization work is incorporatated from previous work of author[7][11] B. Researxh analysis Questions Machine Assisted Translation is a topic of worldwide flat, where Translation schemes have been developed but lack in accuracy in translation even Though Translators like Google achieve Fluency they lack correctness and precision. [RAQ1]. Multilingual Translation scheme in better effective Translation scheme is further challenge. [RAQ2]. Rule based system require expert hand written rules which is huge challenge and time money consuming, writing bilingual dictionary is hard and most difficult task, Web based Translation scheme face Big data challenge.[raq3]. Precise Matching of Corpus similarity is promising but in what way to diminish amount of assessments in Translation Scheme. [RAQ4]. In what way can machine inaccuracy be reduced with innovative Methodology in particular scheme? [RAQ5]. Better design pattern or architecture of system [RAQ6]. Better Evaluation Parameters to evaluate working of rule and statistical Translation scheme [RAQ7] Linguistic Divergence Patterns in English to Marathi Translation [RAQ8]. Inflection Rules for English to Marathi Translation [RAQ9]. IV. CORE METHODOLOGY The core module in accurate information presentation is translation hence core methodology employ s mapping one to one rule in English to appropriate rule in Marathi language which are handcrafted in study details of structure

5 and parsing study of OPEN NLP. Summarization methodology is simple and centroid based for topical or subject summarization[11][12] if in research implement context in translation Core methodology y in part B of system is as shown in fig2 Core Technique is lexical analyzer to detect morphological structure which is matched to English lexicon and then applied to English grammar rule further mapping of English word is been done with Marathi a set of rules are been written to generate exact Marathi sentence translated.the research writer in Marathi studding the English document would be presented with information in Marathi and can further carry his work without any information gap. The inbuilt OPENNLP packages are been used to programme the system. A dictionary set is been generated to store in proper nouns pronouns and verbs adverbs.a healthy dataset consists of words terms and phrase is been developed. The OPENNLP package consists of Sentence detect (), Tokenization (), Parser (), Chunk () as in Built functions. Dataset of Marathi rules is been developed in mapping to English rules generated by OPENNLP Package. This proposed Six Phase architecture is core design that facilitates accurate information presentation from diverse language to Marathi language.the system is web deployed and web based which extract information from URL s and web pages and various online content.the output os summarization is input to translation module the translated output is cross lingual information retrieved Architecture of Cross-lingual information retrieval system The architecture of system consists of six phases. Layer 1:summarization module which takes in pdf,doc,txt and web pages information from web. Layer 2: Consists of core summarization module which summarizes based on key word and percentage or centroid based method as user selects. Layer 3:This module updates dataset with word and their meaning in Marathi with rules mapping from one language to other. Layer 4:Local and web dataset built information repository which stores data for faster access. Layer 5: this is core Translation module where a selector selects the translation scheme based on user input context example or rule based or in all a combination of all process. Layer 6: WORDNET implementation. Fig2: Architecture of Hybrid Translator. A. Algorithm of Hybrid Machine Translation: Given A Input Text Document or web page is submitted to HMT Multi-Engine Processing is been Developed where one machine translation Engine Works for Rule based Translation performing rule based output generation. And reducing; language divergence as in Research Question [RAQ1] eliminating them and generating set of Answers for k-divergence rule set in Engine. Same text or web page is submitted to Statistical Engine based on large corpus of different domains and generates a statistical output. A hybridized Engine takes input from Above two engines to map in rule based results on statistical and vice versa which is dynamic and trainable. A finalized Machine Translation output is generated showing in best machine Translation with Hybrid Approach. B. Mathematical Undepining of algorithm: The model is segregated in 5 conditions as below: 1. Input set 2. Output set 3. Processing set i.e processing on data 4. Success Condition 5.Failure Condition.

6 [1.] Input set: Machine Translation dataset creation: i. HMT accept input as a file or folder in order to create dataset for Translation set M= {file1, file2, file3 file m} ii. Parallel Corpus for statistical translation from, P.S.battacharya [forum with permission] iii. Set of rule For Marathi Language in form iv. Rule={English-word,Marathiword,conversion rule}. v. Hybrid machine Translation System Consist of file AND parallel corpus for in order to create dataset for Translation say set D= < {file1,--file2,---file3 filem},parallel corpus>. vi. Web data has been incorporated from web at online processing stage only [2] Output set: i. Rule Engine maps input English words to Marathi words from database and performs Sentence to sentence translation by replacing Equivalent Marathi words from dictionary to English with Sentence structure as stored in database rule. ii. It Map(eword,mword,rule) let P be complete paragraph then complete process is Rule Translation: =Map ES,MS,rs1,rs2,..r3 (1) The complete paragraph translation is done above. iii. iv. In statistical Translation sentence to sentence mapping of input English sentence is done with bilingual corpus to generated Marathi Translation. Statistical Translation = Map ES, MS, corpus..(2) iii. The Generated results are stored in offline dataset of Marathi bank to upgrade database and hence the Translation is termed as learn as u process translation where dataset goes on increasing. V. RESULTS AND EVALUATION Research Evaluation is most vital in considering success and failure of research work done as most author today just evaluate their research on precision and recall or time delay and no of queries it comes up it limitations as precision and recall are merely mathematical calculation at times they may lead to unstructured research evaluation hence we build a better research work on new parameters like adequacy and fluency which help in evaluating MT system in better way. A. Results Of Hybrid Translation A view of comparison is been present with Google Translator although we don t mean to compare and say that we have built comparable system to Google but a domain restricted system which produces better translation scheme with a hybrid approach which is best and can be incorporated by Google of such web translation system for optimized results. Here the system is been demonstrated to be evaluated with Goggle Translator which primarily employs Statistical approach only and our system implements both (hybrid Approach) statistical and Rule based.the below results demonstrate that hybrid Approach is better than single approach for morphological divergent and huge language like Marathi Input for translation: Plants breeding and Genetics contribute immeasurably to farm productivity. Genetics has also made a science of livestock breeding.hydroponics is a method of soilless gardening.computers have become an essential tool for farm management Statistical Engine Generates output as rule based one. [3]. Processing rule and Statistical: i. Output =Map(Rule Based to,statistical ) (3) ii. Error elimination and word correction by replacing correct structure from rule based engine is done on statistical one to find optimized output. Fig 3: Working of rule based Engine

7 Rule based Statistical Hybrid Total 1697 Table 1: Query Work load B. Precision and Recall The Below Graph Indicates the performance each scheme as shown below for average set of queries, performance of Statistical and rule based differ by marginal values whereas hybrid system produces optimized outputs. Fig 4: Working of statistical Engine = = What evaluation parameters lack: Ignore relevance of words Operate on local level Scores are meaningless Human translators score low on BLEU New Assessment parameter Syntactic similarity Similar equivalence or entailment Metrics targeted at reordering Trainable metrics. Fig 5: comparative view with Google Translator Analysis Google Translator: output consists of fluent words in Marathi language but structure is incorrect and disappearance of sense whereas our system produces statistical translation as near to good but our rule based system output correct it i.e hybrid approach is above single approach. B.Query Workload The System has been tested for statistical translation of 1200 documents of size 3k and 447 translation of Rule based System and 50 web documents from 3 websites related to agriculture,in overall. These include Agriculture domain corpus, medical domain corpus and tourism corpus. Rule based translation include Medical reports, documents on Fig3: Graphical Evaluation of Hybrid System There are manual annotation tolls like for manual assessment and incorporating user feedbacks with MT translations

8 produces the rule based out with accurate sense and meaning conservation Statistical output is generated from corpus incorporated in project the statistical output is also better and comparable. The Hybrid output is been found better in two input s given to system. Ultimately Hybrid Approach is better to single Approach taken. Future research is with integration with Future search engines[3][13]. Fig4: False Alarm Ration Fig5: Worst Case Complexity Adequacy Fluency 5 : All Meaning 5 : Flawless English 4 : Most Meaning 4 : Good English 3 : Much Meaning 3 : Non-Native English 2 : little Meaning 2 : Diffluent English 1 : None 1 : Incomprehensible Table 2: New Evaluation parameters. Innovative Evaluation parameter: We design a new innovative novel evaluation parameter for system based on s measure which is semantic similarity measure of system with other. In practices system evaluate performance i.e. accuracy based on comparison with reference translation only.at times system may generate translation with synonymous words and reference translation do not contain them so system evaluation for automated translation fall in value which is bug and need to make better also at times generated translation might be better than referenced ones so system needs a semantic measure parameter. S-measure < words> < terms> <Phrases> MAP < similar Words set> < similar terms> < Similar Phrases> VI. CONCLUSION The above two are set of input Translations passages taken from website and submitted to Google and to our system also. The results show that at times the output produced by Google Translator is fluent but contains meaningless sentences and miss out the core sense of passage. The proposed system ACKNOWLEDGMENT Thanks to Aniket for his efforts. I truly acknowledge Pushapak Bhattacharyya,S. B. Kulkarni, P. D. Deshmukh, Sreelekha.S, Raj Dabre, M.L.Dhore & S.X.Dixit. All other Authors whose work has been cited directly or indirectly. REFERENCES [1] Pramod Salunkhe, Mrunal Bewoor, Suhas Pati, Shashank Joshi,Aniket kadam, Summarization and Hybrid Machine Translation System for English to Marathi: A Research Effort in InformationRetriveal System (H-Machine Translation,Discovery The International journal (ISI thomas retuers indexed) [2] Pramod Salunkhe, Mrunal Bewoor, Dr.Suhas Patil A Research Work on English to Marathi Hybrid Translation System, (IJCSIT) International Journal of Computer Science and Information Technologies, Vol. 6 (3) 2015, s [3] S. B. Kulkarni, P. D. Deshmukh Linguistic Divergence Patterns in English to Marathi Translation International Journal of Computer Applications ( ) Volume 87 No.4, February J. Ramanand, Akshay Ukey, Brahm Kiran Singh, [4] Pushapak Bhattacharyya Mapping and Structural Analysis of MultilingualWordnets Bulletin of the IEEE Computer Society Technical Committee on Data Engineering. [5] Sreelekha.S, Raj Dabre, Pushpak Bhattacharyya Comparison of SMT and RBMT,The Requirement of Hybridization for Marathi Hindi MT [online pdf]. [6] Amruta Godase, Sharvari Govilkar Survey Of Machine Translation Development for Indian Regional Languages International Journal of Modern Trends in Engineeringand Research IJMTER [7] [8] M.L.Dhore & S.X.Dixit (2011) English to Devnagari Translation for UI Labels of Commercial web based Interactive Applications, International Journal of Computer Applications ( ) Volume 35 No.10, December [9] Devika Pishartoy, Priya, Sayli Wandkar (2012) Exteneding capabilities of English to Marathi machine Translator.,I JCSI International Journal of Computer Science Issues, Vol. 9, Issue 3, No 3, May [10] priyanka-choudhary A approach for interlingua example basede- MTranslation,IJCSIT,2015. [11] Abhay A, Anita G, Paurnima T, Prajakta G (2013), Rule based English to Marathi translation of Assertive sentence International Journal of Modern Trends in Engineering and Research [12] Kadam Aniket Kadam, A.D. Dept. Inf. Tech., BVDUCOEP, Pune, India ; Joshi, S.D. ; Medhane, S.P, Question Answering Search engine short review and road-map to future QA Search Engine, Electrical, Electronics, Signals, Communication and Optimization (EESCO), 2015, /EESCO [13] Kadam Aniket,Prof.S.D.Joshi, prof.s.p.medhane, QAS International Journal of Application or Innovation in Engineering & Management IJAIEM, Volume 3, Issue 5, May 2014 May [14] Kadam Aniket,Prof.S.D.Joshi, prof.s.p.medhane, QAS International Journal of Application or Innovation in Engineering & Management IJAIEM, Volume 3, Issue 5, May 2014 May [15] Kadam Aniket,Prof.S.D.Joshi, prof.s.p.medhane Search Engines to QAS: Explorative Analysis, International Journal of Application or Innovation in Engineering & Management IJAIEM, Volume 3, Issue 5, May 2014 May 2015IJAIEM May 2015.

TRANSLATION OF TELUGU-MARATHI AND VICE- VERSA USING RULE BASED MACHINE TRANSLATION

TRANSLATION OF TELUGU-MARATHI AND VICE- VERSA USING RULE BASED MACHINE TRANSLATION TRANSLATION OF TELUGU-MARATHI AND VICE- VERSA USING RULE BASED MACHINE TRANSLATION Dr. Siddhartha Ghosh 1, Sujata Thamke 2 and Kalyani U.R.S 3 1 Head of the Department of Computer Science & Engineering,

More information

BILINGUAL TRANSLATION SYSTEM

BILINGUAL TRANSLATION SYSTEM BILINGUAL TRANSLATION SYSTEM (FOR ENGLISH AND TAMIL) Dr. S. Saraswathi Associate Professor M. Anusiya P. Kanivadhana S. Sathiya Abstract--- The project aims in developing Bilingual Translation System for

More information

Natural Language to Relational Query by Using Parsing Compiler

Natural Language to Relational Query by Using Parsing Compiler Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 4, Issue. 3, March 2015,

More information

Discovery ANALYSIS. Page46. Publication History Received: 23 August 2015 Accepted: 20 September 2015 Published: 12 October 2015

Discovery ANALYSIS. Page46. Publication History Received: 23 August 2015 Accepted: 20 September 2015 Published: 12 October 2015 Discovery ANALYSIS The International Daily journal ISSN 2278 5469 EISSN 2278 5450 2015 Discovery Publication. All Rights Reserved Summarization and Hybrid Machine Translation System for English to Marathi:

More information

Parsing Technology and its role in Legacy Modernization. A Metaware White Paper

Parsing Technology and its role in Legacy Modernization. A Metaware White Paper Parsing Technology and its role in Legacy Modernization A Metaware White Paper 1 INTRODUCTION In the two last decades there has been an explosion of interest in software tools that can automate key tasks

More information

PROMT Technologies for Translation and Big Data

PROMT Technologies for Translation and Big Data PROMT Technologies for Translation and Big Data Overview and Use Cases Julia Epiphantseva PROMT About PROMT EXPIRIENCED Founded in 1991. One of the world leading machine translation provider DIVERSIFIED

More information

Statistical Machine Translation

Statistical Machine Translation Statistical Machine Translation Some of the content of this lecture is taken from previous lectures and presentations given by Philipp Koehn and Andy Way. Dr. Jennifer Foster National Centre for Language

More information

NATURAL LANGUAGE QUERY PROCESSING USING PROBABILISTIC CONTEXT FREE GRAMMAR

NATURAL LANGUAGE QUERY PROCESSING USING PROBABILISTIC CONTEXT FREE GRAMMAR NATURAL LANGUAGE QUERY PROCESSING USING PROBABILISTIC CONTEXT FREE GRAMMAR Arati K. Deshpande 1 and Prakash. R. Devale 2 1 Student and 2 Professor & Head, Department of Information Technology, Bharati

More information

A Survey on Product Aspect Ranking

A Survey on Product Aspect Ranking A Survey on Product Aspect Ranking Charushila Patil 1, Prof. P. M. Chawan 2, Priyamvada Chauhan 3, Sonali Wankhede 4 M. Tech Student, Department of Computer Engineering and IT, VJTI College, Mumbai, Maharashtra,

More information

Efficient Techniques for Improved Data Classification and POS Tagging by Monitoring Extraction, Pruning and Updating of Unknown Foreign Words

Efficient Techniques for Improved Data Classification and POS Tagging by Monitoring Extraction, Pruning and Updating of Unknown Foreign Words , pp.290-295 http://dx.doi.org/10.14257/astl.2015.111.55 Efficient Techniques for Improved Data Classification and POS Tagging by Monitoring Extraction, Pruning and Updating of Unknown Foreign Words Irfan

More information

Architecture of an Ontology-Based Domain- Specific Natural Language Question Answering System

Architecture of an Ontology-Based Domain- Specific Natural Language Question Answering System Architecture of an Ontology-Based Domain- Specific Natural Language Question Answering System Athira P. M., Sreeja M. and P. C. Reghuraj Department of Computer Science and Engineering, Government Engineering

More information

Natural Language Database Interface for the Community Based Monitoring System *

Natural Language Database Interface for the Community Based Monitoring System * Natural Language Database Interface for the Community Based Monitoring System * Krissanne Kaye Garcia, Ma. Angelica Lumain, Jose Antonio Wong, Jhovee Gerard Yap, Charibeth Cheng De La Salle University

More information

An Approach to Handle Idioms and Phrasal Verbs in English-Tamil Machine Translation System

An Approach to Handle Idioms and Phrasal Verbs in English-Tamil Machine Translation System An Approach to Handle Idioms and Phrasal Verbs in English-Tamil Machine Translation System Thiruumeni P G, Anand Kumar M Computational Engineering & Networking, Amrita Vishwa Vidyapeetham, Coimbatore,

More information

NATURAL LANGUAGE DATABASE INTERFACE

NATURAL LANGUAGE DATABASE INTERFACE NATURAL LANGUAGE DATABASE INTERFACE Aniket Khapane 1, Mahesh Kapadane 1, Pravin Patil 1, Prof. Saba Siraj 1 Student, Bachelor of Computer Engineering SP s Institute of Knowledge College Of Engineering,

More information

How the Computer Translates. Svetlana Sokolova President and CEO of PROMT, PhD.

How the Computer Translates. Svetlana Sokolova President and CEO of PROMT, PhD. Svetlana Sokolova President and CEO of PROMT, PhD. How the Computer Translates Machine translation is a special field of computer application where almost everyone believes that he/she is a specialist.

More information

USABILITY OF A FILIPINO LANGUAGE TOOLS WEBSITE

USABILITY OF A FILIPINO LANGUAGE TOOLS WEBSITE USABILITY OF A FILIPINO LANGUAGE TOOLS WEBSITE Ria A. Sagum, MCS Department of Computer Science, College of Computer and Information Sciences Polytechnic University of the Philippines, Manila, Philippines

More information

NATURAL LANGUAGE TO SQL CONVERSION SYSTEM

NATURAL LANGUAGE TO SQL CONVERSION SYSTEM International Journal of Computer Science Engineering and Information Technology Research (IJCSEITR) ISSN 2249-6831 Vol. 3, Issue 2, Jun 2013, 161-166 TJPRC Pvt. Ltd. NATURAL LANGUAGE TO SQL CONVERSION

More information

An Efficient Database Design for IndoWordNet Development Using Hybrid Approach

An Efficient Database Design for IndoWordNet Development Using Hybrid Approach An Efficient Database Design for IndoWordNet Development Using Hybrid Approach Venkatesh P rabhu 2 Shilpa Desai 1 Hanumant Redkar 1 N eha P rabhugaonkar 1 Apur va N agvenkar 1 Ramdas Karmali 1 (1) GOA

More information

Comprendium Translator System Overview

Comprendium Translator System Overview Comprendium System Overview May 2004 Table of Contents 1. INTRODUCTION...3 2. WHAT IS MACHINE TRANSLATION?...3 3. THE COMPRENDIUM MACHINE TRANSLATION TECHNOLOGY...4 3.1 THE BEST MT TECHNOLOGY IN THE MARKET...4

More information

NATURAL LANGUAGE QUERY PROCESSING USING SEMANTIC GRAMMAR

NATURAL LANGUAGE QUERY PROCESSING USING SEMANTIC GRAMMAR NATURAL LANGUAGE QUERY PROCESSING USING SEMANTIC GRAMMAR 1 Gauri Rao, 2 Chanchal Agarwal, 3 Snehal Chaudhry, 4 Nikita Kulkarni,, 5 Dr. S.H. Patil 1 Lecturer department o f Computer Engineering BVUCOE,

More information

HELP DESK SYSTEMS. Using CaseBased Reasoning

HELP DESK SYSTEMS. Using CaseBased Reasoning HELP DESK SYSTEMS Using CaseBased Reasoning Topics Covered Today What is Help-Desk? Components of HelpDesk Systems Types Of HelpDesk Systems Used Need for CBR in HelpDesk Systems GE Helpdesk using ReMind

More information

CINDOR Conceptual Interlingua Document Retrieval: TREC-8 Evaluation.

CINDOR Conceptual Interlingua Document Retrieval: TREC-8 Evaluation. CINDOR Conceptual Interlingua Document Retrieval: TREC-8 Evaluation. Miguel Ruiz, Anne Diekema, Páraic Sheridan MNIS-TextWise Labs Dey Centennial Plaza 401 South Salina Street Syracuse, NY 13202 Abstract:

More information

International Journal of Scientific & Engineering Research, Volume 4, Issue 11, November-2013 5 ISSN 2229-5518

International Journal of Scientific & Engineering Research, Volume 4, Issue 11, November-2013 5 ISSN 2229-5518 International Journal of Scientific & Engineering Research, Volume 4, Issue 11, November-2013 5 INTELLIGENT MULTIDIMENSIONAL DATABASE INTERFACE Mona Gharib Mohamed Reda Zahraa E. Mohamed Faculty of Science,

More information

Search Result Optimization using Annotators

Search Result Optimization using Annotators Search Result Optimization using Annotators Vishal A. Kamble 1, Amit B. Chougule 2 1 Department of Computer Science and Engineering, D Y Patil College of engineering, Kolhapur, Maharashtra, India 2 Professor,

More information

Depth-of-Knowledge Levels for Four Content Areas Norman L. Webb March 28, 2002. Reading (based on Wixson, 1999)

Depth-of-Knowledge Levels for Four Content Areas Norman L. Webb March 28, 2002. Reading (based on Wixson, 1999) Depth-of-Knowledge Levels for Four Content Areas Norman L. Webb March 28, 2002 Language Arts Levels of Depth of Knowledge Interpreting and assigning depth-of-knowledge levels to both objectives within

More information

Visionet IT Modernization Empowering Change

Visionet IT Modernization Empowering Change Visionet IT Modernization A Visionet Systems White Paper September 2009 Visionet Systems Inc. 3 Cedar Brook Dr. Cranbury, NJ 08512 Tel: 609 360-0501 Table of Contents 1 Executive Summary... 4 2 Introduction...

More information

Search and Information Retrieval

Search and Information Retrieval Search and Information Retrieval Search on the Web 1 is a daily activity for many people throughout the world Search and communication are most popular uses of the computer Applications involving search

More information

Learning Translation Rules from Bilingual English Filipino Corpus

Learning Translation Rules from Bilingual English Filipino Corpus Proceedings of PACLIC 19, the 19 th Asia-Pacific Conference on Language, Information and Computation. Learning Translation s from Bilingual English Filipino Corpus Michelle Wendy Tan, Raymond Joseph Ang,

More information

Customizing an English-Korean Machine Translation System for Patent Translation *

Customizing an English-Korean Machine Translation System for Patent Translation * Customizing an English-Korean Machine Translation System for Patent Translation * Sung-Kwon Choi, Young-Gil Kim Natural Language Processing Team, Electronics and Telecommunications Research Institute,

More information

Building a Question Classifier for a TREC-Style Question Answering System

Building a Question Classifier for a TREC-Style Question Answering System Building a Question Classifier for a TREC-Style Question Answering System Richard May & Ari Steinberg Topic: Question Classification We define Question Classification (QC) here to be the task that, given

More information

Modern foreign languages

Modern foreign languages Modern foreign languages Programme of study for key stage 3 and attainment targets (This is an extract from The National Curriculum 2007) Crown copyright 2007 Qualifications and Curriculum Authority 2007

More information

Semantic analysis of text and speech

Semantic analysis of text and speech Semantic analysis of text and speech SGN-9206 Signal processing graduate seminar II, Fall 2007 Anssi Klapuri Institute of Signal Processing, Tampere University of Technology, Finland Outline What is semantic

More information

Overview of MT techniques. Malek Boualem (FT)

Overview of MT techniques. Malek Boualem (FT) Overview of MT techniques Malek Boualem (FT) This section presents an standard overview of general aspects related to machine translation with a description of different techniques: bilingual, transfer,

More information

Testing Data-Driven Learning Algorithms for PoS Tagging of Icelandic

Testing Data-Driven Learning Algorithms for PoS Tagging of Icelandic Testing Data-Driven Learning Algorithms for PoS Tagging of Icelandic by Sigrún Helgadóttir Abstract This paper gives the results of an experiment concerned with training three different taggers on tagged

More information

Automatic Speech Recognition and Hybrid Machine Translation for High-Quality Closed-Captioning and Subtitling for Video Broadcast

Automatic Speech Recognition and Hybrid Machine Translation for High-Quality Closed-Captioning and Subtitling for Video Broadcast Automatic Speech Recognition and Hybrid Machine Translation for High-Quality Closed-Captioning and Subtitling for Video Broadcast Hassan Sawaf Science Applications International Corporation (SAIC) 7990

More information

Language and Computation

Language and Computation Language and Computation week 13, Thursday, April 24 Tamás Biró Yale University [email protected] http://www.birot.hu/courses/2014-lc/ Tamás Biró, Yale U., Language and Computation p. 1 Practical matters

More information

Module Catalogue for the Bachelor Program in Computational Linguistics at the University of Heidelberg

Module Catalogue for the Bachelor Program in Computational Linguistics at the University of Heidelberg Module Catalogue for the Bachelor Program in Computational Linguistics at the University of Heidelberg March 1, 2007 The catalogue is organized into sections of (1) obligatory modules ( Basismodule ) that

More information

NLUI Server User s Guide

NLUI Server User s Guide By Vadim Berman Monday, 19 March 2012 Overview NLUI (Natural Language User Interface) Server is designed to run scripted applications driven by natural language interaction. Just like a web server application

More information

Understanding Web personalization with Web Usage Mining and its Application: Recommender System

Understanding Web personalization with Web Usage Mining and its Application: Recommender System Understanding Web personalization with Web Usage Mining and its Application: Recommender System Manoj Swami 1, Prof. Manasi Kulkarni 2 1 M.Tech (Computer-NIMS), VJTI, Mumbai. 2 Department of Computer Technology,

More information

Comparative Analysis on the Armenian and Korean Languages

Comparative Analysis on the Armenian and Korean Languages Comparative Analysis on the Armenian and Korean Languages Syuzanna Mejlumyan Yerevan State Linguistic University Abstract It has been five years since the Korean language has been taught at Yerevan State

More information

AN APPROACH TO WORD SENSE DISAMBIGUATION COMBINING MODIFIED LESK AND BAG-OF-WORDS

AN APPROACH TO WORD SENSE DISAMBIGUATION COMBINING MODIFIED LESK AND BAG-OF-WORDS AN APPROACH TO WORD SENSE DISAMBIGUATION COMBINING MODIFIED LESK AND BAG-OF-WORDS Alok Ranjan Pal 1, 3, Anirban Kundu 2, 3, Abhay Singh 1, Raj Shekhar 1, Kunal Sinha 1 1 College of Engineering and Management,

More information

Recovering Business Rules from Legacy Source Code for System Modernization

Recovering Business Rules from Legacy Source Code for System Modernization Recovering Business Rules from Legacy Source Code for System Modernization Erik Putrycz, Ph.D. Anatol W. Kark Software Engineering Group National Research Council, Canada Introduction Legacy software 000009*

More information

Hybrid Machine Translation Guided by a Rule Based System

Hybrid Machine Translation Guided by a Rule Based System Hybrid Machine Translation Guided by a Rule Based System Cristina España-Bonet, Gorka Labaka, Arantza Díaz de Ilarraza, Lluís Màrquez Kepa Sarasola Universitat Politècnica de Catalunya University of the

More information

Hybrid Strategies. for better products and shorter time-to-market

Hybrid Strategies. for better products and shorter time-to-market Hybrid Strategies for better products and shorter time-to-market Background Manufacturer of language technology software & services Spin-off of the research center of Germany/Heidelberg Founded in 1999,

More information

Extraction of Legal Definitions from a Japanese Statutory Corpus Toward Construction of a Legal Term Ontology

Extraction of Legal Definitions from a Japanese Statutory Corpus Toward Construction of a Legal Term Ontology Extraction of Legal Definitions from a Japanese Statutory Corpus Toward Construction of a Legal Term Ontology Makoto Nakamura, Yasuhiro Ogawa, Katsuhiko Toyama Japan Legal Information Institute, Graduate

More information

The SYSTRAN Linguistics Platform: A Software Solution to Manage Multilingual Corporate Knowledge

The SYSTRAN Linguistics Platform: A Software Solution to Manage Multilingual Corporate Knowledge The SYSTRAN Linguistics Platform: A Software Solution to Manage Multilingual Corporate Knowledge White Paper October 2002 I. Translation and Localization New Challenges Businesses are beginning to encounter

More information

Search Engine Based Intelligent Help Desk System: iassist

Search Engine Based Intelligent Help Desk System: iassist Search Engine Based Intelligent Help Desk System: iassist Sahil K. Shah, Prof. Sheetal A. Takale Information Technology Department VPCOE, Baramati, Maharashtra, India [email protected], [email protected]

More information

A Business Process Services Portal

A Business Process Services Portal A Business Process Services Portal IBM Research Report RZ 3782 Cédric Favre 1, Zohar Feldman 3, Beat Gfeller 1, Thomas Gschwind 1, Jana Koehler 1, Jochen M. Küster 1, Oleksandr Maistrenko 1, Alexandru

More information

Interactive Dynamic Information Extraction

Interactive Dynamic Information Extraction Interactive Dynamic Information Extraction Kathrin Eichler, Holmer Hemsen, Markus Löckelt, Günter Neumann, and Norbert Reithinger Deutsches Forschungszentrum für Künstliche Intelligenz - DFKI, 66123 Saarbrücken

More information

Open-Source, Cross-Platform Java Tools Working Together on a Dialogue System

Open-Source, Cross-Platform Java Tools Working Together on a Dialogue System Open-Source, Cross-Platform Java Tools Working Together on a Dialogue System Oana NICOLAE Faculty of Mathematics and Computer Science, Department of Computer Science, University of Craiova, Romania [email protected]

More information

Why Evaluation? Machine Translation. Evaluation. Evaluation Metrics. Ten Translations of a Chinese Sentence. How good is a given system?

Why Evaluation? Machine Translation. Evaluation. Evaluation Metrics. Ten Translations of a Chinese Sentence. How good is a given system? Why Evaluation? How good is a given system? Machine Translation Evaluation Which one is the best system for our purpose? How much did we improve our system? How can we tune our system to become better?

More information

Academic Standards for Reading, Writing, Speaking, and Listening June 1, 2009 FINAL Elementary Standards Grades 3-8

Academic Standards for Reading, Writing, Speaking, and Listening June 1, 2009 FINAL Elementary Standards Grades 3-8 Academic Standards for Reading, Writing, Speaking, and Listening June 1, 2009 FINAL Elementary Standards Grades 3-8 Pennsylvania Department of Education These standards are offered as a voluntary resource

More information

Machine Translation. Why Evaluation? Evaluation. Ten Translations of a Chinese Sentence. Evaluation Metrics. But MT evaluation is a di cult problem!

Machine Translation. Why Evaluation? Evaluation. Ten Translations of a Chinese Sentence. Evaluation Metrics. But MT evaluation is a di cult problem! Why Evaluation? How good is a given system? Which one is the best system for our purpose? How much did we improve our system? How can we tune our system to become better? But MT evaluation is a di cult

More information

Identifying Focus, Techniques and Domain of Scientific Papers

Identifying Focus, Techniques and Domain of Scientific Papers Identifying Focus, Techniques and Domain of Scientific Papers Sonal Gupta Department of Computer Science Stanford University Stanford, CA 94305 [email protected] Christopher D. Manning Department of

More information

Automated Extraction of Security Policies from Natural-Language Software Documents

Automated Extraction of Security Policies from Natural-Language Software Documents Automated Extraction of Security Policies from Natural-Language Software Documents Xusheng Xiao 1 Amit Paradkar 2 Suresh Thummalapenta 3 Tao Xie 1 1 Dept. of Computer Science, North Carolina State University,

More information

Sentiment analysis on news articles using Natural Language Processing and Machine Learning Approach.

Sentiment analysis on news articles using Natural Language Processing and Machine Learning Approach. Sentiment analysis on news articles using Natural Language Processing and Machine Learning Approach. Pranali Chilekar 1, Swati Ubale 2, Pragati Sonkambale 3, Reema Panarkar 4, Gopal Upadhye 5 1 2 3 4 5

More information

Clustering Technique in Data Mining for Text Documents

Clustering Technique in Data Mining for Text Documents Clustering Technique in Data Mining for Text Documents Ms.J.Sathya Priya Assistant Professor Dept Of Information Technology. Velammal Engineering College. Chennai. Ms.S.Priyadharshini Assistant Professor

More information

An Approach towards Automation of Requirements Analysis

An Approach towards Automation of Requirements Analysis An Approach towards Automation of Requirements Analysis Vinay S, Shridhar Aithal, Prashanth Desai Abstract-Application of Natural Language processing to requirements gathering to facilitate automation

More information

SPATIAL DATA CLASSIFICATION AND DATA MINING

SPATIAL DATA CLASSIFICATION AND DATA MINING , pp.-40-44. Available online at http://www. bioinfo. in/contents. php?id=42 SPATIAL DATA CLASSIFICATION AND DATA MINING RATHI J.B. * AND PATIL A.D. Department of Computer Science & Engineering, Jawaharlal

More information

Knowledge Discovery using Text Mining: A Programmable Implementation on Information Extraction and Categorization

Knowledge Discovery using Text Mining: A Programmable Implementation on Information Extraction and Categorization Knowledge Discovery using Text Mining: A Programmable Implementation on Information Extraction and Categorization Atika Mustafa, Ali Akbar, and Ahmer Sultan National University of Computer and Emerging

More information

Specialty Answering Service. All rights reserved.

Specialty Answering Service. All rights reserved. 0 Contents 1 Introduction... 2 1.1 Types of Dialog Systems... 2 2 Dialog Systems in Contact Centers... 4 2.1 Automated Call Centers... 4 3 History... 3 4 Designing Interactive Dialogs with Structured Data...

More information

Semantic annotation of requirements for automatic UML class diagram generation

Semantic annotation of requirements for automatic UML class diagram generation www.ijcsi.org 259 Semantic annotation of requirements for automatic UML class diagram generation Soumaya Amdouni 1, Wahiba Ben Abdessalem Karaa 2 and Sondes Bouabid 3 1 University of tunis High Institute

More information

KSE Comp. support for the writing process 2 1

KSE Comp. support for the writing process 2 1 KSE Comp. support for the writing process 2 1 Flower & Hayes cognitive model of writing A reaction against stage models of the writing process E.g.: Prewriting - Writing - Rewriting They model the growth

More information

Multi language e Discovery Three Critical Steps for Litigating in a Global Economy

Multi language e Discovery Three Critical Steps for Litigating in a Global Economy Multi language e Discovery Three Critical Steps for Litigating in a Global Economy 2 3 5 6 7 Introduction e Discovery has become a pressure point in many boardrooms. Companies with international operations

More information

Any Town Public Schools Specific School Address, City State ZIP

Any Town Public Schools Specific School Address, City State ZIP Any Town Public Schools Specific School Address, City State ZIP XXXXXXXX Supertindent XXXXXXXX Principal Speech and Language Evaluation Name: School: Evaluator: D.O.B. Age: D.O.E. Reason for Referral:

More information

Open Domain Information Extraction. Günter Neumann, DFKI, 2012

Open Domain Information Extraction. Günter Neumann, DFKI, 2012 Open Domain Information Extraction Günter Neumann, DFKI, 2012 Improving TextRunner Wu and Weld (2010) Open Information Extraction using Wikipedia, ACL 2010 Fader et al. (2011) Identifying Relations for

More information

Study Plan for Master of Arts in Applied Linguistics

Study Plan for Master of Arts in Applied Linguistics Study Plan for Master of Arts in Applied Linguistics Master of Arts in Applied Linguistics is awarded by the Faculty of Graduate Studies at Jordan University of Science and Technology (JUST) upon the fulfillment

More information

Structural and Semantic Indexing for Supporting Creation of Multilingual Web Pages

Structural and Semantic Indexing for Supporting Creation of Multilingual Web Pages Structural and Semantic Indexing for Supporting Creation of Multilingual Web Pages Hiroshi URAE, Taro TEZUKA, Fuminori KIMURA, and Akira MAEDA Abstract Translating webpages by machine translation is the

More information

LINGSTAT: AN INTERACTIVE, MACHINE-AIDED TRANSLATION SYSTEM*

LINGSTAT: AN INTERACTIVE, MACHINE-AIDED TRANSLATION SYSTEM* LINGSTAT: AN INTERACTIVE, MACHINE-AIDED TRANSLATION SYSTEM* Jonathan Yamron, James Baker, Paul Bamberg, Haakon Chevalier, Taiko Dietzel, John Elder, Frank Kampmann, Mark Mandel, Linda Manganaro, Todd Margolis,

More information

SMSFR: SMS-Based FAQ Retrieval System

SMSFR: SMS-Based FAQ Retrieval System SMSFR: SMS-Based FAQ Retrieval System Partha Pakray, 1 Santanu Pal, 1 Soujanya Poria, 1 Sivaji Bandyopadhyay, 1 Alexander Gelbukh 2 1 Computer Science and Engineering Department, Jadavpur University, Kolkata,

More information

TEXT TO SPEECH SYSTEM FOR KONKANI ( GOAN ) LANGUAGE

TEXT TO SPEECH SYSTEM FOR KONKANI ( GOAN ) LANGUAGE TEXT TO SPEECH SYSTEM FOR KONKANI ( GOAN ) LANGUAGE Sangam P. Borkar M.E. (Electronics)Dissertation Guided by Prof. S. P. Patil Head of Electronics Department Rajarambapu Institute of Technology Sakharale,

More information

Automated Multilingual Text Analysis in the Europe Media Monitor (EMM) Ralf Steinberger. European Commission Joint Research Centre (JRC)

Automated Multilingual Text Analysis in the Europe Media Monitor (EMM) Ralf Steinberger. European Commission Joint Research Centre (JRC) Automated Multilingual Text Analysis in the Europe Media Monitor (EMM) Ralf Steinberger European Commission Joint Research Centre (JRC) https://ec.europa.eu/jrc/en/research-topic/internet-surveillance-systems

More information

ARTIFICIALLY INTELLIGENT COLLEGE ORIENTED VIRTUAL ASSISTANT

ARTIFICIALLY INTELLIGENT COLLEGE ORIENTED VIRTUAL ASSISTANT ARTIFICIALLY INTELLIGENT COLLEGE ORIENTED VIRTUAL ASSISTANT Vishmita Yashwant Shetty, Nikhil Uday Polekar, Sandipan Utpal Das, Prof. Suvarna Pansambal Department of Computer Engineering, Atharva College

More information

Introduction. Philipp Koehn. 28 January 2016

Introduction. Philipp Koehn. 28 January 2016 Introduction Philipp Koehn 28 January 2016 Administrativa 1 Class web site: http://www.mt-class.org/jhu/ Tuesdays and Thursdays, 1:30-2:45, Hodson 313 Instructor: Philipp Koehn (with help from Matt Post)

More information

Programming Languages

Programming Languages Programming Languages Programming languages bridge the gap between people and machines; for that matter, they also bridge the gap among people who would like to share algorithms in a way that immediately

More information

Dublin City University at CLEF 2004: Experiments with the ImageCLEF St Andrew s Collection

Dublin City University at CLEF 2004: Experiments with the ImageCLEF St Andrew s Collection Dublin City University at CLEF 2004: Experiments with the ImageCLEF St Andrew s Collection Gareth J. F. Jones, Declan Groves, Anna Khasin, Adenike Lam-Adesina, Bart Mellebeek. Andy Way School of Computing,

More information

Writing a Project Report: Style Matters

Writing a Project Report: Style Matters Writing a Project Report: Style Matters Prof. Alan F. Smeaton Centre for Digital Video Processing and School of Computing Writing for Computing Why ask me to do this? I write a lot papers, chapters, project

More information

Language Interface for an XML. Constructing a Generic Natural. Database. Rohit Paravastu

Language Interface for an XML. Constructing a Generic Natural. Database. Rohit Paravastu Constructing a Generic Natural Language Interface for an XML Database Rohit Paravastu Motivation Ability to communicate with a database in natural language regarded as the ultimate goal for DB query interfaces

More information

OPTIMIZING CONTENT FOR TRANSLATION ACROLINX AND VISTATEC

OPTIMIZING CONTENT FOR TRANSLATION ACROLINX AND VISTATEC OPTIMIZING CONTENT FOR TRANSLATION ACROLINX AND VISTATEC We ll look at these questions. Why does translation cost so much? Why is it hard to keep content consistent? Why is it hard for an organization

More information

The Role of Sentence Structure in Recognizing Textual Entailment

The Role of Sentence Structure in Recognizing Textual Entailment Blake,C. (In Press) The Role of Sentence Structure in Recognizing Textual Entailment. ACL-PASCAL Workshop on Textual Entailment and Paraphrasing, Prague, Czech Republic. The Role of Sentence Structure

More information

Generating SQL Queries Using Natural Language Syntactic Dependencies and Metadata

Generating SQL Queries Using Natural Language Syntactic Dependencies and Metadata Generating SQL Queries Using Natural Language Syntactic Dependencies and Metadata Alessandra Giordani and Alessandro Moschitti Department of Computer Science and Engineering University of Trento Via Sommarive

More information

The Development of Multimedia-Multilingual Document Storage, Retrieval and Delivery System for E-Organization (STREDEO PROJECT)

The Development of Multimedia-Multilingual Document Storage, Retrieval and Delivery System for E-Organization (STREDEO PROJECT) The Development of Multimedia-Multilingual Storage, Retrieval and Delivery for E-Organization (STREDEO PROJECT) Asanee Kawtrakul, Kajornsak Julavittayanukool, Mukda Suktarachan, Patcharee Varasrai, Nathavit

More information

Search and Data Mining: Techniques. Text Mining Anya Yarygina Boris Novikov

Search and Data Mining: Techniques. Text Mining Anya Yarygina Boris Novikov Search and Data Mining: Techniques Text Mining Anya Yarygina Boris Novikov Introduction Generally used to denote any system that analyzes large quantities of natural language text and detects lexical or

More information

Statistical Machine Translation: IBM Models 1 and 2

Statistical Machine Translation: IBM Models 1 and 2 Statistical Machine Translation: IBM Models 1 and 2 Michael Collins 1 Introduction The next few lectures of the course will be focused on machine translation, and in particular on statistical machine translation

More information

Bisecting K-Means for Clustering Web Log data

Bisecting K-Means for Clustering Web Log data Bisecting K-Means for Clustering Web Log data Ruchika R. Patil Department of Computer Technology YCCE Nagpur, India Amreen Khan Department of Computer Technology YCCE Nagpur, India ABSTRACT Web usage mining

More information

Reading Competencies

Reading Competencies Reading Competencies The Third Grade Reading Guarantee legislation within Senate Bill 21 requires reading competencies to be adopted by the State Board no later than January 31, 2014. Reading competencies

More information

31 Case Studies: Java Natural Language Tools Available on the Web

31 Case Studies: Java Natural Language Tools Available on the Web 31 Case Studies: Java Natural Language Tools Available on the Web Chapter Objectives Chapter Contents This chapter provides a number of sources for open source and free atural language understanding software

More information

Novel Data Extraction Language for Structured Log Analysis

Novel Data Extraction Language for Structured Log Analysis Novel Data Extraction Language for Structured Log Analysis P.W.D.C. Jayathilake 99X Technology, Sri Lanka. ABSTRACT This paper presents the implementation of a new log data extraction language. Theoretical

More information

Collecting Polish German Parallel Corpora in the Internet

Collecting Polish German Parallel Corpora in the Internet Proceedings of the International Multiconference on ISSN 1896 7094 Computer Science and Information Technology, pp. 285 292 2007 PIPS Collecting Polish German Parallel Corpora in the Internet Monika Rosińska

More information

Language Arts Literacy Areas of Focus: Grade 6

Language Arts Literacy Areas of Focus: Grade 6 Language Arts Literacy : Grade 6 Mission: Learning to read, write, speak, listen, and view critically, strategically and creatively enables students to discover personal and shared meaning throughout their

More information

Application of Natural Language Interface to a Machine Translation Problem

Application of Natural Language Interface to a Machine Translation Problem Application of Natural Language Interface to a Machine Translation Problem Heidi M. Johnson Yukiko Sekine John S. White Martin Marietta Corporation Gil C. Kim Korean Advanced Institute of Science and Technology

More information

Associate Professor, Department of CSE, Shri Vishnu Engineering College for Women, Andhra Pradesh, India 2

Associate Professor, Department of CSE, Shri Vishnu Engineering College for Women, Andhra Pradesh, India 2 Volume 6, Issue 3, March 2016 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Special Issue

More information

DEVELOPMENT OF NATURAL LANGUAGE INTERFACE TO RELATIONAL DATABASES

DEVELOPMENT OF NATURAL LANGUAGE INTERFACE TO RELATIONAL DATABASES DEVELOPMENT OF NATURAL LANGUAGE INTERFACE TO RELATIONAL DATABASES C. Nancy * and Sha Sha Ali # Student of M.Tech, Bharath College Of Engineering And Technology For Women, Andhra Pradesh, India # Department

More information

Fourth generation techniques (4GT)

Fourth generation techniques (4GT) Fourth generation techniques (4GT) The term fourth generation techniques (4GT) encompasses a broad array of software tools that have one thing in common. Each enables the software engineer to specify some

More information