Data-Driven Spell Checking: The Synergy of Two Algorithms for Spelling Error Detection and Correction
|
|
- Rebecca Cain
- 7 years ago
- Views:
Transcription
1 Data-Driven Spell Checking: The Synergy of Two Algorithms for Spelling Error Detection and Correction Eranga Jayalatharachchi, Asanka Wasala*, Ruvan Weerasinghe University of Colombo School of Computing, 35, Reid Avenue, Colombo 00700, Sri Lanka *Localisation Research Centre CSIS Department, University of Limerick, Limerick, Ireland 1
2 Contents 1. Introduction 2. Background Sinhala Language Work on Indian Languages Work on Sinhala 3. Methodology Subasa v1 Subasa v2 4. Evaluation 5. Conclusions & Future Work 6. Demonstration 2
3 Introduction Spell Checking The task of identifying and flagging incorrectly spelled words in a document written in a natural language Spell Correcting The process of replacing the misspelled words with the most likely intended ones Applications Word processing, optical character recognition (OCR), character recognition, speech recognition, computer aided language learning (CALL) etc. 3
4 Introduction Misspelled Words Non-word errors It was teh wind Real-word errors My sun is a doctor Automatic Spelling Error Detection and Correction (Kukich 1992):. 1. Non-word error detection 2. Isolated word error correction 3. Context-dependent error correction 4
5 Introduction About 80% of all misspelled English words (non-word errors) in human typewritten text are due to single-error misspellings. (Damerau 1964) ther insertion teh transposition the th deletion thw substitution 5
6 Introduction Correction Techniques (Kukich. 1992) 1. Minimum edit distance techniques 2. Similarity key techniques 3. Rule-based techniques 4. N-gram-based techniques 5. Probabilistic techniques 6. Neural nets 6
7 Objective Introduction To enhance Subasa, the only documented spell checker available to-date for Sinhala (Wasala et al. 2010; Walasa et al. 2011) Subasa v1 : n-gram Subasa v2: n-gram + edit distance 7
8 N-grams Introduction An n-gram is a sub-sequence of n items from a given sequence Word intention Letter unigrams i n t e n t i o n Letter bi-grams Letter tri-grams in nt te en nt ti io on int nte ten ent nti tio ion 8
9 Introduction N-gram Generating Algorithm function get_n_grams (word, n) returns n_grams_list l length (word) - n n_grams_list empty () for i from 0 to l do n_grams_list append ( substring (word, i, n) ) 9
10 Minimum Edit-Distance Introduction Minimum number of editing operations required to transform one string to another Insertions Deletions Substitutions (Wagner 1974) 10
11 Editing Operations Introduction i n t e n t i o n i n t e n t i o n e x e c u t i o n e x e c u t i o n 5 Substitutions 1 Deletion Cost = 5 x 2 = 10 Cost of Edit Operations Insertion = 1 Deletion = 1 Substitution = Deletion + Insertion = = 2 3 Substitutions 1 Insertion Cost = 1 + (3 x 2) + 1 = 8 11
12 Introduction Minimum Edit Distance Calculation Algorithm A dynamic programming algorithm for minimum edit-distance computation creates an edit-distance matrix M with one column for each symbol in the target sequence and one row for each symbol in the source sequence. function minimum_edit_distance (source, target) returns min_distance m length(source) n length(target) create distance matrix M[n+1,m+1] M[0,0] 0 for each column i from 0 to n do for each row j from 0 to m do M[i,j] min ( M[i-1,j] + cost_insert(target i ), M[i-1,j-1] + cost_substitute(source j, target i ), M[i,j-1] + cost_delete(source j ) ) min_distance M[i+1,j+1] 12
13 source Edit Distance Matrix Introduction n o i t n e t n i # target # e x e c u t i o n Each cell M[i,j] contains the minimum edit distance between the first i characters of the target and the first j characters of the source 13
14 source Edit Distance Matrix Introduction n o i t n e t n i # target # e x e c u t i o n Each cell M[i,j] contains the minimum edit distance between the first i characters of the target and the first j characters of the source 14
15 Background Sinhala Language & Script Majority language of Sri Lanka Sinhala script is a derivative of Brahmi script Sinhala script is an syllabic script 5 pre-nasalized stops & 2 unique vowels (Nandasara, 2009) Sinhala is a phonetic language na-na-la-la dissention Conjunct letters 15
16 Background Work on Indic Languages Non-word spelling correction for Assamese (Das et al. 2002) Uses similarity-key and minimum edit distance techniques Rule cum Dictionary based approach for spell checking Malayalam (Santhosh et al. 2002) Spelling correction for Tamil (Dhanabalan et al. 2003) Non-word error detection using simple dictionary lookups Spell checking for Bangla (Chaudhuri 2002) An adaptation of similarity key based technique 16
17 Background Work on Sinhala Language Thibus Commercial-grade Mozilla Firefox Extension (addons.mozilla.org) Dictionary-based OpenOffce Extension (openoffice.org) Uses Hunspell Microsoft Office Word 2007 (microsoft.com) Via Language Interface Pack (LIP) for Sinhala Subasa (v1) (Wasala et al. 2009; Wasala et al. 2010) N-gram based Phonetic errors 17
18 Methodology: Subasa v1 The Process (k, c) kat kat cat 18
19 Methodology: Subasa v1 The Process (contd.) kat cat ka, at ca, at ka, at = 10+5 ca, at = 20+5 kat cat ka = 10 ca = 20 at = 5 cat 19
20 Methodology: Subasa v1 Phoneme Classes Graphemes Phoneme class, /k/, /g/, /tʃ/, /dʒ/, /ʈ/, /ɖ/, /t / Graphemes Phoneme class, /d /, /p/, /b/, /n/, /l/,, /s/ or /ʃ/, /ɲ/ 20
21 Example Methodology: Subasa v1 UCSC Corpus 10 Mn Words Word Unigrams (440,021) Letter bi-grams (46,878) Letter tri-grams (16,6460) Dictionary of Sinhala Spelling (Koparahewa. 2006) 21
22 22
23 The Process Methodology: Subasa v2 23
24 Methodology: Subasa v2 The Process : Edit Distance Module 24
25 Methodology: Subasa v2 Data UCSC Corpus 10 Mn Words Word Unigrams (440,021) Letter bi-grams (46,878) Letter tri-grams (166,460) Dictionary of Sinhala Spelling (Koparahewa 2006) Word Unigrams (spell checked by Subasa v1) 25
26 Methodology: Subasa v2 New Phoneme Classes 26
27 27
28 Evaluation Compared with: Microsoft Word 2007 Sinhala Language Interface Pack 2007 for Microsoft Office OpenOffice.org 3.2 Writer based on Hunspell Subasa v1 based on n-grams from UCSC Corpus Manual Inspection by a linguist Test cases Test 1: Public Sinhala Newspaper Test 2: Sinhala Blog Syndicator 28
29 Results: Test 1 Evaluation 6155 words from a Public Sinhala Newspaper Incorrect Words Detected Correct Words Detected Word % % Writer % % Subasa v % % Subasa v % % Manual % % 29
30 Results: Test 2 Evaluation 4117 words extracted from a Sinhala blog syndicator Incorrect Words Detected Correct Words Detected Word % % Writer % % Subasa v % % Subasa v % % Manual % % 30
31 Conclusions and Future Work Conclusions Subasa v2 performs much closer to Manual inspection N-gram + Edit distance is better than n-gram only approach Data driven Good for languages with limited resources 31
32 Conclusions and Future Work Future Works Larger dictionary Optimizations to Edit Distance module Candidate correction ranking Word boundary analysis Morphological analysis 32
33 Demonstration & 33
34 Improved Detections Subasa v1 Subasa v2 34
35 Improved Corrections Subasa v1 Subasa v2 35
A Mixed Trigrams Approach for Context Sensitive Spell Checking
A Mixed Trigrams Approach for Context Sensitive Spell Checking Davide Fossati and Barbara Di Eugenio Department of Computer Science University of Illinois at Chicago Chicago, IL, USA dfossa1@uic.edu, bdieugen@cs.uic.edu
More information! # % & (() % +!! +,./// 0! 1 /!! 2(3)42( 2
! # % & (() % +!! +,./// 0! 1 /!! 2(3)42( 2 5 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 15, NO. 5, SEPTEMBER/OCTOBER 2003 1073 A Comparison of Standard Spell Checking Algorithms and a Novel
More informationImplementation of Internet Domain Names in Sinhala
Implementation of Internet Domain Names in Sinhala Harsha Wijayawardhana, Asanka Wasala, Ruvan Weerasinghe and Chamila Liyanage University of Colombo School of Computing 35, Reid Avenue, Colombo 00700
More informationGrammars and introduction to machine learning. Computers Playing Jeopardy! Course Stony Brook University
Grammars and introduction to machine learning Computers Playing Jeopardy! Course Stony Brook University Last class: grammars and parsing in Prolog Noun -> roller Verb thrills VP Verb NP S NP VP NP S VP
More informationThe Design of a Proofreading Software Service
The Design of a Proofreading Software Service Raphael Mudge Automattic Washington, DC 20036 raffi@automattic.com Abstract Web applications have the opportunity to check spelling, style, and grammar using
More informationPOSBIOTM-NER: A Machine Learning Approach for. Bio-Named Entity Recognition
POSBIOTM-NER: A Machine Learning Approach for Bio-Named Entity Recognition Yu Song, Eunji Yi, Eunju Kim, Gary Geunbae Lee, Department of CSE, POSTECH, Pohang, Korea 790-784 Soo-Jun Park Bioinformatics
More informationLuitPad: A fully Unicode compatible Assamese writing software
LuitPad: A fully Unicode compatible Assamese writing software Navanath Saharia 1,3 Kishori M Konwar 2,3 (1) Tezpur University, Tezpur, Assam, India (2) University of British Columbia, Vancouver, Canada
More informationWord Completion and Prediction in Hebrew
Experiments with Language Models for בס"ד Word Completion and Prediction in Hebrew 1 Yaakov HaCohen-Kerner, Asaf Applebaum, Jacob Bitterman Department of Computer Science Jerusalem College of Technology
More informationMachine Translation. Agenda
Agenda Introduction to Machine Translation Data-driven statistical machine translation Translation models Parallel corpora Document-, sentence-, word-alignment Phrase-based translation MT decoding algorithm
More informationYour single-source partner for corporate product communication. Transit NXT Evolution. from Service Pack 0 to Service Pack 8
Transit NXT Evolution from Service Pack 0 to Service Pack 8 April 2009: Transit NXT Service Pack 0 (Version 4.0.0.671) Additional versions of DTP programs supported: InDesign CS3 and FrameMaker 9 Additional
More informationTurkish Radiology Dictation System
Turkish Radiology Dictation System Ebru Arısoy, Levent M. Arslan Boaziçi University, Electrical and Electronic Engineering Department, 34342, Bebek, stanbul, Turkey arisoyeb@boun.edu.tr, arslanle@boun.edu.tr
More informationText-To-Speech Technologies for Mobile Telephony Services
Text-To-Speech Technologies for Mobile Telephony Services Paulseph-John Farrugia Department of Computer Science and AI, University of Malta Abstract. Text-To-Speech (TTS) systems aim to transform arbitrary
More informationTibetan For Windows - Software Development and Future Speculations. Marvin Moser, Tibetan for Windows & Lucent Technologies, USA
Tibetan For Windows - Software Development and Future Speculations Marvin Moser, Tibetan for Windows & Lucent Technologies, USA Introduction This paper presents the basic functions of the Tibetan for Windows
More informationProcessing: current projects and research at the IXA Group
Natural Language Processing: current projects and research at the IXA Group IXA Research Group on NLP University of the Basque Country Xabier Artola Zubillaga Motivation A language that seeks to survive
More informationQ&As: Microsoft Excel 2013: Chapter 2
Q&As: Microsoft Excel 2013: Chapter 2 In Step 5, why did the date that was entered change from 4/5/10 to 4/5/2010? When Excel recognizes that you entered a date in mm/dd/yy format, it automatically formats
More informationBangla Localization of OpenOffice.org. Asif Iqbal Sarkar Research Programmer BRAC University Bangladesh
Bangla Localization of OpenOffice.org Asif Iqbal Sarkar Research Programmer BRAC University Bangladesh Localization L10n is the process of adapting the text and applications of a product or service to
More informationwww.sdl.com SDL Trados Studio 2015 Translation Memory Management Quick Start Guide
www.sdl.com SDL Trados Studio 2015 Translation Memory Management Quick Start Guide SDL Trados Studio 2015 Translation Memory Management Quick Start Guide Copyright Information Copyright 2011-2015 SDL Group.
More informationMEMBERSHIP LOCALIZATION WITHIN A WEB BASED JOIN FRAMEWORK
MEMBERSHIP LOCALIZATION WITHIN A WEB BASED JOIN FRAMEWORK 1 K. LALITHA, 2 M. KEERTHANA, 3 G. KALPANA, 4 S.T. SHWETHA, 5 M. GEETHA 1 Assistant Professor, Information Technology, Panimalar Engineering College,
More informationReading Readiness Online
4433 Bissonnet Bellaire, Texas 77401 713.664.7676 f: 713.664.4744 Reading Readiness Online Lesson 1: Introduction Prerequisite Reading Skills What is Reading? Reading is a process in which symbols on paper
More informationWorking Note FIRE 2013
Working Note FIRE 2013 FAQ retrieval using noisy queries Divyesh Sanjay Kothari Abhinav Saraswat Sarang Kapoor ISM DHANBAD ISM DHANBAD ISM DHANBAD Anjaney Pandey ISM DHANBAD Sukomal Pal ISM DHANBAD mailto:divyesh2506@gmail.com
More informationProblems with the current speling.org system
Problems with the current speling.org system Jacob Sparre Andersen 22nd May 2005 Abstract We out-line some of the problems with the current speling.org system, as well as some ideas for resolving the problems.
More informationSEARCH ENGINE OPTIMIZATION USING D-DICTIONARY
SEARCH ENGINE OPTIMIZATION USING D-DICTIONARY G.Evangelin Jenifer #1, Mrs.J.Jaya Sherin *2 # PG Scholar, Department of Electronics and Communication Engineering(Communication and Networking), CSI Institute
More informationUNKNOWN WORDS ANALYSIS IN POS TAGGING OF SINHALA LANGUAGE
UNKNOWN WORDS ANALYSIS IN POS TAGGING OF SINHALA LANGUAGE A.J.P.M.P. Jayaweera #1, N.G.J. Dias *2 # Virtusa Pvt. Ltd. No 752, Dr. Danister De Silva Mawatha, Colombo 09, Sri Lanka * Department of Statistics
More informationECDL / ICDL Word Processing Syllabus Version 5.0
ECDL / ICDL Word Processing Syllabus Version 5.0 Purpose This document details the syllabus for ECDL / ICDL Word Processing. The syllabus describes, through learning outcomes, the knowledge and skills
More informationA POS-based Word Prediction System for the Persian Language
A POS-based Word Prediction System for the Persian Language Masood Ghayoomi 1 Ehsan Daroodi 2 1 Nancy 2 University, Nancy, France masood29@gmail.com 2 Iran National Science Foundation, Tehran, Iran darrudi@insf.org
More informationKeyboards for inputting Japanese language -A study based on US patents
Keyboards for inputting Japanese language -A study based on US patents Umakant Mishra Bangalore, India umakant@trizsite.tk http://umakant.trizsite.tk (This paper was published in April 2005 issue of TRIZsite
More informationSearch and Data Mining: Techniques. Text Mining Anya Yarygina Boris Novikov
Search and Data Mining: Techniques Text Mining Anya Yarygina Boris Novikov Introduction Generally used to denote any system that analyzes large quantities of natural language text and detects lexical or
More informationA Natural Language Query Processor for Database Interface
A Natural Language Query Processor for Database Interface Mrs.Vidya Dhamdhere Lecturer department of Computer Engineering Department G.H.Raisoni college of Engg.(Pune University) vidya.dhamdhere@gmail.com
More informationEnglish to Arabic Transliteration for Information Retrieval: A Statistical Approach
English to Arabic Transliteration for Information Retrieval: A Statistical Approach Nasreen AbdulJaleel and Leah S. Larkey Center for Intelligent Information Retrieval Computer Science, University of Massachusetts
More informationAUTOLEX: An Automatic Lexicon Builder for Minority Languages Using an Open Corpus
PACLIC 24 Proceedings 63 AUTOLEX: An Automatic Lexicon Builder for Minority Languages Using an Open Corpus Evan Liz C. Buhay a, Marie Joy P. Evardone a, Hansel B. Nocon a, Davis Muhajereen D. Dimalen a,
More informationDesigning forms for auto field detection in Adobe Acrobat
Adobe Acrobat 9 Technical White Paper Designing forms for auto field detection in Adobe Acrobat Create electronic forms more easily by using the right elements in your authoring program to take advantage
More informationModule 9 The CIS error profiling technology
Florian Fink Module 9 The CIS error profiling technology 2015-09-15 1 / 24 Module 9 The CIS error profiling technology Florian Fink Centrum für Informations- und Sprachverarbeitung (CIS) Ludwig-Maximilians-Universität
More informationImproving Data Driven Part-of-Speech Tagging by Morphologic Knowledge Induction
Improving Data Driven Part-of-Speech Tagging by Morphologic Knowledge Induction Uwe D. Reichel Department of Phonetics and Speech Communication University of Munich reichelu@phonetik.uni-muenchen.de Abstract
More informationMicro blogs Oriented Word Segmentation System
Micro blogs Oriented Word Segmentation System Yijia Liu, Meishan Zhang, Wanxiang Che, Ting Liu, Yihe Deng Research Center for Social Computing and Information Retrieval Harbin Institute of Technology,
More informationWord processing software
Unit 244 Word processing software UAN: Level: 2 Credit value: 4 GLH: 30 Assessment type: Relationship to NOS: Assessment requirements specified by a sector or regulatory body: Aim: R/502/4628 Portfolio
More informationOracle Database 11g SQL
AO3 - Version: 2 19 June 2016 Oracle Database 11g SQL Oracle Database 11g SQL AO3 - Version: 2 3 days Course Description: This course provides the essential SQL skills that allow developers to write queries
More informationGCE. Computing. Mark Scheme for January 2011. Advanced Subsidiary GCE Unit F452: Programming Techniques and Logical Methods
GCE Computing Advanced Subsidiary GCE Unit F452: Programming Techniques and Logical Methods Mark Scheme for January 2011 Oxford Cambridge and RSA Examinations OCR (Oxford Cambridge and RSA) is a leading
More informationSpeech Recognition on Cell Broadband Engine UCRL-PRES-223890
Speech Recognition on Cell Broadband Engine UCRL-PRES-223890 Yang Liu, Holger Jones, John Johnson, Sheila Vaidya (Lawrence Livermore National Laboratory) Michael Perrone, Borivoj Tydlitat, Ashwini Nanda
More informationPhonetic Models for Generating Spelling Variants
Phonetic Models for Generating Spelling Variants Rahul Bhagat and Eduard Hovy Information Sciences Institute University Of Southern California 4676 Admiralty Way, Marina Del Rey, CA 90292-6695 {rahul,
More informationSMSFR: SMS-Based FAQ Retrieval System
SMSFR: SMS-Based FAQ Retrieval System Partha Pakray, 1 Santanu Pal, 1 Soujanya Poria, 1 Sivaji Bandyopadhyay, 1 Alexander Gelbukh 2 1 Computer Science and Engineering Department, Jadavpur University, Kolkata,
More informationEvaluating grapheme-to-phoneme converters in automatic speech recognition context
Evaluating grapheme-to-phoneme converters in automatic speech recognition context Denis Jouvet, Dominique Fohr, Irina Illina To cite this version: Denis Jouvet, Dominique Fohr, Irina Illina. Evaluating
More informationIENG2004 Industrial Database and Systems Design. Microsoft Access I. What is Microsoft Access? Architecture of Microsoft Access
IENG2004 Industrial Database and Systems Design Microsoft Access I Defining databases (Chapters 1 and 2) Alison Balter Mastering Microsoft Access 2000 Development SAMS, 1999 What is Microsoft Access? Microsoft
More informationPARLIAMENT OF THE DEMOCRATIC SOCIALIST REPUBLIC OF SRI LANKA
PARLIAMENT OF THE DEMOCRATIC SOCIALIST REPUBLIC OF SRI LANKA FINANCE (AMENDMENT) ACT, No. 8 OF 2008 [Certified on 29th February, 2008] Printed on the Order of Government Published as a Supplement to Part
More informationReview of Hashing: Integer Keys
CSE 326 Lecture 13: Much ado about Hashing Today s munchies to munch on: Review of Hashing Collision Resolution by: Separate Chaining Open Addressing $ Linear/Quadratic Probing $ Double Hashing Rehashing
More informationHow To Write A Phonetic Spelling Checker For Brazilian Pruirosa Pessoa
Towards a Phonetic Brazilian Portuguese Spell Checker Lucas Vinicius Avanço Magali Sanches Duran Maria das Graças Volpe Nunes avanco89@gmail.com, magali.duran@uol.com.br, gracan@icmc.usp.br, Interinstitutional
More informationDEVNAGARI DOCUMENT SEGMENTATION USING HISTOGRAM APPROACH
DEVNAGARI DOCUMENT SEGMENTATION USING HISTOGRAM APPROACH Vikas J Dongre 1 Vijay H Mankar 2 Department of Electronics & Telecommunication, Government Polytechnic, Nagpur, India 1 dongrevj@yahoo.co.in; 2
More informationPDF Accessibility Overview
Contents 1 Overview of Portable Document Format (PDF) 1 Determine the Accessibility Path for each PDF Document 2 Start with an Accessible Document 2 Characteristics of Accessible PDF files 4 Adobe Acrobat
More informationTraining Needs Analysis
Training Needs Analysis Microsoft Office 2007 Access 2007 Course Code: Name: Chapter 1: Access 2007 Orientation I understand how Access works and what it can be used for I know how to start Microsoft Access
More informationSynergy Controller Application Note 4 March 2012, Revision F Tidal Engineering Corporation 2012. Synergy Controller Bar Code Reader Applications
Synergy Controller Bar Code Reader Applications Synergy Controller with Hand Held Products Bar Code Scanner OCR-A Labeled Part Introduction The value of the ubiquitous Bar Code Scanner for speeding data
More informationQuality Companion 3 by Minitab
Quality Companion 3 by Minitab Contents Part 1. Introduction to Quality Companion 3 Part 2. What's New Part 3. Known Problems and Workarounds Important: The Quality Companion Dashboard is no longer available.
More information2 Analysis of Texting Forms
An Unsupervised Model for Text Message Normalization Paul Cook Department of Computer Science University of Toronto Toronto, Canada pcook@cs.toronto.edu Suzanne Stevenson Department of Computer Science
More informationContent Management System
OIT Training and Documentation Services Content Management System End User Training Guide OIT TRAINING AND DOCUMENTATION oittraining@uta.edu http://www.uta.edu/oit/cs/training/index.php 2009 CONTENTS 1.
More informationLINGSTAT: AN INTERACTIVE, MACHINE-AIDED TRANSLATION SYSTEM*
LINGSTAT: AN INTERACTIVE, MACHINE-AIDED TRANSLATION SYSTEM* Jonathan Yamron, James Baker, Paul Bamberg, Haakon Chevalier, Taiko Dietzel, John Elder, Frank Kampmann, Mark Mandel, Linda Manganaro, Todd Margolis,
More informationRecognizing Non-Translatable Symbols in a Multi-Lingual Computer-Assisted Translation System for DTP Documents
AUTOMATYKA 2010 Tom 14 Zeszyt 3/1 Szymon Grabowski*, Cezary Draus**, Wojciech Bieniecki* Recognizing Non-Translatable Symbols in a Multi-Lingual Computer-Assisted Translation System for DTP Documents 1.
More informationMicrosoft Office PowerPoint 2003. Identify components of the PowerPoint window. Tutorial 1 Creating a Presentation
Microsoft Office PowerPoint 2003 Tutorial 1 Creating a Presentation 1 Identify components of the PowerPoint window You will recognize some of the features of the PowerPoint window that are common to Windows
More informationUsing Edit-Distance Functions to Identify Similar E-Mail Addresses Howard Schreier, U.S. Dept. of Commerce, Washington DC
Paper 073-29 Using Edit-Distance Functions to Identify Similar E-Mail Addresses Howard Schreier, U.S. Dept. of Commerce, Washington DC ABSTRACT Version 9 of SAS software has added functions which can efficiently
More informationWikipedia and Web document based Query Translation and Expansion for Cross-language IR
Wikipedia and Web document based Query Translation and Expansion for Cross-language IR Ling-Xiang Tang 1, Andrew Trotman 2, Shlomo Geva 1, Yue Xu 1 1Faculty of Science and Technology, Queensland University
More informationExcel 2002. What you will do:
What you will do: Explore the features of Excel 2002 Create a blank workbook and a workbook from a template Format a workbook Apply formulas to a workbook Create a chart Import data to a workbook Share
More informationMay 2013. Training Guide
May 2013 Training Guide Contents Introduction... 5 1. 2. 3. 4. 5. 6. 7. 8. 9. Getting started... 6 Exercise 1 Starting Read&Write 11 Gold... 6 Exercise 2 Positioning the toolbar... 7 Exercise 3 Understanding
More informationData Warehousing. Jens Teubner, TU Dortmund jens.teubner@cs.tu-dortmund.de. Winter 2014/15. Jens Teubner Data Warehousing Winter 2014/15 1
Jens Teubner Data Warehousing Winter 2014/15 1 Data Warehousing Jens Teubner, TU Dortmund jens.teubner@cs.tu-dortmund.de Winter 2014/15 Jens Teubner Data Warehousing Winter 2014/15 152 Part VI ETL Process
More informationHIT THE GROUND RUNNING MS WORD INTRODUCTION
HIT THE GROUND RUNNING MS WORD INTRODUCTION MS Word is a word processing program. MS Word has many features and with it, a person can create reports, letters, faxes, memos, web pages, newsletters, and
More informationInternationalized Domain Names -
Internationalized Domain Names - Getting them to work Gihan Dias LK Domain Registry What is IDN? Originally DNS names were restricted to the characters a-z (letters), 0-9 (digits) and '-' (hyphen) (LDH)
More informationRA MODEL VISUALIZATION WITH MICROSOFT EXCEL 2013 AND GEPHI
RA MODEL VISUALIZATION WITH MICROSOFT EXCEL 2013 AND GEPHI Prepared for Prof. Martin Zwick December 9, 2014 by Teresa D. Schmidt (tds@pdx.edu) 1. DOWNLOADING AND INSTALLING USER DEFINED SPLIT FUNCTION
More informationThe Re-emergence of Data Capture Technology
The Re-emergence of Data Capture Technology Understanding Today s Digital Capture Solutions Digital capture is a key enabling technology in a business world striving to balance the shifting advantages
More informationTowards Unsupervised Word Error Correction in Textual Big Data
Towards Unsupervised Word Error Correction in Textual Big Data Joao Paulo Carvalho 1 and Sérgio Curto 1 1 INESC-ID, Instituto Superior Técnico, Universidade de Lisboa, Rua Alves Redol 9, Lisboa, Portugal
More informationEr is door mij gebruik gemaakt van dia s uit presentaties van o.a. Anastasios Kesidis, CIL, Athene Griekenland, en Asaf Tzadok, IBM Haifa Research Lab
IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands. Er is door mij gebruik gemaakt van dia s uit presentaties
More informationUnderstanding Video Lectures in a Flipped Classroom Setting. A Major Qualifying Project Report. Submitted to the Faculty
1 Project Number: DM3 IQP AAGV Understanding Video Lectures in a Flipped Classroom Setting A Major Qualifying Project Report Submitted to the Faculty Of Worcester Polytechnic Institute In partial fulfillment
More informationOffline Recognition of Unconstrained Handwritten Texts Using HMMs and Statistical Language Models. Alessandro Vinciarelli, Samy Bengio and Horst Bunke
1 Offline Recognition of Unconstrained Handwritten Texts Using HMMs and Statistical Language Models Alessandro Vinciarelli, Samy Bengio and Horst Bunke Abstract This paper presents a system for the offline
More informationPHONETIC TOOL FOR THE TUNISIAN ARABIC
PHONETIC TOOL FOR THE TUNISIAN ARABIC Abir Masmoudi 1,2, Yannick Estève 1, Mariem Ellouze Khmekhem 2, Fethi Bougares 1, Lamia Hadrich Belguith 2 (1) LIUM, University of Maine, France (2) ANLP Research
More informationAn Arabic Text-To-Speech System Based on Artificial Neural Networks
Journal of Computer Science 5 (3): 207-213, 2009 ISSN 1549-3636 2009 Science Publications An Arabic Text-To-Speech System Based on Artificial Neural Networks Ghadeer Al-Said and Moussa Abdallah Department
More informationGDP11 Student User s Guide. V. 1.7 December 2011
GDP11 Student User s Guide V. 1.7 December 2011 Contents Getting Started with GDP11... 4 Program Structure... 4 Lessons... 4 Lessons Menu... 4 Navigation Bar... 5 Student Portfolio... 5 GDP Technical Requirements...
More informationnews from Tom Bacon about Monday's lecture
ECRIC news from Tom Bacon about Monday's lecture I won't be at the lecture on Monday due to the work swamp. The plan is still to try and get into the data centre in two weeks time and do the next migration,
More informationUsing Microsoft Word. Working With Objects
Using Microsoft Word Many Word documents will require elements that were created in programs other than Word, such as the picture to the right. Nontext elements in a document are referred to as Objects
More informationEasy Bangla Typing for MS-Word!
Easy Bangla Typing for MS-Word! W ELCOME to Ekushey 2.2c, the easiest and most powerful Bangla typing software yet produced! Prepare yourself for international standard UNICODE Bangla typing. Fully integrated
More informationProgramming with SQL
Unit 43: Programming with SQL Learning Outcomes A candidate following a programme of learning leading to this unit will be able to: Create queries to retrieve information from relational databases using
More informationSetting Up OpenOffice.org: Choosing options to suit the way you work
Setting Up OpenOffice.org: Choosing options to suit the way you work Title: Setting Up OpenOffice.org: Choosing options to suit the way you work Version: 1.0 First edition: December 2004 First English
More informationCreating A Simple Dictionary With Definitions
Creating A Simple Dictionary With Definitions The KAS Knowledge Acquisition System allows you to create new dictionaries with definitions from scratch or append information to existing dictionaries. The
More informationWord 2007 Unit B: Editing Documents
Word 2007 Unit B: Editing Documents TRUE/FALSE 1. You can select text and then drag it to a new location using the mouse. 2. The last item copied from a document is stored on the system Clipboard. 3. The
More informationLocalization of Text Editor using Java Programming
Localization of Text Editor using Java Programming Varsha Tomar M.Tech Scholar Banasthali University Jaipur, India Manisha Bhatia Assistant Professor Banasthali University Jaipur, India ABSTRACT Software
More informationThe Benefits of Invented Spelling. Jennifer E. Beakas EDUC 340
THE BENEFITS OF INVENTED SPELLING 1 The Benefits of Invented Spelling Jennifer E. Beakas EDUC 340 THE BENEFITS OF INVENTED SPELLING 2 Abstract The use of invented spelling has long been a controversial
More informationCreating Reports Crystal Clear
Creating Reports Crystal Clear Presented by: Robert Acosta - Senior Client Support Co-Presenter: Praveen Maturi - Support Manager Agenda Why Crystal Reports? Planning a Report Report Access ECD vs Company
More informationOn Optimizing the Editing Algorithms for Evaluating Similarity Between Monophonic Musical Sequences
On Optimizing the Editing Algorithms for Evaluating Similarity Between Monophonic Musical Sequences Pierre Hanna, Pascal Ferraro, Matthias Robine To cite this version: Pierre Hanna, Pascal Ferraro, Matthias
More informationReading Competencies
Reading Competencies The Third Grade Reading Guarantee legislation within Senate Bill 21 requires reading competencies to be adopted by the State Board no later than January 31, 2014. Reading competencies
More informationBeginning Microsoft Access
Beginning Microsoft Access A database is a collection of information. Common collections of information that can be entered into a database include the library card catalog, a recipe box, or your personal
More informationExtraction Transformation Loading ETL Get data out of sources and load into the DW
Lection 5 ETL Definition Extraction Transformation Loading ETL Get data out of sources and load into the DW Data is extracted from OLTP database, transformed to match the DW schema and loaded into the
More informationECDL. European Computer Driving Licence. Word Processing Software BCS ITQ Level 2. Syllabus Version 5.0
European Computer Driving Licence Word Processing Software BCS ITQ Level 2 Using Microsoft Word 2010 Syllabus Version 5.0 This training, which has been approved by BCS, The Chartered Institute for IT,
More informationIntroduction to IBM Watson Analytics Data Loading and Data Quality
Introduction to IBM Watson Analytics Data Loading and Data Quality December 16, 2014 Document version 2.0 This document applies to IBM Watson Analytics. Licensed Materials - Property of IBM Copyright IBM
More informationPerplexity Method on the N-gram Language Model Based on Hadoop Framework
94 International Arab Journal of e-technology, Vol. 4, No. 2, June 2015 Perplexity Method on the N-gram Language Model Based on Hadoop Framework Tahani Mahmoud Allam 1, Hatem Abdelkader 2 and Elsayed Sallam
More informationText Processing (Business Professional)
Unit Title: Word Processing OCR unit number: 03938 Level: 3 Credit value: 6 Guided learning hours: 60 Unit reference number: M/505/7104 Unit aim Text Processing (Business Professional) This unit aims to
More informationKeywords : complexity, dictionary, compression, frequency, retrieval, occurrence, coded file. GJCST-C Classification : E.3
Global Journal of Computer Science and Technology Software & Data Engineering Volume 13 Issue 4 Version 1.0 Year 2013 Type: Double Blind Peer Reviewed International Research Journal Publisher: Global Journals
More informationNetClient CS Document Management Portal User Guide. version 9.x
NetClient CS Document Management Portal User Guide version 9.x TL 23560 (6/9/11) Copyright Information Text copyright 2001-2011 by Thomson Reuters/Tax & Accounting. All rights reserved. Video display images
More informationOneTouch 4.0 with OmniPage OCR Features. Mini Guide
OneTouch 4.0 with OmniPage OCR Features Mini Guide The OneTouch 4.0 software you received with your Visioneer scanner now includes new OmniPage Optical Character Recognition (OCR) features. This brief
More informationKnocker main application User manual
Knocker main application User manual Author: Jaroslav Tykal Application: Knocker.exe Document Main application Page 1/18 U Content: 1 START APPLICATION... 3 1.1 CONNECTION TO DATABASE... 3 1.2 MODULE DEFINITION...
More informationUser Management Resource Administrator 7.2
User Management Resource Administrator 7.2 Table Of Contents What is User Management Resource Administrator... 1 UMRA Scripts... 1 UMRA Projects... 1 UMRA Software... 1 Quickstart - Sample project wizard...
More information1.0 Getting Started Guide
KOFAX Transformation Modules Invoice Packs 1.0 Getting Started Guide 10300805-000 Rev 1.0 2008 Kofax, Inc., 16245 Laguna Canyon Road, Irvine, California 92618, U.S.A. All rights reserved. Use is subject
More informationUSER GUIDE for LEAD AUDITORS
USER GUIDE for LEAD AUDITORS Surveys, Audits, Assessments and Reviews Information System Doc 22-0085 Rev0 Paper copies of this document may not be current and should not be relied on for official purposes.
More informationThe National Reading Panel: Five Components of Reading Instruction Frequently Asked Questions
The National Reading Panel: Five Components of Reading Instruction Frequently Asked Questions Phonemic Awareness What is a phoneme? A phoneme is the smallest unit of sound in a word. For example, the word
More informationBlackboard Help. Getting Started My Institution Tab Courses Tab Working With Modules Customizing Tab Modules Course Catalog.
Blackboard Help Getting Started My Institution Tab Courses Tab Working With Modules Customizing Tab Modules Course Catalog 1 Getting Started The following are some things to keep in mind when using Blackboard
More informationOutlook Web Access (OWA) 2010 Email Cheat Sheet
June 9, 2014 Outlook Web Access (OWA) 2010 Email Cheat Sheet Outlook Web Access 2010 allows you to gain access to your messages, calendars, contacts, tasks and public folders from any computer with internet
More informationVCU-TSA at Semeval-2016 Task 4: Sentiment Analysis in Twitter
VCU-TSA at Semeval-2016 Task 4: Sentiment Analysis in Twitter Gerard Briones and Kasun Amarasinghe and Bridget T. McInnes, PhD. Department of Computer Science Virginia Commonwealth University Richmond,
More information