The Development of Multimedia-Multilingual Document Storage, Retrieval and Delivery System for E-Organization (STREDEO PROJECT)

Save this PDF as:
 WORD  PNG  TXT  JPG

Size: px
Start display at page:

Download "The Development of Multimedia-Multilingual Document Storage, Retrieval and Delivery System for E-Organization (STREDEO PROJECT)"

Transcription

1 The Development of Multimedia-Multilingual Storage, Retrieval and Delivery for E-Organization (STREDEO PROJECT) Asanee Kawtrakul, Kajornsak Julavittayanukool, Mukda Suktarachan, Patcharee Varasrai, Nathavit Buranapraphanont, Chaiwat Ketsuwan, Duangpen Jetpipattanapong, Prakorn Santiwatt, Nattakan Pengphon Natural Language Processing and Intelligent Information Technology Research Laboratory Department of computer engineering Faculty of Engineering, Kasetsart University Bangkok, Thailand Abstract This paper introduces the new project called STREDEO: The Development of Multimedia- Multilingual Storage, Retrieval and Delivery for E-Organization. STREDEO aims to provide the system for multimedia multilingual document management consisting of storage, retrieval and delivery. The project can be divided into seven subprojects, which are: The Development of Multimedia and Multilingual Storage (MUU-DOC), The Development of Processing for Indexing (DIM), The Development of Web-based Intelligent Information Retrieval (WIRE), The Development of Automatic Clustering and Delivery (CLUD), The Development of Multimedia Query Processing : Speech, Text and Handwriting Text (MUL-Q), The Development of Linguistic Knowledge Acquisition and Natural Language Processing Techniques (KANAL), and A Very Large Scale Multimedia Database Management Design and Integrating (INTEGRATE) Keyword: E-Organization, Natural Language Processing, Processing for Indexing, Automatic Indexing, Automatic Clustering, Very Large Scale Hypermedia Storage and Delivery, Web-based Intelligent Search Engine, Linguistic Knowledge Base, Knowledge Acquisition 1. Introduction There is no doubt that today information technology is expanding very rapidly particularly, in the field of communication and networking. In addition, it is likely that the information will continue to grow exponentially. These create the need for the collection of extremely huge information of different languages and media. Table 1 shows estimation of the sizes of information for different media and their growth rates. Table 1: Worldwide production of original content, stored digitally using standard compression methods, in terabytes circa 1999 [9]. Storage Medium Type of Content Terabytes/Year Upper Estimate Terabytes/Year Lower Estimate Growth Rate (%) Paper Book Newspaper Periodicals

2 Film Optical information Magnetic Information Office document Total 40 Picture Movie X-Rays 410, , , ,00 Total 47,16 58,16 4 CDs songs CDs data DVDs 58 Total Camcorder Tape PC Disk Drives Departmental Servers Enterprise Servers 00, , , , ,000 7, , ,000 Total 1,69,000 65, Grand total,10,59 69, Above information can be useful for a wide range of users from an organization to an individual person. However with the very large size of information available, potential problems such as too long searching time or system unstability can easily be encountered. Consequently, there is a need to organize and manage such huge information. These include organizing and managing the storage system, the retrieval system and the delivery system. Today information technology is applied to storage and retrieval system [11]. Examples of such technology are, large scale multimedia document storage [1], [6], [1] automatic indexing system [], [4] and automatic document clustering system [], [8], [10]. However the mentioned systems are created for English and they are not applicable for Thai. That is because Thai has unique characteristics such as no space required between words, ambiguity in meaning between noun and noun phase [7], [1]. STREDEO project aims to develop the technology and apply for Thai storage system, Thai retrieval system and Thai delivery system. The project will help to support an office that uses only electronic document and eliminate the use of paper which can help creating better environment for the world. In addition, it can easily provide services and exchange of information both within and outside Thailand.. STREDEO Overview Figure 1 shows the overview of STREDEO. There are types of input: text and image. Text could also be collected by using webrobot. In case of document image, it will be converted into text (not necessary be high quality) before indexing and then kept its image in the data warehouse. When there are information in the document warehouse, the system will continue to perform

3 document clustering and delivering. If a user input a query using natural language such as text, handwritten, or speech, the system will retrieve the relevant information or document to the user. STREDEO project can be divided into 7 subprojects, which will be described in the following subsections. Very Large Corpora Knowledge Acquisition Parser Linguisti c Linguistic Knowledge Base and Acquisition Toolkit Linguistic Knowledge Base Thesaurus Multimedia Storage Intelligent Search Engine Automatic Indexing And Storaging Warehouse in Multimedia Retrieval Processing www Web Robot Electronic Text to Text Shallow Converting Processing Clustering and Delivering Clustering and Delivering Speech and Text Query Processing Query Processing Electronic Office User Query (Natural Language,Speech) Internet Books or Papers Scanner Division A Division B Division C Division D User Figure 1: Seven subprojects of STREDEO including system integrating

4 .1. The Development of Multimedia and Multilingual Storage (MUU-DOC) MUU-DOC is an important subsystem of the project. The main function is to analyze information in a document or document image for indexing and storing. Figure shows the scope of MUU-DOC. Automatic Indexing Index Representation Warehouse Input Noun Phrase Analysis Electronic Text Automatic Analyzing and Storing Morphological Analysis Processing Electronic Office VISIO CORPORATION $ ก Figure : The Development of Multimedia and Multilingual Storage (MUU-DOC).. The Development of Processing for Indexing (DIM) The text data from book or paper, that will be used for indexing and storing in corpus, must be manually typed. The task is time consuming and tedious. DIM is a part of MUU-DOC that will analyze and recognise the text data roughly from the scanned document image and make the indexing of a large number of documents more convenient. It can reduce much time and human work in typing the text data, that will improve the speed of feeding data to the Multimedia and Multilingual Storage. DIM has four main tasks. N improving to solve scanning problem. N Layout Analysis to distinguish between text image and picture image. N Character Segmentation to segment connected character that cause of scanning or font of characters. N Character recognition to recognise text image to text characters.

5 The process of converting image of typed characters into text document in Thai uses syntactic, fuzzy logic and feature extraction. To make the system more practical, this subproject is not designed to focus only on character recognition but also image processing and character segmentation. Figure shows an example of such system. ก ก ก ก ก ก ก กก Line segmentation กก image transformation ก ก ก ก improving ก ก ก ก ก ก ก กก Line segmentation กก image transformation ก ก ก ก Layout Analysis ก ก ก ก ก ก ก กก Line segmentation กก image transformation ก ก ก ก Line Segmentation ก ก ก ก ก document character Segmentation ก ก Character recognition Text document Figure : image processing system.. The Development of Web-based Intelligent Information Retrieval (WIRE) The increasing of information technology and of using the internet cause the electronic documents to be increased exponentially. Consequently, the searching of an information is a nontrivial problem. It is necessary to create web-based an intelligent Information Retrieval system, which called WIRE. WIRE is a prototype system that capable of searching information in bilingual text (Thai- English). It can be divided into two parts, query processing system and searching system. The query processing system will process query words from users by transforming the query words to be multilevel such as words level, phrase level and sentence level. For example, if the query words are What is an internet address?, the query processing system will generate a multilevel query as internet address for phrase level, and networking for conceptual level. In addition, the query processing system will allow a user to enter query in many different styles for example address of internet or address on internet and still yields the same result. Since the query processing system produces multilevel queries, the searching system must also capable of searching in multilevel too. This can be done by starting the search in phrase level, then word level and conceptual level respectively.

6 .4. The Development of Automatic Clustering and Delivery (CLUD) As mentioned before, the increase in information technology and the increase in using the internet cause the electronic documents to be increased exponentially. There is a need to arrange electronic documents into groups. However if the task is done by human, it can be time consuming, ineffective and very tedious. Therefore, there should be a system that can automatically, effectively and accurately cluster electronic documents [10]. In addition to document clustering, the system also provide the capability that could forward the document to the right users..5. The Development of Multimedia Query Processing : Speech, Text and Handwriting Text (MUL-Q) Today all input queries are entered by using keyboard. To make the system become more friendlier, MUL-Q is proposed to be a multimedia query processing system that allows users to use speech and handwriting as input query to STERDEO. This project is limited to recognize discontinuous speech with domain based vocabularies. Another form of query can be handwriting. Handwriting character recognition (HCR) is more difficult than OCR. However this project is limited to process only neatly handwriting..6. The Development of Linguistic Knowledge Acquisition and Natural Language Processing Techniques (KANAL) Research in natural language processing is important to the development of document processing in term of better understanding human language. This subproject aims to develop linguistic knowledge acquisition system and natural language processing techniques in order to support document processing in indexing, clustering and query..7. A Very Large Scale Multimedia Database Management Design and Integrating (INTEGRATE) The development of software and database for very large-scale multimedia always have a lot of problems. For example, connecting each module together, controlling schedule and quality of each module. Since the development of STREDEO project has seven subprojects, the problems always occur if it has no good planning. The objective of this project is then, to design and development of software architecture, planning development direction, plug-in module, test and maintenance service via the network by applying software engineering technique.. Conclusion Today information technology has proved that there is a need to store, query, search, retrieve, and deliver large amount of electronic information efficiently and accurately. This paper introduces STREDEO project that will deal with the growing number of electronic document. STREDEO project consists of seven subprojects. The first subproject, MUU-DOC, will focus on multimedia and multilingual document storage. The second subproject, DIM, will focus on document image processing system for indexing. The third subproject will focus on web-based intelligent information retrieval. The fourth subproject will focus on automatic document clustering and delivery. The fifth project, MUL-Q, will focus on multimedia query processing such

7 as speech, text and handwriting text. The sixth project, KANAL will focus on linguistic knowledge acquisition and natural language processing Techniques. The last project will focus on a very large scale multimedia database management design and integrating STREDEO. 4. References [1] Andres, F. 000, Active Hypermedia Delivery and PHASME Information Engine, In Proceedings of AdInfo000 First International Symposium on Advandced Informatics 1: pp7-44. [] Chengxing, Z. 1995, Evaluation of syntactic phrase indexing-clarit NLP, Track Report, Text Retrieval Conference 4, New York, p5 [] Cohen, W. W. 1996, Learning rules that classify , in the Proceedings of the 1996 AAAI Spring Symposium on Machine Learning in Information Access., pp18-5. [4] Dik, L. 1997, Information storage and retrieval, nd ed., Pentice Hall Publishing Company, New York. 40 p. [5] Kawtrakul, A.,et.al. 000, Multi-Feature Extraction for Printed Thai Character, SNLP 000 Symposium of Natural Language Processing [6] Kawtrakul, A. et.al. 000, Toward on Enhancement of Textual Database Retrieval by Using NLP Technique, NECTEC Technical Journal, Vol.11 No.7 March-June, 000. [7] Kawtrakul, A. and Thumkanon, C. 1997, A statistical Approach for Thai Morphological International Conference, China. [8] Lang, K. 1995, NewsWeeder learning to filter netnews In Proceeding of ICML-95, 1 th International Conference on Machine Learning 1, pp1-9. [9] Peter L. and Hal R. V. 1999, How much Information? [online] [10] Sebastiani, F. 1999, A Tutorial on Automated Text Categorisation. In Analia Amandi and Alejandro Zunino (eds.), Proceedings of ASAI-99, 1st Argentinian Symposium on Artificial Intelligence, Buenos Aires, AR, pp7-5. [11] William, B. F. and Baeza, Y. R. 199, Information retrieval Data Structure & Algorithm, Prentice Hall, Englewood Cliffs, New Jersey. p504 [1] Kawtrakul A., Andres F., Ono K. and et.al. 000, The Implementation of VLSHDS Project for Thai Retrieval in Proc. First International Symposium on Advance Informatics, Tokyo, Japan.

01219211 Software Development Training Camp 1 (0-3) Prerequisite : 01204214 Program development skill enhancement camp, at least 48 person-hours.

01219211 Software Development Training Camp 1 (0-3) Prerequisite : 01204214 Program development skill enhancement camp, at least 48 person-hours. (International Program) 01219141 Object-Oriented Modeling and Programming 3 (3-0) Object concepts, object-oriented design and analysis, object-oriented analysis relating to developing conceptual models

More information

Study and Analysis of Data Mining Concepts

Study and Analysis of Data Mining Concepts Study and Analysis of Data Mining Concepts M.Parvathi Head/Department of Computer Applications Senthamarai college of Arts and Science,Madurai,TamilNadu,India/ Dr. S.Thabasu Kannan Principal Pannai College

More information

MOVING MACHINE TRANSLATION SYSTEM TO WEB

MOVING MACHINE TRANSLATION SYSTEM TO WEB MOVING MACHINE TRANSLATION SYSTEM TO WEB Abstract GURPREET SINGH JOSAN Dept of IT, RBIEBT, Mohali. Punjab ZIP140104,India josangurpreet@rediffmail.com The paper presents an overview of an online system

More information

An Open Platform for Collecting Domain Specific Web Pages and Extracting Information from Them

An Open Platform for Collecting Domain Specific Web Pages and Extracting Information from Them An Open Platform for Collecting Domain Specific Web Pages and Extracting Information from Them Vangelis Karkaletsis and Constantine D. Spyropoulos NCSR Demokritos, Institute of Informatics & Telecommunications,

More information

Research on News Video Multi-topic Extraction and Summarization

Research on News Video Multi-topic Extraction and Summarization International Journal of New Technology and Research (IJNTR) ISSN:2454-4116, Volume-2, Issue-3, March 2016 Pages 37-39 Research on News Video Multi-topic Extraction and Summarization Di Li, Hua Huo Abstract

More information

Collecting Polish German Parallel Corpora in the Internet

Collecting Polish German Parallel Corpora in the Internet Proceedings of the International Multiconference on ISSN 1896 7094 Computer Science and Information Technology, pp. 285 292 2007 PIPS Collecting Polish German Parallel Corpora in the Internet Monika Rosińska

More information

An Overview of a Role of Natural Language Processing in An Intelligent Information Retrieval System

An Overview of a Role of Natural Language Processing in An Intelligent Information Retrieval System An Overview of a Role of Natural Language Processing in An Intelligent Information Retrieval System Asanee Kawtrakul ABSTRACT In information-age society, advanced retrieval technique and the automatic

More information

Er is door mij gebruik gemaakt van dia s uit presentaties van o.a. Anastasios Kesidis, CIL, Athene Griekenland, en Asaf Tzadok, IBM Haifa Research Lab

Er is door mij gebruik gemaakt van dia s uit presentaties van o.a. Anastasios Kesidis, CIL, Athene Griekenland, en Asaf Tzadok, IBM Haifa Research Lab IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands. Er is door mij gebruik gemaakt van dia s uit presentaties

More information

1. Bangla OCR. Technologies / Products Developed by ISI - Kolkata : Bangla Optical Character Recognition

1. Bangla OCR. Technologies / Products Developed by ISI - Kolkata : Bangla Optical Character Recognition Technologies / Products Developed by ISI - Kolkata : 1. Bangla OCR 1. Name of the 2. Nature of 3. Level: (Product / / Subsystem) 4. Technical Description of the / Product including Basic block diagram,

More information

Modeling and Design of Intelligent Agent System

Modeling and Design of Intelligent Agent System International Journal of Control, Automation, and Systems Vol. 1, No. 2, June 2003 257 Modeling and Design of Intelligent Agent System Dae Su Kim, Chang Suk Kim, and Kee Wook Rim Abstract: In this study,

More information

The AGROVOC Concept Server Workbench: A Collaborative Tool for Managing Multilingual Knowledge

The AGROVOC Concept Server Workbench: A Collaborative Tool for Managing Multilingual Knowledge The AGROVOC Concept Server Workbench: A Collaborative Tool for Managing Multilingual Knowledge 1 Panita Yongyuth 1, Dussadee Thamvijit 1, Thanapat Suksangsri 1, Asanee Kawtrakul 1, 2, Sachit Rajbhandari

More information

DEVELOPMENT AND ANALYSIS OF HINDI-URDU PARALLEL CORPUS

DEVELOPMENT AND ANALYSIS OF HINDI-URDU PARALLEL CORPUS DEVELOPMENT AND ANALYSIS OF HINDI-URDU PARALLEL CORPUS Mandeep Kaur GNE, Ludhiana Ludhiana, India Navdeep Kaur SLIET, Longowal Sangrur, India Abstract This paper deals with Development and Analysis of

More information

Word Spotting in Cursive Handwritten Documents using Modified Character Shape Codes

Word Spotting in Cursive Handwritten Documents using Modified Character Shape Codes Word Spotting in Cursive Handwritten Documents using Modified Character Shape Codes Sayantan Sarkar Department of Electrical Engineering, NIT Rourkela sayantansarkar24@gmail.com Abstract.There is a large

More information

Using LSI for Implementing Document Management Systems Turning unstructured data from a liability to an asset.

Using LSI for Implementing Document Management Systems Turning unstructured data from a liability to an asset. White Paper Using LSI for Implementing Document Management Systems Turning unstructured data from a liability to an asset. Using LSI for Implementing Document Management Systems By Mike Harrison, Director,

More information

UTILIZING SOCIAL MEDIA FOR OBSERVATIONAL GOAL SETTING

UTILIZING SOCIAL MEDIA FOR OBSERVATIONAL GOAL SETTING UTILIZING SOCIAL MEDIA FOR OBSERVATIONAL GOAL SETTING Sébastien Louvigné, Neil Rubens, Fumihiko Anma, and Toshio Okamoto Graduate School of Information Systems The University of Electro-Communications

More information

Search and Information Retrieval

Search and Information Retrieval Search and Information Retrieval Search on the Web 1 is a daily activity for many people throughout the world Search and communication are most popular uses of the computer Applications involving search

More information

ANALYSIS OF WEB-BASED APPLICATIONS FOR EXPERT SYSTEM

ANALYSIS OF WEB-BASED APPLICATIONS FOR EXPERT SYSTEM Computer Modelling and New Technologies, 2011, Vol.15, No.4, 41 45 Transport and Telecommunication Institute, Lomonosov 1, LV-1019, Riga, Latvia ANALYSIS OF WEB-BASED APPLICATIONS FOR EXPERT SYSTEM N.

More information

Multimedia Technology Bachelor of Science

Multimedia Technology Bachelor of Science Multimedia Technology Bachelor of Science 1. Program s Name Thai Name : ว ทยาศาสตรบ ณฑ ต สาขาว ชาเทคโนโลย ม ลต ม เด ย English Name : Bachelor of Science Program in Multimedia Technology 2. Degree Full

More information

Tamil Search Engine. Abstract

Tamil Search Engine. Abstract Tamil Search Engine Baskaran Sankaran AU-KBC Research Centre, MIT campus of Anna University, Chromepet, Chennai - 600 044. India. E-mail: baskaran@au-kbc.org Abstract The Internet marks the era of Information

More information

International Journal of Scientific & Engineering Research, Volume 4, Issue 11, November-2013 5 ISSN 2229-5518

International Journal of Scientific & Engineering Research, Volume 4, Issue 11, November-2013 5 ISSN 2229-5518 International Journal of Scientific & Engineering Research, Volume 4, Issue 11, November-2013 5 INTELLIGENT MULTIDIMENSIONAL DATABASE INTERFACE Mona Gharib Mohamed Reda Zahraa E. Mohamed Faculty of Science,

More information

Wikipedia and Web document based Query Translation and Expansion for Cross-language IR

Wikipedia and Web document based Query Translation and Expansion for Cross-language IR Wikipedia and Web document based Query Translation and Expansion for Cross-language IR Ling-Xiang Tang 1, Andrew Trotman 2, Shlomo Geva 1, Yue Xu 1 1Faculty of Science and Technology, Queensland University

More information

Role of Text Mining in Business Intelligence

Role of Text Mining in Business Intelligence Role of Text Mining in Business Intelligence Palak Gupta 1, Barkha Narang 2 Abstract This paper includes the combined study of business intelligence and text mining of uncertain data. The data that is

More information

Survey on Artificial Intelligence Technology in Thailand

Survey on Artificial Intelligence Technology in Thailand Survey on Artificial Intelligence Technology in Thailand Boonserm Kijsirikul, Department of Computer Engineering, Chulalongkorn University, and Thanaruk Theeramunkong, Sirinthorn International Institute

More information

Efficient Techniques for Improved Data Classification and POS Tagging by Monitoring Extraction, Pruning and Updating of Unknown Foreign Words

Efficient Techniques for Improved Data Classification and POS Tagging by Monitoring Extraction, Pruning and Updating of Unknown Foreign Words , pp.290-295 http://dx.doi.org/10.14257/astl.2015.111.55 Efficient Techniques for Improved Data Classification and POS Tagging by Monitoring Extraction, Pruning and Updating of Unknown Foreign Words Irfan

More information

A Zone Based Approach for Classification and Recognition of Telugu Handwritten Characters

A Zone Based Approach for Classification and Recognition of Telugu Handwritten Characters International Journal of Electrical and Computer Engineering (IJECE) Vol. 6, No. 4, August 2016, pp. 1647~1653 ISSN: 2088-8708, DOI: 10.11591/ijece.v6i4.10553 1647 A Zone Based Approach for Classification

More information

Day 7 Business Information Systems-- the portfolio. Today s Learning Objectives

Day 7 Business Information Systems-- the portfolio. Today s Learning Objectives Day 7 Business Information Systems-- the portfolio MBA 8125 Information technology Management Professor Duane Truex III Today s Learning Objectives 1. Define and describe the repository components of business

More information

Automatic Word Lookup Service and Client Tool for SAIKAM Online Dictionary

Automatic Word Lookup Service and Client Tool for SAIKAM Online Dictionary NII Journal No.1 (2000.12) 研 究 論 文 Automatic Word Lookup Service and Client Tool for SAIKAM Online Dictionary Vuthichai AMPORNARAMVETH National Institute of Informatics Akiko AIZAWA National Institute

More information

Graduate Co-op Students Information Manual. Department of Computer Science. Faculty of Science. University of Regina

Graduate Co-op Students Information Manual. Department of Computer Science. Faculty of Science. University of Regina Graduate Co-op Students Information Manual Department of Computer Science Faculty of Science University of Regina 2014 1 Table of Contents 1. Department Description..3 2. Program Requirements and Procedures

More information

INTERNATIONAL JOURNAL OF ADVANCES IN COMPUTING AND INFORMATION TECHNOLOGY An International online open access peer reviewed journal

INTERNATIONAL JOURNAL OF ADVANCES IN COMPUTING AND INFORMATION TECHNOLOGY An International online open access peer reviewed journal INTERNATIONAL JOURNAL OF ADVANCES IN COMPUTING AND INFORMATION TECHNOLOGY An International online open access peer reviewed journal Research Article ISSN 2277 9140 ABSTRACT Web page categorization based

More information

Multi-Lingual Display of Business Documents

Multi-Lingual Display of Business Documents The Data Center Multi-Lingual Display of Business Documents David L. Brock, Edmund W. Schuster, and Chutima Thumrattranapruk The Data Center, Massachusetts Institute of Technology, Building 35, Room 212,

More information

WEB PAGE CATEGORISATION BASED ON NEURONS

WEB PAGE CATEGORISATION BASED ON NEURONS WEB PAGE CATEGORISATION BASED ON NEURONS Shikha Batra Abstract: Contemporary web is comprised of trillions of pages and everyday tremendous amount of requests are made to put more web pages on the WWW.

More information

Automatic Mining of Internet Translation Reference Knowledge Based on Multiple Search Engines

Automatic Mining of Internet Translation Reference Knowledge Based on Multiple Search Engines , 22-24 October, 2014, San Francisco, USA Automatic Mining of Internet Translation Reference Knowledge Based on Multiple Search Engines Baosheng Yin, Wei Wang, Ruixue Lu, Yang Yang Abstract With the increasing

More information

NATURAL LANGUAGE TO SQL CONVERSION SYSTEM

NATURAL LANGUAGE TO SQL CONVERSION SYSTEM International Journal of Computer Science Engineering and Information Technology Research (IJCSEITR) ISSN 2249-6831 Vol. 3, Issue 2, Jun 2013, 161-166 TJPRC Pvt. Ltd. NATURAL LANGUAGE TO SQL CONVERSION

More information

Cross-Lingual Concern Analysis from Multilingual Weblog Articles

Cross-Lingual Concern Analysis from Multilingual Weblog Articles Cross-Lingual Concern Analysis from Multilingual Weblog Articles Tomohiro Fukuhara RACE (Research into Artifacts), The University of Tokyo 5-1-5 Kashiwanoha, Kashiwa, Chiba JAPAN http://www.race.u-tokyo.ac.jp/~fukuhara/

More information

Natural Language to Relational Query by Using Parsing Compiler

Natural Language to Relational Query by Using Parsing Compiler Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 4, Issue. 3, March 2015,

More information

USABILITY OF A FILIPINO LANGUAGE TOOLS WEBSITE

USABILITY OF A FILIPINO LANGUAGE TOOLS WEBSITE USABILITY OF A FILIPINO LANGUAGE TOOLS WEBSITE Ria A. Sagum, MCS Department of Computer Science, College of Computer and Information Sciences Polytechnic University of the Philippines, Manila, Philippines

More information

An Approach for Facilating Knowledge Data Warehouse

An Approach for Facilating Knowledge Data Warehouse International Journal of Soft Computing Applications ISSN: 1453-2277 Issue 4 (2009), pp.35-40 EuroJournals Publishing, Inc. 2009 http://www.eurojournals.com/ijsca.htm An Approach for Facilating Knowledge

More information

A Framework of Personalized Intelligent Document and Information Management System

A Framework of Personalized Intelligent Document and Information Management System A Framework of Personalized Intelligent and Information Management System Xien Fan Department of Computer Science, College of Staten Island, City University of New York, Staten Island, NY 10314, USA Fang

More information

Compare and Contrast OCR and Forms Recognition Technologies. Peter Lang and Scott Hamilton

Compare and Contrast OCR and Forms Recognition Technologies. Peter Lang and Scott Hamilton Compare and Contrast OCR and Forms Recognition Technologies Peter Lang and Scott Hamilton Agenda Capture in ECM Choices, choices Product Overviews - Peter ABBYY FlexiCapture TeleForm Product Overviews

More information

Implementation of OCR Based on Template Matching and Integrating it in Android Application

Implementation of OCR Based on Template Matching and Integrating it in Android Application International Journal of Computer Sciences and EngineeringOpen Access Technical Paper Volume-04, Issue-02 E-ISSN: 2347-2693 Implementation of OCR Based on Template Matching and Integrating it in Android

More information

Accelerating and Evaluation of Syntactic Parsing in Natural Language Question Answering Systems

Accelerating and Evaluation of Syntactic Parsing in Natural Language Question Answering Systems Accelerating and Evaluation of Syntactic Parsing in Natural Language Question Answering Systems cation systems. For example, NLP could be used in Question Answering (QA) systems to understand users natural

More information

Extraction of Legal Definitions from a Japanese Statutory Corpus Toward Construction of a Legal Term Ontology

Extraction of Legal Definitions from a Japanese Statutory Corpus Toward Construction of a Legal Term Ontology Extraction of Legal Definitions from a Japanese Statutory Corpus Toward Construction of a Legal Term Ontology Makoto Nakamura, Yasuhiro Ogawa, Katsuhiko Toyama Japan Legal Information Institute, Graduate

More information

Artificial Intelligence for Speech Recognition

Artificial Intelligence for Speech Recognition Artificial Intelligence for Speech Recognition Kapil Kumar 1, Neeraj Dhoundiyal 2 and Ashish Kasyap 3 1,2,3 Department of Computer Science Engineering, IIMT College of Engineering, GreaterNoida, UttarPradesh,

More information

Search Engine Architecture I

Search Engine Architecture I Search Engine Architecture I Software Architecture The high level structure of a software system Software components The interfaces provided by those components The relationships between those components

More information

AN ARCHITECTURE OF AN INTELLIGENT TUTORING SYSTEM TO SUPPORT DISTANCE LEARNING

AN ARCHITECTURE OF AN INTELLIGENT TUTORING SYSTEM TO SUPPORT DISTANCE LEARNING Computing and Informatics, Vol. 26, 2007, 565 576 AN ARCHITECTURE OF AN INTELLIGENT TUTORING SYSTEM TO SUPPORT DISTANCE LEARNING Marcia T. Mitchell Computer and Information Sciences Department Saint Peter

More information

Web Content Mining. Dr. Ahmed Rafea

Web Content Mining. Dr. Ahmed Rafea Web Content Mining Dr. Ahmed Rafea Outline Introduction The Web: Opportunities & Challenges Techniques Applications Introduction The Web is perhaps the single largest data source in the world. Web mining

More information

Framework model on enterprise information system based on Internet of things

Framework model on enterprise information system based on Internet of things International Journal of Intelligent Information Systems 2014; 3(6): 55-59 Published online December 22, 2014 (http://www.sciencepublishinggroup.com/j/ijiis) doi: 10.11648/j.ijiis.20140306.11 ISSN: 2328-7675

More information

Multimedia Communication. Slides courtesy of Tay Vaughan Making Multimedia Work

Multimedia Communication. Slides courtesy of Tay Vaughan Making Multimedia Work Multimedia Communication Slides courtesy of Tay Vaughan Making Multimedia Work Outline Multimedia concept Tools for Multimedia communication _Software _Hardware Advanced coding standards Applications What

More information

Specialty Answering Service. All rights reserved.

Specialty Answering Service. All rights reserved. 0 Contents 1 Introduction... 2 1.1 Types of Dialog Systems... 2 2 Dialog Systems in Contact Centers... 4 2.1 Automated Call Centers... 4 3 History... 3 4 Designing Interactive Dialogs with Structured Data...

More information

Automated Medical Citation Records Creation for Web-Based On-Line Journals

Automated Medical Citation Records Creation for Web-Based On-Line Journals Automated Medical Citation Records Creation for Web-Based On-Line Journals Daniel X. Le, Loc Q. Tran, Joseph Chow Jongwoo Kim, Susan E. Hauser, Chan W. Moon, George R. Thoma National Library of Medicine,

More information

Tibetan-Chinese Bilingual Sentences Alignment Method based on Multiple Features

Tibetan-Chinese Bilingual Sentences Alignment Method based on Multiple Features , pp.273-280 http://dx.doi.org/10.14257/ijdta.2015.8.4.27 Tibetan-Chinese Bilingual Sentences Alignment Method based on Multiple Features Lirong Qiu School of Information Engineering, MinzuUniversity of

More information

The multilayer sentiment analysis model based on Random forest Wei Liu1, Jie Zhang2

The multilayer sentiment analysis model based on Random forest Wei Liu1, Jie Zhang2 2nd International Conference on Advances in Mechanical Engineering and Industrial Informatics (AMEII 2016) The multilayer sentiment analysis model based on Random forest Wei Liu1, Jie Zhang2 1 School of

More information

Introduction to Pattern Recognition

Introduction to Pattern Recognition Introduction to Pattern Recognition Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Spring 2009 CS 551, Spring 2009 c 2009, Selim Aksoy (Bilkent University)

More information

Domain Classification of Technical Terms Using the Web

Domain Classification of Technical Terms Using the Web Systems and Computers in Japan, Vol. 38, No. 14, 2007 Translated from Denshi Joho Tsushin Gakkai Ronbunshi, Vol. J89-D, No. 11, November 2006, pp. 2470 2482 Domain Classification of Technical Terms Using

More information

MIRACLE at VideoCLEF 2008: Classification of Multilingual Speech Transcripts

MIRACLE at VideoCLEF 2008: Classification of Multilingual Speech Transcripts MIRACLE at VideoCLEF 2008: Classification of Multilingual Speech Transcripts Julio Villena-Román 1,3, Sara Lana-Serrano 2,3 1 Universidad Carlos III de Madrid 2 Universidad Politécnica de Madrid 3 DAEDALUS

More information

LONG BEACH CITY COLLEGE MEMORANDUM

LONG BEACH CITY COLLEGE MEMORANDUM LONG BEACH CITY COLLEGE MEMORANDUM DATE: May 5, 2000 TO: Academic Senate Equivalency Committee FROM: John Hugunin Department Head for CBIS SUBJECT: Equivalency statement for Computer Science Instructor

More information

Web-based Multimedia Content Management System for Effective News Personalization on Interactive Broadcasting

Web-based Multimedia Content Management System for Effective News Personalization on Interactive Broadcasting Web-based Multimedia Content Management System for Effective News Personalization on Interactive Broadcasting S.N.CHEONG AZHAR K.M. M. HANMANDLU Faculty Of Engineering, Multimedia University, Jalan Multimedia,

More information

Quality control is performed after working with text and focuses on finding and removing any technical, linguistic and editorial errors.

Quality control is performed after working with text and focuses on finding and removing any technical, linguistic and editorial errors. TRANSLATION AS A PRODUCT TRANSLATION PROJECT MANAGEMENT IN A MODERN COMPANY Each translation project is coordinated by a Project Manager who is responsible for meeting the requirements for all aspects

More information

DEVELOPMENT OF NATURAL LANGUAGE INTERFACE TO RELATIONAL DATABASES

DEVELOPMENT OF NATURAL LANGUAGE INTERFACE TO RELATIONAL DATABASES DEVELOPMENT OF NATURAL LANGUAGE INTERFACE TO RELATIONAL DATABASES C. Nancy * and Sha Sha Ali # Student of M.Tech, Bharath College Of Engineering And Technology For Women, Andhra Pradesh, India # Department

More information

NATIONAL SUN YAT-SEN UNIVERSITY

NATIONAL SUN YAT-SEN UNIVERSITY NATIONAL SUN YAT-SEN UNIVERSITY Department of Electrical Engineering (Master s Degree, Doctoral Program Course, International Master's Program in Electric Power Engineering) Course Structure Course Structures

More information

Transformation of Free-text Electronic Health Records for Efficient Information Retrieval and Support of Knowledge Discovery

Transformation of Free-text Electronic Health Records for Efficient Information Retrieval and Support of Knowledge Discovery Transformation of Free-text Electronic Health Records for Efficient Information Retrieval and Support of Knowledge Discovery Jan Paralic, Peter Smatana Technical University of Kosice, Slovakia Center for

More information

Figure 1: The OCR result of a text block generated by a commercial OCR system, TypeReader 3.0 from ExperVision Inc.. In the graphical user interface f

Figure 1: The OCR result of a text block generated by a commercial OCR system, TypeReader 3.0 from ExperVision Inc.. In the graphical user interface f REPRESENTING OCRED DOCUMENTS IN HTML Tao Hong and Sargur N. Srihari Center of Excellence for Document Analysis and Recognition State University of New York at Bualo Bualo, New York 14228 email: ftaohong,sriharig@cedar.buffalo.edu

More information

Machine Learning: Overview

Machine Learning: Overview Machine Learning: Overview Why Learning? Learning is a core of property of being intelligent. Hence Machine learning is a core subarea of Artificial Intelligence. There is a need for programs to behave

More information

Master of Science in Computer Science

Master of Science in Computer Science Master of Science in Computer Science Background/Rationale The MSCS program aims to provide both breadth and depth of knowledge in the concepts and techniques related to the theory, design, implementation,

More information

Graphical Web based Tool for Generating Query from Star Schema

Graphical Web based Tool for Generating Query from Star Schema Graphical Web based Tool for Generating Query from Star Schema Mohammed Anbar a, Ku Ruhana Ku-Mahamud b a College of Arts and Sciences Universiti Utara Malaysia, 0600 Sintok, Kedah, Malaysia Tel: 604-2449604

More information

A Web Prefetching Model Based on Content Analysis

A Web Prefetching Model Based on Content Analysis A Web Prefetching Model Based on Content Analysis O Kit Hong, Fiona Robert P. Biuk-Aghai Faculty of Science and Technology University of Macau {csb.fiona fst.robert}@umac.mo Abstract Web-accessible resources

More information

How the Computer Translates. Svetlana Sokolova President and CEO of PROMT, PhD.

How the Computer Translates. Svetlana Sokolova President and CEO of PROMT, PhD. Svetlana Sokolova President and CEO of PROMT, PhD. How the Computer Translates Machine translation is a special field of computer application where almost everyone believes that he/she is a specialist.

More information

Using Artificial Intelligence to Manage Big Data for Litigation

Using Artificial Intelligence to Manage Big Data for Litigation FEBRUARY 3 5, 2015 / THE HILTON NEW YORK Using Artificial Intelligence to Manage Big Data for Litigation Understanding Artificial Intelligence to Make better decisions Improve the process Allay the fear

More information

Online Multilingual Translation of Technical Service Reports over the World Wide Web

Online Multilingual Translation of Technical Service Reports over the World Wide Web Online Multilingual Translation of Technical Service Reports over the World Wide Web S. Liu, S.C. Hui, S. Foo and P.C. Leong School of Applied Science, Nanyang Technological University Nanyang Avenue,

More information

Arabic E-learning and Computer Tools

Arabic E-learning and Computer Tools Arabic E-learning and Computer Tools Prof. Oleg Redkin, Dr. Olga Bernikova Department of Asian and African Studies, St. Petersburg State University, St. Petersburg, Russia Abstract This paper reviews software

More information

Bachelor Degree in Informatics Engineering Master courses

Bachelor Degree in Informatics Engineering Master courses Bachelor Degree in Informatics Engineering Master courses Donostia School of Informatics The University of the Basque Country, UPV/EHU For more information: Universidad del País Vasco / Euskal Herriko

More information

Master of Science (Electrical Engineering) MS(EE)

Master of Science (Electrical Engineering) MS(EE) Master of Science (Electrical Engineering) MS(EE) 1. Mission Statement: The mission of the Electrical Engineering Department is to provide quality education to prepare students who will play a significant

More information

Decision Support and Business Intelligence Systems. Chapter 1: Decision Support Systems and Business Intelligence

Decision Support and Business Intelligence Systems. Chapter 1: Decision Support Systems and Business Intelligence Decision Support and Business Intelligence Systems Chapter 1: Decision Support Systems and Business Intelligence Types of DSS Two major types: Model-oriented DSS Data-oriented DSS Evolution of DSS into

More information

Web Page Categorization based on Document Structure

Web Page Categorization based on Document Structure 1 Web Page Categorization based on Document Structure Arul Prakash Asirvatham arul@gdit.iiit.net Kranthi Kumar. Ravi kranthi@gdit.iiit.net Centre for Visual Information Technology International Institute

More information

Designed for expert users and translation Expert 8.0 offers all the features Professional plus:

Designed for expert users and translation Expert 8.0 offers all the features Professional plus: @promt Expert 8.0 Short description @promt Expert 8.0 Designed for expert users and translation companies, @promt Expert 8.0 offers all the features of @promt Professional plus: Integration with TM TRADOS

More information

A Matlab Project in Optical Character Recognition (OCR)

A Matlab Project in Optical Character Recognition (OCR) A Matlab Project in Optical Character Recognition (OCR) Jesse Hansen Introduction: What is OCR? The goal of Optical Character Recognition (OCR) is to classify optical patterns (often contained in a digital

More information

Opinion Mining and Preferences Mining in Mobile Search

Opinion Mining and Preferences Mining in Mobile Search Opinion Mining and Preferences Mining in Mobile Search P. Anandajayam 1, D. Ashok Kumar 2. Asst.Professor, Dept. of Computer Science, MANAKULA VINAYAGAR INSTITUTE OF TECHNOLOGY, India 1. PG Scholar, MANAKULA

More information

GLOCAL VALUE SRL DISTRIBUTORE PER L ITALIA. Sales Guide V3.3

GLOCAL VALUE SRL DISTRIBUTORE PER L ITALIA. Sales Guide V3.3 GLOCAL VALUE SRL DISTRIBUTORE PER L ITALIA Sales Guide V3.3 What is Scanshare? The document business critical data, currently locked in paper form or received digitally The MFD the on ramp to an organisation

More information

Associate Professor, Department of CSE, Shri Vishnu Engineering College for Women, Andhra Pradesh, India 2

Associate Professor, Department of CSE, Shri Vishnu Engineering College for Women, Andhra Pradesh, India 2 Volume 6, Issue 3, March 2016 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Special Issue

More information

Unit Objectives. Input / Output Ports. Component 4: Introduction to Information and Computer Science. Unit 3: Computer Hardware & Architecture

Unit Objectives. Input / Output Ports. Component 4: Introduction to Information and Computer Science. Unit 3: Computer Hardware & Architecture Component 4: Introduction to Information and Computer Science Unit 3: Computer Hardware & Architecture Lecture 2 This material was developed by Oregon Health & Science University, funded by the Department

More information

Web Mining. Margherita Berardi LACAM. Dipartimento di Informatica Università degli Studi di Bari berardi@di.uniba.it

Web Mining. Margherita Berardi LACAM. Dipartimento di Informatica Università degli Studi di Bari berardi@di.uniba.it Web Mining Margherita Berardi LACAM Dipartimento di Informatica Università degli Studi di Bari berardi@di.uniba.it Bari, 24 Aprile 2003 Overview Introduction Knowledge discovery from text (Web Content

More information

INFORMATION LOGISTICS VERSUS SEARCH. How context-sensitive information retrieval saves time spent reaching goals

INFORMATION LOGISTICS VERSUS SEARCH. How context-sensitive information retrieval saves time spent reaching goals INFORMATION LOGISTICS VERSUS SEARCH How context-sensitive information retrieval saves time spent reaching goals 2 Information logictics versus search Table of contents Page Topic 3 Search 3 Basic methodology

More information

Expert System and Knowledge Management for Software Developer in Software Companies

Expert System and Knowledge Management for Software Developer in Software Companies Expert System and Knowledge Management for Software Developer in Software Companies 1 M.S.Josephine, 2 V.Jeyabalaraja 1 Dept. of MCA, Dr.MGR University, Chennai. 2 Dept.of MCA, Velammal Engg.College,Chennai.

More information

DATA WAREHOUSE AND DATA MINING NECCESSITY OR USELESS INVESTMENT

DATA WAREHOUSE AND DATA MINING NECCESSITY OR USELESS INVESTMENT Scientific Bulletin Economic Sciences, Vol. 9 (15) - Information technology - DATA WAREHOUSE AND DATA MINING NECCESSITY OR USELESS INVESTMENT Associate Professor, Ph.D. Emil BURTESCU University of Pitesti,

More information

SNMP, CMIP based Distributed Heterogeneous Network Management using WBEM Gateway Enabled Integration Approach

SNMP, CMIP based Distributed Heterogeneous Network Management using WBEM Gateway Enabled Integration Approach , CMIP based Distributed Heterogeneous Network Management using WBEM Gateway Enabled Integration Approach Manvi Mishra Dept. of Information Technology, SRMSCET Bareilly (U.P.), India S.S. Bedi Dept of

More information

Fuzzy Knowledge Base System for Fault Tracing of Marine Diesel Engine

Fuzzy Knowledge Base System for Fault Tracing of Marine Diesel Engine Fuzzy Knowledge Base System for Fault Tracing of Marine Diesel Engine 99 Fuzzy Knowledge Base System for Fault Tracing of Marine Diesel Engine Faculty of Computers and Information Menufiya University-Shabin

More information

Computers in Your Future - 4th Edition, by Pfaffenberger. Publisher: Prentice Hall. ISBN:

Computers in Your Future - 4th Edition, by Pfaffenberger. Publisher: Prentice Hall. ISBN: Introduction to Information Technology ITP 101x (4 Units) Objective Introduction to computer hardware, operating systems, networks, programming. Survey of application software in business and industry.

More information

Knowledge as a Service for Agriculture Domain

Knowledge as a Service for Agriculture Domain Knowledge as a Service for Agriculture Domain Asanee Kawtrakul Abstract Three key issues for providing knowledge services are how to improve the access of unstructured and scattered information for the

More information

Design for Management Information System Based on Internet of Things

Design for Management Information System Based on Internet of Things Design for Management Information System Based on Internet of Things * School of Computer Science, Sichuan University of Science & Engineering, Zigong Sichuan 643000, PR China, 413789256@qq.com Abstract

More information

Effective Data Retrieval Mechanism Using AML within the Web Based Join Framework

Effective Data Retrieval Mechanism Using AML within the Web Based Join Framework Effective Data Retrieval Mechanism Using AML within the Web Based Join Framework Usha Nandini D 1, Anish Gracias J 2 1 ushaduraisamy@yahoo.co.in 2 anishgracias@gmail.com Abstract A vast amount of assorted

More information

Index Terms: Online Ticket Resolving System (OTRS), Network Operation Center(NOCs), Incident Management(INC),

Index Terms: Online Ticket Resolving System (OTRS), Network Operation Center(NOCs), Incident Management(INC), Survey Paper On Resolving Trouble-Ticket System Vikas Kumar Gupta, Ashwin Rajpurohit,Prakhyat Sapkale, Gajanan Chainpure. Mr Kalyan Bamne Information Technology Department, Savitribai Phule Pune University.

More information

A Novel Parallel Architecture Design of Information Retrieval System for Scientific Papers

A Novel Parallel Architecture Design of Information Retrieval System for Scientific Papers A Novel Parallel Architecture Design of Information Retrieval System for Scientific Papers Aziz Murtazaev 1, Sanggil Kang 2 and Sangyoon Oh 3 1 Samsung Electronics, Suwon, South Korea 2 Department of Computer

More information

Chapter 3 Application Software

Chapter 3 Application Software Chapter 3 Application Software Chapter 3 Objectives Identify the categories of application software Explain ways software is distributed Explain how to work with application software Identify the key features

More information

Experiments on Chinese-English Cross-language Retrieval at NTCIR-4

Experiments on Chinese-English Cross-language Retrieval at NTCIR-4 Working Notes of NTCIR-4, Tokyo, 2-4 June 2004 Experiments on Chinese-English Cross-language Retrieval at NTCIR-4 Yilu Zhou 1, Jialun Qin 1, Michael Chau 2, Hsinchun Chen 1 1 Department of Management Information

More information

DATA MINING TECHNIQUES AND APPLICATIONS

DATA MINING TECHNIQUES AND APPLICATIONS DATA MINING TECHNIQUES AND APPLICATIONS Mrs. Bharati M. Ramageri, Lecturer Modern Institute of Information Technology and Research, Department of Computer Application, Yamunanagar, Nigdi Pune, Maharashtra,

More information

NAVIGATING SCIENTIFIC LITERATURE A HOLISTIC PERSPECTIVE. Venu Govindaraju

NAVIGATING SCIENTIFIC LITERATURE A HOLISTIC PERSPECTIVE. Venu Govindaraju NAVIGATING SCIENTIFIC LITERATURE A HOLISTIC PERSPECTIVE Venu Govindaraju BIOMETRICS DOCUMENT ANALYSIS PATTERN RECOGNITION 8/24/2015 ICDAR- 2015 2 Towards a Globally Optimal Approach for Learning Deep Unsupervised

More information

Data Integration using Agent based Mediator-Wrapper Architecture. Tutorial Report For Agent Based Software Engineering (SENG 609.

Data Integration using Agent based Mediator-Wrapper Architecture. Tutorial Report For Agent Based Software Engineering (SENG 609. Data Integration using Agent based Mediator-Wrapper Architecture Tutorial Report For Agent Based Software Engineering (SENG 609.22) Presented by: George Shi Course Instructor: Dr. Behrouz H. Far December

More information

Special Topics in Computer Science

Special Topics in Computer Science Special Topics in Computer Science NLP in a Nutshell CS492B Spring Semester 2009 Jong C. Park Computer Science Department Korea Advanced Institute of Science and Technology INTRODUCTION Jong C. Park, CS

More information

Bilingual Dialogs with a Network Operating System

Bilingual Dialogs with a Network Operating System From:MAICS-99 Proceedings. Copyright 1999, AAAI (www.aaai.org). All rights reserved. Bilingual Dialogs with a Network Operating System Emad Al-Shawakfa, Computer Science Department, Illinois Institute

More information

Cross-Cultural Communication Training for Students in Multidisciplinary Research Area of Biomedical Engineering

Cross-Cultural Communication Training for Students in Multidisciplinary Research Area of Biomedical Engineering Cross-Cultural Communication Training for Students in Multidisciplinary Research Area of Biomedical Engineering Shigehiro HASHIMOTO Biomedical Engineering, Department of Mechanical Engineering, Kogakuin

More information