STRETCH : A System for Document Storage and Retrieval by Content

Size: px
Start display at page:

Download "STRETCH : A System for Document Storage and Retrieval by Content"

Transcription

1 STRETCH : A System for Document Storage and Retrieval by Content E. Appiani, L. Boato, S. Bruzzo, A.M. Colla, M. Davite and D. Sciarra RES Department Elsag spa Via G. Puccini, Genova (Italy) {enrico.appiani,luisa.boato,sandra.bruzzo,annamaria.colla,marco.davite,donatella.sciarra}@elsag.it Abstract In this paper a system for storing and retrieving imaged multimedia documents by content is described. This system is being developed within the Esprit project STRETCH (STorage and RETrieval by Content of imaged documents). The core of STRETCH system is a powerful Archiving and Retrieval Engine, based on a structured document representation and capable of activating appropriate methods to characterise and automatically index heterogeneous documents with variable layout and subsequently retrieve them by answering to complex queries. The produced document base, or Docu-base, relies on an object-oriented internal representation and related characterisation and search methods. A prototype was implemented and successfully tested, in particular, in the creation of an invoice archive. 1. Introduction STRETCH (STorage and RETrieval by Content of imaged documents), ESPRIT Project n , aims at developing a system for storing and retrieving imaged multimedia documents based on their content. STRETCH addresses the Archive Reference System (ARS) market, concerning heterogeneous applications where mass documentary databases are involved. Nowadays, pushed by both new technology developments and the increased need of augmenting the information diffusion and communication efficiency for enterprises, specialised users communities, and the public, there is an ever increasing demand of tools to automatically convert information hold on paper into digital information ( zeropaper option ). The objective of STRETCH is twofold. First, STRETCH aims at combining direct digitalisation, mostly based on location of information fields and OCR, with image indexing open to multimedia, by applying advanced techniques derived from Image Analysis and Pattern Recognition. STRETCH aims at developing a common Archiving and Retrieval shell based on a structured document representation and capable of activating appropriate functions to characterise and subsequently retrieve multimedia documents on users demand. To make such a system effective, the bottleneck of document profiling must be avoided, in particular by overcoming the existing limitations of pre-defined indexing schemes. Second, STRETCH must overcome the main limitations of current ARS systems, offering, in particular, ease of use and programming and ability to dynamically adapt to generic multimedia documents. STRETCH goal to realize a document archive according to the user's view requires the integration among innovative modeling techniques and well established automatic indexing techniques to enable content-based retrieval. The core technology employed is document processing in terms of image enhancement, layout analysis, field location, logo location and recognition, tag identification, Intelligent Character Recognition (ICR). The structure of each document is derived, the document is classified according to user specification, information contents are extracted from the relevant fields. The suitable document representation to support this complex processing has been designed accordingly, along with the corresponding database representation. In such a situation, STRETCH may be regarded as a document meta-engine [1] introducing document logic into existing databases focused on document fields. The document logic, based on STRETCH Document Internal Representation (DIR), is employed for document analysis and classification, information extraction, indexing and retrieval. STRETCH is being tested on three different environments, that are conspicuous examples of the application fields addressed, namely an account payable archive (invoices and related documents: bills of entry, transport documents, and so on), a document archive for the Public Administration (circular letters, statistical reports, and so on), and a medical image archive (miocardial SPECT maps, thorax radiographs). STRETCH started on December 1st, At the time of writing, STRETCH has successfully completed the user requirement capture phase and has consolidated

2 the technical specifications of the data model and of the overall system architecture. The system detailed design and implementation, based on object oriented (OO) approach and incremental prototype production, has been progressively revised and refined during the development stage, following an iterative assessment of the level of users satisfaction. A demonstrative system, endowed with most of the foreseen functionality, has been produced. This paper schematically describes STRETCH architecture, functionality and achievements to date. In the following we will briefly present the system architecture and components (Section 2), the data model (Section 3), and some preliminary experiments in the invoice domain (Section 4). Finally, Section 5 presents some conclusions. 2. Architecture and main components From an architectural point of view, the project relies on a client/server solution, networked through corporate Intranets and (possibly) Internet. The main achievement consists in the development of a powerful Archiving and Retrieval Engine (ARE), based on a general document representation and capable of activating appropriate methods to characterise and retrieve imaged documents. The user interface is constituted by a portable thin client. STRETCH architecture consists of: Client layer: a portable, user-friendly and intuitive Graphical User Interface (GUI); Server layer: a scalable Archiving and Retrieval Engine (ARE); Database layer: a structured document database, or Docu-base. Figure 1 summarizes the main functional components in the Client, Server and Docu-base layers. A standard Corba middleware links all the components in the Client and Server layers, ensuring interoperability among heterogeneous platforms. This choice allows to bring directly in any Corba compliant commercial component, while a Corba wrapping may be implemented for components which are not Corba compliant. The STRETCH Client, also called Docu-client, consists of the STRETCH GUI with related tools. Any other External Application client (for instance, a GUI of external applications extended with STRETCH access) can employ the STRETCH server components through their Corba interfaces. The GUI provides all the ARE services in a friendly way through extensive use of windows, pop-up menus, buttons, icons and thumbnails, as well as context sensitive menus and help windows. In particular, the GUI is used to manage and monitor user sessions and profiles; to allow the user to define new applications and document classes (Maintenance and Definition Tool); to acquire new documents directly by a scanner or other acquisition sources; to visualize archived or retrieved documents; and to inquire the archive. The STRETCH Server, also called Docu-server, is centered on the STRETCH Archiving and Retrieval Engine (ARE), using the Document Internal Representation (DIR). Due to the high-performance requirements and related scalability of most functions to be supported, the ARE can rely on distributed configurations including when necessary parallel machines. The ARE may also interoperate with specialized External Archiving/Retrieval Engines and External Applications, integrated as Corba components. The STRETCH Database, also called Docu-base, includes the main DBMS storing the DIR instances. The latter may contain references to External Databases, if required by application constraints, or to external specialized Archiving/Retrieval Engines. For example, the Docu-base can archive the document description while the external databases store the document images or specific fields. The STRETCH Maintenance and Definition Tool (MDT) consists of both Client and Server components, in charge of application definition, user profile management and system configuration. STRETCH is an open system, due to the standard middleware and the interoperability with external components and applications. Examples of support or external components can be for instance a text retrieval engine or an ERP system. Some basic implementation choices are Java for implementing Client objects and for top-level Server session objects; C++ for Server internal service objects (DIR Objects) and the ARE Manager, in particular for code efficiency; and the adoption of an OODB schema based on the DIR and User Profile classes, permanent objects in the system. DIR methods relevant for retrieval are defined both in the server objects and in the OODB. 3. The data model Any document can be described with respect to two different aspects: the physical and the logical one. The physical structure of a document, also called layout structure, is the collection of the extracted objects, obtained by the repeated partition of the document content into increasingly smaller parts (basic objects), on the basis of the layout appearance. An object of the layout structure is also called physical object. Similarly, the logical structure of a document is the collection of the extracted objects, obtained by the repeated division of the document content into increasingly smaller parts, on the basis of the human perceptible meaning of the content. An object of the logical structure is called logical object.

3 A domain of documents can be defined as a group of documents which can be clustered with respect to their subject or use according to users view: for example journals, tax forms, business letters, invoices, check forms can be regarded as different domains. Since the documents of a domain share the same main subject and are used for similar or related functions, they are characterized by some logical and physical similarities. Some logical objects are common to all the documents of the domain. For example, in an invoice the logical object Total, which contains the amount due to the issuer, is always present. Similarly, a logical object in different types of documents of the same domain, usually keeps some physical features related to its position within the document. Documents belonging to a given domain can be further characterized by different layout or logical structures. Thus documents which feature physical or contextual similarities can be clustered into classes. The internal data structure to represent documents, the so-called Document Internal Representation (DIR), is defined in terms of class objects, with their relevant information and methods to process such information. The DIR objects describe a document according to physical and logical viewpoints. For most applications among those considered by STRETCH, the DIR data structure is based on Modified X-Y trees (Sect. 4.1). The DIR represents the core system data since the Archiving Retrieval Engine uses it during archiving (DIR generation) and retrieval (access and feature matching). The domain representation is based on a similar approach: domain objects are template structures and fields whose methods are the strategies to process new documents and extract information from them. To implement domain knowledge, relevant to specific documents and types of physical structures, we make use of a template structure named correlation graph (Sect. 4.2). The correlation graph makes it possible the implementation of information extraction strategies based on advanced image recognition and reading technologies Modified X-Y tree The Modified X-Y tree (M-X-Y tree) [2] is derived from the X-Y tree [3,4], a well-known data-driven method for page layout analysis. The M-X-Y tree is well suited to the physical representation of documents with complex layout. The basic assumption behind this approach is the fact that structured elements of the page (columns, paragraphs, titles, figures, lines of text, printed symbols) are generally laid out in rectangular blocks, which can almost always be divided into groups in such a way that blocks that are adjacent to one another within a group have one dimension in common [3]. The method consists in using thresholded projection profiles (i.e. the histogram of the number of black pixels along parallel lines through the document) in order to split the document into successively smaller rectangular blocks [4]. Depending on the direction of lines, we can have horizontal or vertical projection profiles. Thresholded projection profiles are obtained by comparing the values of a projection profile with a given threshold. The blocks are split by alternately making horizontal and vertical cuts along either white spaces, found by using the thresholded projection profile, or horizontal or vertical ruler lines. The result of such segmentation can be represented as a tree, where the root is for the whole page, the leaves are for blocks of the page, whereas each level alternatively represents the results of horizontal (X-cut) or vertical (Y-cut) segmentation. In order to maintain consistency in the data representation, the ruler lines, although used as separators, are also stored as leaves in the M-X-Y tree. The tree structure is enriched with descriptions of inter-leaves relationships. Adjacency links among leaves of the tree can be seen as an adjacency graph, where nodes of the graph correspond to leaves of the tree. An adjacency graph [5] describes the structure of a document by giving the position of nearest objects in the horizontal and vertical directions (above, below, left, right relations) Correlation graph The correlation graph is a template structure used to implement domain knowledge, possibly automatically extracted from document samples [6]. This representation is suited for variable layout documents with some spatial structure, or for documents whose semantics can be recognized looking for textual tags in the image. It can be applied to either a full document, or a part of it. The correlation graph describes a document understanding strategy which uses both the predefined template elements (implementing field reading inside a search area in the image, for each field type), and search area computations for fields to be read based on the position of other already found fields. Meaningful fields are of three main types: (i) fields to be read as ASCII strings by the recognition strategy; (ii) textual or geometric tags used by the recognition strategy to understand the document structure; and (iii) image fields that can be recognized by suitable methods (i.e. logos). 4. The invoice application 4.1. Passive invoice management In the scenario of passive invoice management, STRETCH aims at providing, on one hand, data entry automation for VAT recording purposes, interfacing an

4 ERP system, and on the other hand the invoice acquisition, archiving and retrieval capabilities that make the electronic copy immediately available to all the authorized users. In STRETCH environment new invoices can be grouped into batches to be scanned. The acquisition process produces the electronic copy of invoices, in a suitable format that can vary from binary up to colour images depending on users requirements. New invoices can be automatically input to the information extraction procedure, mainly consisting of document classification and automatic ICR reading. It is mandatory for the extracted data, which are to be used for indexing and as input to the VAT recording procedure, to be error free, so a supervision phase before archiving and VAT registration is advisable. The electronic copy of invoices is then archived in the docu-base with the previously extracted archiving indexes. The ICR recognition results also provide automatic data entry to the VAT registration procedure, usually part of the ERP system. The retrieval function is reserved to the authorized users, and makes all archived documents available for immediate consulting with the advantage of eliminating circulation of paper copies. Content-based retrieval allows to find out invoices by means of any partial information. The prototype is centered on INFORMATION EXTRACTION (document classification and reading) and ARCHIVING processes. The SUPERVISION procedure simply consists in presenting the invoice image together with the recognized fields. The user can correct any information, then confirm the data, that will not be modified any more after archiving. The ARCHIVING process directly demonstrates how the recognition results map into STRETCH internal knowledge representation structure, at the moment stored in a relational database. The RETRIEVAL functions are based on the presentation of a form for Query Definition. SQL queries are allowed on the values of known fields, with standard AND-OR expressions Information extraction Three document processing steps are activated in order to extract information for indexing and for the VAT registration procedure (see Figure 2): first the M-X-Y tree generation produces the M-X-Y tree representation of the invoice (see Sect. 4.1); then the classification procedure based on the M-X-Y tree produces the document classification, i.e. the supplier identification; last, the reading strategy [6] for that supplier is applied, which is based on ICR techniques including field finding, neural character reading [7], tag finder and logo recognition [8]. The ICR reader locates and reads the information written on invoices issued by a given supplier. If the supplier identification fails, a general reading strategy can be applied. The ICR result is a set of text strings used as indexes by the archiving procedure and a set of data used as input by the VAT recording procedure. For each information field a basic type is assigned that defines how that field value is interpreted during retrieval. For example a date field older than a certain threshold can be searched, as well as a string field similar to a certain word. A set of tags that have a significant spatial relation with information fields is internally employed by the reading strategy: Date, Invoice Number, Total, VAT,. A set of the most relevant fields from user requirements was selected for the prototype. This set consists of: Supplier (string): the supplier name inherited from the MXY-based classifier, used both as an archiving index and for VAT registration; the supplier logo is located as an accessory information; Date (date): date of issue, used both as an index and for VAT registration; Invoice number (string): used as an index and for VAT registration; Total (integer): the total amount of the invoice, used for VAT registration; IVA (integer): the total amount of Italian VAT tax Preliminary experimental results The invoice documents used as a test set for the demo system were 250 real passive invoices of a company of the Finmeccanica Group. They show different layouts, various styles and many different fonts and font sizes. All the invoices show a company logo, usually in one-to-one correspondence with the supplier, but those issued by one supplier have neither a fixed layout, nor a unique standard writing style. All the documents in the test set are composed of a single page. The acquisition produced binary (black and white) images, with 300 DPI x 300 DPI resolution. No specific filtering or enhancement was applied to the images. The information extraction stage performance was: the M-X-Y tree-based classification achieved 97.8% correct classification in top position; fields were correctly located in 98.4% cases; automatic reading of the field values produced a total of 31 misclassification errors (96.9% correct on fields, 100% on tags); problems were mainly encountered with very noisy images, dot matrix and italic fonts. 5. Conclusions This paper has presented a short architectural and functional description of the STRETCH system, along with the current achievements. The demonstrative system implemented for automated invoice processing has been

5 briefly described and some experimental results presented. The system relies on an open three-tiered architecture, with the capability to interoperate with external applications, engines and databases. Such openness takes into account that advanced technology is nowadays available for document processing. What is expected from STRETCH is to provide the document logic viewpoint above the either flat or explicitly indexed archives of text or images. STRETCH openness, together with the adopted standard middleware and object-oriented approach, will allow to integrate future technology innovations. Acknowledgements We would like to acknowledge the contributions by all STRETCH workteam, in particular by P. Penna (AET, Genova), E. Francesconi and S. Marinai (DSI University of Firenze), M. Diligenti (DI University of Siena). 6. References [1] M. Beigi et al., MetaSEEK: A Content-Based Meta-Search Engine for Images, SPIE Proceedings on Storage and Retrieval for Image and Video Databases, vol. 3312, Jan [2] F. Cesarini, M. Gori, S. Marinai, G. Soda, Structured document segmentation and representation by the modified X-Y tree, Proc. ICDAR 99 (to appear) [3] G. Nagy and S. Seth, Hierarchical representation of optically scanned documents, in Proc. of the International Conference on Pattern Recognition, pp , [4] G. Nagy and M. Viswanathan, Dual representation of segmented technical documents, in Proc. First Int'l Conf. Document Anal. Recog., pp , [5] J. Yuan, Y. Y. Tang, and C. Y. Suen, Four directional adjacency graphs (FDAG) and their application in locating fields in forms, in Proc. Third Int'l Conf. Document Anal. Recog., (Montreal, Canada), pp , [6] L. Boato, E. Cattani, M. Davite, B. Villa, Automatic Programming of Variable Layout Image Documents Reading Applications based on Minimum Description Length Induction, AI*IA Workshop on Automatic Learning and Natural Language, Turin, Italy, Dec [7] A.M. Colla, P. Pedrazzi, Single and Coupled Neural Handprinted Character Classifiers, in M. Marinaro and P.G. Morasso (Ed.s), ICANN 94 Proc. Intl. Conf. on ARTIFICIAL NEURAL NETWORKS, Sorrento, Italy, May , vol. II, pp , Springer-Verlag (1994). [8] M.Corvi, E.Ottaviani, "Multiresolution logo recognition", Proc. Int. Workshop on Visual Form, Capri, Acquisition GUI Maintenance & Definition Tool (C) Docu-client Enhancement Segmentation Layout Analysis Docu-server Docubase Archiving / Retrieval Engine Content-based Image Search Content-based Docum. Analysis Information Retrieval OCR / ICR DBMS Maintenance & Definition Tool (S) Document Internal Repres. Image Analysis Document Internal Repres. Database Instances Figure 1. Main layers with functional modules and data. MXY Generation MXY-based CLASSIFIER Image MXY Tree Supplier Name Document Class Reading Strategy Information from Fields Figure 2. The recognition process for the invoice application.

A Framework of Personalized Intelligent Document and Information Management System

A Framework of Personalized Intelligent Document and Information Management System A Framework of Personalized Intelligent and Information Management System Xien Fan Department of Computer Science, College of Staten Island, City University of New York, Staten Island, NY 10314, USA Fang

More information

Component visualization methods for large legacy software in C/C++

Component visualization methods for large legacy software in C/C++ Annales Mathematicae et Informaticae 44 (2015) pp. 23 33 http://ami.ektf.hu Component visualization methods for large legacy software in C/C++ Máté Cserép a, Dániel Krupp b a Eötvös Loránd University mcserep@caesar.elte.hu

More information

Event-based middleware services

Event-based middleware services 3 Event-based middleware services The term event service has different definitions. In general, an event service connects producers of information and interested consumers. The service acquires events

More information

Distributed Database for Environmental Data Integration

Distributed Database for Environmental Data Integration Distributed Database for Environmental Data Integration A. Amato', V. Di Lecce2, and V. Piuri 3 II Engineering Faculty of Politecnico di Bari - Italy 2 DIASS, Politecnico di Bari, Italy 3Dept Information

More information

How To Make Sense Of Data With Altilia

How To Make Sense Of Data With Altilia HOW TO MAKE SENSE OF BIG DATA TO BETTER DRIVE BUSINESS PROCESSES, IMPROVE DECISION-MAKING, AND SUCCESSFULLY COMPETE IN TODAY S MARKETS. ALTILIA turns Big Data into Smart Data and enables businesses to

More information

Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Turban, Aronson, and Liang Decision Support Systems and Intelligent Systems, Seventh Edition Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

More information

Natural Language to Relational Query by Using Parsing Compiler

Natural Language to Relational Query by Using Parsing Compiler Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 4, Issue. 3, March 2015,

More information

A Grid Architecture for Manufacturing Database System

A Grid Architecture for Manufacturing Database System Database Systems Journal vol. II, no. 2/2011 23 A Grid Architecture for Manufacturing Database System Laurentiu CIOVICĂ, Constantin Daniel AVRAM Economic Informatics Department, Academy of Economic Studies

More information

Course 803401 DSS. Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

Course 803401 DSS. Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Oman College of Management and Technology Course 803401 DSS Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization CS/MIS Department Information Sharing

More information

Chapter 5. Warehousing, Data Acquisition, Data. Visualization

Chapter 5. Warehousing, Data Acquisition, Data. Visualization Decision Support Systems and Intelligent Systems, Seventh Edition Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization 5-1 Learning Objectives

More information

How To Fix Out Of Focus And Blur Images With A Dynamic Template Matching Algorithm

How To Fix Out Of Focus And Blur Images With A Dynamic Template Matching Algorithm IJSTE - International Journal of Science Technology & Engineering Volume 1 Issue 10 April 2015 ISSN (online): 2349-784X Image Estimation Algorithm for Out of Focus and Blur Images to Retrieve the Barcode

More information

2. Distributed Handwriting Recognition. Abstract. 1. Introduction

2. Distributed Handwriting Recognition. Abstract. 1. Introduction XPEN: An XML Based Format for Distributed Online Handwriting Recognition A.P.Lenaghan, R.R.Malyan, School of Computing and Information Systems, Kingston University, UK {a.lenaghan,r.malyan}@kingston.ac.uk

More information

Fluency With Information Technology CSE100/IMT100

Fluency With Information Technology CSE100/IMT100 Fluency With Information Technology CSE100/IMT100 ),7 Larry Snyder & Mel Oyler, Instructors Ariel Kemp, Isaac Kunen, Gerome Miklau & Sean Squires, Teaching Assistants University of Washington, Autumn 1999

More information

2 AIMS: an Agent-based Intelligent Tool for Informational Support

2 AIMS: an Agent-based Intelligent Tool for Informational Support Aroyo, L. & Dicheva, D. (2000). Domain and user knowledge in a web-based courseware engineering course, knowledge-based software engineering. In T. Hruska, M. Hashimoto (Eds.) Joint Conference knowledge-based

More information

Modeling the User Interface of Web Applications with UML

Modeling the User Interface of Web Applications with UML Modeling the User Interface of Web Applications with UML Rolf Hennicker,Nora Koch,2 Institute of Computer Science Ludwig-Maximilians-University Munich Oettingenstr. 67 80538 München, Germany {kochn,hennicke}@informatik.uni-muenchen.de

More information

Cooperative and Fast-Learning Information Extraction from Business Documents for Document Archiving

Cooperative and Fast-Learning Information Extraction from Business Documents for Document Archiving Cooperative and Fast-Learning Information Extraction from Business Documents for Document Archiving Daniel Esser Technical University Dresden Computer Networks Group 01062 Dresden, Germany daniel.esser@tu-dresden.de

More information

Foundations of Business Intelligence: Databases and Information Management

Foundations of Business Intelligence: Databases and Information Management Foundations of Business Intelligence: Databases and Information Management Problem: HP s numerous systems unable to deliver the information needed for a complete picture of business operations, lack of

More information

A MACHINE LEARNING APPROACH TO FILTER UNWANTED MESSAGES FROM ONLINE SOCIAL NETWORKS

A MACHINE LEARNING APPROACH TO FILTER UNWANTED MESSAGES FROM ONLINE SOCIAL NETWORKS A MACHINE LEARNING APPROACH TO FILTER UNWANTED MESSAGES FROM ONLINE SOCIAL NETWORKS Charanma.P 1, P. Ganesh Kumar 2, 1 PG Scholar, 2 Assistant Professor,Department of Information Technology, Anna University

More information

Technical Information Abstract

Technical Information Abstract 1/15 Technical Information Abstract Disclaimer: in no event shall Microarea be liable for any special, indirect or consequential damages or any damages whatsoever resulting from loss of use, data or profits,

More information

Christoph Schlenzig 1

Christoph Schlenzig 1 EnviroInfo 2002 (Wien) Environmental Communication in the Information Society - Proceedings of the 16th Conference The MESAP Software for the German Emission Inventory An integrated information system

More information

HELP DESK SYSTEMS. Using CaseBased Reasoning

HELP DESK SYSTEMS. Using CaseBased Reasoning HELP DESK SYSTEMS Using CaseBased Reasoning Topics Covered Today What is Help-Desk? Components of HelpDesk Systems Types Of HelpDesk Systems Used Need for CBR in HelpDesk Systems GE Helpdesk using ReMind

More information

Selbo 2 an Environment for Creating Electronic Content in Software Engineering

Selbo 2 an Environment for Creating Electronic Content in Software Engineering BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 9, No 3 Sofia 2009 Selbo 2 an Environment for Creating Electronic Content in Software Engineering Damyan Mitev 1, Stanimir

More information

PSG College of Technology, Coimbatore-641 004 Department of Computer & Information Sciences BSc (CT) G1 & G2 Sixth Semester PROJECT DETAILS.

PSG College of Technology, Coimbatore-641 004 Department of Computer & Information Sciences BSc (CT) G1 & G2 Sixth Semester PROJECT DETAILS. PSG College of Technology, Coimbatore-641 004 Department of Computer & Information Sciences BSc (CT) G1 & G2 Sixth Semester PROJECT DETAILS Project Project Title Area of Abstract No Specialization 1. Software

More information

5.5 Copyright 2011 Pearson Education, Inc. publishing as Prentice Hall. Figure 5-2

5.5 Copyright 2011 Pearson Education, Inc. publishing as Prentice Hall. Figure 5-2 Class Announcements TIM 50 - Business Information Systems Lecture 15 Database Assignment 2 posted Due Tuesday 5/26 UC Santa Cruz May 19, 2015 Database: Collection of related files containing records on

More information

Skills for Employment Investment Project (SEIP)

Skills for Employment Investment Project (SEIP) Skills for Employment Investment Project (SEIP) Standards/ Curriculum Format for Web Application Development Using DOT Net Course Duration: Three Months 1 Course Structure and Requirements Course Title:

More information

A THREE-TIERED WEB BASED EXPLORATION AND REPORTING TOOL FOR DATA MINING

A THREE-TIERED WEB BASED EXPLORATION AND REPORTING TOOL FOR DATA MINING A THREE-TIERED WEB BASED EXPLORATION AND REPORTING TOOL FOR DATA MINING Ahmet Selman BOZKIR Hacettepe University Computer Engineering Department, Ankara, Turkey selman@cs.hacettepe.edu.tr Ebru Akcapinar

More information

Filtering Noisy Contents in Online Social Network by using Rule Based Filtering System

Filtering Noisy Contents in Online Social Network by using Rule Based Filtering System Filtering Noisy Contents in Online Social Network by using Rule Based Filtering System Bala Kumari P 1, Bercelin Rose Mary W 2 and Devi Mareeswari M 3 1, 2, 3 M.TECH / IT, Dr.Sivanthi Aditanar College

More information

The Re-emergence of Data Capture Technology

The Re-emergence of Data Capture Technology The Re-emergence of Data Capture Technology Understanding Today s Digital Capture Solutions Digital capture is a key enabling technology in a business world striving to balance the shifting advantages

More information

Using Data Mining for Mobile Communication Clustering and Characterization

Using Data Mining for Mobile Communication Clustering and Characterization Using Data Mining for Mobile Communication Clustering and Characterization A. Bascacov *, C. Cernazanu ** and M. Marcu ** * Lasting Software, Timisoara, Romania ** Politehnica University of Timisoara/Computer

More information

Reusable Knowledge-based Components for Building Software. Applications: A Knowledge Modelling Approach

Reusable Knowledge-based Components for Building Software. Applications: A Knowledge Modelling Approach Reusable Knowledge-based Components for Building Software Applications: A Knowledge Modelling Approach Martin Molina, Jose L. Sierra, Jose Cuena Department of Artificial Intelligence, Technical University

More information

A Workbench for Prototyping XML Data Exchange (extended abstract)

A Workbench for Prototyping XML Data Exchange (extended abstract) A Workbench for Prototyping XML Data Exchange (extended abstract) Renzo Orsini and Augusto Celentano Università Ca Foscari di Venezia, Dipartimento di Informatica via Torino 155, 30172 Mestre (VE), Italy

More information

FreeForm Designer. Phone: +972-9-8309999 Fax: +972-9-8309998 POB 8792, Natanya, 42505 Israel www.autofont.com. Document2

FreeForm Designer. Phone: +972-9-8309999 Fax: +972-9-8309998 POB 8792, Natanya, 42505 Israel www.autofont.com. Document2 FreeForm Designer FreeForm Designer enables designing smart forms based on industry-standard MS Word editing features. FreeForm Designer does not require any knowledge of or training in programming languages

More information

Databases in Organizations

Databases in Organizations The following is an excerpt from a draft chapter of a new enterprise architecture text book that is currently under development entitled Enterprise Architecture: Principles and Practice by Brian Cameron

More information

Automatic Extraction of Signatures from Bank Cheques and other Documents

Automatic Extraction of Signatures from Bank Cheques and other Documents Automatic Extraction of Signatures from Bank Cheques and other Documents Vamsi Krishna Madasu *, Mohd. Hafizuddin Mohd. Yusof, M. Hanmandlu ß, Kurt Kubik * *Intelligent Real-Time Imaging and Sensing group,

More information

Self-Service Business Intelligence

Self-Service Business Intelligence Self-Service Business Intelligence BRIDGE THE GAP VISUALIZE DATA, DISCOVER TRENDS, SHARE FINDINGS Solgenia Analysis provides users throughout your organization with flexible tools to create and share meaningful

More information

LOCAL SURFACE PATCH BASED TIME ATTENDANCE SYSTEM USING FACE. indhubatchvsa@gmail.com

LOCAL SURFACE PATCH BASED TIME ATTENDANCE SYSTEM USING FACE. indhubatchvsa@gmail.com LOCAL SURFACE PATCH BASED TIME ATTENDANCE SYSTEM USING FACE 1 S.Manikandan, 2 S.Abirami, 2 R.Indumathi, 2 R.Nandhini, 2 T.Nanthini 1 Assistant Professor, VSA group of institution, Salem. 2 BE(ECE), VSA

More information

Client/server is a network architecture that divides functions into client and server

Client/server is a network architecture that divides functions into client and server Page 1 A. Title Client/Server Technology B. Introduction Client/server is a network architecture that divides functions into client and server subsystems, with standard communication methods to facilitate

More information

Email Spam Detection Using Customized SimHash Function

Email Spam Detection Using Customized SimHash Function International Journal of Research Studies in Computer Science and Engineering (IJRSCSE) Volume 1, Issue 8, December 2014, PP 35-40 ISSN 2349-4840 (Print) & ISSN 2349-4859 (Online) www.arcjournals.org Email

More information

SIPAC. Signals and Data Identification, Processing, Analysis, and Classification

SIPAC. Signals and Data Identification, Processing, Analysis, and Classification SIPAC Signals and Data Identification, Processing, Analysis, and Classification Framework for Mass Data Processing with Modules for Data Storage, Production and Configuration SIPAC key features SIPAC is

More information

Course Syllabus For Operations Management. Management Information Systems

Course Syllabus For Operations Management. Management Information Systems For Operations Management and Management Information Systems Department School Year First Year First Year First Year Second year Second year Second year Third year Third year Third year Third year Third

More information

Web. Studio. Visual Studio. iseries. Studio. The universal development platform applied to corporate strategy. Adelia. www.hardis.

Web. Studio. Visual Studio. iseries. Studio. The universal development platform applied to corporate strategy. Adelia. www.hardis. Web Studio Visual Studio iseries Studio The universal development platform applied to corporate strategy Adelia www.hardis.com The choice of a CASE tool does not only depend on the quality of the offer

More information

SOFT FLOW 2012 PRODUCT OVERVIEW

SOFT FLOW 2012 PRODUCT OVERVIEW SOFT FLOW 2012 PRODUCT OVERVIEW Copyright 2010-2012 Soft Click 1 About Soft Flow Platform Welcome to Soft Flow, the most flexible and easiest to use document management and business process management

More information

DATA MINING TOOL FOR INTEGRATED COMPLAINT MANAGEMENT SYSTEM WEKA 3.6.7

DATA MINING TOOL FOR INTEGRATED COMPLAINT MANAGEMENT SYSTEM WEKA 3.6.7 DATA MINING TOOL FOR INTEGRATED COMPLAINT MANAGEMENT SYSTEM WEKA 3.6.7 UNDER THE GUIDANCE Dr. N.P. DHAVALE, DGM, INFINET Department SUBMITTED TO INSTITUTE FOR DEVELOPMENT AND RESEARCH IN BANKING TECHNOLOGY

More information

IFS-8000 V2.0 INFORMATION FUSION SYSTEM

IFS-8000 V2.0 INFORMATION FUSION SYSTEM IFS-8000 V2.0 INFORMATION FUSION SYSTEM IFS-8000 V2.0 Overview IFS-8000 v2.0 is a flexible, scalable and modular IT system to support the processes of aggregation of information from intercepts to intelligence

More information

NAVIGATING SCIENTIFIC LITERATURE A HOLISTIC PERSPECTIVE. Venu Govindaraju

NAVIGATING SCIENTIFIC LITERATURE A HOLISTIC PERSPECTIVE. Venu Govindaraju NAVIGATING SCIENTIFIC LITERATURE A HOLISTIC PERSPECTIVE Venu Govindaraju BIOMETRICS DOCUMENT ANALYSIS PATTERN RECOGNITION 8/24/2015 ICDAR- 2015 2 Towards a Globally Optimal Approach for Learning Deep Unsupervised

More information

ifinder ENTERPRISE SEARCH

ifinder ENTERPRISE SEARCH DATA SHEET ifinder ENTERPRISE SEARCH ifinder - the Enterprise Search solution for company-wide information search, information logistics and text mining. CUSTOMER QUOTE IntraFind stands for high quality

More information

Chapter 6 FOUNDATIONS OF BUSINESS INTELLIGENCE: DATABASES AND INFORMATION MANAGEMENT Learning Objectives

Chapter 6 FOUNDATIONS OF BUSINESS INTELLIGENCE: DATABASES AND INFORMATION MANAGEMENT Learning Objectives Chapter 6 FOUNDATIONS OF BUSINESS INTELLIGENCE: DATABASES AND INFORMATION MANAGEMENT Learning Objectives Describe how the problems of managing data resources in a traditional file environment are solved

More information

Software Life-Cycle Management

Software Life-Cycle Management Ingo Arnold Department Computer Science University of Basel Theory Software Life-Cycle Management Architecture Styles Overview An Architecture Style expresses a fundamental structural organization schema

More information

M3039 MPEG 97/ January 1998

M3039 MPEG 97/ January 1998 INTERNATIONAL ORGANISATION FOR STANDARDISATION ORGANISATION INTERNATIONALE DE NORMALISATION ISO/IEC JTC1/SC29/WG11 CODING OF MOVING PICTURES AND ASSOCIATED AUDIO INFORMATION ISO/IEC JTC1/SC29/WG11 M3039

More information

Blog Post Extraction Using Title Finding

Blog Post Extraction Using Title Finding Blog Post Extraction Using Title Finding Linhai Song 1, 2, Xueqi Cheng 1, Yan Guo 1, Bo Wu 1, 2, Yu Wang 1, 2 1 Institute of Computing Technology, Chinese Academy of Sciences, Beijing 2 Graduate School

More information

Visualization methods for patent data

Visualization methods for patent data Visualization methods for patent data Treparel 2013 Dr. Anton Heijs (CTO & Founder) Delft, The Netherlands Introduction Treparel can provide advanced visualizations for patent data. This document describes

More information

The Role of Size Normalization on the Recognition Rate of Handwritten Numerals

The Role of Size Normalization on the Recognition Rate of Handwritten Numerals The Role of Size Normalization on the Recognition Rate of Handwritten Numerals Chun Lei He, Ping Zhang, Jianxiong Dong, Ching Y. Suen, Tien D. Bui Centre for Pattern Recognition and Machine Intelligence,

More information

Fast and Easy Delivery of Data Mining Insights to Reporting Systems

Fast and Easy Delivery of Data Mining Insights to Reporting Systems Fast and Easy Delivery of Data Mining Insights to Reporting Systems Ruben Pulido, Christoph Sieb rpulido@de.ibm.com, christoph.sieb@de.ibm.com Abstract: During the last decade data mining and predictive

More information

Course 103402 MIS. Foundations of Business Intelligence

Course 103402 MIS. Foundations of Business Intelligence Oman College of Management and Technology Course 103402 MIS Topic 5 Foundations of Business Intelligence CS/MIS Department Organizing Data in a Traditional File Environment File organization concepts Database:

More information

Common Questions and Concerns About Documentum at NEF

Common Questions and Concerns About Documentum at NEF LES/NEF 220 W Broadway Suite B Hobbs, NM 88240 Documentum FAQ Common Questions and Concerns About Documentum at NEF Introduction...2 What is Documentum?...2 How does Documentum work?...2 How do I access

More information

Code Generation for Mobile Terminals Remote Accessing to the Database Based on Object Relational Mapping

Code Generation for Mobile Terminals Remote Accessing to the Database Based on Object Relational Mapping , pp.35-44 http://dx.doi.org/10.14257/ijdta.2013.6.5.04 Code Generation for Mobile Terminals Remote Accessing to the Database Based on Object Relational Mapping Wen Hu and Yan li Zhao School of Computer

More information

Proc. of the 3rd Intl. Conf. on Document Analysis and Recognition, Montreal, Canada, August 1995. 1

Proc. of the 3rd Intl. Conf. on Document Analysis and Recognition, Montreal, Canada, August 1995. 1 Proc. of the 3rd Intl. Conf. on Document Analysis and Recognition, Montreal, Canada, August 1995. 1 A Map Acquisition, Storage, Indexing, and Retrieval System Hanan Samet Aya Soer Computer Science Department

More information

Oracle8i Spatial: Experiences with Extensible Databases

Oracle8i Spatial: Experiences with Extensible Databases Oracle8i Spatial: Experiences with Extensible Databases Siva Ravada and Jayant Sharma Spatial Products Division Oracle Corporation One Oracle Drive Nashua NH-03062 {sravada,jsharma}@us.oracle.com 1 Introduction

More information

Data Analytics and Reporting in Toll Management and Supervision System Case study Bosnia and Herzegovina

Data Analytics and Reporting in Toll Management and Supervision System Case study Bosnia and Herzegovina Data Analytics and Reporting in Toll Management and Supervision System Case study Bosnia and Herzegovina Gordana Radivojević 1, Gorana Šormaz 2, Pavle Kostić 3, Bratislav Lazić 4, Aleksandar Šenborn 5,

More information

File Magic 5 Series. The power to share information PRODUCT OVERVIEW. Revised November 2004

File Magic 5 Series. The power to share information PRODUCT OVERVIEW. Revised November 2004 File Magic 5 Series The power to share information PRODUCT OVERVIEW Revised November 2004 Copyrights, Legal Notices, Trademarks and Servicemarks Copyright 2004 Westbrook Technologies Incorporated. All

More information

Managing Large Imagery Databases via the Web

Managing Large Imagery Databases via the Web 'Photogrammetric Week 01' D. Fritsch & R. Spiller, Eds. Wichmann Verlag, Heidelberg 2001. Meyer 309 Managing Large Imagery Databases via the Web UWE MEYER, Dortmund ABSTRACT The terramapserver system is

More information

Master s Program in Information Systems

Master s Program in Information Systems The University of Jordan King Abdullah II School for Information Technology Department of Information Systems Master s Program in Information Systems 2006/2007 Study Plan Master Degree in Information Systems

More information

SERVICE-ORIENTED MODELING FRAMEWORK (SOMF ) SERVICE-ORIENTED SOFTWARE ARCHITECTURE MODEL LANGUAGE SPECIFICATIONS

SERVICE-ORIENTED MODELING FRAMEWORK (SOMF ) SERVICE-ORIENTED SOFTWARE ARCHITECTURE MODEL LANGUAGE SPECIFICATIONS SERVICE-ORIENTED MODELING FRAMEWORK (SOMF ) VERSION 2.1 SERVICE-ORIENTED SOFTWARE ARCHITECTURE MODEL LANGUAGE SPECIFICATIONS 1 TABLE OF CONTENTS INTRODUCTION... 3 About The Service-Oriented Modeling Framework

More information

HOW TO DO A SMART DATA PROJECT

HOW TO DO A SMART DATA PROJECT April 2014 Smart Data Strategies HOW TO DO A SMART DATA PROJECT Guideline www.altiliagroup.com Summary ALTILIA s approach to Smart Data PROJECTS 3 1. BUSINESS USE CASE DEFINITION 4 2. PROJECT PLANNING

More information

Foundations of Business Intelligence: Databases and Information Management

Foundations of Business Intelligence: Databases and Information Management Foundations of Business Intelligence: Databases and Information Management Content Problems of managing data resources in a traditional file environment Capabilities and value of a database management

More information

Intelligent Agents Serving Based On The Society Information

Intelligent Agents Serving Based On The Society Information Intelligent Agents Serving Based On The Society Information Sanem SARIEL Istanbul Technical University, Computer Engineering Department, Istanbul, TURKEY sariel@cs.itu.edu.tr B. Tevfik AKGUN Yildiz Technical

More information

CHAPTER 2 MODELLING FOR DISTRIBUTED NETWORK SYSTEMS: THE CLIENT- SERVER MODEL

CHAPTER 2 MODELLING FOR DISTRIBUTED NETWORK SYSTEMS: THE CLIENT- SERVER MODEL CHAPTER 2 MODELLING FOR DISTRIBUTED NETWORK SYSTEMS: THE CLIENT- SERVER MODEL This chapter is to introduce the client-server model and its role in the development of distributed network systems. The chapter

More information

AHUDesigner. The Air Handling Units selection software. Product description

AHUDesigner. The Air Handling Units selection software. Product description AHUDesigner The Air Handling Units selection software Product description Table of contents INTRODUCTION... 4 AHU SELECTION SOFTWARE FUNCTIONAL SPECIFICATIONS... 5 Definition of unit configuration... 5

More information

The Development of Multimedia-Multilingual Document Storage, Retrieval and Delivery System for E-Organization (STREDEO PROJECT)

The Development of Multimedia-Multilingual Document Storage, Retrieval and Delivery System for E-Organization (STREDEO PROJECT) The Development of Multimedia-Multilingual Storage, Retrieval and Delivery for E-Organization (STREDEO PROJECT) Asanee Kawtrakul, Kajornsak Julavittayanukool, Mukda Suktarachan, Patcharee Varasrai, Nathavit

More information

Implementation of OCR Based on Template Matching and Integrating it in Android Application

Implementation of OCR Based on Template Matching and Integrating it in Android Application International Journal of Computer Sciences and EngineeringOpen Access Technical Paper Volume-04, Issue-02 E-ISSN: 2347-2693 Implementation of OCR Based on Template Matching and Integrating it in Android

More information

International Journal of Scientific & Engineering Research, Volume 5, Issue 4, April-2014 442 ISSN 2229-5518

International Journal of Scientific & Engineering Research, Volume 5, Issue 4, April-2014 442 ISSN 2229-5518 International Journal of Scientific & Engineering Research, Volume 5, Issue 4, April-2014 442 Over viewing issues of data mining with highlights of data warehousing Rushabh H. Baldaniya, Prof H.J.Baldaniya,

More information

MODEL OF SOFTWARE AGENT FOR NETWORK SECURITY ANALYSIS

MODEL OF SOFTWARE AGENT FOR NETWORK SECURITY ANALYSIS MODEL OF SOFTWARE AGENT FOR NETWORK SECURITY ANALYSIS Hristo Emilov Froloshki Department of telecommunications, Technical University of Sofia, 8 Kliment Ohridski st., 000, phone: +359 2 965 234, e-mail:

More information

CONDIS. IT Service Management and CMDB

CONDIS. IT Service Management and CMDB CONDIS IT Service and CMDB 2/17 Table of contents 1. Executive Summary... 3 2. ITIL Overview... 4 2.1 How CONDIS supports ITIL processes... 5 2.1.1 Incident... 5 2.1.2 Problem... 5 2.1.3 Configuration...

More information

PROCESSING & MANAGEMENT OF INBOUND TRANSACTIONAL CONTENT

PROCESSING & MANAGEMENT OF INBOUND TRANSACTIONAL CONTENT PROCESSING & MANAGEMENT OF INBOUND TRANSACTIONAL CONTENT IN THE GLOBAL ENTERPRISE A BancTec White Paper SUMMARY Reducing the cost of processing transactions, while meeting clients expectations, protecting

More information

SPATIAL DATA CLASSIFICATION AND DATA MINING

SPATIAL DATA CLASSIFICATION AND DATA MINING , pp.-40-44. Available online at http://www. bioinfo. in/contents. php?id=42 SPATIAL DATA CLASSIFICATION AND DATA MINING RATHI J.B. * AND PATIL A.D. Department of Computer Science & Engineering, Jawaharlal

More information

Operations Research and Knowledge Modeling in Data Mining

Operations Research and Knowledge Modeling in Data Mining Operations Research and Knowledge Modeling in Data Mining Masato KODA Graduate School of Systems and Information Engineering University of Tsukuba, Tsukuba Science City, Japan 305-8573 koda@sk.tsukuba.ac.jp

More information

01219211 Software Development Training Camp 1 (0-3) Prerequisite : 01204214 Program development skill enhancement camp, at least 48 person-hours.

01219211 Software Development Training Camp 1 (0-3) Prerequisite : 01204214 Program development skill enhancement camp, at least 48 person-hours. (International Program) 01219141 Object-Oriented Modeling and Programming 3 (3-0) Object concepts, object-oriented design and analysis, object-oriented analysis relating to developing conceptual models

More information

CHAPTER 6: TECHNOLOGY

CHAPTER 6: TECHNOLOGY Chapter 6: Technology CHAPTER 6: TECHNOLOGY Objectives Introduction The objectives are: Review the system architecture of Microsoft Dynamics AX 2012. Describe the options for making development changes

More information

CONFIOUS * : Managing the Electronic Submission and Reviewing Process of Scientific Conferences

CONFIOUS * : Managing the Electronic Submission and Reviewing Process of Scientific Conferences CONFIOUS * : Managing the Electronic Submission and Reviewing Process of Scientific Conferences Manos Papagelis 1, 2, Dimitris Plexousakis 1, 2 and Panagiotis N. Nikolaou 2 1 Institute of Computer Science,

More information

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014 RESEARCH ARTICLE OPEN ACCESS A Survey of Data Mining: Concepts with Applications and its Future Scope Dr. Zubair Khan 1, Ashish Kumar 2, Sunny Kumar 3 M.Tech Research Scholar 2. Department of Computer

More information

Database Optimizing Services

Database Optimizing Services Database Systems Journal vol. I, no. 2/2010 55 Database Optimizing Services Adrian GHENCEA 1, Immo GIEGER 2 1 University Titu Maiorescu Bucharest, Romania 2 Bodenstedt-Wilhelmschule Peine, Deutschland

More information

VISUALIZATION APPROACH FOR SOFTWARE PROJECTS

VISUALIZATION APPROACH FOR SOFTWARE PROJECTS Canadian Journal of Pure and Applied Sciences Vol. 9, No. 2, pp. 3431-3439, June 2015 Online ISSN: 1920-3853; Print ISSN: 1715-9997 Available online at www.cjpas.net VISUALIZATION APPROACH FOR SOFTWARE

More information

ANALYSIS OF GRID COMPUTING AS IT APPLIES TO HIGH VOLUME DOCUMENT PROCESSING AND OCR

ANALYSIS OF GRID COMPUTING AS IT APPLIES TO HIGH VOLUME DOCUMENT PROCESSING AND OCR ANALYSIS OF GRID COMPUTING AS IT APPLIES TO HIGH VOLUME DOCUMENT PROCESSING AND OCR By: Dmitri Ilkaev, Stephen Pearson Abstract: In this paper we analyze the concept of grid programming as it applies to

More information

How To Use Neural Networks In Data Mining

How To Use Neural Networks In Data Mining International Journal of Electronics and Computer Science Engineering 1449 Available Online at www.ijecse.org ISSN- 2277-1956 Neural Networks in Data Mining Priyanka Gaur Department of Information and

More information

Extracting Business. Value From CAD. Model Data. Transformation. Sreeram Bhaskara The Boeing Company. Sridhar Natarajan Tata Consultancy Services Ltd.

Extracting Business. Value From CAD. Model Data. Transformation. Sreeram Bhaskara The Boeing Company. Sridhar Natarajan Tata Consultancy Services Ltd. Extracting Business Value From CAD Model Data Transformation Sreeram Bhaskara The Boeing Company Sridhar Natarajan Tata Consultancy Services Ltd. GPDIS_2014.ppt 1 Contents Data in CAD Models Data Structures

More information

How To Develop Software

How To Develop Software Software Engineering Prof. N.L. Sarda Computer Science & Engineering Indian Institute of Technology, Bombay Lecture-4 Overview of Phases (Part - II) We studied the problem definition phase, with which

More information

Masters in Information Technology

Masters in Information Technology Computer - Information Technology MSc & MPhil - 2015/6 - July 2015 Masters in Information Technology Programme Requirements Taught Element, and PG Diploma in Information Technology: 120 credits: IS5101

More information

Big Data: Rethinking Text Visualization

Big Data: Rethinking Text Visualization Big Data: Rethinking Text Visualization Dr. Anton Heijs anton.heijs@treparel.com Treparel April 8, 2013 Abstract In this white paper we discuss text visualization approaches and how these are important

More information

The Scientific Data Mining Process

The Scientific Data Mining Process Chapter 4 The Scientific Data Mining Process When I use a word, Humpty Dumpty said, in rather a scornful tone, it means just what I choose it to mean neither more nor less. Lewis Carroll [87, p. 214] In

More information

OLAP and Data Mining. Data Warehousing and End-User Access Tools. Introducing OLAP. Introducing OLAP

OLAP and Data Mining. Data Warehousing and End-User Access Tools. Introducing OLAP. Introducing OLAP Data Warehousing and End-User Access Tools OLAP and Data Mining Accompanying growth in data warehouses is increasing demands for more powerful access tools providing advanced analytical capabilities. Key

More information

Modern Databases. Database Systems Lecture 18 Natasha Alechina

Modern Databases. Database Systems Lecture 18 Natasha Alechina Modern Databases Database Systems Lecture 18 Natasha Alechina In This Lecture Distributed DBs Web-based DBs Object Oriented DBs Semistructured Data and XML Multimedia DBs For more information Connolly

More information

Requirements Analysis Concepts & Principles. Instructor: Dr. Jerry Gao

Requirements Analysis Concepts & Principles. Instructor: Dr. Jerry Gao Requirements Analysis Concepts & Principles Instructor: Dr. Jerry Gao Requirements Analysis Concepts and Principles - Requirements Analysis - Communication Techniques - Initiating the Process - Facilitated

More information

Quality Control of National Genetic Evaluation Results Using Data-Mining Techniques; A Progress Report

Quality Control of National Genetic Evaluation Results Using Data-Mining Techniques; A Progress Report Quality Control of National Genetic Evaluation Results Using Data-Mining Techniques; A Progress Report G. Banos 1, P.A. Mitkas 2, Z. Abas 3, A.L. Symeonidis 2, G. Milis 2 and U. Emanuelson 4 1 Faculty

More information

An Overview of Knowledge Discovery Database and Data mining Techniques

An Overview of Knowledge Discovery Database and Data mining Techniques An Overview of Knowledge Discovery Database and Data mining Techniques Priyadharsini.C 1, Dr. Antony Selvadoss Thanamani 2 M.Phil, Department of Computer Science, NGM College, Pollachi, Coimbatore, Tamilnadu,

More information

Chapter 11 Mining Databases on the Web

Chapter 11 Mining Databases on the Web Chapter 11 Mining bases on the Web INTRODUCTION While Chapters 9 and 10 provided an overview of Web data mining, this chapter discusses aspects of mining the databases on the Web. Essentially, we use the

More information

B.Sc (Computer Science) Database Management Systems UNIT-V

B.Sc (Computer Science) Database Management Systems UNIT-V 1 B.Sc (Computer Science) Database Management Systems UNIT-V Business Intelligence? Business intelligence is a term used to describe a comprehensive cohesive and integrated set of tools and process used

More information

WEB APPLICATION FOR TIMETABLE PLANNING IN THE HIGHER TECHNICAL COLLEGE OF INDUSTRIAL AND TELECOMMUNICATIONS ENGINEERING

WEB APPLICATION FOR TIMETABLE PLANNING IN THE HIGHER TECHNICAL COLLEGE OF INDUSTRIAL AND TELECOMMUNICATIONS ENGINEERING WEB APPLICATION FOR TIMETABLE PLANNING IN THE HIGHER TECHNICAL COLLEGE OF INDUSTRIAL AND TELE ENGINEERING Dra. Marta E. Zorrilla Pantaleón Dpto. Applied Mathematics and Computer Science Avda. Los Castros

More information

Prof. Pietro Ducange Students Tutor and Practical Classes Course of Business Intelligence 2014 http://www.iet.unipi.it/p.ducange/esercitazionibi/

Prof. Pietro Ducange Students Tutor and Practical Classes Course of Business Intelligence 2014 http://www.iet.unipi.it/p.ducange/esercitazionibi/ Prof. Pietro Ducange Students Tutor and Practical Classes Course of Business Intelligence 2014 http://www.iet.unipi.it/p.ducange/esercitazionibi/ Email: p.ducange@iet.unipi.it Office: Dipartimento di Ingegneria

More information

Lost in Space? Methodology for a Guided Drill-Through Analysis Out of the Wormhole

Lost in Space? Methodology for a Guided Drill-Through Analysis Out of the Wormhole Paper BB-01 Lost in Space? Methodology for a Guided Drill-Through Analysis Out of the Wormhole ABSTRACT Stephen Overton, Overton Technologies, LLC, Raleigh, NC Business information can be consumed many

More information

Microsoft Office 2010: Access 2010, Excel 2010, Lync 2010 learning assets

Microsoft Office 2010: Access 2010, Excel 2010, Lync 2010 learning assets Microsoft Office 2010: Access 2010, Excel 2010, Lync 2010 learning assets Simply type the id# in the search mechanism of ACS Skills Online to access the learning assets outlined below. Titles Microsoft

More information