Data Integration Hub for a Hybrid Paper Search
|
|
|
- Miles Richardson
- 10 years ago
- Views:
Transcription
1 Data Integration Hub for a Hybrid Paper Search Jungkee Kim 1,2, Geoffrey Fox 2, and Seong-Joon Yoo 3 1 Department of Computer Science, Florida State University, Tallahassee FL 32306, U.S.A., [email protected], 2 Community Grids Laboratory, Indiana University, Bloomington IN 47404, U.S.A. 3 School of Computer Engineering, Sejong University, 98 Gunja-Dong, Gwangjin-gu, Seoul, , Korea Abstract. In this paper we describe the design of a hybrid search that combines simple metadata search with a traditional keyword search over unstructured context data. This paradigm provides the inquirer additional options to narrow the search with some semantic aspects through the XML metadata query. We demonstrate a paper search for a case study of the hybrid search, and describe a data integration hub to integrate those data dispersed on the Net. 1 Introduction To discover and share heterogeneous resources on the Net has been a long term challenge since computer communication networks were popularized. There are two traditional approaches to organizing the data to be searched one is structured data and the other is unstructured data. A Web search engine is a typical example of search on the Internet. Its technologies are rooted in information retrieval that represents search over the unstructured data. Web search engines provide clues for resource location, but they have no semantic schema and often produce meaningless keyword search results. The Semantic Web is a ambitious extension of the Web. It also includes multiple relation links with directed labeled graphs by which machines like Web crawlers can interpret the relationship between resources. Meanwhile the ordinary Web has a single relationship and a machine cannot infer further meaning. To represent the relations of the object on the Web, the object terms should be defined under a specific description-an ontology. Domain experts are usually needed to design an ontology due to the sophisticated definition required. Currently, many Web pages included no such semantic content, and no unified definition of general semantic agreement exists. Our hybrid keyword search aims to give an intermediate search paradigm on the Internet providing semantic value through XML metadata that are simpler than those of the Semantic Web. In this paper we describe our design of hybrid search systems. In earlier experiments, we had suffered from performance problems in a local level, and we proposed scalable hybrid search on distributed environments [4, 5]. In the architecture, a group of independent search providers share
2 their information through their own search systems. But some data providers, who possess small amounts of data, may join such group. They may not want to develop their own search services. Otherwise, participation of many nodes possessing small data in a group will increase the communication traffic and drop a chance to reach the target information under Time-to-Live (TTL) strategy. Partial integration may be one possible method to increase the data portion queried in the search group. In this paper we also present our architecture for data integration hub, which is an application communicating through a message broker with centralized control. This hub can act as a partial integrator on distributed databases or peer-to-peer systems. This paper is organized as follows. In the next section we describe one of our hybrid keyword search architectures. We present a data integration hub to integrate those papers in Section 3. In Section 4, we summarize and conclude. 2 Hybrid Keyword Search Our hybrid keyword search combines metadata search with a traditional keyword search over unstructured context data. Each chunk of unstructured data, usually represented by a file, has an assigned metadata. We use XML the de facto standard format for information exchange between machines as a metalanguage for the metadata. To demonstrate the practicality of the hybrid keyword search, we design and evaluate hybrid search systems based on a native XML database and a file system based text search library, as well as a market-leading relational database management system that integrates XML and text management. We have already introduced a relational database based implementation [3, 4]. It utilized an XML-enabled relational Database Management System (DBMS) with nested subqueries to implement the combination of query results against unstructured documents and semistructured metadata. The other implementation is based on a native XML database and a text search library. We use Apache Xindice [2] for XML instance repositories and XML query processing. Jakarta Lucene [1] is used to manage context query over unstructured data in our hybrid search. The query processing architecture is shown in figure 1. In the Xindice database, we associate an XML instance with an unstructured document by assigning the file name for the document as the key of the XML instance. For example, an XML instance in a file named pt1.xml with unstructured data in the file apaper.pdf by inserting the XML instance into the data collection as follows: xindice ad -c /a collection path -f pt1.xml -n apaper.pdf The assigned key the file name of unstructured document in the example as an attribute value on the root element of the XML created by executing an XPath query. For example the query result may start: <bibliography src:col="/a collection path " src:key="apaper.pdf " xmlns:src=" The key and result XML tuples are stored in a hash table. Another keyword search against unstructured documents returns names of documents which in-
3 Metadata in XML format Unstructured Data Document Name Unstructured Data Document Binary to Text Converter Xindice Storage Metadata Query Processing Query Hash Table (Key, XML) Text Document Lucene Indexing Text Query Processing List of matched documents Final Result Set (Key, Metadata) Fig. 1. Query Processing Architecture clude the given keyword in the text. The returned names are mapped in the hash table and the combined results are collected in a Java hash table. For efficient text search, Jakarta Lucene provides an index class along with a stop word filtering analyzer class. The analyzer filters out stop words articles and other words that are meaningless to the search. Some binary format files should be converted to pure text files before indexing. We pass the binary file name as well as the text file name as parameters to the index generating class in order to indicate the original document format. 2.1 Case Study: Hybrid Paper Search The initial demonstration of the hybrid search is in a simplified model a paper search. The paper search is a content search across various types of documents. Each document has metadata presented as an XML instance. An example XML instance is shown in figure 2. In this demonstration we use a relational DBMS instead of a native XML database. Two relational database tables are used for metadata and documents. For the XML instances representing the metadata, the XMLType of Oracle 9i [6] is used. A column with BFILE large object type is used for the external document table. Those large object rows are indexed using Oracle Text a text management system integrated into Oracle DBMS. Through an Oracle Text index, we can search the target content. The document table has a special attribute for a document type BINARY or TEXT. This attribute is necessary for the filtering option of Oracle Text. Oracle Text filters binary files to pure text instances before making an index. A one-to-one relationship set is used for relation between the paper document and metadata, but a one-to-many or many-to-many
4 <bibliography> <authors> <author>j. Kim</author> <author>o. Balsoy</author> <author>m. Pierce</author> <author>g. Fox</author> </authors> <title>design of a Hybrid Search in the OKC</title> <source>proceedings of the International Conference on IKS</source> <year>2002</year> </bibliography> Fig. 2. An Example XML Instance relationship set could be used for other applications. The relationship table is not necessary in one-to-one relationships, but it is essential to decompose relations to avoid anomalies in one-to-many or many-to-many relationship. With two data tables and a relationship table we can query keywords in the content, which can be associated with particular metadata through nested subqueries. For example, we can find documents with a keyword XML and published in Figure 3 shows the relational schema of our hybrid paper search. papers(papernd: string, descriptions: XMLType) paperfiles(filename: string, doctype: string, contents: BFILE) filelocator(papernd: string, filename: string) Fig. 3. Relational Schema of Hybrid Paper Search 3 Data Integration Hub for Hybrid Search In our earlier work [5] we assumed query clients only read resource in the other nodes. In this paper we take into account that clients may desire to be a data provider without providing an independent search service. Several or many data providers can share their information through a centralized database. They may upload, query, and download unstructured data with attached metadata via a central DBMS. Another aspect for the integration hub, which was not introduced in the local hybrid paper search, is the metadata validation against XML schema. The storage for the local hybrid paper search is managed by database administrators, but ordinary users can upload their own data to the database of the integration hub. We utilize an Oracle DB operation to check the validity of metadata presented in XML instances against a registered XML schema. The XML schema for our hybrid paper search is shown in figure 4.
5 <?xml version="1.0" encoding="utf-8"?> <xs:schema xmlns:xs=" elementformdefault="qualified" attributeformdefault="unqualified"> <xs:element name="bibliography"> <xs:complextype> <xs:sequence> <xs:element name="authors"> <xs:complextype> <xs:sequence> <xs:element name="author" type="xs:string" maxoccurs="unbounded"/> </xs:sequence> </xs:complextype> </xs:element> <xs:element name="title" type="xs:string"/> <xs:element name="source" type="xs:string"/> <xs:element name="year" type="xs:decimal"/> </xs:sequence> </xs:complextype> </xs:element> </xs:schema> Fig. 4. An XML Schema Example for Hybrid Paper Search We demonstrate a data integration hub on the central DBMS using messageoriented middleware. This integration hub is an integrated version of the hybrid paper search introduced in section 2.1. The general architecture of a data integration hub is summarized in figure 5. Those clients can communicate with an integrant through a message broker NaradaBrokering [7]. This broker includes JMS compliant topic-based communication a publish/subscribe model. As in the read-only query case, clients are publishers and the integrant is a subscriber. Clients and the integrant use the same topic. The integrant manages uploading of the metadata and unstructured data, query wrapping, and file transfer. Requests are classified by the job property, which is attached to the message sent from client to integrant. A temporary topic is also delivered to the integrant. This temporary topic provides unique identification for the requester client, and the integrant can return back to the client by publishing the message on a dynamic virtual channel a temporary topic whose session object was attached to the message. We use different message types for each service request, as follows: Upload request: initially the client should request to upload to the integrant, before the unstructured data files are sent. The message for this request includes a file name of unstructured data, metadata, and unique client user name. They are string contents in a MapMessage type message. The integrant checks the validity of the metadata against XML schema as clients may provide no well-formed or valid data. The central database should avoid
6 Message Broker Integrant Client... Client... Client Database or File System Fig. 5. An Integration Hub Architecture redundancy for the unstructured data stored in a file system. The name and directory of the unstructured data, which is generated by combining the user name and the file name, is used for a primary key. File upload: after an upload request is granted by the integrant, the client can send a file for the unstructured data. For the file uploading, we use a ByteMessage type message, which is appropriate for sending a stream of uninterpreted bytes. The client sends a header message first, and the file is published buffer by buffer. A job property, a file name, a user name, and a temporary topic are included in the header message. A JMS message has a unique ID when it is created, and the ID is attached to each message. The integrant can classify those file upload messages by extracting the message ID. This mechanism is necessary because several file uploads can occur simultaneously. Query request: clients are publishers for a query topic, and the integrant subscribe on the same topic. A MapMessage type message, which includes query parameters and properties, is published to the integrant. The query results are delivered back to the inquiry client subscribed to a temporary topic. Download request: the target unstructured data in a file can be obtained from the query request. The client subsequently requests a file download with a MapMessage type message that includes the file name and a temporary topic. The listener on the client then captures the ByteMessage type message published from the integrant, and the target file is written message by message on the client machine. Each message includes the file name property. The message broker is responsible for preserving order of message transfer. The database schema of our data integration hub are similar to those in the hybrid paper search in figure 3, but there is an additional table for the file uploading for the unstructured data a temporary file locator table. Our system allows the file upload on a temporary directory only, and moves the
7 files to the designated directory later. When a user request a file upload and incidentally the same file name already exists, a new file name is assigned for the final destination by the integrant. The original file name is stored in the table for metadata, but the actual file name is stored in the unstructured data table. We assume that a user does not assign the same file name to two or more different unstructured data. A naming and directory for each row in the paper metadata table is generated from combining the unique user name and the file name, and it makes the naming and directory to a potential primary key. Each data integration hub has a message broker and an integrant. A group of data integration hubs may provide a global search over a distributed information system by using the cooperative network features built in to NaradaBrokering, or by using an additional network layer a peer-to-peer overlay network. 4 Conclusion In this paper we described our approaches to hybrid search at a local level with a case study of a hybrid paper search. Our demonstration showed the possibility of the hybrid search paradigm for a practical semantic integration. We had another case study of a data integration hub, which is an application communicating through the message broker with a centralized control. The integration hub can be a search service node in a distributed database, or a peer in a peer-topeer overlay network under more scalable environment [4, 5]. This scalable generalization may have a practical bridging role for information search providing semantic value through metadata whose implementation are simpler than those of the Semantic Web. References 1. Apache Software Foundation. Jakarta Lucene. World Wide Web Apache Software Foundation. Xindice. World Wide Web J. Kim, O. Balsoy, M. Pierce, and G. Fox. Design of a Hybrid Search in the Online Knowledge Center. In Proceedings of the IASTED International Conference on Information and Knowledge Sharing, November J. Kim and G. Fox. A Hybrid Keyword Search across Peer-to-Peer Federated Databases. In Proceedings of East-European Conference on Advances in Databases and Information Systems (ADBIS), September J. Kim and G. Fox. Scalable Hybrid Search on Distributed Databases. In Proceedings of International Workshop of Autonomic Distributed Data and Storage Systems Management, volume 3516 of Lecture Notes in Computer Science. Springer, May Oracle Corporation. Oracle9i Application Developers Guide XML, June S. Pallickara and G. C. Fox. NaradaBrokering: A Distributed Middleware Framework and Architecture for Enabling Durable Peer-to-Peer Grids. In Proceedings of International Middleware Conference, June 2003.
Web Service Robust GridFTP
Web Service Robust GridFTP Sang Lim, Geoffrey Fox, Shrideep Pallickara and Marlon Pierce Community Grid Labs, Indiana University 501 N. Morton St. Suite 224 Bloomington, IN 47404 {sblim, gcf, spallick,
A Survey Study on Monitoring Service for Grid
A Survey Study on Monitoring Service for Grid Erkang You [email protected] ABSTRACT Grid is a distributed system that integrates heterogeneous systems into a single transparent computer, aiming to provide
Classic Grid Architecture
Peer-to to-peer Grids Classic Grid Architecture Resources Database Database Netsolve Collaboration Composition Content Access Computing Security Middle Tier Brokers Service Providers Middle Tier becomes
Event-based middleware services
3 Event-based middleware services The term event service has different definitions. In general, an event service connects producers of information and interested consumers. The service acquires events
A Web Services Framework for Collaboration and Audio/Videoconferencing
A Web Services Framework for Collaboration and Audio/Videoconferencing Geoffrey Fox, Wenjun Wu, Ahmet Uyar, Hasan Bulut Community Grid Computing Laboratory, Indiana University [email protected], [email protected],
Managing XML Data to optimize Performance into Object-Relational Databases
Database Systems Journal vol. II, no. 2/2011 3 Managing XML Data to optimize Performance into Object-Relational Databases Iuliana BOTHA Academy of Economic Studies, Bucharest, Romania [email protected]
RESEARCH ISSUES IN PEER-TO-PEER DATA MANAGEMENT
RESEARCH ISSUES IN PEER-TO-PEER DATA MANAGEMENT Bilkent University 1 OUTLINE P2P computing systems Representative P2P systems P2P data management Incentive mechanisms Concluding remarks Bilkent University
TOP-DOWN APPROACH PROCESS BUILT ON CONCEPTUAL DESIGN TO PHYSICAL DESIGN USING LIS, GCS SCHEMA
TOP-DOWN APPROACH PROCESS BUILT ON CONCEPTUAL DESIGN TO PHYSICAL DESIGN USING LIS, GCS SCHEMA Ajay B. Gadicha 1, A. S. Alvi 2, Vijay B. Gadicha 3, S. M. Zaki 4 1&4 Deptt. of Information Technology, P.
Databases in Organizations
The following is an excerpt from a draft chapter of a new enterprise architecture text book that is currently under development entitled Enterprise Architecture: Principles and Practice by Brian Cameron
Service Description: NIH GovTrip - NBS Web Service
8 July 2010 Page 1 Service Description: NIH GovTrip - NBS Web Service Version # Change Description Owner 1.0 Initial Version Jerry Zhou 1.1 Added ISC Logo and Schema Section Ian Sebright 8 July 2010 Page
Chapter 6 Basics of Data Integration. Fundamentals of Business Analytics RN Prasad and Seema Acharya
Chapter 6 Basics of Data Integration Fundamentals of Business Analytics Learning Objectives and Learning Outcomes Learning Objectives 1. Concepts of data integration 2. Needs and advantages of using data
XIII. Service Oriented Computing. Laurea Triennale in Informatica Corso di Ingegneria del Software I A.A. 2006/2007 Andrea Polini
XIII. Service Oriented Computing Laurea Triennale in Informatica Corso di Outline Enterprise Application Integration (EAI) and B2B applications Service Oriented Architecture Web Services WS technologies
DocuSign Connect Guide
Information Guide 1 DocuSign Connect Guide 2 Copyright 2003-2014 DocuSign, Inc. All rights reserved. For information about DocuSign trademarks, copyrights and patents refer to the DocuSign Intellectual
ENABLING SEMANTIC SEARCH IN STRUCTURED P2P NETWORKS VIA DISTRIBUTED DATABASES AND WEB SERVICES
ENABLING SEMANTIC SEARCH IN STRUCTURED P2P NETWORKS VIA DISTRIBUTED DATABASES AND WEB SERVICES Maria Teresa Andrade FEUP / INESC Porto [email protected] ; [email protected] http://www.fe.up.pt/~mandrade/
No Trade Secrets. Microsoft does not claim any trade secret rights in this documentation.
[MS-EDCSOM]: Intellectual Property Rights Notice for Open Specifications Documentation Technical Documentation. Microsoft publishes Open Specifications documentation for protocols, file formats, languages,
Associate Professor, Department of CSE, Shri Vishnu Engineering College for Women, Andhra Pradesh, India 2
Volume 6, Issue 3, March 2016 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Special Issue
Lightweight Data Integration using the WebComposition Data Grid Service
Lightweight Data Integration using the WebComposition Data Grid Service Ralph Sommermeier 1, Andreas Heil 2, Martin Gaedke 1 1 Chemnitz University of Technology, Faculty of Computer Science, Distributed
BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON
BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON Overview * Introduction * Multiple faces of Big Data * Challenges of Big Data * Cloud Computing
Data Grids. Lidan Wang April 5, 2007
Data Grids Lidan Wang April 5, 2007 Outline Data-intensive applications Challenges in data access, integration and management in Grid setting Grid services for these data-intensive application Architectural
Modernize your NonStop COBOL Applications with XML Thunder September 29, 2009 Mike Bonham, TIC Software John Russell, Canam Software
Modernize your NonStop COBOL Applications with XML Thunder September 29, 2009 Mike Bonham, TIC Software John Russell, Canam Software Agenda XML Overview XML Thunder overview Case Studies Q & A XML Standard
Tier Architectures. Kathleen Durant CS 3200
Tier Architectures Kathleen Durant CS 3200 1 Supporting Architectures for DBMS Over the years there have been many different hardware configurations to support database systems Some are outdated others
Remote Sensitive Image Stations and Grid Services
International Journal of Grid and Distributed Computing 23 Remote Sensing Images Data Integration Based on the Agent Service Binge Cui, Chuanmin Wang, Qiang Wang College of Information Science and Engineering,
STRATEGIES ON SOFTWARE INTEGRATION
STRATEGIES ON SOFTWARE INTEGRATION Cornelia Paulina Botezatu and George Căruţaşu Faculty of Computer Science for Business Management Romanian-American University, Bucharest, Romania ABSTRACT The strategy
Oracle Java CAPS Message Library for EDIFACT User's Guide
Oracle Java CAPS Message Library for EDIFACT User's Guide Part No: 821 2607 March 2011 Copyright 2008, 2011, Oracle and/or its affiliates. All rights reserved. License Restrictions Warranty/Consequential
K@ A collaborative platform for knowledge management
White Paper K@ A collaborative platform for knowledge management Quinary SpA www.quinary.com via Pietrasanta 14 20141 Milano Italia t +39 02 3090 1500 f +39 02 3090 1501 Copyright 2004 Quinary SpA Index
Archiving, Indexing and Accessing Web Materials: Solutions for large amounts of data
Archiving, Indexing and Accessing Web Materials: Solutions for large amounts of data David Minor 1, Reagan Moore 2, Bing Zhu, Charles Cowart 4 1. (88)4-104 [email protected] San Diego Supercomputer Center
EDG Project: Database Management Services
EDG Project: Database Management Services Leanne Guy for the EDG Data Management Work Package EDG::WP2 [email protected] http://cern.ch/leanne 17 April 2002 DAI Workshop Presentation 1 Information in
How To Improve Performance In A Database
Some issues on Conceptual Modeling and NoSQL/Big Data Tok Wang Ling National University of Singapore 1 Database Models File system - field, record, fixed length record Hierarchical Model (IMS) - fixed
An Oracle White Paper October 2013. Oracle XML DB: Choosing the Best XMLType Storage Option for Your Use Case
An Oracle White Paper October 2013 Oracle XML DB: Choosing the Best XMLType Storage Option for Your Use Case Introduction XMLType is an abstract data type that provides different storage and indexing models
Big Data With Hadoop
With Saurabh Singh [email protected] The Ohio State University February 11, 2016 Overview 1 2 3 Requirements Ecosystem Resilient Distributed Datasets (RDDs) Example Code vs Mapreduce 4 5 Source: [Tutorials
Why NoSQL? Your database options in the new non- relational world. 2015 IBM Cloudant 1
Why NoSQL? Your database options in the new non- relational world 2015 IBM Cloudant 1 Table of Contents New types of apps are generating new types of data... 3 A brief history on NoSQL... 3 NoSQL s roots
+ <xs:element name="productsubtype" type="xs:string" minoccurs="0"/>
otcd.ntf.001.01.auctiondetail.. otcd.ntf.001.01.auctionresult - + otcd.ntf.001.01.automaticterminationsummary
keyon Luna SA Monitor Service Administration Guide 1 P a g e Version Autor Date Comment
Luna SA Monitor Service Administration Guide Version Autor Date Comment 1.1 Thomas Stucky 25. July 2013 Update installation instructions. 1 P a g e Table of Contents 1 Overview... 3 1.1 What is the keyon
A MEDIATION LAYER FOR HETEROGENEOUS XML SCHEMAS
A MEDIATION LAYER FOR HETEROGENEOUS XML SCHEMAS Abdelsalam Almarimi 1, Jaroslav Pokorny 2 Abstract This paper describes an approach for mediation of heterogeneous XML schemas. Such an approach is proposed
The Direct Project. Implementation Guide for Direct Project Trust Bundle Distribution. Version 1.0 14 March 2013
The Direct Project Implementation Guide for Direct Project Trust Bundle Distribution Version 1.0 14 March 2013 Version 1.0, 14 March 2013 Page 1 of 14 Contents Change Control... 3 Status of this Guide...
Chapter 11 Map-Reduce, Hadoop, HDFS, Hbase, MongoDB, Apache HIVE, and Related
Chapter 11 Map-Reduce, Hadoop, HDFS, Hbase, MongoDB, Apache HIVE, and Related Summary Xiangzhe Li Nowadays, there are more and more data everyday about everything. For instance, here are some of the astonishing
Unified XML/relational storage March 2005. The IBM approach to unified XML/relational databases
March 2005 The IBM approach to unified XML/relational databases Page 2 Contents 2 What is native XML storage? 3 What options are available today? 3 Shred 5 CLOB 5 BLOB (pseudo native) 6 True native 7 The
5.5 Copyright 2011 Pearson Education, Inc. publishing as Prentice Hall. Figure 5-2
Class Announcements TIM 50 - Business Information Systems Lecture 15 Database Assignment 2 posted Due Tuesday 5/26 UC Santa Cruz May 19, 2015 Database: Collection of related files containing records on
Module 1: Getting Started with Databases and Transact-SQL in SQL Server 2008
Course 2778A: Writing Queries Using Microsoft SQL Server 2008 Transact-SQL About this Course This 3-day instructor led course provides students with the technical skills required to write basic Transact-
COM_2006_023_02.xsd <?xml version="1.0" encoding="utf-8"?> <xs:schema xmlns:xs="http://www.w3.org/2001/xmlschema" elementformdefault="qualified">
Writing Queries Using Microsoft SQL Server 2008 Transact-SQL
Course 2778A: Writing Queries Using Microsoft SQL Server 2008 Transact-SQL Length: 3 Days Language(s): English Audience(s): IT Professionals Level: 200 Technology: Microsoft SQL Server 2008 Type: Course
A Semantic Marketplace of Peers Hosting Negotiating Intelligent Agents
A Semantic Marketplace of Peers Hosting Negotiating Intelligent Agents Theodore Patkos and Dimitris Plexousakis Institute of Computer Science, FO.R.T.H. Vassilika Vouton, P.O. Box 1385, GR 71110 Heraklion,
An Oracle White Paper February 2009. Managing Unstructured Data with Oracle Database 11g
An Oracle White Paper February 2009 Managing Unstructured Data with Oracle Database 11g Introduction The vast majority of the information used by corporations, enterprises, and other organizations is referred
Katta & Hadoop. Katta - Distributed Lucene Index in Production. Stefan Groschupf Scale Unlimited, 101tec. sg{at}101tec.com
1 Katta & Hadoop Katta - Distributed Lucene Index in Production Stefan Groschupf Scale Unlimited, 101tec. sg{at}101tec.com foto by: [email protected] 2 Intro Business intelligence reports from
BUSINESSOBJECTS DATA INTEGRATOR
PRODUCTS BUSINESSOBJECTS DATA INTEGRATOR IT Benefits Correlate and integrate data from any source Efficiently design a bulletproof data integration process Accelerate time to market Move data in real time
Analytics March 2015 White paper. Why NoSQL? Your database options in the new non-relational world
Analytics March 2015 White paper Why NoSQL? Your database options in the new non-relational world 2 Why NoSQL? Contents 2 New types of apps are generating new types of data 2 A brief history of NoSQL 3
Geodatabase Programming with SQL
DevSummit DC February 11, 2015 Washington, DC Geodatabase Programming with SQL Craig Gillgrass Assumptions Basic knowledge of SQL and relational databases Basic knowledge of the Geodatabase We ll hold
The basic data mining algorithms introduced may be enhanced in a number of ways.
DATA MINING TECHNOLOGIES AND IMPLEMENTATIONS The basic data mining algorithms introduced may be enhanced in a number of ways. Data mining algorithms have traditionally assumed data is memory resident,
CA DLP. Stored Data Integration Guide. Release 14.0. 3rd Edition
CA DLP Stored Data Integration Guide Release 14.0 3rd Edition This Documentation, which includes embedded help systems and electronically distributed materials, (hereinafter referred to as the Documentation
Advanced Information Management
Anwendersoftware a Advanced Information Management Chapter 1: Introduction Holger Schwarz Universität Stuttgart Sommersemester 2009 Overview Basics Terms Database Management Systems Database Design Query
Keywords Big Data; OODBMS; RDBMS; hadoop; EDM; learning analytics, data abundance.
Volume 4, Issue 11, November 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Analytics
Universal Event Monitor for SOA 5.2.0 Reference Guide
Universal Event Monitor for SOA 5.2.0 Reference Guide 2015 by Stonebranch, Inc. All Rights Reserved. 1. Universal Event Monitor for SOA 5.2.0 Reference Guide.............................................................
Foundations of Business Intelligence: Databases and Information Management
Chapter 5 Foundations of Business Intelligence: Databases and Information Management 5.1 Copyright 2011 Pearson Education, Inc. Student Learning Objectives How does a relational database organize data,
Data Integration using Agent based Mediator-Wrapper Architecture. Tutorial Report For Agent Based Software Engineering (SENG 609.
Data Integration using Agent based Mediator-Wrapper Architecture Tutorial Report For Agent Based Software Engineering (SENG 609.22) Presented by: George Shi Course Instructor: Dr. Behrouz H. Far December
How To Create A Privacy Preserving And Dynamic Load Balancing System In A Distributed System
Enforcing Secure and Privacy-Preserving Information Brokering with Dynamic Load Balancing in Distributed Information Sharing. 1 M.E. Computer Engineering Department GHRCEM, Wagholi, Pune. [email protected]
A P2P SERVICE DISCOVERY STRATEGY BASED ON CONTENT
A P2P SERVICE DISCOVERY STRATEGY BASED ON CONTENT CATALOGUES Lican Huang Institute of Network & Distributed Computing, Zhejiang Sci-Tech University, No.5, St.2, Xiasha Higher Education Zone, Hangzhou,
Physical Database Design Process. Physical Database Design Process. Major Inputs to Physical Database. Components of Physical Database Design
Physical Database Design Process Physical Database Design Process The last stage of the database design process. A process of mapping the logical database structure developed in previous stages into internal
DBMS / Business Intelligence, SQL Server
DBMS / Business Intelligence, SQL Server Orsys, with 30 years of experience, is providing high quality, independant State of the Art seminars and hands-on courses corresponding to the needs of IT professionals.
Gplus Adapter 8.0. for Siebel CRM. Developer s Guide
Gplus Adapter 8.0 for Siebel CRM Developer s Guide The information contained herein is proprietary and confidential and cannot be disclosed or duplicated without the prior written consent of Genesys Telecommunications
ESS event: Big Data in Official Statistics. Antonino Virgillito, Istat
ESS event: Big Data in Official Statistics Antonino Virgillito, Istat v erbi v is 1 About me Head of Unit Web and BI Technologies, IT Directorate of Istat Project manager and technical coordinator of Web
INTEGRATION OF XML DATA IN PEER-TO-PEER E-COMMERCE APPLICATIONS
INTEGRATION OF XML DATA IN PEER-TO-PEER E-COMMERCE APPLICATIONS Tadeusz Pankowski 1,2 1 Institute of Control and Information Engineering Poznan University of Technology Pl. M.S.-Curie 5, 60-965 Poznan
A Peer-to-Peer Approach to Content Dissemination and Search in Collaborative Networks
A Peer-to-Peer Approach to Content Dissemination and Search in Collaborative Networks Ismail Bhana and David Johnson Advanced Computing and Emerging Technologies Centre, School of Systems Engineering,
Writing Queries Using Microsoft SQL Server 2008 Transact-SQL
Lincoln Land Community College Capital City Training Center 130 West Mason Springfield, IL 62702 217-782-7436 www.llcc.edu/cctc Writing Queries Using Microsoft SQL Server 2008 Transact-SQL Course 2778-08;
Modern Databases. Database Systems Lecture 18 Natasha Alechina
Modern Databases Database Systems Lecture 18 Natasha Alechina In This Lecture Distributed DBs Web-based DBs Object Oriented DBs Semistructured Data and XML Multimedia DBs For more information Connolly
Napster and Gnutella: a Comparison of two Popular Peer-to-Peer Protocols. Anthony J. Howe Supervisor: Dr. Mantis Cheng University of Victoria
Napster and Gnutella: a Comparison of two Popular Peer-to-Peer Protocols Anthony J Howe Supervisor: Dr Mantis Cheng University of Victoria February 28, 2002 Abstract This article presents the reverse engineered
Comparing Microsoft SQL Server 2005 Replication and DataXtend Remote Edition for Mobile and Distributed Applications
Comparing Microsoft SQL Server 2005 Replication and DataXtend Remote Edition for Mobile and Distributed Applications White Paper Table of Contents Overview...3 Replication Types Supported...3 Set-up &
NS DISCOVER 4.0 ADMINISTRATOR S GUIDE. July, 2015. Version 4.0
NS DISCOVER 4.0 ADMINISTRATOR S GUIDE July, 2015 Version 4.0 TABLE OF CONTENTS 1 General Information... 4 1.1 Objective... 4 1.2 New 4.0 Features Improvements... 4 1.3 Migrating from 3.x to 4.x... 5 2
14 Databases. Source: Foundations of Computer Science Cengage Learning. Objectives After studying this chapter, the student should be able to:
14 Databases 14.1 Source: Foundations of Computer Science Cengage Learning Objectives After studying this chapter, the student should be able to: Define a database and a database management system (DBMS)
The Role and uses of Peer-to-Peer in file-sharing. Computer Communication & Distributed Systems EDA 390
The Role and uses of Peer-to-Peer in file-sharing Computer Communication & Distributed Systems EDA 390 Jenny Bengtsson Prarthanaa Khokar [email protected] [email protected] Gothenburg, May
Bridge from Entity Relationship modeling to creating SQL databases, tables, & relations
1 Topics for this week: 1. Good Design 2. Functional Dependencies 3. Normalization Readings for this week: 1. E&N, Ch. 10.1-10.6; 12.2 2. Quickstart, Ch. 3 3. Complete the tutorial at http://sqlcourse2.com/
P2P: centralized directory (Napster s Approach)
P2P File Sharing P2P file sharing Example Alice runs P2P client application on her notebook computer Intermittently connects to Internet; gets new IP address for each connection Asks for Hey Jude Application
A Multidatabase System as 4-Tiered Client-Server Distributed Heterogeneous Database System
A Multidatabase System as 4-Tiered Client-Server Distributed Heterogeneous Database System Mohammad Ghulam Ali Academic Post Graduate Studies and Research Indian Institute of Technology, Kharagpur Kharagpur,
Principles of Distributed Database Systems
M. Tamer Özsu Patrick Valduriez Principles of Distributed Database Systems Third Edition
UIMA and WebContent: Complementary Frameworks for Building Semantic Web Applications
UIMA and WebContent: Complementary Frameworks for Building Semantic Web Applications Gaël de Chalendar CEA LIST F-92265 Fontenay aux Roses [email protected] 1 Introduction The main data sources
131-1. Adding New Level in KDD to Make the Web Usage Mining More Efficient. Abstract. 1. Introduction [1]. 1/10
1/10 131-1 Adding New Level in KDD to Make the Web Usage Mining More Efficient Mohammad Ala a AL_Hamami PHD Student, Lecturer m_ah_1@yahoocom Soukaena Hassan Hashem PHD Student, Lecturer soukaena_hassan@yahoocom
Distributed Databases in a Nutshell
Distributed Databases in a Nutshell Marc Pouly [email protected] Department of Informatics University of Fribourg, Switzerland Priciples of Distributed Database Systems M. T. Özsu, P. Valduriez Prentice
PEER TO PEER FILE SHARING USING NETWORK CODING
PEER TO PEER FILE SHARING USING NETWORK CODING Ajay Choudhary 1, Nilesh Akhade 2, Aditya Narke 3, Ajit Deshmane 4 Department of Computer Engineering, University of Pune Imperial College of Engineering
Reference Architecture, Requirements, Gaps, Roles
Reference Architecture, Requirements, Gaps, Roles The contents of this document are an excerpt from the brainstorming document M0014. The purpose is to show how a detailed Big Data Reference Architecture
A Pluggable Security Framework for Message Oriented Middleware
A Pluggable Security Framework for Message Oriented Middleware RUEY-SHYANG WU, SHYAN-MING YUAN Department of Computer Science National Chiao-Tung University 1001 Ta Hsueh Road, Hsinchu 300, TAIWAN, R.
MySQL for Beginners Ed 3
Oracle University Contact Us: 1.800.529.0165 MySQL for Beginners Ed 3 Duration: 4 Days What you will learn The MySQL for Beginners course helps you learn about the world's most popular open source database.
Middleware support for the Internet of Things
Middleware support for the Internet of Things Karl Aberer, Manfred Hauswirth, Ali Salehi School of Computer and Communication Sciences Ecole Polytechnique Fédérale de Lausanne (EPFL) CH-1015 Lausanne,
Generating XML from Relational Tables using ORACLE. by Selim Mimaroglu Supervisor: Betty O NeilO
Generating XML from Relational Tables using ORACLE by Selim Mimaroglu Supervisor: Betty O NeilO 1 INTRODUCTION Database: : A usually large collection of data, organized specially for rapid search and retrieval
Multicast vs. P2P for content distribution
Multicast vs. P2P for content distribution Abstract Many different service architectures, ranging from centralized client-server to fully distributed are available in today s world for Content Distribution
City Data Pipeline. A System for Making Open Data Useful for Cities. [email protected]
City Data Pipeline A System for Making Open Data Useful for Cities Stefan Bischof 1,2, Axel Polleres 1, and Simon Sperl 1 1 Siemens AG Österreich, Siemensstraße 90, 1211 Vienna, Austria {bischof.stefan,axel.polleres,simon.sperl}@siemens.com
Combining Unstructured, Fully Structured and Semi-Structured Information in Semantic Wikis
Combining Unstructured, Fully Structured and Semi-Structured Information in Semantic Wikis Rolf Sint 1, Sebastian Schaffert 1, Stephanie Stroka 1 and Roland Ferstl 2 1 {firstname.surname}@salzburgresearch.at
Base One's Rich Client Architecture
Base One's Rich Client Architecture Base One provides a unique approach for developing Internet-enabled applications, combining both efficiency and ease of programming through its "Rich Client" architecture.
