The Niagara Internet Query System
|
|
|
- Walter Miller
- 9 years ago
- Views:
Transcription
1 The Niagara Internet Query System Jeffrey Naughton, David DeWitt, David Maier, Ashraf Aboulnaga, Jianjun Chen, Leonidas Galanis, Jaewoo Kang, Rajasekar Krishnamurthy, Qiong Luo, Naveen Prakash Ravishankar Ramamurthy, Jayavel Shanmugasundaram, Feng Tian, Kristin Tufte, Stratis Viglas, Yuan Wang, Chun Zhang, Bruce Jackson, Anurag Gupta, Rushan Chen Abstract Recently, there has been a great deal of research into XML query languages to enable the execution of database-style queries over XML files. However, merely being an XML query-processing engine does not render a system suitable for querying the Internet. A useful system must provide mechanisms to (a) find the XML files that are relevant to a given query, and (b) deal with remote data sources that either provide unpredictable data access and transfer rates, or are infinite streams, or both. The Niagara Internet Query System was designed from the bottom-up to provide these mechanisms. In this article we describe the overall Niagara architecture, and how Niagara finds relevant XML documents by using a collaboration between the Niagara XML-QL query processor and the Niagara text-in-context XML search engine. The Niagara Internet Query System is public domain software that can be found at 1 Introduction One of the most exciting opportunities presented by the emergence of XML is the ability to query XML data over the Internet. In our view, a query system for web-accessible Internet data should have the following characteristics. First, the query itself should not have to specify the XML files that should be consulted for its answer. This flexibility is a departure from the way current database systems work; in SQL terminology, it amounts to supporting a FROM * construct in the query language, and ideally would allow a user to pose a query, and get an answer if the query is answerable from any combination of XML files anywhere in the Internet. Secondly, a useful query system cannot assume that all the streams of data feeding its operators progress at the same speed, or even that all of the data streams feeding its operators will terminate. Once again, this is a departure from the way conventional DBMS operate. The Niagara Internet Query System is designed to have these characteristics. In this paper we describe how Niagara supports queries that do not explicitly name source XML files; support for network resident and streaming data is described elsewhere [STD+00]. The remainder of this paper is organized as follows. Section 2 gives a brief overview of XML and XML- QL and presents the overall top-level architecture of the Niagara Internet Query System. Section 3 describes the text-in-context Niagara Search Engine and the SEQL language, while Section 4 describes how XML-QL queries are evaluated. We conclude in Section 5. Copyright 2001 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE. Bulletin of the IEEE Computer Society Technical Committee on Data Engineering 1
2 2 XML, XML-QL, and Top-Level Architecture of Niagara 2.1 XML and XML-QL <book> <title> Java Programming </title> <price> 40 </price> <author id = gosling > <name> <firstname> James </firstname> <lastname> Gosling </lastname> </name> </author> </book> Figure 1: An Example XML Document WHERE <book> <title> Java Programming </title> <author> <lastname> $l </lastname> IN * CONSTRUCT <lastname> $l </lastname> Figure 2: An Example XML-QL Query Extensible Markup Language (XML) is a hierarchical data format for information representation and exchange in the World Wide Web. An XML document consists of nested element structures, starting with a root element. Element data can be in the form of attributes or sub-elements. Figure 1 shows an XML document that contains information about a book. In this example, there is a book element that has three sub-elements title, price and author. The author element has an id attribute with value gosling and is further nested to provide name information. Much more information about XML and related standards can be found on the W3C Web Site, There are many semi-structured query languages that can be used to query XML documents, including XML-QL [DFF+99], Lorel [AQM+97], UnQL [BDH+96] and XQL [R98], and what appears to be the emerging standard, XQuery [CFR+01]. Each of these query languages has a notion of path expressions for navigating the nested structure of XML. In Niagara, we initially chose XML-QL as the query language, although we are currently switching to XQuery. Figure 2 shows an XML-QL query to determine the last name of the author of a book having the title Java Programming. The query is executed over the XML documents specified in the IN construct and the last names thus selected are nested under a lastname element. As mentioned in the introduction, one of the design goals of the Niagara query system is to allow the user the flexibility to query the web without having to explicitly specify each individual XML file to be queried. We thus extend XML-QL to support the IN * construct, whereby the query is (conceptually) executed over all the XML files present in the World Wide Web. 2.2 Top-Level Architecture of the Niagara Query System Figure 3 illustrates the main components of the Niagara Internet Query System. Users craft XML-QL queries using a graphical user interface and send them to the query engine for execution. The connection manager in the query engine accepts queries for execution and is also responsible for maintaining sessions with clients. Each query received by the connection manager is parsed and optimized. A crucial step in the optimization process is reducing the number of XML files to be consulted in order to produce the result of the query (especially in view of the IN * construct). Niagara accomplishes this reduction by extracting a search engine query from the XML-QL query. This search engine query is sent for execution to the search engine, which responds with a list of potentially relevant URLs. During execution, data from the relevant input sources is asynchronously fetched from the Internet by the data manager (if it is not already cached), and the results of execution are streamed to the user as they become available. Users can request partial 2
3 Niagara Query Engine GUI Client Niagara Search Engine Browser Connection Manager XML-QL Parser Query Optimizer Execution Engine Data Manager Server SEQL Parser SEQL Optimizer Interpreter Index Manager Crawler THE INTERNET Figure 3: Niagara System Architecture Figure 4: Query result screen shot. results at any time during the execution. In this case, the execution engine returns the result so far while at the same time processing new input data coming off the Internet. The Niagara search engine, in addition to its role in answering user XML-QL queries, can also be used as a stand-alone search engine for XML documents over the World Wide Web. The query interface to the search engine, in either case, is SEQL (Search Engine Query Language). SEQL queries are parsed and optimized in the search engine and the search engine interpreter executes the optimized execution plan. The results of the query execution are returned to the query engine or user. The inverted indices used for efficient SEQL execution are built and updated by the index manager. The index manager receives information about new and updated XML files from a crawler. The GUI is a Java application that can also be run as an applet in a web browser. In addition to providing a simple graphical user interface to generate XML-QL, it is also capable of handling multiple concurrently executing XML-QL or SEQL queries. Both the query engine and search engine are also implemented in Java and are structured as multi-threaded servers. Since the query engine and search engine are individual servers, they run as separate processes (potentially on different machines). The next two sections describe the working of the search engine and the query engine in more detail. 3 The Niagara Text-in-Context Search Engine A novel feature of the Niagara Internet Query System is that users do not need to specify source XML files in their queries. Rather, it is the responsibility of the system itself to examine the query, and from the query to determine the set of XML files that could possibly contribute to an answer to the query. Niagara does this through the use of its search engine, the Niagara Text-in-Context Search Engine. When using an existing Internet search engine, the most common query is find all the documents that contain these keywords. It is possible to construct more advanced searches based upon such properties as proximity and simple Boolean combinations of keywords. However, it is impossible to query based upon the role of the keyword in a document, because that information is not even available in the document itself. XML changes all this. As a simple, admittedly contrived example, consider searching for all documents that have information about departures of a ship named Montreal. One could go to one of the existing search engines and ask for the URLs 3
4 of all documents that contain the string Montreal, with predictable results the query will return thousands of documents, most of which have nothing to do with the ship named Montreal. Using the Niagara textin-context search engine, one can instead ask for all documents that contain departure elements that contain shipname elements that contain the string Montreal. It should now be clear what we mean by text-in-context rather than just searching for words in documents, we search for containment relationships between elements and other elements (e.g., departure element contains shipname element ) and also containment relationships between elements and strings (e.g., shipname element contains the string Montreal ). The preceding paragraph is overly simplistic a user looking for departures of ships named Montreal with a traditional search engine would be unlikely to search only on Montreal, they would be far more likely to search for Montreal, ship, departure. This will yield a far more focussed search than just giving the keyword Montreal. It is an open question how this search would fare when compared to the XML structural search. The structural search is more precise, but the value of this precision can only be determined empirically as we gain experience with the engine. Since in our experiments the structural approach is not significantly slower with respect to query evaluation, we have decided to use it. 3.1 SEQL In this section we describe Search Engine Query Language (SEQL), the language executed by the search engine, briefly describe how the search engine evaluates SEQL, and how the XML-QL engine and the search engine interact. This process is described in more detail in [NDM+00]. SEQL is a simple language designed to specify patterns that can be matched by XML files. The output of a SEQL query is the list of URLs of the files that match the query. An atomic SEQL query is a word or an XML element tag. Such a query returns the URLs of the XML files containing the word or XML element tag, respectively. Complex SEQL queries can be built from the atomic SEQL queries using the binary operators contains, containedin and is In addition, SEQL supports a proximity operator and a numeric comparison operator. SEQL also supports the standard Boolean connectives and, or, and except, which represent the intersection, union and difference of the results of the simple SEQL queries, respectively. Finally, SEQL supports a special construct conformsto, which is used to restrict the result to only the URLs of those XML files that are declared to conform to a given DTD. 3.2 Evaluating SEQL We now turn our attention to the efficient evaluation of SEQL queries in the Niagara search engine. As mentioned in Section 2.2, the search engine has two logical parts, the crawler and the SEQL execution engine. The crawler locates XML documents in the web. Since at the time of writing this paper there are relatively few XML files on the web, to evaluate the system we manually built a local collection of XML files, and then used this crawler to crawl the local subtree and find these files. The crawler passes URLs of XML files to the search engine to be indexed. The indices used by the search engine are variants of inverted lists used for information retrieval. There are three categories of inverted lists used by the search engine, which are element lists, word lists and DTD lists. The search engine maintains one element list for every unique XML element name encountered in the crawled XML files. Each entry (or posting ) of the element list has the form (fileid, beginid, endid) where fileid identifies a file containing the XML element, beginid specifies the beginning position of the element s begin tag in that file, and endid specifies the ending position of the element s end tag in the same file. The search engine also maintains one word list for each unique text word encountered in the crawled XML files. Each posting in a word list for a given word is of the form (fileid, position) where fileid identifies a file containing the word and position indicates the position of the word in that file. Finally, the search engine 4
5 maintains one DTD list for each unique DTD that the crawled XML files conform to. Each posting in a DTD list for a given DTD is of the form (fileid), where fileid identifies a file that conforms to that DTD. All three types of lists are maintained sorted by fileid to enable the efficient execution of SEQL queries. 4 Generating and Evaluating XML-QL Queries In this section we describe how users pose queries using the Niagara GUI interface, and how those queries are evaluated by the query engine. 4.1 Extracting SEQL from XML-QL The Niagara query engine sends SEQL queries to the Niagara search engine to determine the XML files over which to run an XML-QL query. To do so, the XML-QL query engine extracts a SEQL query (or queries) from the original XML-QL query. The XML-QL query engine extracts SEQL during query optimization. The SEQL extraction process does not extract all possible constraints from the query; rather it uses heuristics to avoid generating SEQL that would be likely to be extremely inefficient to evaluate. The goal of the generated SEQL is to produce a superset of the URLs that need to be consulted to evaluate the XML-QL. It would be optimal if the superset were exactly the set actually required, but for efficiency of SEQL evaluation we do not always achieve this goal. Evaluating tradeoffs between the cost of the SEQL query and the precision with which it returns useful URLs is an interesting direction for future work. 4.2 A User Interface for XML Querying In a classical relational database management query-builder interface, the basic approach is to start with the database schema. Clearly something analogous is needed for querying XML (no user is going to type in XML- QL!), but if a query is being posed over all the XML files on the Internet where is the schema? In our system, we have taken the simple approach that this schema information is derived from document type descriptors (DTDs). Both the XML-QL engine and the text-in-context search engine have graphical user interfaces. To build a query, a user starts by selecting a set of DTDs to work with. After these DTDs have been selected, the GUI displays element names and users can do standard point and click or drag and drop query building over these DTDs. Once the query has been specified, and the user clicks on submit query, an XML-QL query is generated (in the case of the XML-QL engine) or a Search Engine Query Language Query is generated (in the case of the text-in-context search engine.) Note that the resulting query will not run solely over documents conforming to the DTDs selected by the user (unless the user specifies that this is desired by including a conformsto clause). Rather, the DTDs are used to generate a candidate set of tags over which to query. Any document that can match the query over these tags will be used in answering the query, whether or not it conforms to the DTD. Clearly this is only a first step to building a good user interface. What is needed is a higher level mapping from user concepts to XML element vocabularies. We regard this area as important for future research. It is our hope that this sort of mapping will be done at a higher level than the query engine, in a layer that understands user-level concepts and can map them to schema information stored as DTDs or XML Schemas. 4.3 A Simple Query For clarity and brevity of exposition, we present a very simple query. Consider the DTD in Figure 5 that describes XML documents representing movies. The elements are self-explanatory, with the exception of W4F DOC, which is an element added by the wrapper that converted this data to XML [SA99]. 5
6 <?xml encoding="iso "?> <!ELEMENT W4F_DOC (Movie)> <!ELEMENT Movie (Title,Year,Directed_By,Genres,Cast)> <!ELEMENT Title (#PCDATA)> <!ELEMENT Year (#PCDATA)> <!ELEMENT Directed_By (Director)*> <!ELEMENT Director (#PCDATA)> <!ELEMENT Genres (Genre)*> <!ELEMENT Genre (#PCDATA)> <!ELEMENT Cast (Actor)*> <!ELEMENT Actor (FirstName,LastName)> <!ELEMENT FirstName (#PCDATA)> <!ELEMENT LastName (#PCDATA)> Figure 5: Example Movie DTD. Figure 6: Niagara Query Interface Example. Figure 6 shows a screen shot of the Niagara GUI specifying the query retrieve the Movie title and the Cast of movies directed by Terry Gilliam over the DTD in Figure 5. The XML-QL query that is generated in response to the query specified in the GUI is shown in Figure 7. Figure 8 shows the SEQL query the query engine extracts from this XML-QL query. In response to the SEQL query, the search engine returns three URLs, which have been filtered from over 600 XML documents that conform to the movies DTD. These URLs are passed to the query engine, which evaluates the XML-QL query, giving the result shown in Figure 4. WHERE <W4F_DOC> <Movie> <Title>$v68 <Directed_By> <Director>$v71 <Cast>$v74 IN "*" conform_to " xml-movies/movies.dtd", $v71 = "Terry Gilliam" CONSTRUCT <result> <Title>$v68 <Cast>$v74 (W4F_DOC CONTAINS (Movie CONTAINS ((Directed_By CONTAINS (Director IS "Terry Gilliam")) AND (Title AND Cast) ) ) ) conformsto " xml-movies/movies.dtd" Figure 8: SEQL Extracted from Terry Gilliam query. Figure 7: Generated XML-QL. 6
7 5 Conclusion The Niagara Internet Query System is designed to enable users to pose XML queries over the Internet. It differs from traditional database systems in (a) how it decides which files to use as input, (b) how it handles input sources that have unpredictable performance or may be infinite streams or both, and (c) in its support for large numbers of triggers. In this paper we have focussed on the interaction between the search engine and the query engine; other aspects of the system are described elsewhere [CDT+00, STD+00]. The Niagara project is on-going; much more information about the system is available from the project homepage, Acknowledgements Funding for this work was provided by DARPA through NAVY/SPAWAR Contract No. N and by NSF Awards CDA and ITR References [AQM+97] S. Abiteboul, D. Quass, J. McHugh, J. Widom, J. Wiener, The Lorel Query Language for Semistructured Data, International Journal on Digital Libraries, 1(1), pp , April [BDH+96] [CFR+01] [CDT+00] [DFF+99] P. Buneman, S. Davidson, G. Hillebrand, D. Suciu, A Query Language and Optimization Techniques for Unstructured Data, Proceedings of the 1996 ACM SIGMOD Conference, Montreal, Canada, June D. Chamberlin, D. Florescu, J. Robie, J. Simeon, M. Stefanescu, XQuery: A Query Language for XML. W3C Working Draft, available at J. Chen, D. J. DeWitt, F. Tian, Y. Wang, NiagaraCQ: A Scalable Continuous Query System for Internet Databases, Proceedings of the 2000 ACM SIGMOD Conference, Dallas, TX, May A. Deutsch, M. Fernandez, D. Florescu, A. Levy, D. Suciu, XML-QL: A Query Language for XML, Proceedings of the Eighth International World Wide Web Conference, Toronto, May [NDM+00] J. Naughton, D. DeWitt, D. Maier, et al., The Niagara Internet Query System. Available from [R98] [SA99] [STD+00] Jonathan Robie, The Design of XQL, Texcel Research, November Sahuguet, Arnaud and Fabien Azavant. Web Ecology, Recycling HTML pages as XML documents using W4F, Proceedings of the 1999 WebDB Workshop, June J. Shanmugasundaram, K. Tufte, D. DeWitt, D. Maier, J. Naughton, Architecting a Network Query Engine for Producing Partial Results, Lecture Notes in Computer Science, Vol. 1997, Springer- Verlag Publishers, A short version of this paper appeared in the proceedings of WebDB 2000 and is also available from niagara/papers/partialresultsperformance.pdf. 7
Relational Databases for Querying XML Documents: Limitations and Opportunities. Outline. Motivation and Problem Definition Querying XML using a RDBMS
Relational Databases for Querying XML Documents: Limitations and Opportunities Jayavel Shanmugasundaram Kristin Tufte Gang He Chun Zhang David DeWitt Jeffrey Naughton Outline Motivation and Problem Definition
Search and Information Retrieval
Search and Information Retrieval Search on the Web 1 is a daily activity for many people throughout the world Search and communication are most popular uses of the computer Applications involving search
Semistructured data and XML. Institutt for Informatikk INF3100 09.04.2013 Ahmet Soylu
Semistructured data and XML Institutt for Informatikk 1 Unstructured, Structured and Semistructured data Unstructured data e.g., text documents Structured data: data with a rigid and fixed data format
Model-Mapping Approaches for Storing and Querying XML Documents in Relational Database: A Survey
Model-Mapping Approaches for Storing and Querying XML Documents in Relational Database: A Survey 1 Amjad Qtaish, 2 Kamsuriah Ahmad 1 School of Computer Science, Faculty of Information Science and Technology,
Unraveling the Duplicate-Elimination Problem in XML-to-SQL Query Translation
Unraveling the Duplicate-Elimination Problem in XML-to-SQL Query Translation Rajasekar Krishnamurthy University of Wisconsin [email protected] Raghav Kaushik Microsoft Corporation [email protected]
Modern Databases. Database Systems Lecture 18 Natasha Alechina
Modern Databases Database Systems Lecture 18 Natasha Alechina In This Lecture Distributed DBs Web-based DBs Object Oriented DBs Semistructured Data and XML Multimedia DBs For more information Connolly
An XML Based Data Exchange Model for Power System Studies
ARI The Bulletin of the Istanbul Technical University VOLUME 54, NUMBER 2 Communicated by Sondan Durukanoğlu Feyiz An XML Based Data Exchange Model for Power System Studies Hasan Dağ Department of Electrical
Abstract 1. INTRODUCTION
A Virtual Database Management System For The Internet Alberto Pan, Lucía Ardao, Manuel Álvarez, Juan Raposo and Ángel Viña University of A Coruña. Spain e-mail: {alberto,lucia,mad,jrs,avc}@gris.des.fi.udc.es
Dr. Anuradha et al. / International Journal on Computer Science and Engineering (IJCSE)
HIDDEN WEB EXTRACTOR DYNAMIC WAY TO UNCOVER THE DEEP WEB DR. ANURADHA YMCA,CSE, YMCA University Faridabad, Haryana 121006,India [email protected] http://www.ymcaust.ac.in BABITA AHUJA MRCE, IT, MDU University
Chapter 1: Introduction
Chapter 1: Introduction Database System Concepts, 5th Ed. See www.db book.com for conditions on re use Chapter 1: Introduction Purpose of Database Systems View of Data Database Languages Relational Databases
Website Standards Association. Business Website Search Engine Optimization
Website Standards Association Business Website Search Engine Optimization Copyright 2008 Website Standards Association Page 1 1. FOREWORD...3 2. PURPOSE AND SCOPE...4 2.1. PURPOSE...4 2.2. SCOPE...4 2.3.
Introduction to XML Applications
EMC White Paper Introduction to XML Applications Umair Nauman Abstract: This document provides an overview of XML Applications. This is not a comprehensive guide to XML Applications and is intended for
XFlash A Web Application Design Framework with Model-Driven Methodology
International Journal of u- and e- Service, Science and Technology 47 XFlash A Web Application Design Framework with Model-Driven Methodology Ronnie Cheung Hong Kong Polytechnic University, Hong Kong SAR,
Lesson 4 Web Service Interface Definition (Part I)
Lesson 4 Web Service Interface Definition (Part I) Service Oriented Architectures Module 1 - Basic technologies Unit 3 WSDL Ernesto Damiani Università di Milano Interface Definition Languages (1) IDLs
MTCache: Mid-Tier Database Caching for SQL Server
MTCache: Mid-Tier Database Caching for SQL Server Per-Åke Larson Jonathan Goldstein Microsoft {palarson,jongold}@microsoft.com Hongfei Guo University of Wisconsin [email protected] Jingren Zhou Columbia
Using Database Metadata and its Semantics to Generate Automatic and Dynamic Web Entry Forms
Using Database Metadata and its Semantics to Generate Automatic and Dynamic Web Entry Forms Mohammed M. Elsheh and Mick J. Ridley Abstract Automatic and dynamic generation of Web applications is the future
Translating between XML and Relational Databases using XML Schema and Automed
Imperial College of Science, Technology and Medicine (University of London) Department of Computing Translating between XML and Relational Databases using XML Schema and Automed Andrew Charles Smith acs203
Implementing a Web-based Transportation Data Management System
Presentation for the ITE District 6 Annual Meeting, June 2006, Honolulu 1 Implementing a Web-based Transportation Data Management System Tim Welch 1, Kristin Tufte 2, Ransford S. McCourt 3, Robert L. Bertini
Bitrix Site Manager 4.1. User Guide
Bitrix Site Manager 4.1 User Guide 2 Contents REGISTRATION AND AUTHORISATION...3 SITE SECTIONS...5 Creating a section...6 Changing the section properties...8 SITE PAGES...9 Creating a page...10 Editing
A Document Management System Based on an OODB
Tamkang Journal of Science and Engineering, Vol. 3, No. 4, pp. 257-262 (2000) 257 A Document Management System Based on an OODB Ching-Ming Chao Department of Computer and Information Science Soochow University
A LANGUAGE INDEPENDENT WEB DATA EXTRACTION USING VISION BASED PAGE SEGMENTATION ALGORITHM
A LANGUAGE INDEPENDENT WEB DATA EXTRACTION USING VISION BASED PAGE SEGMENTATION ALGORITHM 1 P YesuRaju, 2 P KiranSree 1 PG Student, 2 Professorr, Department of Computer Science, B.V.C.E.College, Odalarevu,
Developing XML Solutions with JavaServer Pages Technology
Developing XML Solutions with JavaServer Pages Technology XML (extensible Markup Language) is a set of syntax rules and guidelines for defining text-based markup languages. XML languages have a number
AN ENHANCED DATA MODEL AND QUERY ALGEBRA FOR PARTIALLY STRUCTURED XML DATABASE
THE UNIVERSITY OF SHEFFIELD DEPARTMENT OF COMPUTER SCIENCE RESEARCH MEMORANDA CS-03-08 MPHIL/PHD UPGRADE REPORT AN ENHANCED DATA MODEL AND QUERY ALGEBRA FOR PARTIALLY STRUCTURED XML DATABASE SUPERVISORS:
Web Browsing Quality of Experience Score
Web Browsing Quality of Experience Score A Sandvine Technology Showcase Contents Executive Summary... 1 Introduction to Web QoE... 2 Sandvine s Web Browsing QoE Metric... 3 Maintaining a Web Page Library...
Lightweight Data Integration using the WebComposition Data Grid Service
Lightweight Data Integration using the WebComposition Data Grid Service Ralph Sommermeier 1, Andreas Heil 2, Martin Gaedke 1 1 Chemnitz University of Technology, Faculty of Computer Science, Distributed
Concepts of Database Management Seventh Edition. Chapter 9 Database Management Approaches
Concepts of Database Management Seventh Edition Chapter 9 Database Management Approaches Objectives Describe distributed database management systems (DDBMSs) Discuss client/server systems Examine the ways
KEYWORD SEARCH IN RELATIONAL DATABASES
KEYWORD SEARCH IN RELATIONAL DATABASES N.Divya Bharathi 1 1 PG Scholar, Department of Computer Science and Engineering, ABSTRACT Adhiyamaan College of Engineering, Hosur, (India). Data mining refers to
Introduction to IR Systems: Supporting Boolean Text Search. Information Retrieval. IR vs. DBMS. Chapter 27, Part A
Introduction to IR Systems: Supporting Boolean Text Search Chapter 27, Part A Database Management Systems, R. Ramakrishnan 1 Information Retrieval A research field traditionally separate from Databases
Web 2.0-based SaaS for Community Resource Sharing
Web 2.0-based SaaS for Community Resource Sharing Corresponding Author Department of Computer Science and Information Engineering, National Formosa University, [email protected] doi : 10.4156/jdcta.vol5.issue5.14
So today we shall continue our discussion on the search engines and web crawlers. (Refer Slide Time: 01:02)
Internet Technology Prof. Indranil Sengupta Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Lecture No #39 Search Engines and Web Crawler :: Part 2 So today we
Last Week. XML (extensible Markup Language) HTML Deficiencies. XML Advantages. Syntax of XML DHTML. Applets. Modifying DOM Event bubbling
XML (extensible Markup Language) Nan Niu ([email protected]) CSC309 -- Fall 2008 DHTML Modifying DOM Event bubbling Applets Last Week 2 HTML Deficiencies Fixed set of tags No standard way to create new
Topics in basic DBMS course
Topics in basic DBMS course Database design Transaction processing Relational query languages (SQL), calculus, and algebra DBMS APIs Database tuning (physical database design) Basic query processing (ch
Relational Databases
Relational Databases Jan Chomicki University at Buffalo Jan Chomicki () Relational databases 1 / 18 Relational data model Domain domain: predefined set of atomic values: integers, strings,... every attribute
OWL based XML Data Integration
OWL based XML Data Integration Manjula Shenoy K Manipal University CSE MIT Manipal, India K.C.Shet, PhD. N.I.T.K. CSE, Suratkal Karnataka, India U. Dinesh Acharya, PhD. ManipalUniversity CSE MIT, Manipal,
DTD Tutorial. About the tutorial. Tutorial
About the tutorial Tutorial Simply Easy Learning 2 About the tutorial DTD Tutorial XML Document Type Declaration commonly known as DTD is a way to describe precisely the XML language. DTDs check the validity
DataDirect XQuery Technical Overview
DataDirect XQuery Technical Overview Table of Contents 1. Feature Overview... 2 2. Relational Database Support... 3 3. Performance and Scalability for Relational Data... 3 4. XML Input and Output... 4
Concrete uses of XML in software development and data analysis.
Concrete uses of XML in software development and data analysis. S. Patton LBNL, Berkeley, CA 94720, USA XML is now becoming an industry standard for data description and exchange. Despite this there are
XML Programming with PHP and Ajax
http://www.db2mag.com/story/showarticle.jhtml;jsessionid=bgwvbccenyvw2qsndlpskh0cjunn2jvn?articleid=191600027 XML Programming with PHP and Ajax By Hardeep Singh Your knowledge of popular programming languages
Terms and Definitions for CMS Administrators, Architects, and Developers
Sitecore CMS 6 Glossary Rev. 081028 Sitecore CMS 6 Glossary Terms and Definitions for CMS Administrators, Architects, and Developers Table of Contents Chapter 1 Introduction... 3 1.1 Glossary... 4 Page
A Java proxy for MS SQL Server Reporting Services
1 of 5 1/10/2005 9:37 PM Advertisement: Support JavaWorld, click here! January 2005 HOME FEATURED TUTORIALS COLUMNS NEWS & REVIEWS FORUM JW RESOURCES ABOUT JW A Java proxy for MS SQL Server Reporting Services
WEBVIEW An SQL Extension for Joining Corporate Data to Data Derived from the World Wide Web
WEBVIEW An SQL Extension for Joining Corporate Data to Data Derived from the World Wide Web Charles A Wood and Terence T Ow Mendoza College of Business University of Notre Dame Notre Dame, IN 46556-5646
10CS73:Web Programming
10CS73:Web Programming Question Bank Fundamentals of Web: 1.What is WWW? 2. What are domain names? Explain domain name conversion with diagram 3.What are the difference between web browser and web server
Software Visualization Tools for Component Reuse
Software Visualization Tools for Component Reuse Craig Anslow Stuart Marshall James Noble Robert Biddle 1 School of Mathematics, Statistics and Computer Science, Victoria University of Wellington, New
City Data Pipeline. A System for Making Open Data Useful for Cities. [email protected]
City Data Pipeline A System for Making Open Data Useful for Cities Stefan Bischof 1,2, Axel Polleres 1, and Simon Sperl 1 1 Siemens AG Österreich, Siemensstraße 90, 1211 Vienna, Austria {bischof.stefan,axel.polleres,simon.sperl}@siemens.com
How To Use X Query For Data Collection
TECHNICAL PAPER BUILDING XQUERY BASED WEB SERVICE AGGREGATION AND REPORTING APPLICATIONS TABLE OF CONTENTS Introduction... 1 Scenario... 1 Writing the solution in XQuery... 3 Achieving the result... 6
LabVIEW Internet Toolkit User Guide
LabVIEW Internet Toolkit User Guide Version 6.0 Contents The LabVIEW Internet Toolkit provides you with the ability to incorporate Internet capabilities into VIs. You can use LabVIEW to work with XML documents,
Database Systems. Lecture 1: Introduction
Database Systems Lecture 1: Introduction General Information Professor: Leonid Libkin Contact: [email protected] Lectures: Tuesday, 11:10am 1 pm, AT LT4 Website: http://homepages.inf.ed.ac.uk/libkin/teach/dbs09/index.html
Application of XML Tools for Enterprise-Wide RBAC Implementation Tasks
Application of XML Tools for Enterprise-Wide RBAC Implementation Tasks Ramaswamy Chandramouli National Institute of Standards and Technology Gaithersburg, MD 20899,USA 001-301-975-5013 [email protected]
A MEDIATION LAYER FOR HETEROGENEOUS XML SCHEMAS
A MEDIATION LAYER FOR HETEROGENEOUS XML SCHEMAS Abdelsalam Almarimi 1, Jaroslav Pokorny 2 Abstract This paper describes an approach for mediation of heterogeneous XML schemas. Such an approach is proposed
Querying Combined Cloud-Based and Relational Databases
Querying Combined Cloud-Based and Relational Databases Minpeng Zhu and Tore Risch Department of Information Technology, Uppsala University, Sweden [email protected] [email protected] Abstract An increasing
Component visualization methods for large legacy software in C/C++
Annales Mathematicae et Informaticae 44 (2015) pp. 23 33 http://ami.ektf.hu Component visualization methods for large legacy software in C/C++ Máté Cserép a, Dániel Krupp b a Eötvös Loránd University [email protected]
Social Relationship Analysis with Data Mining
Social Relationship Analysis with Data Mining John C. Hancock Microsoft Corporation www.johnchancock.net November 2005 Abstract: The data mining algorithms in Microsoft SQL Server 2005 can be used as a
VIRTUAL LABORATORY: MULTI-STYLE CODE EDITOR
VIRTUAL LABORATORY: MULTI-STYLE CODE EDITOR Andrey V.Lyamin, State University of IT, Mechanics and Optics St. Petersburg, Russia Oleg E.Vashenkov, State University of IT, Mechanics and Optics, St.Petersburg,
XML: extensible Markup Language. Anabel Fraga
XML: extensible Markup Language Anabel Fraga Table of Contents Historic Introduction XML vs. HTML XML Characteristics HTML Document XML Document XML General Rules Well Formed and Valid Documents Elements
estatistik.core: COLLECTING RAW DATA FROM ERP SYSTEMS
WP. 2 ENGLISH ONLY UNITED NATIONS STATISTICAL COMMISSION and ECONOMIC COMMISSION FOR EUROPE CONFERENCE OF EUROPEAN STATISTICIANS Work Session on Statistical Data Editing (Bonn, Germany, 25-27 September
Technologies for a CERIF XML based CRIS
Technologies for a CERIF XML based CRIS Stefan Bärisch GESIS-IZ, Bonn, Germany Abstract The use of XML as a primary storage format as opposed to data exchange raises a number of questions regarding the
Data XML and XQuery A language that can combine and transform data
Data XML and XQuery A language that can combine and transform data John de Longa Solutions Architect DataDirect technologies [email protected] Mobile +44 (0)7710 901501 Data integration through
The Web Web page Links 16-3
Chapter Goals Compare and contrast the Internet and the World Wide Web Describe general Web processing Write basic HTML documents Describe several specific HTML tags and their purposes 16-1 Chapter Goals
INTEGRATION OF XML DATA IN PEER-TO-PEER E-COMMERCE APPLICATIONS
INTEGRATION OF XML DATA IN PEER-TO-PEER E-COMMERCE APPLICATIONS Tadeusz Pankowski 1,2 1 Institute of Control and Information Engineering Poznan University of Technology Pl. M.S.-Curie 5, 60-965 Poznan
SnapLogic Salesforce Snap Reference
SnapLogic Salesforce Snap Reference Document Release: October 2012 SnapLogic, Inc. 71 East Third Avenue San Mateo, California 94401 U.S.A. www.snaplogic.com Copyright Information 2012 SnapLogic, Inc. All
Web Data Management - Some Issues
Web Data Management - Some Issues Properties of Web Data Lack of a schema Data is at best semi-structured Missing data, additional attributes, similar data but not identical Volatility Changes frequently
Associate Professor, Department of CSE, Shri Vishnu Engineering College for Women, Andhra Pradesh, India 2
Volume 6, Issue 3, March 2016 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Special Issue
Recovering Business Rules from Legacy Source Code for System Modernization
Recovering Business Rules from Legacy Source Code for System Modernization Erik Putrycz, Ph.D. Anatol W. Kark Software Engineering Group National Research Council, Canada Introduction Legacy software 000009*
Extensible Markup Language (XML): Essentials for Climatologists
Extensible Markup Language (XML): Essentials for Climatologists Alexander V. Besprozvannykh CCl OPAG 1 Implementation/Coordination Team The purpose of this material is to give basic knowledge about XML
XML Processing and Web Services. Chapter 17
XML Processing and Web Services Chapter 17 Textbook to be published by Pearson Ed 2015 in early Pearson 2014 Fundamentals of http://www.funwebdev.com Web Development Objectives 1 XML Overview 2 XML Processing
Indexing Techniques for Data Warehouses Queries. Abstract
Indexing Techniques for Data Warehouses Queries Sirirut Vanichayobon Le Gruenwald The University of Oklahoma School of Computer Science Norman, OK, 739 [email protected] [email protected] Abstract Recently,
Efficient Mapping XML DTD to Relational Database
Efficient Mapping XML DTD to Relational Database Mohammed Adam Ibrahim Fakharaldien 1, Khalid Edris 2, Jasni Mohamed Zain 3, Norrozila Sulaiman 4 Faculty of Computer System and Software Engineering,University
CSE 233. Database System Overview
CSE 233 Database System Overview 1 Data Management An evolving, expanding field: Classical stand-alone databases (Oracle, DB2, SQL Server) Computer science is becoming data-centric: web knowledge harvesting,
Advantages of XML as a data model for a CRIS
Advantages of XML as a data model for a CRIS Patrick Lay, Stefan Bärisch GESIS-IZ, Bonn, Germany Summary In this paper, we present advantages of using a hierarchical, XML 1 -based data model as the basis
CHAPTER 1 INTRODUCTION
CHAPTER 1 INTRODUCTION 1.1 Introduction Nowadays, with the rapid development of the Internet, distance education and e- learning programs are becoming more vital in educational world. E-learning alternatives
Object Oriented Databases. OOAD Fall 2012 Arjun Gopalakrishna Bhavya Udayashankar
Object Oriented Databases OOAD Fall 2012 Arjun Gopalakrishna Bhavya Udayashankar Executive Summary The presentation on Object Oriented Databases gives a basic introduction to the concepts governing OODBs
Data Integration using Agent based Mediator-Wrapper Architecture. Tutorial Report For Agent Based Software Engineering (SENG 609.
Data Integration using Agent based Mediator-Wrapper Architecture Tutorial Report For Agent Based Software Engineering (SENG 609.22) Presented by: George Shi Course Instructor: Dr. Behrouz H. Far December
