Abstract 1. INTRODUCTION
|
|
- Holly Shelton
- 8 years ago
- Views:
Transcription
1 A Virtual Database Management System For The Internet Alberto Pan, Lucía Ardao, Manuel Álvarez, Juan Raposo and Ángel Viña University of A Coruña. Spain {alberto,lucia,mad,jrs,avc}@gris.des.fi.udc.es Address: Dpto. Electrónica y Sistemas. Campus de Elviña S/N, Universidad de A Coruña. Spain Tlf: Ext Fax: Abstract Virtual Databases (VDB s) differ from standard databases because data are not really stored into the database. In turn data can be stored remotely in several heterogeneous semi-structured sources. Virtual databases offer an uniform way to query and integrate this information. We present a VDB system which focuses in the reuse of the public information available in the World Wide Web, providing programmers with an easy and quick way to use that information. 1. INTRODUCTION The World Wide Web has become a huge repository for all kind of information. Many applications could get substantial benefit if they could easily and efficiently query this repository. But WWW information is written in HTML pages which are humanreadable through a web browser but which are not machine-readable in a straightforward manner. This is due to the lack of semantic capabilities in HTML and because this is not usually an issue for HTML authors when the pages are created. Nevertheless, much of the WWW information is not completely unstructured. Many web sites provide information on a semi-structured way. Typical examples include HTML tables, outputs from on-line search forms, etc. Virtual Databases [1] provide a way to get benefit of this huge repository. We present a Virtual Database Management System (VDBMS) that has proven to be useful in many real-world applications [2] [3] [4]. Virtual Databases (VDB s) differ from standard databases because data are not really stored into the database. In turn data can be stored remotely in several heterogeneous semi-structured sources. Virtual databases offer an uniform way to query and integrate this information. We present a VDB system which focuses in the reuse of the public information available in the World Wide Web, providing programmers with an easy and quick way to use that information Application developers can easily create a Virtual Database in our VDBMS specifying the table structure of the database as they would do in a standard DBMS. Then they will also specify the web sources from which the data will be extracted along with a simple description of the source. Then the VDBMS automatically generates wrappers for a transparent access to these sources, so that programmers can write standard database queries to access the data. Key issues in building a VDBMS are performance and how easily programmers can automatically generate wrappers for the desired information sources.
2 To improve the sytem performance we rely on asynchronous multi-thread operation and in an cache system. Besides the normal cache operation, our cache system is able to transparently do pre-loadings of the most frequently requested data and to answer queries by filtering previously cached ones. For an easy and quick wrapper generation, we have developed an innovative tool which lets users write specifications describing the sources in a simple language. The entire process of adding a new information source often does not take more than 5-10 minutes. As we have remarked previously, our VDBMS has already been successfully used in a number of real-world applications including the first comparative shopping tool in the Spanish Internet [2] and several projects to provide web content to Internet enterprises and audience sites in domains as traffic, flights, tourism, financial products comparison, [3] [4] [11][12]. Section 2 of this paper is an overview of the system architecture. Section 3 shows how to create a table in the VDBMS including the process needed to fill the table with data from remote web sources. Section 4 focuses in the wrapper generation tool. Section 5 focuses in the cache system. Section 6 shows some real-world examples using the VDBMS. Section 6 list conclusions of this work and outline future improvements. 2. SYSTEM ARCHITECTURE Figure 1 shows an schema of the VDBMS architectural components.. Section 2.1 shows how the components of the architecture interact to answer a query against a table in the VDBMS. Application Program Query Interpreter Query Language Data Diction. Query Engine Cache Filter Engine Wrapper 1 Wrapper 2 Wrapper n Specification Language. Source 1 Source 2 Source n
3 2.1. Answering queries The Application program can make queries against the VDBMS in a specific language. Our query language is currently quite simple. Queries are restricted to an unique table and a typical query looks like this: (field operator {value1,,valuen}) relational-operator (field operator {value1,, valuen}) sorgroup-operator {field1,,fieldn}. For instance, if we had a table named BOOK with fields TITLE (String), AUTHOR (String), PRICE (Money), then we could write a query like this: (title contains { java, xml }) and (author contains { Rick Smith }) and (price lessthan {(30,EURO)}) sortby asc {price}) for retrieving the rows in the table representing books with the words java and xml in their title, written by Rick Smith and with a price under 30 Euros, sorted by ascending price. The query interpreter is in charge of parsing the query and transform it to an internal format. The Data Dictionary of the VDBMS must be accessed here for ensuring consistence between the query and the table schema. Then the query engine starts to resolve the query. If the cache is activated for the queried table, then the query is sent to the cache system. If the cache system is able to resolve the query, then it return the results to the query engine which can return them to the invoking application. See section 5 for a more detailed explanation of this process. If cache is not activated or it can not answer the query, then the query engine looks in the Data Dictionary for the relevant sources for this query. Usually the relevant sources for a query against a certain table will be all the sources associated with the given table, but it is possible to choose alternative sources depending on the fields of the table involved in the query. Now, the query engine dinamically creates one wrapper for each relevant source. In order to create the wrapper, the query engine looks in the Data Dictionary for an specification associated with the source (which was written by the table creator when the source was added). This specification is the input to the wrapper generation tool, which automatically generates the wrapper for the source (see section 4 for details and an example of a specification for adding a web source). Then the wrapper is in charge of obtaining the partial results to the query provided for a given source. Therefore, the wrapper must be able of reformulate the query in terms of the remote source (so the query is understood by the remote source) and also it must be able of understand the source output format in order to map the output given by the source to the table structure. When the remote sources are web sources (the usual case), to make a query in the source means automatically fill some kind of web form and execute an HTTP GET or
4 POST operation aginst the web server of the source (in that sense, a VDBMS can be seen as a sophisticated type of metasearch engine). To understand the source results means parsing the HTML (and sometimes XML or Javascript) returned by the search to extract the found items (for instance: parse the books returned for an online web search form on an online bookshop). Obviously, wrapper generation is a key point. If we pretend a VDBMS to be a powerful tool, it is very important that the wrappers can be easily created. That is, it should be easy and quick to write the specification for adding a given web source. Section 4 explains our wrapper generation system in more detail but we want to remark that our wrapper generation tool has showed itself powerful and easy to use. Currently, we have extracted information from more than 250 different web sources in many application domains (see section 6 for some remarkable examples). The specifications are usually written by non-programmers (it is only needed to know some HTML and HTTP concepts to add new sources in our system) and the typical time for adding a source are between 10 and 20 minutes. When the query engine receives each result from the wrappers, the filter engine works to ensure that the given results are consistent with the query and with the table schema. This is needed because wrappers are not forced to return complete valid results in certain cases. For instance, to make the previous query about books of java and xml written by Rick Martin and with a price minor than 30 euros, directly in a remote source, that source should have a web form which let users search the books in its database by title, author and price. But many online bookshops do not have so detailed search interfaces. For instance in many bookshops users can only search by title or author but not by both or perhaps it is permitted to search by both title and author, but not by price. In this kind of situations, the wrapper chooses to search a more general query and to let the filter engine remove the unwished results. For instance, the wrapper could search only by title knowing that the filter engine will remove the results which not match the author and price search criterias. With all the results returned by the wrappers the query engine constructs a result set. If the query contains sort of group by operations over the data, the query engine uses the filter engine to execute them over the result set. It is important to note that the VDBMS also can operate in an asynchronous manner (see section 5 for details). That means that application can access the result set before it is complete (so the application need not to wait for all the results to process the already received ones). Finally, the result set is returned to the application program. If the cache is activated, then the result set is also stored in the cache. 3. CREATING A TABLE IN THE VDBMS The process of creating a table in our VDBMS involves two steps: (1) Define the table schema and (2) Configure wrappers to search and extract data from web sources. Step (1) is not very different of the process of creating a table in an standard DBMS. A table schema consists of a list of fields. Each field has a data type and can have
5 associated constraints. Some data types currently available in our VDBMS are Strings, integers, long integers, money, date, URL, etc. Some restrictions currently available are: uniqueness, field not null, field not searchable, must exist and be accesible (only applicable to URL fields), etc. Step (2) includes to configure wrappers for each data source. For each source a wrapper must know: 1) how to search in the source and 2) how to understand the results of a search. Both (1) and (2) are made using a graphical web administration tool and no code at all is required. 4. WRAPPER GENERATION TOOL In this section we will describe the wrapper generation tool, which is able to generate wrappers around semi-structured web sites without using any domain specific heuristic (we have conducted several successful tests of the tool with many Web Information sources in different domains). A wrapper for a web source must be able of doing two different things over the source: 1) how to search in it and 2) how to understand the results of a search. 1) requires to be able of automatically generate and submit HTTP web forms. When a user of the VDBMS needs to add a new source provides our wrapper generation tool with an URL to the page where the desired HTTP form is located. The tool is able of automatically find the web forms of the page and present them to the user. Then the user can associate the fields of the web form with searchable fields of the table (for instance, the user would associate the field for searching by title in an on-line book-shop web form with the field title of the table BOOK ). Then the tool makes the rest of the work and generates a URL pattern that will be used to search in the source. The user can also associate fields in the form with operators in the VDB. For instance he/she can associate a search by keyword checkbox in an HTTP form with the containskeyword operator of the VDB. 2) requires parsing the result of one or many HTTP request (usually HTML pages) and extract the obtained results from them. Our parser generation tool can be used, by even users without technical capabilities, to write specifications describing the external appearance in a web browser of a pattern of information to be extracted from a set of HTML pages. Then, the tool is able to generate a wrapper to automatically extract information according to this pattern from the specified pages. Therefore, the tool avoids the need to write a specific parser for extracting the desired information. We will consider a simple example for illustrating the use of the extraction tool. As our example we chose the Internet bookshop Amazon [5]. Figure 2 shows a snapshot of the answer of the AMAZON Internet bookshop to the query books which contain the word java in their title. The specification for extracting information from this page is showed in Figure 3.
6 Figure 2: Amazon snapshot ANCHOR (TITLE) ~ IRRELEVANT? EOL AUTHOR / IRRELEVANT EOL Our Price : $ PRICE[CURRENCY=DOLLAR] ~ IRRELEVANT? EOL Figure 3: AMAZON specification With this specification and the showed example page, our tool will find an instance of the pattern for each book in the results page. The idea is that the user writing the specification tries to reproduce the visual aspect of the pattern which is trying to match. In the actual state of the tool, the reserved word ANCHOR is used to indicate an HTML link and EOL indicates an end of line. Names such as TITLE or PRICE are character strings naming the attributes that we want to obtain from the occurrences of the pattern in the page. We will call these reference-names. For each instance of the pattern found, the tool will produce a sequence of tuples matching each referencename in the specified pattern with the real value found in the pattern. For instance the first book in the results page would make the tool match a pattern with the following tuples: { (TITLE, Abstract data types in Java ), (AUTHOR, Michael S. Jenkins ), (PRICE, 40.46, CURRENCY=DOLLAR ) }. In this example we have to point some other features: 1) It is possible to embed some application-specific meta-information in the specifications, enclosing it between [ and ] immediately after a reference-name. For instance we write PRICE[CURRENCY=DOLLAR] in the last line of our specification. The tool generates for the PRICE reference-name a 3-upla with the form (PRICE, the-extracted-price, CURRENCY=DOLLAR ). It is the application responsibility to correctly use the context information of a referencename. 2) "IRRELEVANT" is a reserved name used to represent attributes inside the pattern that are no relevant for our purposes and so, they should not generate a pair in the
7 output. For instance, here we suppose that the availability information provided in the first line of the pattern is not relevant for our purposes and we do not want it to be returned. 3) We can use string separators to divide the text items inside the pattern. For instance in the second line of the pattern, we use the separator / to separate between author information and the other text information on the same line, that we suppose irrelevant for our purposes. 4) It is usual to find patterns with optional parts. For instance in Amazon availability information appear only in some of the results. We can enclose this optional parts between the and? characters. There are a lot of features that we will not expose here for simplicity and extension. Here we mention some of them. 1) We can extract information from multi-page outputs traversing HTML anchors, with hierarchically capabilities. For this purpose we can define sub-specifications inside the main specification. 2) Support for multi-valued items when the number of values is variable. For instance, the number of the main players in a movie or the authors of a book (note that this feature was not used on the Amazon example). 3) We can apply operators to the reference-names for transforming the values assigned to the attribute or filtering certain matched patterns. 4) We can write alternate specifications for information sources that use different answering formats to the same query depending of the number and kind of the results obtained. 5) It is possible to assign default values for attributes. 6) Etc Design and implementation Overview At an internal level, our tool is divided in two modules. Both use the tool Jflex[6] to generate scanners. For parsing, we have built our own parser tool. The first module parses the specification written by the user and generates an internal representation of it. The second module uses the internal representation of the specification to really extract the information from the source. The scanner divides the source into tokens. The parser looks for patterns and also takes care of checking that the text items are correctly structured according the specification. The parser also has to treat with optional parts of the specification, multi-valued items, etc. A higher layer is used to rule more advanced behaviours as traversing links in multi-page outputs. 5. PERFORMANCE AND CACHE Performance is one of the key issues involved in building a VDBMS;. Very often the data accesible through a VDBMS are extracted from remote sources and therefore performance is a major concern. In this section we outline our main strategies for improving performance in our VDBMS. When data are extracted from remote sources it is often desirable not to wait for the entire collection of data to be extracted before returning some results to the application.
8 For this reason our system can operate in an asynchronous and multi-thread basis when extracting information from remote web souces. The system starts one thread for each information source. Each thread has a maximum lifetime, and when one of them overcome it, it is suspended. Results are available to the application as soon as they are extracted. That means that the Result Set of the query can be accessed before the query is really finished. Application can also perform asynchronous filtering and ordering operations. For instance an application can execute asort operation over an uncomplete Result Set of an unfinished query. The results available at the moment at which the operation is executed will be sorted. New results will be added at the end of the ResultSet as they arrive. The application could execute a new sort operation in order to sort again the Result set when the query is complete. Another key element for improving the performance of the VDBMS is the cache system. If the cache is activated in a table of the VDBMS the result of a previous query can be used to answer a later one. The cache is able of, starting from a more general query with a entry in the cache, apply filtering processes to answer to less general queries without needing to extract again the data from the remote sources. For instance, we can have a table book filled with data extracted from the main Internet bookshops. Suppose the VDBMS receives the query (TITLE Contains java ) AND (TITLE Contains xml ). If there is an entry in the cache corresponding to the query (TITLE Contains java ), the cache will answer the query applying a filter to this cache entry obtaining the results that also have the word xml in its title, therefore avoiding the need of extract the data from the remote sources. The cache entries have a configurable lifetime. For getting upper cache agree, the system carries out pre-loads of frequently requested data which have timed-out. Data are also pre-loaded to obtain data from sources which failed in the past, because of a network error or source-server congestion. Pre-loads of data can be scheduled by system administrator, so they can be executed when the system workload is low. The cache system can operate only over a certain table or over the entire Virtual Database. Multiple servers can share the cache by making persistent the cache entries in a shared storage space. This is used, also, as a second level cache (the first level of the cache is stored in the memory of each server in order to get lower response times). 6. REAL WORLD EXAMPLES The VDBMS explained in this paper has already been used successfully in some realworld aplications. Some examples are: - The first comparative shopping tool in the Spanish Internet [2] - A MP3 search engine [3] - A web service for comparison of financial products [4]
9 The comparative shopping tool define a table for each type of product that can be searched. To fill the table of a product (e.g. books) with data, we extract information from the main Internet shops around the world which sell that product. Then it is possible to make queries against a table for obtaining the products satisfying certain conditions. Results can be filtered and sortered according to criteria as price, shipment fees, delivery times, etc. This application was developed to the spanish search engine Biwe and is currently accesible in its website. It will also be included soon in other well-known spanish audience sites. The MP3 search engine acts as a metasearch engine over the main MP3 crawlers in the Internet. In this case the VDB has an only table containing MP3 files. An special filter was added to this application: if it is required, the system can check that the MP3 files really exist in the server, a problem very common when downloading this kind of archives. Following the idea of comparisons between products of the same domain, we also have built a comparative tool for financial. This tool looks for financial products like deposits and mortgages and compares these products across the main banks and financial entities with presence in Spain. This application was developed to the spanish bank ebankinter and is currently accesible in its website. Besides these examples our VDBMS has been used to provide web content to Internet enterprises and audience sites from many different domains and web information sources such as traffic, flights, employment, auctions, entertainment, travels, financial information and so on. Some examples are DGT [7]: the traffic general direction of Spain, AENA[8]: flight information of all the spanish airports, InfoJobs.Net [9]: a complete Spanish employment exchange, ebay auctions [10], the Miami Herald Newspaper [11] (entertainment information like movies, theater, concerts, life at night, restaurants, etc.,) and NASDAQ Stock Market [12].
10 7. CONCLUSIONS AND FUTURE WORK We have presented a Virtual Database Management System which focuses in the reuse of the public information available in the World Wide Web, providing programmers with an easy and quick way to use that information in their application programs. This way, programs can get benefit of the huge amounts of useful semi-strucutred information available in the World Wide Web. Our VDBMS let programmers define tables of a Virtual DataBase and fill it with data extracted from remote web sources. In order to add a web source a simple specification describing the source is needed. With our innovative wrapper generation tool, this process can be made by non-programmers in minutes for average sources. For improving performance, our VDBMS includes a cache system able of pre-load useful queries and able to answer new queries by filtering more general previous ones. Our future work include to improve our query language with more complex structures as joins between different tables, making it more similar to Objectual DataBase query languages such as OQL. We also are improving our wrapper generation tool to include support for sources with complex Javascript. REFERENCES [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12]
So today we shall continue our discussion on the search engines and web crawlers. (Refer Slide Time: 01:02)
Internet Technology Prof. Indranil Sengupta Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Lecture No #39 Search Engines and Web Crawler :: Part 2 So today we
More informationBuglook: A Search Engine for Bug Reports
Buglook: A Search Engine for Bug Reports Georgi Chulkov May 18, 2007 Project Report Networks and Distributed Systems Seminar Supervisor: Dr. Juergen Schoenwaelder Jacobs University Bremen 1 INTRODUCTION
More informationLDIF - Linked Data Integration Framework
LDIF - Linked Data Integration Framework Andreas Schultz 1, Andrea Matteini 2, Robert Isele 1, Christian Bizer 1, and Christian Becker 2 1. Web-based Systems Group, Freie Universität Berlin, Germany a.schultz@fu-berlin.de,
More informationRotorcraft Health Management System (RHMS)
AIAC-11 Eleventh Australian International Aerospace Congress Rotorcraft Health Management System (RHMS) Robab Safa-Bakhsh 1, Dmitry Cherkassky 2 1 The Boeing Company, Phantom Works Philadelphia Center
More informationHow To Build A Connector On A Website (For A Nonprogrammer)
Index Data's MasterKey Connect Product Description MasterKey Connect is an innovative technology that makes it easy to automate access to services on the web. It allows nonprogrammers to create 'connectors'
More informationINTRODUCING AZURE SEARCH
David Chappell INTRODUCING AZURE SEARCH Sponsored by Microsoft Corporation Copyright 2015 Chappell & Associates Contents Understanding Azure Search... 3 What Azure Search Provides...3 What s Required to
More informationBinonymizer A Two-Way Web-Browsing Anonymizer
Binonymizer A Two-Way Web-Browsing Anonymizer Tim Wellhausen Gerrit Imsieke (Tim.Wellhausen, Gerrit.Imsieke)@GfM-AG.de 12 August 1999 Abstract This paper presents a method that enables Web users to surf
More informationChapter 2 Database System Concepts and Architecture
Chapter 2 Database System Concepts and Architecture Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 2 Outline Data Models, Schemas, and Instances Three-Schema Architecture
More informationMonitoring Infrastructure (MIS) Software Architecture Document. Version 1.1
Monitoring Infrastructure (MIS) Software Architecture Document Version 1.1 Revision History Date Version Description Author 28-9-2004 1.0 Created Peter Fennema 8-10-2004 1.1 Processed review comments Peter
More informationChapter-1 : Introduction 1 CHAPTER - 1. Introduction
Chapter-1 : Introduction 1 CHAPTER - 1 Introduction This thesis presents design of a new Model of the Meta-Search Engine for getting optimized search results. The focus is on new dimension of internet
More informationA Layered Architecture based on Java for Internet and Intranet Information Systems
A Layered Architecture based on Java for Internet and Intranet Information Systems Fidel CACHEDA, Alberto PAN, Lucía ARDAO, Ángel VIÑA Departamento de Electrónica y Sistemas Facultad de Informática, Universidad
More informationMiddleware support for the Internet of Things
Middleware support for the Internet of Things Karl Aberer, Manfred Hauswirth, Ali Salehi School of Computer and Communication Sciences Ecole Polytechnique Fédérale de Lausanne (EPFL) CH-1015 Lausanne,
More informationPerformance evaluation of Web Information Retrieval Systems and its application to e-business
Performance evaluation of Web Information Retrieval Systems and its application to e-business Fidel Cacheda, Angel Viña Departament of Information and Comunications Technologies Facultad de Informática,
More informationFig (1) (a) Server-side scripting with PHP. (b) Client-side scripting with JavaScript.
Client-Side Dynamic Web Page Generation CGI, PHP, JSP, and ASP scripts solve the problem of handling forms and interactions with databases on the server. They can all accept incoming information from forms,
More informationXML Processing and Web Services. Chapter 17
XML Processing and Web Services Chapter 17 Textbook to be published by Pearson Ed 2015 in early Pearson 2014 Fundamentals of http://www.funwebdev.com Web Development Objectives 1 XML Overview 2 XML Processing
More informationTier Architectures. Kathleen Durant CS 3200
Tier Architectures Kathleen Durant CS 3200 1 Supporting Architectures for DBMS Over the years there have been many different hardware configurations to support database systems Some are outdated others
More informationAJAX Storage: A Look at Flash Cookies and Internet Explorer Persistence
AJAX Storage: A Look at Flash Cookies and Internet Explorer Persistence Corey Benninger The AJAX Storage Dilemna AJAX (Asynchronous JavaScript and XML) applications are constantly looking for ways to increase
More informationLightweight Data Integration using the WebComposition Data Grid Service
Lightweight Data Integration using the WebComposition Data Grid Service Ralph Sommermeier 1, Andreas Heil 2, Martin Gaedke 1 1 Chemnitz University of Technology, Faculty of Computer Science, Distributed
More informationNetwork Activity D Developing and Maintaining Databases
Network Activity D - Developing and Maintaining Databases Report D3.2.2 User Interface implementation Patricia KELBERT MNHN Paris BGBM Berlin July 2006-1- Table of Contents 1 Introduction... 4 2 Material
More informationNetIQ Identity Manager Identity Reporting Module Guide
NetIQ Identity Manager Identity Reporting Module Guide December 2014 www.netiq.com/documentation Legal Notice THIS DOCUMENT AND THE SOFTWARE DESCRIBED IN THIS DOCUMENT ARE FURNISHED UNDER AND ARE SUBJECT
More informationToad for Oracle 8.6 SQL Tuning
Quick User Guide for Toad for Oracle 8.6 SQL Tuning SQL Tuning Version 6.1.1 SQL Tuning definitively solves SQL bottlenecks through a unique methodology that scans code, without executing programs, to
More informationFirewall Builder Architecture Overview
Firewall Builder Architecture Overview Vadim Zaliva Vadim Kurland Abstract This document gives brief, high level overview of existing Firewall Builder architecture.
More informationA Tool for Evaluation and Optimization of Web Application Performance
A Tool for Evaluation and Optimization of Web Application Performance Tomáš Černý 1 cernyto3@fel.cvut.cz Michael J. Donahoo 2 jeff_donahoo@baylor.edu Abstract: One of the main goals of web application
More informationWeb. Services. Web Technologies. Today. Web. Technologies. Internet WWW. Protocols TCP/IP HTTP. Apache. Next Time. Lecture #3 2008 3 Apache.
JSP, and JSP, and JSP, and 1 2 Lecture #3 2008 3 JSP, and JSP, and Markup & presentation (HTML, XHTML, CSS etc) Data storage & access (JDBC, XML etc) Network & application protocols (, etc) Programming
More informationUsing Database Metadata and its Semantics to Generate Automatic and Dynamic Web Entry Forms
Using Database Metadata and its Semantics to Generate Automatic and Dynamic Web Entry Forms Mohammed M. Elsheh and Mick J. Ridley Abstract Automatic and dynamic generation of Web applications is the future
More informationA network monitoring tool for student training
A network monitoring tool for student training Miguel A. Mateo Pla, M.P. Malumbres Departamento de Informática de Sistemas y Computadores (DISCA) Facultad de Informática (FI) Universidad Politécnica de
More informationREVIEW PAPER ON PERFORMANCE OF RESTFUL WEB SERVICES
REVIEW PAPER ON PERFORMANCE OF RESTFUL WEB SERVICES Miss.Monali K.Narse 1,Chaitali S.Suratkar 2, Isha M.Shirbhate 3 1 B.E, I.T, JDIET, Yavatmal, Maharashtra, India, monalinarse9990@gmail.com 2 Assistant
More informationTHE CCLRC DATA PORTAL
THE CCLRC DATA PORTAL Glen Drinkwater, Shoaib Sufi CCLRC Daresbury Laboratory, Daresbury, Warrington, Cheshire, WA4 4AD, UK. E-mail: g.j.drinkwater@dl.ac.uk, s.a.sufi@dl.ac.uk Abstract: The project aims
More informationMA-WA1920: Enterprise iphone and ipad Programming
MA-WA1920: Enterprise iphone and ipad Programming Description This 5 day iphone training course teaches application development for the ios platform. It covers iphone, ipad and ipod Touch devices. This
More informationData Discovery on the Information Highway
Data Discovery on the Information Highway Susan Gauch Introduction Information overload on the Web Many possible search engines Need intelligent help to select best information sources customize results
More informationLabVIEW Internet Toolkit User Guide
LabVIEW Internet Toolkit User Guide Version 6.0 Contents The LabVIEW Internet Toolkit provides you with the ability to incorporate Internet capabilities into VIs. You can use LabVIEW to work with XML documents,
More informationAdvanced Meta-search of News in the Web
Advanced Meta-search of News in the Web Rubén Tous, Jaime Delgado Universitat Pompeu Fabra (UPF), Departament de Tecnologia, Pg. Circumval lació, 8. E-08003 Barcelona, Spain {ruben.tous, Jaime.delgado}@tecn.upf.es
More informationWeb Data Management - Some Issues
Web Data Management - Some Issues Properties of Web Data Lack of a schema Data is at best semi-structured Missing data, additional attributes, similar data but not identical Volatility Changes frequently
More informationSOA REFERENCE ARCHITECTURE: WEB TIER
SOA REFERENCE ARCHITECTURE: WEB TIER SOA Blueprint A structured blog by Yogish Pai Web Application Tier The primary requirement for this tier is that all the business systems and solutions be accessible
More informationTIBCO Spotfire Automation Services 6.5. User s Manual
TIBCO Spotfire Automation Services 6.5 User s Manual Revision date: 17 April 2014 Important Information SOME TIBCO SOFTWARE EMBEDS OR BUNDLES OTHER TIBCO SOFTWARE. USE OF SUCH EMBEDDED OR BUNDLED TIBCO
More informationWeb Browsing Quality of Experience Score
Web Browsing Quality of Experience Score A Sandvine Technology Showcase Contents Executive Summary... 1 Introduction to Web QoE... 2 Sandvine s Web Browsing QoE Metric... 3 Maintaining a Web Page Library...
More informationTopics. Introduction to Database Management System. What Is a DBMS? DBMS Types
Introduction to Database Management System Linda Wu (CMPT 354 2004-2) Topics What is DBMS DBMS types Files system vs. DBMS Advantages of DBMS Data model Levels of abstraction Transaction management DBMS
More informationA&D srl Consulting & Logistic Systems Galleria Spagna, 35-35127 Padova (PD) - Italy - Telefono +39.049.8792400 - Fax +39.049.8792408 Sede Legale:
INTEGRATED DOCUMENT MANAGEMENT GENERAL DIAGRAM 1 GENERAL CONCEPTS The integrated document management of a company is due to two trends: 1. electronic processing (scanning) of documents used within the
More informationComputer Networks. Lecture 7: Application layer: FTP and HTTP. Marcin Bieńkowski. Institute of Computer Science University of Wrocław
Computer Networks Lecture 7: Application layer: FTP and Marcin Bieńkowski Institute of Computer Science University of Wrocław Computer networks (II UWr) Lecture 7 1 / 23 Reminder: Internet reference model
More informationXQuery and the E-xml Component suite
An Introduction to the e-xml Data Integration Suite Georges Gardarin, Antoine Mensch, Anthony Tomasic e-xmlmedia, 29 Avenue du Général Leclerc, 92340 Bourg La Reine, France georges.gardarin@e-xmlmedia.fr
More informationSecurity Test s i t ng Eileen Donlon CMSC 737 Spring 2008
Security Testing Eileen Donlon CMSC 737 Spring 2008 Testing for Security Functional tests Testing that role based security functions correctly Vulnerability scanning and penetration tests Testing whether
More informationEUR-Lex 2012 Data Extraction using Web Services
DOCUMENT HISTORY DOCUMENT HISTORY Version Release Date Description 0.01 24/01/2013 Initial draft 0.02 01/02/2013 Review 1.00 07/08/2013 Version 1.00 -v1.00.doc Page 2 of 17 TABLE OF CONTENTS 1 Introduction...
More informationDeposit Identification Utility and Visualization Tool
Deposit Identification Utility and Visualization Tool Colorado School of Mines Field Session Summer 2014 David Alexander Jeremy Kerr Luke McPherson Introduction Newmont Mining Corporation was founded in
More informationIntroduction to XML Applications
EMC White Paper Introduction to XML Applications Umair Nauman Abstract: This document provides an overview of XML Applications. This is not a comprehensive guide to XML Applications and is intended for
More informationWeb Analytics Understand your web visitors without web logs or page tags and keep all your data inside your firewall.
Web Analytics Understand your web visitors without web logs or page tags and keep all your data inside your firewall. 5401 Butler Street, Suite 200 Pittsburgh, PA 15201 +1 (412) 408 3167 www.metronomelabs.com
More informationMiddleware- Driven Mobile Applications
Middleware- Driven Mobile Applications A motwin White Paper When Launching New Mobile Services, Middleware Offers the Fastest, Most Flexible Development Path for Sophisticated Apps 1 Executive Summary
More informationSpecify the location of an HTML control stored in the application repository. See Using the XPath search method, page 2.
Testing Dynamic Web Applications How To You can use XML Path Language (XPath) queries and URL format rules to test web sites or applications that contain dynamic content that changes on a regular basis.
More informationJAVA. EXAMPLES IN A NUTSHELL. O'REILLY 4 Beijing Cambridge Farnham Koln Paris Sebastopol Taipei Tokyo. Third Edition.
"( JAVA. EXAMPLES IN A NUTSHELL Third Edition David Flanagan O'REILLY 4 Beijing Cambridge Farnham Koln Paris Sebastopol Taipei Tokyo Table of Contents Preface xi Parti. Learning Java 1. Java Basics 3 Hello
More informationDeveloping a Web Server Platform with SAPI Support for AJAX RPC using JSON
Revista Informatica Economică, nr. 4 (44)/2007 45 Developing a Web Server Platform with SAPI Support for AJAX RPC using JSON Iulian ILIE-NEMEDI, Bucharest, Romania, inemedi@ie.ase.ro Writing a custom web
More informationCity Data Pipeline. A System for Making Open Data Useful for Cities. stefan.bischof@tuwien.ac.at
City Data Pipeline A System for Making Open Data Useful for Cities Stefan Bischof 1,2, Axel Polleres 1, and Simon Sperl 1 1 Siemens AG Österreich, Siemensstraße 90, 1211 Vienna, Austria {bischof.stefan,axel.polleres,simon.sperl}@siemens.com
More informationApplication of ontologies for the integration of network monitoring platforms
Application of ontologies for the integration of network monitoring platforms Jorge E. López de Vergara, Javier Aracil, Jesús Martínez, Alfredo Salvador, José Alberto Hernández Networking Research Group,
More informationChapter 1: Introduction
Chapter 1: Introduction Database System Concepts, 5th Ed. See www.db book.com for conditions on re use Chapter 1: Introduction Purpose of Database Systems View of Data Database Languages Relational Databases
More informationCGHub Web-based Metadata GUI Statement of Work
CGHub Web-based Metadata GUI Statement of Work Mark Diekhans Version 1 April 23, 2012 1 Goals CGHub stores metadata and data associated from NCI cancer projects. The goal of this project
More informationBusiness Application Services Testing
Business Application Services Testing Curriculum Structure Course name Duration(days) Express 2 Testing Concept and methodologies 3 Introduction to Performance Testing 3 Web Testing 2 QTP 5 SQL 5 Load
More informationEfficiency of Web Based SAX XML Distributed Processing
Efficiency of Web Based SAX XML Distributed Processing R. Eggen Computer and Information Sciences Department University of North Florida Jacksonville, FL, USA A. Basic Computer and Information Sciences
More informationEasy configuration of NETCONF devices
Easy configuration of NETCONF devices David Alexa 1 Tomas Cejka 2 FIT, CTU in Prague CESNET, a.l.e. Czech Republic Czech Republic alexadav@fit.cvut.cz cejkat@cesnet.cz Abstract. It is necessary for developers
More informationEvaluator s Guide. PC-Duo Enterprise HelpDesk v5.0. Copyright 2006 Vector Networks Ltd and MetaQuest Software Inc. All rights reserved.
Evaluator s Guide PC-Duo Enterprise HelpDesk v5.0 Copyright 2006 Vector Networks Ltd and MetaQuest Software Inc. All rights reserved. All third-party trademarks are the property of their respective owners.
More information1 File Processing Systems
COMP 378 Database Systems Notes for Chapter 1 of Database System Concepts Introduction A database management system (DBMS) is a collection of data and an integrated set of programs that access that data.
More information16.1 MAPREDUCE. For personal use only, not for distribution. 333
For personal use only, not for distribution. 333 16.1 MAPREDUCE Initially designed by the Google labs and used internally by Google, the MAPREDUCE distributed programming model is now promoted by several
More informationFigure 1. perfsonar architecture. 1 This work was supported by the EC IST-EMANICS Network of Excellence (#26854).
1 perfsonar tools evaluation 1 The goal of this PSNC activity was to evaluate perfsonar NetFlow tools for flow collection solution and assess its applicability to easily subscribe and request different
More informationABSTRACT 1. INTRODUCTION. Kamil Bajda-Pawlikowski kbajda@cs.yale.edu
Kamil Bajda-Pawlikowski kbajda@cs.yale.edu Querying RDF data stored in DBMS: SPARQL to SQL Conversion Yale University technical report #1409 ABSTRACT This paper discusses the design and implementation
More informationA LANGUAGE INDEPENDENT WEB DATA EXTRACTION USING VISION BASED PAGE SEGMENTATION ALGORITHM
A LANGUAGE INDEPENDENT WEB DATA EXTRACTION USING VISION BASED PAGE SEGMENTATION ALGORITHM 1 P YesuRaju, 2 P KiranSree 1 PG Student, 2 Professorr, Department of Computer Science, B.V.C.E.College, Odalarevu,
More informationWeb Database Integration
Web Database Integration Wei Liu School of Information Renmin University of China Beijing, 100872, China gue2@ruc.edu.cn Xiaofeng Meng School of Information Renmin University of China Beijing, 100872,
More informationICOM 6005 Database Management Systems Design. Dr. Manuel Rodríguez Martínez Electrical and Computer Engineering Department Lecture 2 August 23, 2001
ICOM 6005 Database Management Systems Design Dr. Manuel Rodríguez Martínez Electrical and Computer Engineering Department Lecture 2 August 23, 2001 Readings Read Chapter 1 of text book ICOM 6005 Dr. Manuel
More informationGeneric Log Analyzer Using Hadoop Mapreduce Framework
Generic Log Analyzer Using Hadoop Mapreduce Framework Milind Bhandare 1, Prof. Kuntal Barua 2, Vikas Nagare 3, Dynaneshwar Ekhande 4, Rahul Pawar 5 1 M.Tech(Appeare), 2 Asst. Prof., LNCT, Indore 3 ME,
More informationA Near Real-Time Personalization for ecommerce Platform Amit Rustagi arustagi@ebay.com
A Near Real-Time Personalization for ecommerce Platform Amit Rustagi arustagi@ebay.com Abstract. In today's competitive environment, you only have a few seconds to help site visitors understand that you
More informationVMware vcenter Log Insight User's Guide
VMware vcenter Log Insight User's Guide vcenter Log Insight 1.5 This document supports the version of each product listed and supports all subsequent versions until the document is replaced by a new edition.
More informationLesson 4 Web Service Interface Definition (Part I)
Lesson 4 Web Service Interface Definition (Part I) Service Oriented Architectures Module 1 - Basic technologies Unit 3 WSDL Ernesto Damiani Università di Milano Interface Definition Languages (1) IDLs
More informationVMware vcenter Log Insight User's Guide
VMware vcenter Log Insight User's Guide vcenter Log Insight 1.0 This document supports the version of each product listed and supports all subsequent versions until the document is replaced by a new edition.
More informationPivot Charting in SharePoint with Nevron Chart for SharePoint
Pivot Charting in SharePoint Page 1 of 10 Pivot Charting in SharePoint with Nevron Chart for SharePoint The need for Pivot Charting in SharePoint... 1 Pivot Data Analysis... 2 Functional Division of Pivot
More informationA Scalability Model for Managing Distributed-organized Internet Services
A Scalability Model for Managing Distributed-organized Internet Services TSUN-YU HSIAO, KO-HSU SU, SHYAN-MING YUAN Department of Computer Science, National Chiao-Tung University. No. 1001, Ta Hsueh Road,
More informationUse Cases for the Business Transaction Protocol
Use Cases for the Business Transaction Protocol OASIS Business Transactions Technical Committee Models/Use Cases Technical Subcommittee bt-models@lists.oasis-open.org Introduction This document attempts
More informationImplementation Guide SAP NetWeaver Identity Management Identity Provider
Implementation Guide SAP NetWeaver Identity Management Identity Provider Target Audience Technology Consultants System Administrators PUBLIC Document version: 1.10 2011-07-18 Document History CAUTION Before
More informationEvaluation of Nagios for Real-time Cloud Virtual Machine Monitoring
University of Victoria Faculty of Engineering Fall 2009 Work Term Report Evaluation of Nagios for Real-time Cloud Virtual Machine Monitoring Department of Physics University of Victoria Victoria, BC Michael
More informationBPM for Quality Assurance Systems in Higher Education. Vicente Cerverón-Lleó, Juan Cabotà-Soro, Francisco Grimaldo-Moreno, Ricardo Ferrís-Castell
BPM for Quality Assurance Systems in Higher Education Vicente Cerverón-Lleó, Juan Cabotà-Soro, Francisco Grimaldo-Moreno, Ricardo Ferrís-Castell Motivation (1) The European Higher Education Area (EHEA)
More informationAn Oracle White Paper June 2014. RESTful Web Services for the Oracle Database Cloud - Multitenant Edition
An Oracle White Paper June 2014 RESTful Web Services for the Oracle Database Cloud - Multitenant Edition 1 Table of Contents Introduction to RESTful Web Services... 3 Architecture of Oracle Database Cloud
More informationPUBLIC Performance Optimization Guide
SAP Data Services Document Version: 4.2 Support Package 6 (14.2.6.0) 2015-11-20 PUBLIC Content 1 Welcome to SAP Data Services....6 1.1 Welcome.... 6 1.2 Documentation set for SAP Data Services....6 1.3
More informationNetwork Technologies
Network Technologies Glenn Strong Department of Computer Science School of Computer Science and Statistics Trinity College, Dublin January 28, 2014 What Happens When Browser Contacts Server I Top view:
More informationQlik REST Connector Installation and User Guide
Qlik REST Connector Installation and User Guide Qlik REST Connector Version 1.0 Newton, Massachusetts, November 2015 Authored by QlikTech International AB Copyright QlikTech International AB 2015, All
More informationAn Ontology Based Method to Solve Query Identifier Heterogeneity in Post- Genomic Clinical Trials
ehealth Beyond the Horizon Get IT There S.K. Andersen et al. (Eds.) IOS Press, 2008 2008 Organizing Committee of MIE 2008. All rights reserved. 3 An Ontology Based Method to Solve Query Identifier Heterogeneity
More informationEvaluation of Open Source Data Cleaning Tools: Open Refine and Data Wrangler
Evaluation of Open Source Data Cleaning Tools: Open Refine and Data Wrangler Per Larsson plarsson@cs.washington.edu June 7, 2013 Abstract This project aims to compare several tools for cleaning and importing
More informationSQL Server 2005 Reporting Services (SSRS)
SQL Server 2005 Reporting Services (SSRS) Author: Alex Payne and Brian Welcker Published: May 2005 Summary: SQL Server 2005 Reporting Services is a key component of SQL Server 2005. Reporting Services
More informationSimWebLink.NET Remote Control and Monitoring in the Simulink
SimWebLink.NET Remote Control and Monitoring in the Simulink MARTIN SYSEL, MICHAL VACLAVSKY Department of Computer and Communication Systems Faculty of Applied Informatics Tomas Bata University in Zlín
More informationGATE Mímir and cloud services. Multi-paradigm indexing and search tool Pay-as-you-go large-scale annotation
GATE Mímir and cloud services Multi-paradigm indexing and search tool Pay-as-you-go large-scale annotation GATE Mímir GATE Mímir is an indexing system for GATE documents. Mímir can index: Text: the original
More informationBIRT Document Transform
BIRT Document Transform BIRT Document Transform is the industry leader in enterprise-class, high-volume document transformation. It transforms and repurposes high-volume documents and print streams such
More informationSystem Requirement Specification for A Distributed Desktop Search and Document Sharing Tool for Local Area Networks
System Requirement Specification for A Distributed Desktop Search and Document Sharing Tool for Local Area Networks OnurSoft Onur Tolga Şehitoğlu November 10, 2012 v1.0 Contents 1 Introduction 3 1.1 Purpose..............................
More informationTDAQ Analytics Dashboard
14 October 2010 ATL-DAQ-SLIDE-2010-397 TDAQ Analytics Dashboard A real time analytics web application Outline Messages in the ATLAS TDAQ infrastructure Importance of analysis A dashboard approach Architecture
More informationIntegrating VoltDB with Hadoop
The NewSQL database you ll never outgrow Integrating with Hadoop Hadoop is an open source framework for managing and manipulating massive volumes of data. is an database for handling high velocity data.
More informationIntroduction to Database Systems. Module 1, Lecture 1. Instructor: Raghu Ramakrishnan raghu@cs.wisc.edu UW-Madison
Introduction to Database Systems Module 1, Lecture 1 Instructor: Raghu Ramakrishnan raghu@cs.wisc.edu UW-Madison Database Management Systems, R. Ramakrishnan 1 What Is a DBMS? A very large, integrated
More informationBRINGING INFORMATION RETRIEVAL BACK TO DATABASE MANAGEMENT SYSTEMS
BRINGING INFORMATION RETRIEVAL BACK TO DATABASE MANAGEMENT SYSTEMS Khaled Nagi Dept. of Computer and Systems Engineering, Faculty of Engineering, Alexandria University, Egypt. khaled.nagi@eng.alex.edu.eg
More informationMigrating to vcloud Automation Center 6.1
Migrating to vcloud Automation Center 6.1 vcloud Automation Center 6.1 This document supports the version of each product listed and supports all subsequent versions until the document is replaced by a
More informationNovell Identity Manager
AUTHORIZED DOCUMENTATION Manual Task Service Driver Implementation Guide Novell Identity Manager 4.0.1 April 15, 2011 www.novell.com Legal Notices Novell, Inc. makes no representations or warranties with
More informationSAP Data Services 4.X. An Enterprise Information management Solution
SAP Data Services 4.X An Enterprise Information management Solution Table of Contents I. SAP Data Services 4.X... 3 Highlights Training Objectives Audience Pre Requisites Keys to Success Certification
More informationWeb 2.0-based SaaS for Community Resource Sharing
Web 2.0-based SaaS for Community Resource Sharing Corresponding Author Department of Computer Science and Information Engineering, National Formosa University, hsuic@nfu.edu.tw doi : 10.4156/jdcta.vol5.issue5.14
More informationWhat is a database? COSC 304 Introduction to Database Systems. Database Introduction. Example Problem. Databases in the Real-World
COSC 304 Introduction to Systems Introduction Dr. Ramon Lawrence University of British Columbia Okanagan ramon.lawrence@ubc.ca What is a database? A database is a collection of logically related data for
More informationJava Application Developer Certificate Program Competencies
Java Application Developer Certificate Program Competencies After completing the following units, you will be able to: Basic Programming Logic Explain the steps involved in the program development cycle
More informationHow To Test Your Web Site On Wapt On A Pc Or Mac Or Mac (Or Mac) On A Mac Or Ipad Or Ipa (Or Ipa) On Pc Or Ipam (Or Pc Or Pc) On An Ip
Load testing with WAPT: Quick Start Guide This document describes step by step how to create a simple typical test for a web application, execute it and interpret the results. A brief insight is provided
More informationFlattening Enterprise Knowledge
Flattening Enterprise Knowledge Do you Control Your Content or Does Your Content Control You? 1 Executive Summary: Enterprise Content Management (ECM) is a common buzz term and every IT manager knows it
More informationUnderstanding Slow Start
Chapter 1 Load Balancing 57 Understanding Slow Start When you configure a NetScaler to use a metric-based LB method such as Least Connections, Least Response Time, Least Bandwidth, Least Packets, or Custom
More informationCS2Bh: Current Technologies. Introduction to XML and Relational Databases. Introduction to Databases. Why databases? Why not use XML?
CS2Bh: Current Technologies Introduction to XML and Relational Databases Spring 2005 Introduction to Databases CS2 Spring 2005 (LN5) 1 Why databases? Why not use XML? What is missing from XML: Consistency
More information