The Virtual Database A Tool for Integrated Data Processing in a Distributed Environment



Similar documents
DEVELOPMENT OF THE INTEGRATING AND SHARING PLATFORM OF SPATIAL WEBSERVICES

Interoperable Solutions in Web-based Mapping

GIS AS A DECISION SUPPORT FOR SUPPLY CHAIN MANAGEMENT

SQL SUPPORTED SPATIAL ANALYSIS FOR WEB-GIS INTRODUCTION

Study of GML-Based Geographical Data Visualization Strategy

A Web services solution for Work Management Operations. Venu Kanaparthy Dr. Charles O Hara, Ph. D. Abstract

ArcGIS. Server. A Complete and Integrated Server GIS

Web Map Context Service for Adaptive Geospatial Data Visualization

DISMAR implementing an OpenGIS compliant Marine Information Management System

STATISTICAL COMMISSION and Working Paper No. 20 ECONOMIC COMMISSION FOR EUROPE

Quality Assessment for Geographic Web Services. Pedro Medeiros (1)

Design Requirements for an AJAX and Web-Service Based Generic Internet GIS Client

Documentation of open source GIS/RS software projects

GIS Databases With focused on ArcSDE

INTEROPERABLE IMAGE DATA ACCESS THROUGH ARCGIS SERVER

ONLINE VISUALIZATION OF SPATIAL DATA

Data interchange between Web client based task controllers and management information systems using ISO and OGC standards

Client-Server Architecture & J2EE Platform Technologies Overview Ahmed K. Ezzat

ArcGIS Framework Plug-In: Extending the ArcGIS Desktop for ANSI Standard Framework Data to Support Government Decision Making

Correspondence can be sent to: GeoConnections Natural Resources Canada 615 Booth Street Ottawa, Ontario K1A 0E9

Managing a Geographic Database From Mobile Devices Through OGC Web Services

How to Build an E-Commerce Application using J2EE. Carol McDonald Code Camp Engineer

An architecture for open and scalable WebGIS

About scope of OpenGIS technology in oceanographic data management and visualization Andrey V. Golik, Vitaly K. Fischenko, Stepan G.

A framework for web-based product data management using J2EE

Data Visualization Using Web GIS Software

OPEN STANDARD WEB SERVICES FOR VISUALISATION OF TIME SERIES DATA OF FLOOD MODELS

Choosing the right GIS framework for an informed Enterprise Web GIS Solution

CURSO Inspire INSPIRE. SPEAKER: Pablo Echamendi Lorente. JEUDI 23/ THURSDAY 23 rd W S V : G E O S P A T I A L D A T A A C C E S S

Web-based Remote Sensing Applications and Java Tools for Environmental Monitoring

12th AGILE International Conference on Geographic Information Science 2009 page 1 of 5 Leibniz Universität Hannover, Germany

Visualization Method of Trajectory Data Based on GML, KML

How To Use The Alabama Data Portal

Lecture 8. Online GIS

DATA QUALITY AND SCALE IN CONTEXT OF EUROPEAN SPATIAL DATA HARMONISATION

DATA SHARING AND SPATIAL QUERY

GEOG 482/582 : GIS Data Management. Lesson 10: Enterprise GIS Data Management Strategies GEOG 482/582 / My Course / University of Washington

County of Los Angeles. Chief Information Office Preferred Technologies for Geographic Information Systems (GIS) September 2014

Vector Web Mapping Past, Present and Future. Jing Wang MRF Geosystems Corporation

Integrating AJAX Approach into GIS Visualization Web Services

DESIGN AND IMPLEMENTATION OF A GIS BASED BICYCLE ROUTING SYSTEM FOR THE WORLD WIDE WEB (WWW)

The ORIENTGATE data platform

1. Introduction ABSTRACT

Development of Sensor Web Applications with Open Source Software

A Hybrid Architecture for Mobile Geographical Data Acquisition and Validation Systems

The Data Grid: Towards an Architecture for Distributed Management and Analysis of Large Scientific Datasets

Integration of location based services for Field support in CRM systems

What Is the Java TM 2 Platform, Enterprise Edition?

Customer Bank Account Management System Technical Specification Document

An Introduction to Open Source Geospatial Tools

Chapter 1: Introduction to ArcGIS Server


Oracle Platform GIS & Location-Based Services. Fred Louis Solution Architect Ohio Valley

INCORPORATING LOAD BALANCING SPATIAL ANALYSIS INTO XML-BASED WEBGIS

White Paper: 1) Architecture Objectives: The primary objective of this architecture is to meet the. 2) Architecture Explanation

Geographic Information Systems for Java

Developer Tutorial Version 1. 0 February 2015

REAL-TIME DATA GENERALISATION AND INTEGRATION USING JAVA

Software. PowerExplorer. Information Management and Platform DATA SHEET

13/10/2011. Data Integration and Interoperability. Gordon Sumerling & Maree Wilson

sensors ISSN

An Esri White Paper June 2011 ArcGIS for INSPIRE

Contents. Client-server and multi-tier architectures. The Java 2 Enterprise Edition (J2EE) platform

Investigating Hadoop for Large Spatiotemporal Processing Tasks

Internet Engineering: Web Application Architecture. Ali Kamandi Sharif University of Technology Fall 2007

AN OPENGIS WEB MAP SERVER FOR THE ESA MULTI-MISSION CATALOGUE

2012 LABVANTAGE Solutions, Inc. All Rights Reserved.

mdwfs Model-driven Schema Translation for Integrating Heterogeneous Geospatial Data

Distributed GIS Systems, Open Specifications and Interoperability: How do They Relate to the Sustainable Management of Natural Resources?

GEOENGINE MSc in Geomatics Engineering, Master Thesis Gina Campuzano

Cloud application for water resources modeling. Faculty of Computer Science, University Goce Delcev Shtip, Republic of Macedonia

ArcGIS Data Models Practical Templates for Implementing GIS Projects

GIS and Mapping Solutions for Developers. ESRI Developer Network (EDN SM)

The GeoMedia Architecture Advantage. White Paper. April The GeoMedia Architecture Advantage Page 1

Service Oriented Architecture

Heterogeneous Tools for Heterogeneous Network Management with WBEM

Establishment of Spatial Data Infrastructure within the Environmental Sector in Slovak Republic

JOURNAL OF OBJECT TECHNOLOGY

Standardized data sharing through an open-source Spatial Data Infrastructure: the Afromaison project

Pennsylvania Geospatial Data Sharing Standards (PGDSS) V 2.5

GeoKettle: A powerful open source spatial ETL tool

Using CAD Data in ArcGIS

Texas Develops Online Geospatial Data Repository to Support Emergency Management

LSD APC Part I Workshop Geographic Information System. Danny Yeung 14 November 2015

Introduction to GIS (Basics, Data, Analysis) & Case Studies. 13 th May Content. What is GIS?

A Java Tool for Creating ISO/FGDC Geographic Metadata

Web GIS: Technologies and Its Applications

Five Steps to Better Performance

Rich Web Map Applications HANNES JOHANSSON

M-GIS Mobile and Interoperable Access to Geographic Information

Enterprise GIS Solutions to GIS Data Dissemination

EXPLORING AND SHARING GEOSPATIAL INFORMATION THROUGH MYGDI EXPLORER

An Esri White Paper October 2010 Esri Production Mapping Product Library: Spatially Enabled Document Management System

Web and Mobile GIS Applications Development

EBONE. European Biodiversity Observation Network: Design of a plan for an integrated biodiversity observing system in space and time

Geodatabase Programming with SQL

gvsig: A GIS desktop solution for an open SDI.

Chapter 6: Data Acquisition Methods, Procedures, and Issues

Building a Spatial Database in PostgreSQL

Data Integration for ArcGIS Users Data Interoperability. Charmel Menzel, ESRI Don Murray, Safe Software

Transcription:

EnviroInfo 2004 (Geneva) Sh@ring EnviroInfo 2004 The Virtual Database A Tool for Integrated Data Processing in a Distributed Environment Marcel Frehner 1, Martin Brändli 2, Jürg Schenker 3 Abstract Traditional desktop GIS are expensive to buy and require much experience and know-how in order to be used reasonably. Web mapping systems have become a cheap and easy to use alternative recently but offer only restricted access to existing spatial data and limited spatial data handling capabilities. This paper presents an architecture called Virtual Database which makes available spatial environmental information as well as advanced geoprocessing functionality to any user who has access to the Internet. A particular feature of the Virtual Database is that it serves as a platform for the integration of distributed data repositories consisting of environmental data. Data access and integration conform to the OpenGIS specifications for Web Feature Services (WFS) and the Geography Markup Language (GML). Analysis functionality, in particular methods of spatial overlay are provided by a spatial analysis engine software component. First experiences with the Virtual Database show its high flexibility concerning the integration of heterogeneous data repositories. High scalability of the system is achieved by a caching mechanism based on data replication. The potential of the use of the Virtual Database is illustrated by sketching an application scenario from the field of environmental data handling. 1. Introduction One of the medium-term goals of the Landscape Inventory Division at the Swiss Federal Research Institute WSL is the development and establishment of an integrated environmental and landscape information system. The system aims at offering a comprehensive solution for enabling the sharing of spatial data, methods, and computing resources. An information system in general consists of appropriately 1 Swiss Federal Research Institute WSL, Landscape Inventories, Zürcherstrasse 111, CH- 8903 Birmensdorf,Switzerland, email: marcel.frehner@wsl.ch 2 Swiss Federal Research Institute WSL, Landscape Inventories, Zürcherstrasse 111, CH- 8903 Birmensdorf,Switzerland, email: martin.braendli@wsl.ch 3 Swiss Agency for the Environment, Forests and Landscape (SAEFL), Division of Nature, CH-3003Bern,Switzerland,juerg.schenker@buwal.admin.ch 537

compiled data, methods and models to process the data, and usually of the possibility to access external data sources. An integrated information system should offer a unifying platform facilitating the application of a diversity of technologies and methods (database, GIS, remote sensing, statistics, etc.) and permitting the potentially arbitrary combination of any available data and methods either located inhouse or externally. Basing such a system on Internet/Intranet technology promises a general access to the various data and computing resources. This paper presents the Virtual Database - an architecture for the integration of distributed data repositories. Integration of data will enable combined visualization and analysis of distributed data and build the basis for a comprehensive environmental and landscape information system. The Virtual database currently aims at integrating different distributed databases of the Swiss Agency for the Environment, Forests and Landscape (SAEFL). SAFEL is responsible for collecting and storing data of nature protection (protected areas, fauna and flora) on the national level. Because this task takes place at various decentralized institutions, the goal of the Virtual Database is to offer a unifying platform for the combination of these data. Databases (called data components) from the Centre Suisse de Cartographie de la Faune, the Institute for Systematic Botany at University of Zurich, and the Swiss Federal Research Institute WSL are brought together in order to build an integrated data federation. Data are provided by an easily accessible environment enabling comprehensive exploration and analysis. Initially, the Virtual Database was designed for the purpose of integrating data for visualization and simple querying (Brändli/Sparenborg 2002). However, since the Virtual Database is based on the distributed geographic information services paradigm (discussed in the next section) and since it takes advantage of open standardization initiatives and open source software development efforts, it is now extended towards inclusion of advanced spatial data handling capabilities which is particularly addressed in this paper. The paper proceeds as follows: Section 2 discusses related work on exchanging, sharing and analysis of spatial data using the Internet. The design of the Virtual Database is sketched in section 3, followed by the description of implementation details, particularly on integrating analysis functionality, in section 4. The benefit of the Virtual Database is presented in section 5 by illustrating a specific application scenario. The paper ends with some conclusions and an outlook for planned work. 2. Related work The establishment of this integrated environmental and landscape information system benefits from research efforts in the IT and in particular the GIS community. Traditionally, geographic information systems provide spatial data handling capabilities for data input, storage, retrieval, management, manipulation, analysis, and 538

output (Aronoff 1989, Burrough/McDonnell 1998). Geoprocessing functionality is usually supplied by a single and monolithic system, the data is normally stored in a single database. Due to the popular use of the Internet this closed architecture paradigm of GIS is shifting towards a distributed geographic information services paradigm (Tsou/Buttenfield 2002). Distribution includes both the storage of data in spatially distributed database systems and dispersed geoprocessing providers offering so-called geo-services (Peng/Tsou 2003) consisting of spatial data handling functionality. How geo-services might be located on the Internet is discussed in Tsou (2002). The price of this paradigm shift is the requirement for enhanced interoperability, reusability and flexibility of both data and geo-services. Today, however, most spatial data handling applications on the Internet concern Web mapping or Web cartography offering functionality for the use, distribution and production of maps by means of the Internet (Kraak 2001, Orthofer/Loibl 2004). Additionally, they allow for visualizing spatial data and submitting simple queries. Current standardization efforts such as the initiative by the Open GIS Consortium (OGC) support this type of geospatial data handling. OGC released the Web Map Service Implementation Specification (WMS) which standardizes the way map images, service-level metadata, and information about particular map features contained in a map are requested (OGC 2001). But an OGC-compliant Web map server does not necessarily include any further tools for spatial analysis and modeling. As data are returned in an image format, they cannot be accessed for additional processing. The need for exchanging and sharing spatial data on the Internet that goes beyond the transfer of query results as Web maps is well recognized. Standardizations like OGC s Web Feature Service (WFS) Implementation Specification (OGC 2002) and OGC s Geography Markup Language (GML) Implementation Specification (OGC 2003) focus on the exchange of geographic data in a format that enables further client-side processing. Software producers such as ESRI with the recently released ArcGIS Server are extending their Web map server products by processing functions that consist of advanced spatial data handling operations. Besides commercial developments, research efforts concerning Web-based GIS applications also aim at offering advanced spatial data handling capabilities. Tsou (2004) describes an application based on a set of Java applets that integrates GIS and remote sensing tools for address matching, network analysis, reselection, change detection in raster images, and image classification. The system claims to be particularly useful for non-gis professionals who by now hesitated using GIS software for reasons of cost, complicated software installation and insufficient software training. Focusing on GIS tools Anderson and Moreno-Sanchez (2002) demonstrate the implementation of spatial analysis capabilities around open specifications and open source software. Results show that both open specifications and open source software libraries have become powerful and mature enough to be applied in Web-GIS 539

projects. Maximal interoperability is achieved by strictly conforming to the guidelines of open specifications. 3. Design of the Virtual Database The design and implementation of the Virtual Database follows the trend of Internetbased data exchange and takes advantage of open standards, open interfaces and open source software development. Design requirements for the Virtual Database are in particular: 1. Integration of distributed data repositories which are stored using different database management systems. The autonomy of the individual components must not be restricted by the data federation. 2. Database functionality is limited to distributed queries. Inserts and updates are handled by applications of the individual components. 3. Uniform interfaces are defined for data access. 4. Retrieval, query, analysis and display of data from distributed database systems should be open to a wide audience and therefore take place by means of a Web browser-based client. Figure 1: Architecture of the Virtual Database 540

The design and implementation of the Virtual Database follow the principle of loose coupling of the individual data components (databases) and is structured into clearly separated but interrelated tiers. Figure 1 presents data components and necessary software modules as elements of three separate tiers. These tiers are as follows: 1. Enterprise Information System Tier (EIS Tier): The EIS Tier consists of distributed data repositories that have to be integrated. Data are either stored in database management systems or simply as files. 2. Middle Tier: The Middle Tier contains interfaces, so-called access layers that enable access of data repositories available from the EIS tier. The interfaces specify the way data must be served on the one hand and accessed on the other hand. In addition interfaces to descriptive metadata must be provided in order to enable an assessment of served data. The integration layer controls the access of the distributed data repositories and integrates the data retrieved from the access layers in order to provide a transparent data view. The spatial analysis engine performs any desired analysis operations. Map server software is responsible for the rendering of maps of integrated and analyzed data. 3. Client Tier: Data retrieved from the map server are displayed by a user-friendly thin Web client facilitating user interaction and display. 4. Implementation of the Virtual Database 4.1 Enterprise information system tier The EIS tier of the Virtual Database consists of heterogeneous data repositories at different locations. Heterogeneity concerns the data structures and the storage and database management systems (DBMS) of the involved data components. Currently data repositories from three different institutions are available from the EIS tier: The first repository is installed at WSL and is called Data Center for Nature and Landscape (DNL). It is a database mainly storing inventory data of protected biotopes in Switzerland (Baltensweiler/Brändli 2004). Oracle is used as DBMS in combination with ESRI s Spatial Database Engine (SDE) for handling and processing of spatial data types. The second database is located at the Centre Suisse de Cartographie de la Faune (CSCF) and stores data on endangered animal species using an Oracle DBMS. In contrast to the DNL database spatial information (i.e. coordinates) on discovered animals is stored as standard columns in regular database tables. The third data component installed at the Institute of Systematic Botany, University of Zurich, contains discovered locations of endangered and rare moss species. Again, attribute data are stored in an Oracle database. Location data are, however, stored in ESRI s shapefile format. 541

4.2 Middle tier As outlined above the middle tier contains various functional components, i.e. several access layers, an integration layer, a map server, and a spaial analysis engine. Due to the complexity of the middle tier Jakarta Struts (http://struts.apache.org) has been chosen as a framework for implementation. The Struts framework is based on the Java 2 Platform (http://java.sun.com) and makes use of Java Servlets, JavaServer Pages (JSP), JavaBeans, and XML, as well as various other open source software components provided by the Jakarta Project (http://jakarta.apache.org). Struts encourages the design and implementation of application architectures based on the Model-View-Controller (MVC) paradigm. The MVC paradigm suggests the organization of interactive applications into three separate modules: one for the application model with its data representation and business logic, the second for views that provide data presentation and user input, and the third for a controller which dispatches user requests and controls application flow (Singh et al. 2004). Inside the Virtual Database the access layer, the spatial analysis engine, the map server, and the integration layer make up the model part, the view consists of dynamic JSP pages for information presentation, and the Struts ActionServlet as well as Struts Action classes build the controller. 4.2.1 Access layer Each EIS component requires an appropriate access layer that accounts for its individual data storage system. Implementation of individual access layers follows OGC s Web Feature Service (WFS) implementation specification (OGC 2002). The specification defines interfaces for the manipulation of spatial features, i.e. querying, inserting, updating and deleting data, and bases the communication between the distributed computing platforms on HTTP. Access layers of the Virtual Database implement the following interfaces that are required in any basic read-only Web Feature Service (for a more detailed description see Brändli/Sparenborg 2002): 1. GetCapabilities: Returns details about service capabilities like available data and functionality. 2. DescribeFeatureType: Returns a description of the data structure of available data. 3. GetFeature: Returns geospatial data encoded according to the Geography Markup Language (GML) which is based on an XML schema tailored for the exchange of spatial data. Since the three data components of the EIS tier use different DBMS and file formats (ESRI s shapefile for moss data, for instance), the interfaces must be implemented accordingly. For example, access to the DNL database takes advantage of the ESRI SDE API for retrieval of spatial data types. In contrast, access to regular Oracle da- 542

tabase tables is supplied by using the particular JDBC (Java Database Connection) implementation for Oracle. Integrating distributed data repositories using an interface-based approach for data access is a highly flexible solution. The advantage of the usage of interfaces is that neither existing database schemas nor file structures have to be changed or adapted. Conformity to the specified interfaces is achieved only by adapting the access software of the access layers. In some cases, though, the database schemas have additionally been adapted by generating database views for joining related tables in order to simplify access of the data. 4.2.2 Integration layer The integration layer sends requests to each access layer of the involved data repositories. When the GML data are returned the integration layer parses and merges the data according to their XML schemas. The current implementation does not consider any data heterogeneities such as differences in scale or different data accuracies. Handling of such heterogeneities and data uncertainties is postponed to future developments of the Virtual Database. The integration layer provides a transparent view on the distributed data repositories ready for use by the map server component and the spatial analysis engine described below. During development problems related to scalability concerning the increase of the size of accessed datasets had to be handled. Spatial data, such as polygons describing the boundaries of administrative or environmental protection areas, are quite large in comparison to corresponding attribute data. Additionally, the necessary conversions from local data formats to GML substantially increase dataset size. A considerable growth of the amount of spatial data results in data transfer times that are unacceptable from a user point of view. That s why a caching mechanism based on data replication was implemented. Replicated spatial data accessed from the distributed components are stored as part of the integration layer and updated as soon as data changes occur. 4.2.3 Map server Visualization and query of spatial data and corresponding attribute data is based on out-of-the-box components of ESRI s Internet mapping software ArcIMS. ArcIMS offers an XML-based query language and various connectors, among them a Java connector with a corresponding JSP tag library which facilitates the composition of requests for visualization options, spatial queries, attribute queries, buffering, and other simple GIS operations (http://www.esri.com/software/arcgis/arcims/index.html). ArcIMS is mainly applied for historical and institutional reasons. Open Source products like MapServer 543

(http://mapserver.gis.umn.edu/) would suit as well. A comprehensive list of map server software can be found on (http://gislounge.com/ll/webgis.shtml). Peng (2003, 379) gives a detailed insight into a few popular commercial map servers. 4.2.4 Spatial analysis engine and overlay computation A full featured GIS is expected to provide functionality that can be categorized into the five areas (1) data acquisition; (2) preliminary data processing; (3) data storage and retrieval; (4) spatial search and analysis; (5) graphical display and interaction (Jones 1997, 38). The Virtual Database is already enabled for many of these features. For instance thematic layers can be selected in a Web form for display. Data can then be queried by attributes or by spatial selection, spatial objects can be buffered, attribute tables can be displayed, and object and layer metadata may be accessed. However, the Virtual Database doesn t claim to be a full featured GIS but wants to satisfy the specialized needs of some particular research and public administration groups in an optimal way. Once new users begin showing interest, further functionality may be included. As mentioned above the map server component is already able to perform spatial queries and buffer operations. By providing a particular spatial analysis engine the functionality is extended with advanced spatial data handling capabilities which we expect to improve the usefulness of the Virtual Database. The spatial analysis engine currently consists of a tool for computing vector overlays (described below). Spatial overlays leverage the Virtual Database since they build the basis for many further GIS modeling and analysis tasks. Overlaying vector data asks for geometric intersection of lines and polygons as well as for feature selection by either Boolean or set operations with input layers (Jones 1997, 48-54). The current implementation of the overlay tool of the analysis engine makes use of ESRI s MapObjects Java Edition 2.0. MapObjects provides methods for the computation of intersections on the level of single geometric objects like polylines and polygons. The overlay of two or more entire layers consisting of a great number of spatial objects is a complex task and requires highly performing software. First experiences with MapObjects show that algorithm performance is not very promising. Future versions of the analysis engine therefore will be based on alternative existing libraries, in particular the free Open Source libraries from Geotools (http://www.geotools.org) and the JTS Topology Suite (http://www.vividsolutions.com/jts) published under the LGPL license (http://www.gnu.org/copyleft/lesser.html). The user interface of the analysis engine is implemented as a JSP page allowing for the selection of two polygon layers. Figure 2 shows, that the desired type of overlay can be selected by a choice of four Boolean operations, i.e. AND, OR, NOT, XOR further illustrated by intuitive graphic symbols. 544

Figure 2: User interface for spatial overlay analysis. 4.3 Client tier The browser is designed as a thin client handling user input and display. Data assembling, graphical rendering, and spatial analysis are accomplished by the corresponding software components of the middle tier. Client-side code is thus limited to HTML and JavaScript which promises optimal availability to potential users. Maps are published as JPEG raster images. Vector formats such as Vector Markup Language (VML) and Scalable Vector Graphics (SVG) could provide more comprehensive display functions like rapid zoom-in/out, customizable map symbols, and layer stacking order (Tsou 2004), but are not implemented in the current version of the Virtual Database. Main problems related to providing data by VML or SVG are data protection issues and the necessary plug-in installation. 545

5. An application scenario Brändli and Höppner (2004) show the potential of the Virtual Database in regional planning by ameliorating data management, improving data and GIS availability, as well as simplifying data exploration and analysis. The following scenario illustrates how information retrieval may take place in the field of environmental data handling. Say regional commissioner for nature conservation Tom is interested in a particular nature reserve. The Virtual Database gives him access to the DNL where various inventories on nature and landscape are stored. By browsing through the list of datasets he finds all available states of the respective inventory object. Tom selects them all and the system displays them as layers in a map. By exploring the map Tom has the impression that the area of the reserve has been extended significantly between the last two states. He requests an overlay analysis from the server and gets a new dataset returned displaying all geometry and attribute changes. Currently, Tom is particularly interested in moss data. Therefore he uses the Virtual Database again to select some rare moss species from the data repository at the Institute for Systematic Botany at University of Zurich and adds them to the map. Tom is pleased to find out that many of them can be found in the recently added areas of the nature reserve. The scenario described involves overlay operations and spatial searches of multiple heterogeneous datasets. Given access to the Virtual Database Tom doesn t need to tediously collect and pre-process the potentially heterogeneous data layers himself, but can directly search the Virtual Database and rely on the data being served by the system. The necessary spatial operations can be performed on-line and the results are returned as a map for further exploration and analysis. Because the necessary GIS functionality is accessible on-line Tom can perform the required analysis without any locally installed GIS software. A common Web-browser-based thin client is sufficient for using the Virtual Database including its spatial analysis capabilities. 6. Conclusions and outlook We presented the design and implementation of a software architecture for a Webbased spatial data handling application that offers access to distributed spatial data repositories. The advantage of the Virtual Database is that anybody with a Web browser and access to the Internet can use provided spatial data to perform comprehensive spatial data exploration and advanced analysis operations. The chosen approach for the access of distributed data based on standardized interfaces proved evidence for being highly flexible since no changes or adaptations of involved data repositories are necessary. High scalability for the handling of large datasets is 546

achieved by replicating the data and storing them as part of the integration layer. A current bottleneck of the application is the weak performance of the spatial analysis engine concerning overlay operations. We expect a significant increase of algorithm performance by the use and inclusion of alternative open source libraries. A fundamental problem related to the integration of datasets from distributed repositories is not yet solved, however. Existing data heterogeneities, data errors and data uncertainties are currently not considered. This concerns the integration layer on the one hand which aims at providing a transparent view on the data for exploration and analysis. The implementation of homogenization methods is necessary in order to completely satisfy this goal. Exploration and analysis operations on the other hand must take into account data errors and uncertainties in order to enable reliable interpretation and assessment of datasets and analysis results. A first step towards the handling of error and uncertainty characteristics of available datasets will be taken by considering existing metadata, in particular metadata on data quality (provided that these metadata are available!). Automatic interpretation of metadata for data integration, exploration and subsequent analysis as well as metadata propagation in case of data analysis will be the key research topics in the near future. Bibliography Aronoff, S. (1989): Geographic Information Systems: A Management Perspective. WDL Publications, Ottawa. Anderson, G., Moreno-Sanchez, R. (2002): Building Web-Based Spatial Information Solutions around Open Specifications and Open Source Software. Transactions in GIS, 7(4):447-466. Blackwell Publishing Ltd., Oxford. Baltensweiler, A., Brändli, M. (2004): Web-based Exploration of Environmental Data and Corresponding Metadata, in Particular Lineage Information. In: Scharl, A. (ed.): Environmental Online Communication. Advanced Information and Knowledge Processing Series: 127-132, Springer, London. Brändli, M., Höppner, C. (2004): Die Virtuelle Datenbank: Technologie zur Unterstützung in der Regionalplanung. Proceedings CORP 2004. Brändli, M., Sparenborg, J. (2002): SVG as graphical metadata for distributed spatial data processing. SVG Open / Carto.net Developers Conference, Zurich, Switzerland, July 15-17. URL: http://www.svgopen.org/2002/papers/braendli_sparenborg svg_for_metadat a/index.html, accessed on: 12/08/2004. Burrough, Peter A., McDonnell, Rachael A. (1998): Principles of Geographical Information Systems. Oxford University Press, Oxford. Duckham, M., McCreadie, J. E. (2002): Error-aware GIS Development. In Shi et al. (2002): Spatial Data Quality. Taylor and Francis, London. 547

Heuvelink, G. B. M. (1998): Error Propagation in Environmental Modeling with GIS. Taylor and Francis, London. Kraak, M.-J. (2001): Settings and needs for web cartography. In: Kraak, M.-J., and Brown, A. (eds.): Web Cartography. Developments and prospects. Taylor and Francis, London and New York. ISO (2003): ISO 19115:2003. Geographic Information Metadata. URL: http://www.iso.org. Jones, Ch. (1997): Geographical Information Systems and Computer Cartography. Addison Wesley Longman Ltd, England. OGC (2001): Web Map Service Implementation Specification. Version: 1.1.1. Open GIS Consortium, Inc. URL: http://www.opengis.org/docs/01-068r2.pdf, accessed on: 08/05/2004. OGC (2002): Web Feature Service Implementation Specification. Version: 1.0.0. Open GIS Consortium, Inc. URL: http://www.opengis.org/docs/02-058.pdf, accessed on: 08/05/2004. OGC (2003): OpenGIS Geography Markup Language (GML) Implementation Specification. Version: 3.00. Open GIS Consortium, Inc. URL: http://www.opengis.org/docs/02-023r4.pdf, accessed on: 08/05/2004. Orthofer, R., Loibl, W. (2004): Sharing Environmental Maps on the Web: The Austrian EnviroMap System. In: Scharl, Arno (ed.): Environmental Online Communication. Advanced Information and Knowledge Processing Series: 133-144, Springer, London. Singh, I., Stearns, B., Johnson, M. (2002): Designing Enterprise Applications with the J2EE Platform, Second Edition. URL: http://java.sun.com/blueprints/guidelines/designing_enterprise_applications _2e/, accessed on: 08/05/2004. Tsou, M.-H. (2002): An Operational Metadata Framework for Searching, Indexing, and Retrieving Distributed Geographic Information Services on the Internet. In: Geographic Information Science (GIScience 2002). Lecture Notes in Computer Science Vol. 2478: 313-332, Springer-Verlag, Berlin. Tsou, M.-H., Buttenfield, B. P. (2002): A Dynamic Architecture for Distributing Geographic Information Services. Transactions in GIS, 6 (4):355-381, Blackwell Publishing Ltd, Oxford. Tsou, M.-H. (2004): Integrating Web-based GIS and image processing tools for environmental monitoring and natural resource management. Journal of Geographical Systems, 6:155-174, Springer-Verlag, Berlin. 548