WCS as a Download Service for Big (and Small) Data INSPIRE 2013 Florence, Italy, 2013-jun-25 Peter Baumann 1, Stephan Meissl 2, Alan Beccati 1 1 Jacobs University rasdaman GmbH, Bremen, Germany 2 EOX GmbH, Vienna, Austria Supported by EU FP7 einfrastructure, contract 283610 EarthServer
Big Data Research @ Jacobs U Large-Scale Scientific Information Systems research group focus: large-scale n-d raster services & beyond www.jacobs-university.de/lsis Main results: Array DBMS, rasdaman Research spin-off, rasdaman GmbH Geo service standards: chair, OGC raster-relevant working groups editor, 10+ OGC stds & candidate stds Recently: ISO Array SQL
Overview Big Data Coverage data: the OGC Coverage Model Coverage services: the WCS Suite rasdaman Conclusion 3
Big Data : The 4 Vs Volume [Doug Laney / Gartner & IBM] Velocity Variety Veracity...plus several more Vs in blogs (the V s deluge): Value, Verisimilitude, Variability, Visualization,...
Facing the Coverage Tsunami sensor feeds [OGC SWE] coverage server 5
Taming the Coverage Tsunami sensor feeds [OGC SWE] coverage server 6
Serving Coverages GMLCOV & WCS: download & processing SOS WCS SWE SOS: data capturing coverage server
Overview Big Data Coverage data: the OGC Coverage Model Coverage services: the WCS Suite rasdaman Conclusions 8
Coverage Definition class GML 3.2.1 Application Schema for Coverages contains hook for metadata «FeatureType» GML::Feature «FeatureType» Coverage ISO 19123 is abstract many different implementations possible not per se interoperable OGC coverage std is concrete and interoperable domainset rangetype rangeset «Union» GML::DomainSet «type» SWE Common::DataRecord «Union» GML::RangeSet [OGC 09-146r2]
Gridded Coverage Types Not georeferenced, just pixels GMLCOV::GridCoverage Georeferenced, possibly oblique GMLCOV::RectifiedGridCoverage 1+ irregular axes GML 3.3 ReferenceableGridByVectors* GMLCOV::ReferenceableGridCoverage Mix example: sat image timeseries 1+ warped axes GML 3.3 ReferenceableGridByArray* GMLCOV::ReferenceableGridCoverage *) plan: GML 3.4 for generalization [Campalani 2013]
Big Data Variety: Coverages n-d "space/time-varying phenomenon" [ISO 19123, OGC 09-146r2] «FeatureType» Abstract Coverage Grid Coverage MultiSolid Coverage MultiSurface Coverage MultiCurve Coverage MultiPoint Coverage Referenceable GridCoverage Rectified GridCoverage
Including Coverages Metadata Coverage has slot metadata for <any> kind of metadata WCS will deliver this, without understanding contents Enables catalogues, extended semantics, metadata extensions Ex: EO-WCS GetCoverage result
Overview Big Data Coverage data: the OGC Coverage Model Coverage services: the WCS Suite rasdaman Conclusions 13
WCS Service Model: Structure [OGC 09-110r4] Hook for future servicerelated coverage metadata
WCS Service Model: Operations GetCapabilities what service extensions? What formats? CRSs? What coverages? DescribeCoverage coverage metadata GetCoverage main workhorse: coverage, or subset thereof Direct access to values for processing 15
WCS: The Big Picture
Web Coverage Service Core Simple & efficient access to n-d spatio-temporal coverages In any format subset = trim slice 17
OGC Spatio-Temporal CRSs WGS84, URI-based: Parametrized ( AUTO ) CRSs: Ad-hoc combination of CRSs: http://www.opengis.net/def/crs/epsg/0/4326 http://www.opengis.net/def/crs?authority=ogc&version=1.3 & code=auto42003 & UoM=m & CenterLongitude=-100 & CenterLatitude=45 http://www.opengis.net/def/crs-compound? 1=http://resolver/def/crs/EPSG/0/326NN&2=http://resolver/def/crs/EPSG/0/3855& 3=http://resolver/def/crs/OGC/0.1/Unix-Time&4=http://resolver/def/crs/OGC/0.1/ Unix-Time?label= forecast OGC CRS resolver, powered by rasdaman/secore
WCS Extension: CRS (Warp) [11-053] 2 optional CRS parameters added to GetCoverage request: subsettingcrs: CRS in which subsetting bbox is expressed default: coverage s Native CRS E.g. Lat/Lon AOI over several UTM coverages outputcrs: CRS in which coverage will be delivered default: subsettingcrs E.g. Re-projected output, depends on interpolation extension Ex:...? REQUEST=GetCoverage & OUTPUTCRS=http://www.opengis.net/def/crs/EPSG/0/4326 &
WCS Extension: Range Subsetting [12-140] Range subsetting = band, channel extraction by way of example: extract red {request}? & RANGESUBSET=red & extract nir, red, green All between nir and green {request}? & RANGESUBSET=nir,red,green & {request}? & RANGESUBSET=nir:green & combination: {request}? & RANGESUBSET=band01,band03:band05,band19:band21 &
Coverage Encoding Pure GML: complete coverage represented by GML Special Format: suitable file format (able to deliver all information) Multipart-Mixed: multipart MIME, type multipart/mixed GML Coverage Domain set Range type Range set App Metadata NetCDF Domain set Range type Range set App Metadata GML Coverage Domain set Range type xlink App Metadata PNG file 21
Web Coverage Processing Service XQuery for rasters : ad-hoc navigation, extraction, aggregation, analytics Time series Image processing Summary data Sensor fusion 24
WCPS By Example "From MODIS scenes M1, M2, and M3, the absolute of the difference between red and nir, in HDF-EOS" for $c in ( M1, M2, M3 ) return encode( abs( $c.red - $c.nir ), "hdf ) (hdf A, hdf B, hdf C ) 25
WCPS By Example "From MODIS scenes M1, M2, and M3, the absolute of the difference between red and nir, in HDF-EOS" but only those where nir exceeds 127 somewhere for $c in ( M1, M2, M3 ) where some( $c.nir > 127 ) return encode( abs( $c.red - $c.nir ), "hdf ) (hdf A, hdf C ) 26
WCPS By Example "From MODIS scenes M1, M2, and M3, the absolute of the difference between red and nir, in HDF-EOS" but only those where nir exceeds 127 somewhere inside region R for $c in ( M1, M2, M3 ), $r in ( R ) where some( $c.nir > 127 and $r ) return encode( abs( $c.red - $c.nir ), "hdf ) (hdf A ) 27
WCPS Language Features for $c in ( EuropeMultispectral ) return encode( ( $c.nir - $c.red ) / ) ( $c.nir + $c.red ), geotiff Process Subsets Server Side Math for $d in ( EuropeTerrainModel ), return encode( d[ Lon(34:35), Lat(22:23), t("2008-01-01 ) ], geotiff ) for $c in ( EuropeMultispectral ), $d in ( EuropeTerrainModel ), $r in ( EuropeGroundStations ) return encode( Query operators on c,d,r, geotiff ) Coverage Integration
Overview Motivation: What is a coverage? Coverage data: the OGC Coverage Model Coverage services: the WCS Suite rasdaman Conclusion 29
WCS Implementations High acceptance by implementers for WCS 2 Specs simple to understand & implement Users increasingly planning to move to WCS 2 (eg, NGA) Known implementations: rasdaman, MapServer, GeoServer, OPeNDAP, GMU, Trellis,... WCS Core Reference implementation : rasdaman WCS is the future-directed OGC Big Data standard
rasdaman: Scalable Array Analytics raster data manager : SQL + n-d arrays Scalable parallel tile streaming architecture Storage: [ database preexisting file archive ] OGC WMS, WCS, WCPS, WPS geo service standards WCPS, WCS Core reference implementation rasdaman visitors 2013+ www.rasdaman.org
In-Situ Databases Problem: Large-scale data centers sometimes object to import cannot duplicate my 100 TB Approach: reference external files, use as tiles; cf [Ailamaki et al 2010] insert into MyTable (, image, ) values (,referencing /my/path/ext1.tif [1000:1999,3000:3999], ) update MyTable set image = referencing /oops/forgot/ext2.jpg [1000:1999,4000:4999] Challenges: efficiency, consistency, caching,... Data simultaneously used by other actors in parallel! In future: optimized storage of hot spots in DB
Let s Take a Closer Look... t Divergent access patterns for ingest and retrieval Server must mediate between access patterns
Configurable Tiling Sample tiling strategies [Furtado]: regular directional area of interest rasdaman storage layout language insert into MyCollection values... tiling area of interest [0:20,0:40], [45:80,80:85] tile size 1000000 index d_index storage array compression zlib
Overview Motivation: What is a coverage? Coverage data: the OGC Coverage Model Coverage services: the WCS Suite rasdaman Conclusions 35
EarthServer: Big Earth Data Analytics 100+ TB databases for Earth sciences + planetary science EU FP7-INFRA, 3 years, 5.85 meur Core platform: rasdaman Open standards, visualization tools, integrated query engine
Conclusion OGC WCS: spatio-temporal download service for Big Earth Data Beyond ortho imagery: curvilinear, point clouds, meshes, TINs... From simple access (WCS Core) to complex server-side processing (WCPS) Query language as flexible interface paradigm Semantic interoperability on data Visual clients can hide QL, servers can scale [Dali] rasdaman reference impl for WCS Core, WCPS Follow us: www.rasdaman.org, standards.rasdaman.org www.earthserver.eu www.ogcnetwork.net/wcs
We re Hiring!