Big Data Volume & velocity data management with ERDAS APOLLO Alain Kabamba Hexagon Geospatial
Intergraph is Part of the Hexagon Family Hexagon is dedicated to delivering actionable information through design, measurement, and visualization technologies. Learn more at hexagon.com. 2 2
Dynamic GIS: Measuring Our Changing Earth Geospatial Information Value Chain 3 Capture Process Share Deliver
VOLUME
Data deluge Imagery data volumes are growing at an ever increasing rate. Faster than processing and storage costs are falling 3000 Terrabyte/ year forseen for Copernicus 5
Data Deluge Greater Resolution, Greater Coverage, Greater Frequency 6
Data Deluge New Data Sources beside traditional airborne and satelliteplatforms 7
Rocketing User Demand Every day, new users are demanding imagery, across all applications, on all devices Not just some imagery, but the latest, highest resolution imagery available They want the entire historical archive accessible 8
Big Data Made Small the most common purpose of Big Data is to produce Small Data DataInformed 10 Sept 2013 http://data-informed.com/common-purpose-big-data-produce-small-data # Pixels in a Single ECW File 9
Big Data Made Small 14 terapixels the most common purpose of Big Data is to produce Small Data DataInformed 10 Sept 2013 http://data-informed.com/common-purpose-big-data-produce-small-data 0.6 terapixels 1.7 terapixels 2004 2005 2012 # Pixels in a Single ECW File 10
Big Data Made Small The World s Largest Geospatial Image? A single aerial image covering Germany @ 20cm GSD 3,210,000 px by 4,340,000px Big Data Made Small 38,000gb Uncompressed 50,000gb with image pyramids 875gb ECW Compressed 370,000 source files 1 ECW file 11
38Tb image Germany 12
13 ECW & JPEG2000 Compression Technology ENCODING ENCODING ENCODING ENCODING SPEED SPEED SPEED SPEED DECODING DECODING DECODING DECODING SPEED SPEED SPEED SPEED IMAGE IMAGE IMAGE IMAGE QUALITY QUALITY QUALITY QUALITY TIME TIME TIME TIME SAVED SAVED SAVED SAVED REDUCED REDUCED REDUCED REDUCED STORAGE STORAGE STORAGE STORAGE COSTS COSTS COSTS COSTS ENHANCED ENHANCED ENHANCED ENHANCED USER USER USER USER PERFORMANCE PERFORMANCE PERFORMANCE PERFORMANCE REDUCED REDUCED REDUCED REDUCED DATA DATA DATA DATA MANAGEMENT MANAGEMENT MANAGEMENT MANAGEMENT The Power of ECW & JP2K Compression technology FILE FILE FILE FILE SIZE SIZE SIZE SIZE
COMPRESS Uncompressed Original & Pyramids 1300 Gb Uncompressed Original 1000 Gb JPEG 2000 Numerically Lossless Compression 400 Gb ECW Visually Lossless Compression ECW image compression: Instant storage savings Faster performance Full visual quality JPG2000: Strongly reduce size Keep data integrity 50Gb 14
ECW - COMPRESS and SIMPLIFY Intergraph ECW Raw Imagery +1 ECW mosaic file Conventional Approach Raw Imagery +Image Tiles +Tile Pyramids Efficient processing and simple data structure: Easier to manage Time effective Provides a single source of truth One format to serve all software clients. +Mosaic Overviews 15 +Tile Cache
Imagery & Tile Cache Storage Non-scalable solutions Duplicates data Complicates data publishing Restricts users to specific projections Restricts users to specific resolutions/scales Tile caching places decoding speed above all other needs 16
Tile cache Worked example (Storage) 71Tb level 19 cache Uncompressed With Pyramids With tile-cache to level 17 With tile-cache to level 18 With tile-cache to level 19 38Tb level 17 ECW cache 0.85Tb Pyramids level 18 cache +29 + 116 ECW +7 7 17 Imagery Size Days needed to generate
Storage Costs - Tile-cache & Pyramid vs. ECW $ 4,600 $ 4,700 $ 6,200 Amazon S3 Cloud storage Costs comparison example 98% lower costs using ECW >$4.6k monthly saving Up to $73k annual saving Original with pyramids plus tile-cache 17 levels plus tile-cache 19 levels ECW $ 82 18 $ cost per month Data generated using the Amazon S3 Cloud Calculator
ECW JPEG2000 SDK Developer toolkit for Reading & writing ECW and JPEG2000 file formats Fully compliant ISO/IEC 15444 Part 2 JP2 support Reading and writing NITF 2.1 EPJE and NPJE code-streams Reading ECWP streams C, C++, JNI bindings SWIG bindings in development 20+ Examples Hardware accelerated (SSE 3,4, AVX & NEON) Available across 7 different environments Windows x86 VC90, VC100, VC110 Linux x86 GCC 4.4+ Android x86 & ARM ios ARM Windows Phone MacOSX x86 (future release) 19
Dissemination: APOLLO Essentials ERDAS APOLLO Essential Capture Process Deliver WMS WMTS (tiles) WMS-T (time-series) JPIP (JPG2000 Streaming) ECWP (ECW Streaming) GeoREST 20
APOLLO Essentials Silver bullets APOLLO Essentials & its compression technology provides: Very efficient compression technology to reduce imagery storage by 5-20x One of the most optimized geospatial image server on the market 64-bit, multi-threaded, hardware accelerated server Highest performing image server available Can do more with less servers, storage and time Reduced dissemination time Rapid dissemination of terabytes of data Unique delivery methods To interoperate with third-party systems Reduce bandwidth requirements Improve user experience 21 21
VELOCITY
Data deluge Imagery data volumes are growing at an ever increasing rate Faster than processing and storage costs are falling 8 Terrabyte/ day forseen for Copernicus 23
Data Crawling
APOLLO Advantage Generated Crawling APOLLO Advantage Manage Catalog Capture Process Deliver 25
CRAWLING The Geospatial Information crawlers are scheduled server jobs for continuous discovery of Geospatial data at user specified dataset store locations. The crawlers : Run asynchronously on the Server: Set it and forget it! Run on a repeated scheduled basis to enable the catalog auto update Auto-discover imagery and terrain data Drop box concept Auto-harvest imagery/sensor metadata like LANDSAT, QUICKBIRD, SPOT, DIMAP, ISO 19115/19139, etc Auto-provision data for optimized end user consumption (pyramids, thumbnails and metadata generation, footprint computation and security configuration) 26
Advantage Workflow Architecture Web Client Access & Geospatial Security View Download Browse, search Data OGC Services (WMS, WCS, WMTS) Streaming Services (ECWP, JPIP) Download Service (Clip, Zip Ship) Catalog (CS-W, REST) Manager Crawl, Secure, Manage and Style Data Parse and Edit Metadata Configure server and service interfaces Crawl, Provision 27 Imagery Terrain Sensor LIDAR Oracle, MS SQL, 27 PostGIS
Hierarchical raster Data Model (Variety, Value) Hierarchical classification of Data Data gathered by collection, theme, type, domain Ascendant metadata aggregation through hierarchy metadata aggregation 28 28
Open interface : OGC web coverage service (WCS) serve raw (unportrayed) data on the fly support for: spatial subsetting pixel space subsetting temporal selection band ordering and selection mosaïcing/time series coordinate transform custom srs management very big imagery through the whole chain any data depth (8bits 64 bits) any number of bands multiple data formats in input: see Imagine decoders In output: Geotiff, JPEG200, ECW, NITF, DTED, IMG decoder and encoder plugs 29 29
Online Processing
APOLLO Professional APOLLO Professional Publish Manage Catalog Online processing Processes 31
What is ERDAS APOLLO Professional Workflow Web Client 1. Select Process 3. Configure process inputs and parameters Access & Geospatial Security 8.View & Download 4. Execute 2. Propose process inputs 7. Propose process outputs Data OGC and Streaming Services (WMS, WCS, WMTS, ECWP, JPIP) Download Service (Clip, Zip Ship) Process Service (WPS) Catalog (CS-W, REST) Manager Secure and Manage processes Configure execute permissions 5. Write Results 6. Crawl, Provision Model and Publish Process ERDAS IMAGINE Imagery, Terrain, Oracle, MS SQL, 2012 Intergraph Corporation 32 Sensor and LIDAR 32 PostGIS
ERDAS APOLLO Geoprocessing 5 Band Data Produces very clean change detection B-1 B-2 B-3 33
Results 34
Conclusion APOLLO provides tools to address Big Data challenge Volume: Big Data to Small Data Hexagon ECW and JPG2000 compression very efficient Easy and fast data dissemination Streaming: ECWP/JPIP WMS, WMS-T, WMTS SDK available to integrate that technology in your own system Velocity : Dynamic data management: Data crawling Metadata aggregation Automatic medata parsers for different Satellite products (Landsat, Spot, etc) Variaty of interfaces to consume data including OGC Download: WCS (from Level0 ), Clip Zip and Ship View : WMS, WMTS, WMS-T Processing : WPS Discovery 35
QUESTIONS?