Cloud-based Infrastructures Serving INSPIRE needs INSPIRE Conference 2014 Workshop Sessions Benoit BAURENS, AKKA Technologies (F) Claudio LUCCHESE, CNR (I) June 16th, 2014 This content by the InGeoCloudS consortium members is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License. Based on a work at http://www.ingeoclouds.eu/. INSPIRE Compliant Data and Services on the Cloud WORKSHOP, Monday June 16 th, 2014 1
The Digitized Earth Avalanche of data generated by various institutions all over Europe in numerous domains: Geography, Earth observation, Geology, Public Administration Private Geo Companies And also citizens INSPIRE Compliant Data and Services on the Cloud WORKSHOP, Monday June 16 th, 2014 2
Why considering cloud technologies? We cannot act here Data Quantity and Quality Visibility, Accessibility and Sharing We can «push up» here for improving necessary services INSPIRE Compliant Data and Services on the Cloud WORKSHOP, Monday June 16 th, 2014 3
Why considering cloud technologies? More and more needs for computations, simulations, peak responses to emergencies. Geospatial sciences & applications Enabling IT Resources We can act on here for accelerated procurement, for resources availability and pooling INSPIRE Compliant Data and Services on the Cloud WORKSHOP, Monday June 16 th, 2014 4
Why considering cloud technologies? Some common characteristics about Geo Applications Usage Data intensity scenarios: storage but not only (e.g. performance of queries, synchronisation and integrity of data) Computing intensity scenarios: geo processing and computation on demand, procurement and release of resources Access intensity scenarios: World wide and/or concurrent access to information in case of particular events where maps / data/ services are required well beyond normal usage INSPIRE Compliant Data and Services on the Cloud WORKSHOP, Monday June 16 th, 2014 5
Why considering the Cloud? Data must be provided / shared Efficiently Ubiquitously In various forms At no (or cheap) price Following standards and recommendations «packed» into relevant services INSPIRE Compliant Data and Services on the Cloud WORKSHOP, Monday June 16 th, 2014 6
What is Cloud Computing? Cloud computing comes from the convergence of: service oriented architectures... loose coupling of services with operating systems and technologies... parallel computing large scale data analysis, up to thousands of machines virtualization independence from physical hardware Cloud computing is a model for enabling ubiquitous, convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction. (NIST) http://csrc.nist.gov/publications/nistpubs/800-145/sp800-145.pdf INSPIRE Compliant Data and Services on the Cloud WORKSHOP, Monday June 16 th, 2014 7
Why considering the Cloud? Promises of the Cloud: Costs Reduction and simplification Reduced Total Cost of Ownership Technical staff, power supply, physical space hardware, cables, Scale economy among partners Share of Databases, servers, CPUs, Pay as you go: Operations Costs versus Infrastructure Costs Easy procurement (on demand, self service) INSPIRE Compliant Data and Services on the Cloud WORKSHOP, Monday June 16 th, 2014 8
Why considering the Cloud? Promises of the Cloud: Not only about the Money! A cloud platform shall also provide Large computing power with ad hoc machines and network Various up to date Operating Systems and technologies Ubiquitous access and Quality of Service INSPIRE Compliant Data and Services on the Cloud WORKSHOP, Monday June 16 th, 2014 9
About Cloud Computing Towards perfect capacity management? INSPIRE Compliant Data and Services on the Cloud WORKSHOP, Monday June 16 th, 2014 10
About Cloud Computing Towards perfect capacity management? App App App Operating System Hardware Traditional Stack INSPIRE Compliant Data and Services on the Cloud WORKSHOP, Monday June 16 th, 2014 11
About Cloud Computing Towards perfect capacity management? App App App Hardware Operating App System Operating App App App System App App Operating App System App App Traditional App Hardware App Operating Stack App System Hardware Operating App System App App Hardware Operating App System App App Operating Traditional App System Hardware App Traditional Stack App Hardware Operating Stack System Traditional Hardware Operating Stack System Hardware Operating Traditional App System App Traditional Stack App Hardware Stack Traditional Stack Hardware Traditional Stack App App App Operating App System App App App App App Traditional Hardware Operating Stack System Traditional Stack Traditional Stack Hardware Traditional Stack App App App Operating App System App App Hardware Operating App System App App Traditional Hardware Operating Stack System Traditional Stack Hardware Traditional Stack App App App Operating System Hardware Traditional Stack INSPIRE Compliant Data and Services on the Cloud WORKSHOP, Monday June 16 th, 2014 12
About Cloud Computing Towards perfect capacity management? Starting costs remain reasonable Adaptive capacity: scaling down is as important as scaling up INSPIRE Compliant Data and Services on the Cloud WORKSHOP, Monday June 16 th, 2014 13
Challenges of GeoData Services Infrastructures Diverse software requirements Diverse resource requirements Resource requirements vary over time Reduce costs INSPIRE Compliant Data and Services on the Cloud WORKSHOP, Monday June 16 th, 2014 14
Challenges of GeoData Services infrastructures and Cloud Computing Diverse software requirements < > Virtualization To support a larger number of software requirements Diverse resource requirements < > Scalability To support large data volumes and high throughput To support increasing dataset sizes Resource requirements vary over time < > Elasticity To support a varying number of users To support on demand computations Reduce costs < > Pay as you go To reduce infrastructural cost during low platform usage INSPIRE Compliant Data and Services on the Cloud WORKSHOP, Monday June 16 th, 2014 15
Why InGeoCloudS The project wants to demonstrate that a Cloud infrastructure can be used by public organisations to provide more efficient, scalable and flexible services for creating, sharing and disseminating spatial environmental data INSPIRE Compliant Data and Services on the Cloud WORKSHOP, Monday June 16 th, 2014 16
Moving to the Cloud is not only virtualizing exiting service stack App App App Hardware Operating App System Operating App App App System App App Operating App System App App Traditional App Hardware App Operating Stack App System Hardware Operating App System App App Hardware Operating App System App App Operating Traditional App System Hardware App Traditional Stack App Hardware Operating Stack System Traditional Hardware Operating Stack System Hardware Operating Traditional App System App Traditional Stack App Hardware Stack Traditional Stack Hardware Traditional Stack App App App Operating App System App App App App App Traditional Hardware Operating Stack System Traditional Stack Traditional Stack Hardware Traditional Stack App App App Operating App System App App Hardware Operating App System App App Traditional Hardware Operating Stack System Traditional Stack Hardware Traditional Stack App App App Operating System Hardware Traditional Stack INSPIRE Compliant Data and Services on the Cloud WORKSHOP, Monday June 16 th, 2014 17
Moving to the Cloud Think Layers!. Identify a common software stack Homogeneous infrastructure Scale economies Cloud compliance Define data volumes requirements Capacity planning Scalability challenges and requirements High throughput services Reliability Guarantee QoS for INSPIRE/OGC services Web Server Map Server Spatial Data Storage Operating System Geo Spatial Stack It is not simply about porting existing applications to the cloud, but rather to integrate them in a scalable geo spatial framework INSPIRE Compliant Data and Services on the Cloud WORKSHOP, Monday June 16 th, 2014 18
InGeoCLOUDS Architecture: Auto Scaling Layers INSPIRE Compliant Data and Services on the Cloud WORKSHOP, Monday June 16 th, 2014 19
InGeoCloudS Achievements The Architecture SCALABILITY + ELASTICITY + ON-DEMAND + MEASURED SERVICE INSPIRE Compliant Data and Services on the Cloud WORKSHOP, Monday June 16 th, 2014 20
InGeoCloudS Scalable Services InGeoCloudS scalable services: Elastic File Server Elastic Database Server Elastic Web Server Elastic Map Server Elastic Linked Data Store All of the above are hot topics from a technological and scientific point of view. INSPIRE Compliant Data and Services on the Cloud WORKSHOP, Monday June 16 th, 2014 21
Focus on INSPIRE Compliant Data and Services on the Cloud WORKSHOP, Monday June 16 th, 2014 22
Elastic File Server We evaluated several technologies: S3FS, S3Backer, pnfs, LUSTRE, Our choice was GlusterFS No single point of failure No file metadata server Scalable Can add as many servers as needed at any time. Support standard protocols (e.g. NFS) Includes some optimizations, e.g., read ahead, write behind, async I/O, scheduling, caching It is currently sponsored by RedHat Other Cloud based storage solutions are based on the key value access pattern, which is incompatible with every other technology on the Geo Spatial Software stack This is almost a research challenge! INSPIRE Compliant Data and Services on the Cloud WORKSHOP, Monday June 16 th, 2014 23
GlusterFS at work Transparent access for applications Similar to NFS. Automatic set up on IGC instances. INSPIRE Compliant Data and Services on the Cloud WORKSHOP, Monday June 16 th, 2014 24
Elastic File Server Scalability 800 GlusterFS write GlusteFS read 730 700 600 Throughput (MB/s) 500 400 300 342 344 200 125 210 100 78 0 77 55 1 2 4 8 Number of Servers INSPIRE Compliant Data and Services on the Cloud WORKSHOP, Monday June 16 th, 2014 25
Elastic DataBase Server PostgreSQL (+PostGIS) PgPool Load balancer Master/Slave architecture Streaming replication Scalability Parallel read operations Can add as many servers as needed at any time. Reliability Automatic fail over A slave replaces the Master INSPIRE Compliant Data and Services on the Cloud WORKSHOP, Monday June 16 th, 2014 26
InGeoCloudS Context and Challenges The classical approach of the environmental data dissemination INSPIRE services : different part of the system potentially not in relation with data publication INSPIRE Compliant Data and Services on the Cloud WORKSHOP, Monday June 16 th, 2014 27
The INSPIRE Architecture Applications Layer Applications and Geoportals Service Bus Service Layer Discovery Service View Service Download Service Transformation Service Invoke service Spatial Data Service Registry Service metadata Data Layer Metadata: - Datasets -Services Spatial Data Set Registers Dataset metadata INSPIRE Compliant Data and Services on the Cloud WORKSHOP, Monday June 16 th, 2014 28
A cloud based infrastructure for scientific publication Metadata INSPIRE Data model OGC services INSPIRE Compliant Data and Services on the Cloud WORKSHOP, Monday June 16 th, 2014 29
InGeoCloudS project Focus Applications Layer Applications and Geoportals Service Bus Service Layer Discovery Service View Service Download Service Transformation Service Invoke service Spatial Data Service Registry Service metadata Data Layer Metadata: - Datasets -Services Spatial Data Set Registers Dataset metadata INSPIRE Compliant Data and Services on the Cloud WORKSHOP, Monday June 16 th, 2014 30
Geo Publication A Key Service in InGeoCloudS Objectives Simplify the process of transforming geo data as geoservices Guarantee the geo service compliance with OGC standards and INSPIRE requirements 3 components in the Data Publication : Read Only services with OGC:WMS (image) and OGC:WFS (data) CRUD API to manage the configuration of each service by data provider Metadata management (ISO 1911 + OGC:CSW) INSPIRE Compliant Data and Services on the Cloud WORKSHOP, Monday June 16 th, 2014 31
Insights for GeoPublication : A SaaS approach for GIS+INSPIRE objectives «Software As A Service» for INSPIRE and GIS team The software The service in the cloud GIS desktop Describe and share your datasets Share my maps with Internet Provide my INSPIRE services GIS file My GIS interface (Web editor) My secure storage for my files and my maps INSPIRE Compliant Data and Services on the Cloud WORKSHOP, Monday June 16 th, 2014 32
The strengths of the SaaS approach for GIS+INSPIRE Publication for data providers and users The process of geo datasets publication is greatly simplified Mask all the technical aspects for the Web publication of geospatial datasets Do not worry about the performance requirements (WMS in less <5s), the capability (>20 requests/s) or the availability of your web services (+99%) Benefit of the improvement of the service over time without install/update process: WFS 2.0, new GIS functionalities, stats of use, You are in your workspace and you master your publication INSPIRE Compliant Data and Services on the Cloud WORKSHOP, Monday June 16 th, 2014 33
Focus on INSPIRE Compliant Data and Services on the Cloud WORKSHOP, Monday June 16 th, 2014 34
GeoPublication Component Architecture HTTP/API WMS WFS HTTP load balancer Data publication API Mapserver Server Mapserver Server Mapserver Server Mounting FS for all data provider Write Mounting FS for all data provider ReadOnly Access DB 3306 port ELASTIC GEOSPATIAL SERVER CLUSTER Elastic FS and DB Cloud infrastructure INSPIRE Compliant Data and Services on the Cloud WORKSHOP, Monday June 16 th, 2014 35
Small Amazon instance 6 Example with the number of requests with a WMS GetMap 50 Performance Capacity WMS GetMap 800x600 <5 s simultaneaus requests > 20/s Availability 99% Large Amazon instance INSPIRE Compliant Data and Services on the Cloud WORKSHOP, Monday June 16 th, 2014 36
Focus on INSPIRE Compliant Data and Services on the Cloud WORKSHOP, Monday June 16 th, 2014 37
Elasticity Experiment: Elastic Web Server 12000 Issued Requests System Load No. Servers Load Threshold 100 10000 90 80 Requests / min 8000 6000 4000 3 servers 4 servers 70 60 50 40 30 Average CPU Utilization 2000 1 server 2 servers 20 10 0 0 1 6 11 16 21 26 31 36 41 46 51 Time INSPIRE Compliant Data and Services on the Cloud WORKSHOP, Monday June 16 th, 2014 38
System load System load increases quickly increases slowly: the system can sustain peak loads more easily 12000 100 90 10000 80 Requests / min 8000 6000 4000 2000 0 4 servers 3 servers 2 servers 1 server 1 6 11 16 21 26 31 36 41 46 51 Time INSPIRE Compliant Data and Services on the Cloud WORKSHOP, Monday June 16 th, 2014 39 70 60 50 40 30 20 10 0 Average CPU Utilization
ID Card of the Project Who we are? 5 Geological Surveys bringing in 6 initial Use Cases (datasets and applications) Ground Water Management Geo Hazards: Landslides, Earthquakes GeoPublication and Web Mapping made easy 3 ICT organizations bringing key expertise Cloud Computing GIS Semantic Web and Linked Data Software architecture and integration EC Support INSPIRE Compliant Data and Services on the Cloud WORKSHOP, Monday June 16 th, 2014 40
ID Card of the Project Key Dates Feb 2012 March 2013 October 2013 July 2014 MAY 2012: Experts Workshop#1 INSPIRE Compliant Data and Services on the Cloud WORKSHOP, Monday June 16 th, 2014 41
ID Card of the Project Some highlights on results Fundamental scalable/elastic services for data management: Database Server, File Server, Linked Data Store Geo Data publication services including INSPIRE services generation and (SaaS mode) An API available publicly: RESTful Web Services upon a loosecoupled architecture Data providers data and applications in the cloud Generic Services, Integration Portal and Management Tools INSPIRE Compliant Data and Services on the Cloud WORKSHOP, Monday June 16 th, 2014 42
InGeoClouds: Internet infrastructure IGC provides a Internet infrastructure open on the Web Fully featured RESTful APIs facilitating control integration in business processes IGC is a scalable and sustainable infrastructure as required by INSPIRE network services rules INSPIRE Compliant Data and Services on the Cloud WORKSHOP, Monday June 16 th, 2014 43
Wrap up : InGeoCloudS Pilot2 A Portal INSPIRE Compliant Data and Services on the Cloud WORKSHOP, Monday June 16 th, 2014 44
Wrap up : InGeoCloudS Pilot2 Integrated Geo Applications INSPIRE Compliant Data and Services on the Cloud WORKSHOP, Monday June 16 th, 2014 45
Wrap up : InGeoclouds Network services for INSPIRE Portals GIS Applications INTERNET WITH CLOUD Metadata discovery services (CSW) View services (WMS 1.3) Pre-defined download services (ATOM) Direct Download services (WFS) Mapping and Publication Services for Data/Maps Metadata Storage (BD, files) 25/06/2014 InGeoCloudS / CNES Presentation, Toulouse, March 18th, 2014 46
Thanks for your attention Thanks for your attention. www.ingeoclouds.eu contact@ingeoclouds.eu www.facebook.com/ingeoclouds @ingeoclouds INSPIRE Compliant Data and Services on the Cloud WORKSHOP, Monday June 16 th, 2014 47