Industry & SMEs Round Table 2014 Conference on Big Data from Space (BiDS '14) Dr. Florin Serban Dr. Catalin Cucu-Dumitrescu 12-14 November 2014, ESRIN, Frascati, Italy
Main aspects to be presented: Status and challenges of Big Data from space in Europe Specific non-technical problems encountered by companies of various scale in Working with Big Data ASRC introduction Round table Open discussion Next steps and recommendations 2
ASRC Funded in 2007 27 employees More than 1 Mil EUR turnover for each of the last 2 years The only Romanian company that developed its own capabilities of analysis, processing and interpretation of optical and radar Earth Observation data Offers innovative solutions for environmental monitoring and risk assessment (flood risk analysis, drought early warning, deforestation evaluation etc.) Clients: European Space Agency, German Aerospace Center, World Bank, national public authorities, private companies 3
ASRC Four main development directions: Monitoring services based on satellite and in situ data processing for: natural hazards risks (drought, floods, landslides / earthquakes), mining, urban and wet zones, agriculture, forestry, critical infrastructure, CO2 storage areas Web based applications and platforms for data searching, downloading, management and processing Educational software development Complementary ground based data acquisition sensors (radar) for different monitoring applications and services Other activities: Visual Analytics tools and services for EO and linked data access Modeling Tool Design 4
Status and Challenges of Big Data from Space in Europe Big Data: Expanding on 3 fronts at an increasing rate Expansion of Big Data in the 3Vs representation (Diya Soubra: The 3Vs that define Big Data) 5
Existing European Data Repositories ESA Data Policy (ERS, Envisat, Earth Explorers): free datasets - free of charge, based on user registration and acceptance of ESA Terms & Conditions; restrained datasets - free of charge, based on user registration and submission of a Project (Full) Proposal and acceptance of ESA Terms & Conditions; after the project evaluation a quota will be assigned. ESA Third Party Missions (Data Policy of individual data providers): reproduction cost (e.g. ALOS)/specific restrictions to the use of data (limitations of quota, geographical restrictions, etc.). (www.analyticbridge.com) 6
Existing European Data Repositories Evolution of ESA's EO Data Archives between 1986-2010 and future projections* *Günther Kohlhammer: (Big?) Data and Earth Observation, H/EO Ground Segment and Missions Operations Department, Big Data from Space, 5-7 June 2013 7
Existing European Data Repositories ESA Mission Sentinel-1 Swarm CryoSat SMOS Data Volume Huge, potentially up to 2.4 TB/day (with the two satellites) Modest data volume ~50 GB per day ~10 GB per day Current and past ESA Missions ESA Mission Sentinel 2 Sentinel 3 ADM-Aeolus Data Volume Huge, potentially up to 1.6 TB/day (with the two satellites) Huge, potentially up to 2.2 TB/day (with the three satellites) 5 TB over the entire mission Earthcare Level 1: 100 GB/day Future ESA Missions (ESA) 8
Application Areas These are the important areas identified by ESA Ground Segment (GS) as priorities*: Dissemination and on-demand processing (because needs are variable and depending on user demand); Secondary archive and re-processing (because needs are limited in time); Temporary resources for integration, testing and demonstration (because needs are limited in time); System sizing (because needs are unknown). *S. Loekken, J. Farres: ESA Earth Observation Big Data R&D Past, Present, & Future Activities, Ground Segment and Mission Operations Department, Earth Observation Programmes Directorate, March 2014 9
Services for ESA EO GS Cloud services have made significant progress. US companies lead the competition: offer very sophisticated and integrated services including user management and communication; intend to develop a business based on information extraction from merged datasets (EO data with other data). Microsoft is cooperating with the European research community, while Google is strongly approaching EO data holders at all levels to offer its services*. *ESA @ ASI: Big Data, IT Technology and their Impact on EO in Europe, November 2013 10
Services for ESA EO GS ESA is under pressure to reduce the cost of its EO Ground Segment. Currently no European service provider can offer the level of services the big players (all of them US companies) can. (ESA) *ESA @ ASI: Big Data, IT Technology and their Impact on EO in Europe, November 2013 11
Users expectations 1. Open Data a. All data are discoverable, accessible online and free b. Data is arranged on long time series of coherent data from different providers. 2. Open Computing a. Users are able to perform processing directly on the cloud using virtual servers. b. Users can choose their preferred cloud provider 3. Open Source Software a. All basic/platform software is open and freely available b. Applications can be easily ported across clouds 4. Open Collaboration* a. Data and applications can be easily shared with other users (eoxserver.org/doc/en/users/index.html) *A. Minchella on behalf of ESA EOPI Team: Access to ESA & ESA TPM EO Data, ESA Advanced Training Course in Land Remote Sensing, 2 July 2013, Athens, Greece 12
Specific Non-Technical Problems Encountered by Companies of Various Scale in Working with Big Data 13
System: The Need to Develop an European Big Data Ecosystem (EBDE) A business ecosystem is an economic community supported by a foundation of interacting organizations and individuals* *Moore, J.F. The Death of Competition: Leadership and Strategy in the Age of Business Ecosystems, HarperBusiness. (1996) The Big Data Value Chain** ** Framing a European Partnership for a Big Data Value Ecosystem, version 1.4, Vision for a European Big Data Value Partnership, February 2014 14
The Dimensions of a Big Data Value Ecosystem* * Diya Soubra: The 3Vs that define Big Data, posted on July 5, 2012 15
System: The Need to Develop an European Big Data Ecosystem (EBDE) The European Partnership for Big Data Value (EP-BDV) has identified several areas where the Big Data Value contractual Public Private Partnership should focus its actions: broadening the availability and accessibility of data sources; assessing the economic value of data assets; developing Big Data technologies and tools to support best datadriven applications and business opportunities; developing data-driven applications and business models providing measurable value to the involved players and addressing the lack of convincing use cases; testing and benchmarking technologies, applications, and business models; addressing the lack of skills and expertise; addressing the issues related to security and privacy and increasing the level of trust into data and data-driven applications. 16
System: The Need to Develop an European Big Data Ecosystem (EBDE) Intensive discussions with stakeholders have clearly shown that besides technology and application many infrastructural, economic, social and legal issues will have to be addressed in an interdisciplinary fashion. Especially for SMEs, these issues are central for a fast take-up of the opportunities offered by Big Data Value*: skills and training; reliable legal frameworks; reference applications and access to an ecosystem. The signature of the Big Data Value Public Private Partnership (BDV PPP) took place on 13 October 2014 (http://www.bigdatavalue.eu/#sthash.vqkffyxw.x61p5wcg.dpuf) *NESSI: DRAFT European Big Data Value Strategic Research & Innovation Agenda, Version 0.7, April 2014 17
System: The Need to Develop an European Big Data Ecosystem (EBDE) There are a number of drivers encouraging the scaling up of the EBDE*: ensuring appropriate access to finance for big data companies; establishing an enabling business environment for data storage, data transfers and communication networks; supporting entrepreneurship, leading to the creation of start-ups and SMEs that offer big data analytics and decision making solutions; fostering administrative simplification to enable companies to submit information to a single public administration; to support big data SMEs in their internationalization process, e.g. reimburse young companies when they move to international market; to develop and promote an education system able to answer the specific needs of big data companies. *Laurent Probst et al.: Big Data Analytics & Decision Making, Directorate-General for Enterprise and Industry, Directorate B Sustainable Growth and EU 2020, Unit B3 Innovation Policy for Growth, September 2013 18
Legal: Specific Legal Aspects for Space Big Data Big Data s increasing economic importance also raises a number of legal issues: Ownership of data Data protection law Copyright Contractual and Liability problems Who owns a piece of data and what rights come attached with a dataset? What defines fair use of data? Who is responsible when an inaccurate piece of data leads to negative consequences? Such types of legal issues will need clarification, probably over time, to capture the full potential of big data*. *McKinsey Global Institute: Big data, the next frontier for innovation, competition and productivity, 2011 19
Business: Challenges for Companies in the Field of Big Data from Space Organizations capitalizing on Big Data differ from traditional data analysis in three ways*: 1. They pay attention to data flows as opposed to stocks. 2. They rely on data scientists and product and process developers rather than data analysts. 3. They are moving analytics away from IT function and into core business, operational and production functions. *Davenport, T.H., Barth, P. and Bean, R. How: Big Data is Different. MIT Sloan Management Review, July 2012 20
Business: Challenges for Companies in the Field of Big Data from Space Companies can have different positions in the Big Data Ecosystem*: Established User Enterprises Data Generators and Providers Technology Providers Collaborative networks *S. Loekken, J. Farres: ESA Earth Observation Big Data R&D Past, Present, & Future Activities, Ground Segment and Mission Operations Department, Earth Observation Programmes Directorate, March 2014 21
Skills: The Demand for Specialists Qualified for Working with Big Data In order to leverage the potential of Big Data, a key challenge for Europe is to ensure the availability of highly and rightly skilled people: Data Scientists - solid knowledge in statistical foundations and advanced data analysis methods combined with a thorough understanding of scalable data management, with the associated technical and implementation aspects; deliver novel algorithms and approaches for the Big Data Value stack in general, such as advanced learning algorithms, predictive analytics mechanisms, etc. Data Engineers - develop and exploit techniques, processes, tools and methods for developing applications that actually turn data into value; understand the domain and the business of the organizations; bring knowledge and work at the intersection of technology, application domains and business. In order to educate and train Data Engineers, novel courses and forms of training are required*. *ESA @ ASI: Big Data, IT Technology and their Impact on EO in Europe, Nov 2013 22
Discussion Topics Q1. What aspects do you consider that are the most important to be addressed within a future consistent strategy on big data (e.g. data access, harmonization at EU / international level, education, etc.? Q2. Do we have a proper view on the requirements? If not, who is supposed to generate the requirements? Q3. Who is who - what roles for which type of organization? For example, should private organizations perform data archiving? Q4. How value-added service delivery will change in the big data era? Q5. Space exploration, world round, is carried out mostly by national governments, and this data hosted on government servers*. How do you see the future of space open data? Could selling or outsourcing this data have significant repercussions on national security, especially in politically charged times such as this one? If yes, what is to be seen as alternative solution? Q6. Big Data, as a general domain, is a USA dominated play-ground?** European ICT companies have a backlog (1 to 2 years) compared to the USA. European scientific institutions are also relatively late to become involved in Big Data research. There are only a few Big Data technology suppliers in Europe, which is a reason of concern for EU. How do you see the situation & future evolution of this European backlog for the particular domain of Space Big Data? Is collaboration with US desirable? If yes, in which way? *A. Santhanam: The Data Behind Deep Space Exploration, October 30, 2014, Dataconomy Media GmbH 2014 ** Pierre-Yves DANET et al.: Big and Open data Position Paper, Networked and Electronic Media, December 2013 23
Discussion Topics Q7. What are current main challenges for SMEs in using satellite data for providing services? Q8. Buy or build the needed technology? When your company is developing a Big Data project, how does the Manager and the Information Officer choose the right solution that confers a competitive advantage?* Which are the plusses and the minuses to be considered? Q9. Open source versus proprietary source business models. Most of the supporting tools and storage architectures are now Open Source (Hadoop, Hive, Spark, Shark, HBase, Riak, Titan, etc.), leveling the playing field for tool vendors in this field. Opting for the open source route, however, comes with its own set of difficulties. What is the model your company uses? What is the rationale? Q10. Training the internal Big Data specialists. Have the joint research & innovation projects, between academia and your company, proved to be a good way to foster knowledge exchange, thus delivering experience about cutting-edge technology? At what level is the Big Data specialist formation to be addressed: education (university level) or training? Q11. Is Big Data really within the reach of SMEs? Are the cheap commodity hardware and open source software, together with Big Data cloud solutions, enough to ensure that? If not, what is it to be done? AOB Next steps/recommendations Conclusions * Nicole Laskowski: The big data architecture dilemma for CIOs, August 2014 24
Thank you for your attention! ASRC Dr. Florin Serban florin.serban@asrc.ro http://asrc.ro/ 25