The Digicene: the Age of Big Data in the Geosciences



Similar documents
National Geothermal Data System and Global Geosciences Data Integration

Geothermal Technologies Program

CYBERINFRASTRUCTURE FRAMEWORK FOR 21 ST CENTURY SCIENCE, ENGINEERING, AND EDUCATION (CIF21)

Data Intensive Research Initiative for South Africa (DIRISA)

ODIP: Establishing and operating an Ocean Data Interoperability Platform

DATA STEWARDSHIP from a geoscience and academic perspective

Doing Multidisciplinary Research in Data Science

Directorate for Geosciences

CYBERINFRASTRUCTURE FRAMEWORK FOR 21 st CENTURY SCIENCE AND ENGINEERING (CIF21)

The Next Generation Science Standards (NGSS) Correlation to. EarthComm, Second Edition. Project-Based Space and Earth System Science

BIG DATA & DATA SCIENCE

Department of Geology

Collecting and Analyzing Big Data for O&G Exploration and Production Applications October 15, 2013 G&G Technology Seminar

Databases & Data Infrastructure. Kerstin Lehnert

Pan-European infrastructure for management of marine and ocean geological and geophysical data

History of the Military Geology Branch of the U.S. Geological Survey. Joseph M. Duracinsky

Infosys Oil and Gas Practice

Oracle Big Data Discovery Unlock Potential in Big Data Reservoir

Drilling into Science: A Hands-on Cooperative Learning Oil Exploration Activity designed for Middle School and High School Students

Delivering Smart Answers!

CYBERINFRASTRUCTURE FRAMEWORK FOR 21 ST CENTURY SCIENCE, ENGINEERING, AND EDUCATION (CIF21) $100,070,000 -$32,350,000 / %

Good morning. It is a pleasure to be with you here today to talk about the value and promise of Big Data.

A CONTENT STANDARD IS NOT MET UNLESS APPLICABLE CHARACTERISTICS OF SCIENCE ARE ALSO ADDRESSED AT THE SAME TIME.

IBM Big Data in Government

National Higher Education & Workforce Initiative Regional Economic Growth Through High skill, High demand Workforce Development

Answer Keys to Unit Tests

Proposal for New Program: Minor in Data Science: Computational Analytics

Hur hanterar vi utmaningar inom området - Big Data. Jan Östling Enterprise Technologies Intel Corporation, NER

Geoscientists follow paths of exploration and discovery in quest of solutions to some of society's most challenging problems.

354 Russell Senate Office Building 724 Hart Senate Office Building Washington, D.C Washington, D.C

Geoparks: Creating a Vision for North America

NASA Earth Science Research in Data and Computational Science Technologies Report of the ESTO/AIST Big Data Study Roadmap Team September 2015

The following was presented at DMT 14 (June 1-4, 2014, Newark, DE).

SURVEY REPORT DATA SCIENCE SOCIETY 2014

A New Era Of Analytic

The Challenges of Integrating Structured and Unstructured Data

Insight for Informed Decisions

Data Management Considerations for the Data Life Cycle

International Data Sharing Framework

Impact of Big Data in Oil & Gas Industry. Pranaya Sangvai Reliance Industries Limited 04 Feb 15, DEJ, Mumbai, India.

National Big Data R&D Initiative

Workprogramme

Paradigm High Tech and Innovative Software Solutions for the Oil and Gas Industry

African European Georesources Observation System

Oil Gas expo 2015 is comprised of 13 Main tracks and 131 sub tracks designed to offer comprehensive sessions that address current issues.

HDP Hadoop From concept to deployment.

DATA SCIENTIST TRAINING FOR LIBRARIANS #DST4L. C. Erdmann Designing Libraries

Microsoft - Oil and Gas

The Master in Geology

Big Data & Coal. How big data can help Coal Geology, Exploration and Resource Evaluation in a downturn 12/07/2013

KNOWLEDGE MANAGEMENT AT ECOPETROL

Exploiting Prestack Seismic from Data Store to Desktop

College of Agriculture, Engineering and Science INSPIRING GREATNESS

Analytics Centre of Excellence: Roles, Responsibilities and Challenges

Big Data in the context of Preservation and Value Adding

Empowering the Masses with Analytics

GEOG 482/582 : GIS Data Management. Lesson 10: Enterprise GIS Data Management Strategies GEOG 482/582 / My Course / University of Washington

BOOSTING THE COMMERCIAL RETURNS FROM RESEARCH

Using Google Earth to Explore Plate Tectonics

VMworld 2015 Track Names and Descriptions

UNH Strategic Technology Plan

Sharing the experiences of teaching business analytics in a University course

Are You Big Data Ready?

LIMITED RESOURCES: "A SHORTAGE IN THE SEA" QUESTION Are the things that we use from the ocean unlimited? Can we run out?

Iterative Database Design Challenges and Solutions for a Geomechanics Database

Organic Data Publishing: A Novel Approach to Scientific Data Sharing

IBM - Fueling the Oil & Gas Industry

Information Technology Career Path Overview

Transcription:

The Digicene: the Age of Big Data in the Geosciences Lee Allison Arizona Geological Survey USGIN Foundation, Inc. Earth Data Science in the Era of Big Data and Compute April 30, 2015

A system that works for one geologist, in the field and at their desk and for users of streaming, High Performance Computing

Data Long tail of science Pb Tb ~20% of earth scientists use/need HPC capabilities; 80% rely dominantly on desktop software and data sets they collect themselves Gb Mb Mainstream Scientists

Big Data: Data Integration ( interoperability ) Capabilities for moving data between data warehousing, business analytics, master data management, enterprise applications, and custom applications. Integrating structured data in relational databases with social media data, weblogs, and various unstructured data Dain Hansen, Oracle Corp 2012

Big Data is not just large data sets Digitizing, organizing, analyzing, modeling, visualizing [small] data from disparate sources

National Geothermal Data System Oregon Institute of Technology Geo Heat Center Boise State Free online access to: Stanford Reservoir Engineering AZGS: 50 State Geological Surveys U of Nevada Reno University of Utah Energy & Geosciences Institute U.S. DOE Geothermal Data Repository hosted at NREL U.S. Geological Survey Maps, data, & documents from 65+ providers nationwide Southern Methodist University Distributed network >10 million data records >3 million oil & gas wells >47,000 maps & reports Powered by USGIN

For mainstream science Not centralized big iron, but decentralized data wrangling Not one ring to rule them all but small pieces loosely joined Mass democratisation of the means of access, storage & processing of data A distributed ecosystem of information, an ecosystem of small data Distributed models not centralized ones Collaboration not control, and small data not big data Rufus Pollock Open Knowledge Foundation April 2013

Small data culture Big data environment Barriers to discovery, access, and integration of data have shaped the scientific practice since it s beginnings Most of us work in small data because there is no viable alternative Tackling interdisciplinary/multidisciplinary problems was not realistic Many large geoscience databases are not really big data Open data access and interoperability are changing the paradigm Digitization, integration, and modeling are creating unprecedented opportunities and challenges The new generation of geoscientists increasingly will need data science skills

Turning legacy data.

.into this NERC 2008

Creating common standards and protocols Engaging the vast number of distributed data resources Establishing practices for recognition of and respect for intellectual property Developing simple data and resource discovery and access systems Building mechanisms to encourage development of web service tools and workflows for data analysis Brokering the diverse disciplinary service buses Creating sustainable business models for maintenance and evolution of information resources Integrating the data management life-cycle into the practice of science

Big Data and the Geoscience Profession There will be a shortage of talent necessary for organizations to take advantage of big data. By 2018, the United States alone could face a shortage of 140,000 to 190,000 people with deep analytical skills as well as 1.5 million managers and analysts with the know-how to use the analysis of big data to make effective decisions. McKinsey Global Institute, 2011

Global Cyberinfrastucture and Data Management Initiatives Environment Earth Observation Geoscience Marine Petroleum/ Energy All FOCUS EarthCube (US only) Earth Science Information Partners (ESIP) (US) GeoSeas (EU) Renewable Energy Agency US Inspire (EU only) National/ Continental IGSN (EU, US, AUS) ODIP Oceans Data Interoperability Platform (US, EU, AUS) CoopEUS (EU, US) Cross Continental Belmont e- Infrastructure GEO/GEOSS Geoscience Information Council OneGeology Energistics National Data Repositories CODATA ICSU World data System Research Data Alliance Global USGIN involvement Ones GA is actively involved in Ones GA follows Ones mentioned in OneGeology agenda Ones not on OneGeology Agenda Geoscience Australia

Geothermal Petroleum Mining Geological Surveys Academia Professional Organizations National Geothermal Data System (NGDS) IRENA Global Atlas African Rift Geotherm al (ARGeo) Standards Leadership Council National Data Repositories (NDR) African Minerals Geoscience Initiative OneGeology USGIN USGS Community for Data Integration EarthCube Belmont Forum Intl Geo Sample Numbering (IGSN) IUGS CGI GeoSciML Coalition for Publishing Data in the Earth and Space Sciences Geoscience Cyberinfrastructure

Everything digital, online, & interoperable Productivity increase Enabling inter- and multidisciplinary research Enabling unheard-of analytical, computational, visualization, and modeling

Digitization Open access Integration/Interoperability Modeling, visualization, analytical capacity Capacity building Social construct

GLOBAL CONVERGENCE IN GEOSCIENCE CYBERINFRASTRUCTURE DAWN OF THE DIGICENE