More Than A Buzzword: Big Data in the Environmental Arena



Similar documents
Using RDBMS, NoSQL or Hadoop?

Datenverwaltung im Wandel - Building an Enterprise Data Hub with

Transforming the Telecoms Business using Big Data and Analytics

Integrating a Big Data Platform into Government:

Simplifying Big Data Analytics: Unifying Batch and Stream Processing. John Fanelli,! VP Product! In-Memory Compute Summit! June 30, 2015!!

Tapping Into Hadoop and NoSQL Data Sources with MicroStrategy. Presented by: Jeffrey Zhang and Trishla Maru

Big Data Analytics Nokia

Big Data Spatial Analytics An Introduction

BIG DATA & DATA SCIENCE

Big Data Technology ดร.ช ชาต หฤไชยะศ กด. Choochart Haruechaiyasak, Ph.D.

Keywords Big Data; OODBMS; RDBMS; hadoop; EDM; learning analytics, data abundance.

Big Data. Lyle Ungar, University of Pennsylvania

The Future of Big Data SAS Automotive Roundtable Los Angeles, CA 5 March 2015 Mike Olson Chief Strategy Officer,

Big Data, Why All the Buzz? (Abridged) Anita Luthra, February 20, 2014

locuz.com Big Data Services

How To Use Big Data For Telco (For A Telco)

Has been into training Big Data Hadoop and MongoDB from more than a year now

DNS Big Data

GAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION

Big Data Architecture & Analytics A comprehensive approach to harness big data architecture and analytics for growth

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ

#mstrworld. Tapping into Hadoop and NoSQL Data Sources in MicroStrategy. Presented by: Trishla Maru. #mstrworld

How To Create A Data Visualization With Apache Spark And Zeppelin

Native Connectivity to Big Data Sources in MSTR 10

How To Scale Out Of A Nosql Database

Financial, Telco, Retail, & Manufacturing: Hadoop Business Services for Industries

Tap into Hadoop and Other No SQL Sources

This Symposium brought to you by

Oracle Big Data SQL Technical Update

Spatio-Temporal Networks:

An Open Dynamic Big Data Driven Applica3on System Toolkit

Big Data: Are You Ready? Kevin Lancaster

Big Data Challenges. Alexandru Adrian TOLE Romanian American University, Bucharest, Romania

The 4 Pillars of Technosoft s Big Data Practice

Architectural patterns for building real time applications with Apache HBase. Andrew Purtell Committer and PMC, Apache HBase

Information Builders Mission & Value Proposition

BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES

TRANSFORM BIG DATA INTO ACTIONABLE INFORMATION

ITG Software Engineering

Hadoop and its Usage at Facebook. Dhruba Borthakur June 22 rd, 2009

Microsoft Big Data. Solution Brief

Big Data a threat or a chance?

How To Handle Big Data With A Data Scientist

SOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP. Eva Andreasson Cloudera

BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON

Big Data and the new trends for BI and Analytics Juha Teljo Business Intelligence and Predictive Solutions Executive IBM Europe

The Future of Data Management

Architecting for the Internet of Things & Big Data

So What s the Big Deal?

DATA EXPERTS MINE ANALYZE VISUALIZE. We accelerate research and transform data to help you create actionable insights

Luncheon Webinar Series May 13, 2013

Big Data Analytics. An Introduction. Oliver Fuchsberger University of Paderborn 2014

Hortonworks & SAS. Analytics everywhere. Page 1. Hortonworks Inc All Rights Reserved

Native Connectivity to Big Data Sources in MicroStrategy 10. Presented by: Raja Ganapathy

Hunk & Elas=c MapReduce: Big Data Analy=cs on AWS

A very short talk about Apache Kylin Business Intelligence meets Big Data. Fabian Wilckens EMEA Solutions Architect

How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning

Introduc8on to Apache Spark

BIG DATA What it is and how to use?

TAMING THE BIG CHALLENGE OF BIG DATA MICROSOFT HADOOP

Hadoop Evolution In Organizations. Mark Vervuurt Cluster Data Science & Analytics

.nl ENTRADA. CENTR-tech 33. November 2015 Marco Davids, SIDN Labs. Klik om de s+jl te bewerken

Ganzheitliches Datenmanagement

MSc Data Science at the University of Sheffield. Started in September 2014

Discovering Business Insights in Big Data Using SQL-MapReduce

Big Data and Data Science: Behind the Buzz Words

SQLSaturday #399 Sacramento 25 July, Big Data Analytics with Excel

ANALYTICS CENTER LEARNING PROGRAM

Aligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap

USING BIG DATA FOR INTELLIGENT BUSINESSES

Oracle Big Data Discovery Unlock Potential in Big Data Reservoir

Implementing Data Models and Reports with Microsoft SQL Server 2012 MOC 10778

Performance Management in Big Data Applica6ons. Michael Kopp, Technology

Hadoop Ecosystem Overview. CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook

Big Data & QlikView. Democratizing Big Data Analytics. David Freriks Principal Solution Architect

Monitis Project Proposals for AUA. September 2014, Yerevan, Armenia

Big Data Explained. An introduction to Big Data Science.

How Companies are! Using Spark

Application Development. A Paradigm Shift

An to Big Data, Apache Hadoop, and Cloudera

Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes

How To Extend An Enterprise Bio Solution

Big Data and Industrial Internet

Building Your Big Data Team

Surfing the Data Tsunami: A New Paradigm for Big Data Processing and Analytics

Enabling Manufacturing Transformation in a Connected World. John Shewchuk Technical Fellow DX

Testing 3Vs (Volume, Variety and Velocity) of Big Data

Oracle Big Data Essentials

Transcription:

More Than A Buzzword: Big Data in the Environmental Arena 2015 Na>onal Environmental Monitoring Conference July 15, 2015 Brooke Roecker Senior Environmental Data Analyst Mark Packard, PG, CPG President/CEO

Presenta>on Outline Big Data Defined Environmental Data Past & Present Today s Tools and Approaches Example Projects Future Considera>ons 2

Big Data in Environmental Monitoring/Remedia>on Big Data Defined: (yet again) "Big data is high volume, high velocity, and/or high variety informa>on assets that require new forms of processing to enable enhanced decision making, insight discovery and process op>miza>on. Laney, Douglas. "The Importance of 'Big Data': A Definition". Gartner. Retrieved 21 June 2012. 3

Big Data in Environmental Monitoring/Remedia>on Big Data Defined: (yet again) "Big data is high volume, high velocity, and/or high variety informa>on assets that require new forms of processing to enable enhanced decision making, insight discovery and process op>miza>on. Laney, Douglas. "The Importance of 'Big Data': A Definition". Gartner. Retrieved 21 June 2012. 4

Velocity: high frequency data Variety: mixed data/a^ributes Volume: very large datasets VERACITY: Accuracy of data 5 Diya Soubra, The 3Vs that define Big Data. http://www.datasciencecentral.com/forum/topics/the-3vs-that-define-big-data

Where have we come from? Hand- wri^en field logs Text files Spreadsheets Simple Reports 6

Where are we today? Older technologies remain Database storage Out of the box storage/analysis tools EQuIS ENFOS Locus Project Portal 7

Where are we today? Older technologies remain Database storage Out of the box storage/analysis tools EQuIS ENFOS Locus Project Portal Limita>ons? What data are you not managing/analyzing? 8

Bigger data tools available today High frequency advancements: EQuIS Live Project Portal Analy>cs Module Analysis and modeling tools Spa>al: ArcGIS, EVS Visualiza>on: Tableau Custom scrip>ng (R, Python, T- SQL ) SSAS, Weka MatPlotLib (Python), ggplot2 (R) 9

Project Examples 10

Project: Surface Water Monitoring High velocity data, 5- minute intervals Teledyne ISCO samplers Historical data archived in raw MS Excel files Challenge: Dataset too large for Big picture trends Storing/archiving data long- term Centralized access for project team 11

Project: Surface Water Monitoring Solu>on: Streamlined data resampling & import rou>ne Resample from 5- minute to 12- hour averages (or totals) Raw Data Resampled Data 12

Project: Surface Water Monitoring Solu>on: Data available to project team via Project Portal Environmental Database module for resampled data Documents module for raw data 13

Project: RAD Site Monitoring Challenge: High volume, high velocity sensor data w/ telemetry Automated database storage Visual analysis of high volume weather data 14

Project: RAD Site Monitoring Solu>on: Automated data import via web upload Database available to field staff via Project Portal Wind rose graphics to visualize data via EnviroInsite QA of erroneous data points (par>culates) 15

Project: Mine Tunnel Monitoring See next slide High volume, high velocity data On- site sensor data from PLC system Public big data streams Challenge: Centralized database storage Real- >me data access Real- >me no>fica>ons/ alarms 16

Project: Mine Tunnel Monitoring Solu>on: Generic database design op>mized for high frequency data Predic>ve trend modeling calcula>ons Data available via Project Portal Email alerts when incoming data parameters out of spec. 17

Project: O&M Site Monitoring High velocity, automated SVE and GW treatment systems Groundwater Treatment System Challenge: Centralized storage Centralized monitoring System troubleshoo>ng 18

Project: O&M Site Monitoring System Solu>on: Mul>variate analysis to review system variables Secondary analysis to iden>fy fluctua>ons 19

What s Next? Use of emerging technologies: Distributed data sourcing Hadoop HDFS NoSQL Distributed processing Batch processing (MapReduce, Apache Hive) Real- >me processing/streaming (Cloudera Impala, Apache 54) PolyBase (cross- querying HDFS and SQL Server) 20

Summary / Key Takeaways Free big data go out and use it! Big data to inves>gate the unknown Greater project intelligence & decision making 21 linkedin.com/pulse/auto-industry-bored-big-data-brian-pasch

Thank you! Contributors: Myles Hook, ddms Jon Turner, ddms Heidi Gaedy, ddms Ed Larson, ddms Angela Remer, ddms Emily Mulford, Earthsov Inc. Brooke Roecker BRoecker@ddmsinc.com Mark Packard, PG, CPG MPackard@ddmsinc.com