The Matsu Wheel: A Cloud-based Scanning Framework for Analyzing Large Volumes of Hyperspectral Data



Similar documents
Best PracDces for Building and Deploying PredicDve Models Over Big Data. Module 12: Case Study Matsu

Hadoop. MPDL-Frühstück 9. Dezember 2013 MPDL INTERN

High Productivity Data Processing Analytics Methods with Applications

A Namibia Early Flood Warning System A CEOS Pilot Project

Trends and Research Opportunities in Spatial Big Data Analytics and Cloud Computing NCSU GeoSpatial Forum

A Multilevel Secure MapReduce Framework for Cross-Domain Information Sharing in the Cloud

A New Cloud-based Deployment of Image Analysis Functionality

How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning

VITO Centre of Image Processing

International Journal of Engineering Research ISSN: & Management Technology November-2015 Volume 2, Issue-6

BIG DATA What it is and how to use?

Recognization of Satellite Images of Large Scale Data Based On Map- Reduce Framework

CLASSIFICATION ALGORITHMS FOR BIG DATA ANALYSIS, A MAP REDUCE APPROACH

COASTAL MONITORING & OBSERVATIONS LESSON PLAN Do You Have Change?

Hadoop Ecosystem Overview. CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook

ESS event: Big Data in Official Statistics. Antonino Virgillito, Istat

Microsoft Research Windows Azure for Research Training

DATA ANALYTICS SERVICES. G-CLOUD SERVICE DEFINITION.

Leveraging Big Data Technologies to Support Research in Unstructured Data Analytics

Microsoft Research Microsoft Azure for Research Training

Cloud-based Geospatial Data services and analysis

Problem Solving Hands-on Labware for Teaching Big Data Cybersecurity Analysis

Hadoop on Windows Azure: Hive vs. JavaScript for Processing Big Data

An Integrated Big Data & Analytics Infrastructure June 14, 2012 Robert Stackowiak, VP Oracle ESG Data Systems Architecture

HiBench Installation. Sunil Raiyani, Jayam Modi

Hadoop Evolution In Organizations. Mark Vervuurt Cluster Data Science & Analytics

Silviu Panica, Marian Neagul, Daniela Zaharie and Dana Petcu (Romania)

Joint Polar Satellite System (JPSS)

How Companies are! Using Spark

Extending the Enterprise Data Warehouse with Hadoop Robert Lancaster. Nov 7, 2012

TerraColor White Paper

Microsoft Big Data Solutions. Anar Taghiyev P-TSP

NASA s Big Data Challenges in Climate Science

Integrating Hadoop. Into Business Intelligence & Data Warehousing. Philip Russom TDWI Research Director for Data Management, April

The basic data mining algorithms introduced may be enhanced in a number of ways.

Assignment # 1 (Cloud Computing Security)

Scalable Developments for Big Data Analytics in Remote Sensing

Active and Passive Microwave Remote Sensing

Chase Wu New Jersey Ins0tute of Technology

HDP Hadoop From concept to deployment.

Where is... How do I get to...

FREE computing using Amazon EC2

BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON

Chukwa, Hadoop subproject, 37, 131 Cloud enabled big data, 4 Codd s 12 rules, 1 Column-oriented databases, 18, 52 Compression pattern, 83 84

Classification Techniques in Remote Sensing Research using Smart Data Analytics

Data Semantics Aware Cloud for High Performance Analytics

Hadoop Architecture. Part 1

Agenda. Big Data & Hadoop ViPR HDFS Pivotal Big Data Suite & ViPR HDFS ViON Customer Feedback #EMCVIPR

Big Data, Cloud Computing, Spatial Databases Steven Hagan Vice President Server Technologies

A Study of Data Management Technology for Handling Big Data

Big Data Analytics. An Introduction. Oliver Fuchsberger University of Paderborn 2014

Comprehensive Analytics on the Hortonworks Data Platform

Big Data Explained. An introduction to Big Data Science.

E6893 Big Data Analytics Lecture 2: Big Data Analytics Platforms

BUDT 758B-0501: Big Data Analytics (Fall 2015) Decisions, Operations & Information Technologies Robert H. Smith School of Business

The USGS Landsat Big Data Challenge

VOL. 5, NO. 2, August 2015 ISSN ARPN Journal of Systems and Software AJSS Journal. All rights reserved

Supervised Classification workflow in ENVI 4.8 using WorldView-2 imagery

Big Data Spatial Analytics An Introduction

THE STATE OF GEO BIG DATA IN OPEN SOURCE. Rob Emanuele

Cloud Scale Distributed Data Storage. Jürmo Mehine

Ubuntu and Hadoop: the perfect match

Is a Data Scientist the New Quant? Stuart Kozola MathWorks

Lecture 32 Big Data. 1. Big Data problem 2. Why the excitement about big data 3. What is MapReduce 4. What is Hadoop 5. Get started with Hadoop

Understanding Big Data Analytics Applications in Earth Science Morris Riedel, Rahul Ramachandran/Kuo Kwo-Sen, Peter Baumann Big Data Analytics

ENVI THE PREMIER SOFTWARE FOR EXTRACTING INFORMATION FROM GEOSPATIAL IMAGERY.

Mambo Running Analytics on Enterprise Storage

Origins, Evolution, and Future Directions of MATLAB Loren Shure

Welcome to the unit of Hadoop Fundamentals on Hadoop architecture. I will begin with a terminology review and then cover the major components

Introduction to Big Data Analytics p. 1 Big Data Overview p. 2 Data Structures p. 5 Analyst Perspective on Data Repositories p.

The Challenges of Geospatial Analytics in the Era of Big Data

Big Data Analytics: Where is it Going and How Can it Be Taught at the Undergraduate Level?

QLIKVIEW AND BIG DATA

Moving From Hadoop to Spark

COMP9321 Web Application Engineering

Developing Scalable Smart Grid Infrastructure to Enable Secure Transmission System Control

The premier software for extracting information from geospatial imagery.

McIDAS-V - A powerful data analysis and visualization tool for multi and hyperspectral environmental satellite data

Workshop on Hadoop with Big Data

The Inside Scoop on Hadoop

The 4 Pillars of Technosoft s Big Data Practice

Hortonworks and ODP: Realizing the Future of Big Data, Now Manila, May 13, 2015

Business Intelligence for Big Data

Machine Learning for Fraud Detection

BIG DATA CHALLENGES AND PERSPECTIVES

Transcription:

The Matsu Wheel: A Cloud-based Scanning Framework for Analyzing Large Volumes of Hyperspectral Data Maria Patterson, PhD Open Science Data Cloud Center for Data Intensive Science (CDIS) University of Chicago HyspIRI Symposium, 5 June, 2014

The Open Science Data Cloud (OSDC) is an open-source, cloud-based infrastructure that allows scientists to manage, share, and analyze medium to large size scientific datasets. Application for resources available to anyone doing scientific research: www.opensciencedatacloud.org

User view: 1) login

User view: 2) launch virtual machine

User view: 3) run analysis

Project Matsu Joint effort between the Open Cloud Consortium (lead, Robert Grossman) and NASA (lead, Dan Mandl) to develop open source technology for cloud-based processing of satellite imagery to support earth sciences. The OSDC is used to process Earth Observing 1 (EO-1) satellite imagery from the Advanced Land Imager and the Hyperion instruments and to make this data available to interested users. Namibia flood dashboard, WCPS Hadoop-based Matsu Wheel scanning data algorithm

Matsu Analytic Wheel Earth Observing-1 New data observed by EO-1 and downloaded to NASA NASA Goddard Space Flight Center NASA images sent to OSDC Public Data Commons cloud for permanent storage OSDC Public Data Commons (GlusterFS) Data read into HDFS only once HDFS Metadata stored Wheel analytics run over data using MapReduce Additional analytics plug in easily contours + clusters rare pixel finder report generators spectral blobs NoSql Database (Accumulo) Analytic results stored supervised classifier Analytic reports generated by Wheel are accessible via web browser Secondary analysis can be done from analytic database

Matsu Analytic Wheel Earth Observing-1 New data observed by EO-1 and downloaded to NASA NASA Goddard Space Flight Center NASA images sent to OSDC Public Data Commons cloud for permanent storage OSDC Public Data Commons (GlusterFS) Data read into HDFS only once HDFS The Wheel watches for new data to become Additional Metadata stored analytics plug in easily available, using Apache Storm. Wheel analytics run over data using MapReduce contours + clusters supervised classifier NoSql Database The Wheel analytics (Accumulo) run each night, daily reports Analytic results stored rare pixel finder report spectral When new data are detected, loaded into generators Hadoop s blobs distributed file system for analysis using MapReduce. available the morning after data are received. Analytic reports generated by Wheel are accessible via web browser Secondary analysis can be done from analytic database

Matsu Analytic Wheel Earth Observing-1 New data observed by EO-1 and downloaded to NASA NASA Goddard Space Flight Center NASA images sent to OSDC Public Data Commons cloud for permanent storage OSDC Public Data Commons (GlusterFS) Data read into HDFS only once HDFS The Wheel is efficient for processing large volumes of data with many types of analysis by simply requiring a common input format. Metadata stored NoSql Database (Accumulo) Wheel analytics run over data using MapReduce Additional analytics plug in easily report generators Analytic results stored contours + clusters supervised classifier rare pixel finder spectral blobs Analytic reports generated by Wheel are accessible via web browser Secondary analysis can be done from analytic database

Matsu Analytic Wheel Earth Observing-1 New data observed by EO-1 and downloaded to NASA NASA Goddard Space Flight Center NASA images sent to OSDC Public Data Commons cloud for permanent storage OSDC Public Data Commons (GlusterFS) Data read into HDFS only once HDFS Metadata stored Wheel analytics run over data using MapReduce Additional analytics plug in easily contours + clusters rare pixel finder report generators spectral blobs NoSql Database (Accumulo) Analytic results stored supervised classifier Analytic reports generated by Wheel are accessible via web browser Secondary analysis can be done from analytic database

Matsu Wheel Daily Reports matsu-analytics.opensciencedatacloud.org

Matsu Wheel Daily Reports

Matsu Wheel Daily Reports

Matsu Wheel Daily Reports

Matsu Wheel Daily Reports

Matsu Wheel Daily Reports

Matsu Wheel Daily Reports

Matsu Wheel is open source github.com/opencloudconsortium/matsu-project

New wheel analytic (beta): Support Vector Machine (SVM) classifier A supervised machine learning classification algorithm Train the classifier by hand classifying areas in a set of training images Beta classifier has 4 classes: clouds, dry land, vegetation, water

New wheel analytic (beta): Support Vector Machine (SVM) classifier

Continuing work SVM classifier adapt regionally to geographic area (classes depend on geography) Incorporate SVM classifier into Matsu Wheel Additional wheel analytics Web Map Service and tiling using Geoserver Add additional data to the Wheel What you can do Make your data available to Project Matsu Port your analysis tools and applications Use the Matsu cloud to facilitate making discoveries that require integrating multiple large datasets Contribute a Wheel analytic