SeaCloudDM: Massive Heterogeneous Sensor Data Management in the Internet of Things



Similar documents
ANIMATION a system for animation scene and contents creation, retrieval and display

CUBE INDEXING IMPLEMENTATION USING INTEGRATION OF SIDERA AND BERKELEY DB

Course DSS. Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

Monitoring and Mining Sensor Data in Cloud Computing Environments

The basic data mining algorithms introduced may be enhanced in a number of ways.

Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

Transforming the Telecoms Business using Big Data and Analytics

Middleware support for the Internet of Things

Topics in basic DBMS course

Multi-dimensional index structures Part I: motivation

SPATIAL DATA CLASSIFICATION AND DATA MINING

Oracle Database 11g Comparison Chart

Research of Smart Distribution Network Big Data Model

Chapter 5. Warehousing, Data Acquisition, Data. Visualization

Volume 3, Issue 6, June 2015 International Journal of Advance Research in Computer Science and Management Studies

White Paper. How Streaming Data Analytics Enables Real-Time Decisions

Time series IoT data ingestion into Cassandra using Kaa

Master s Program in Information Systems

Lesson 15 - Fill Cells Plugin

Improving Data Processing Speed in Big Data Analytics Using. HDFS Method

Oracle Big Data SQL Technical Update

Cloud Computing and Advanced Relationship Analytics

RESEARCH ON THE FRAMEWORK OF SPATIO-TEMPORAL DATA WAREHOUSE

Distributed Database for Environmental Data Integration

Foundations of Business Intelligence: Databases and Information Management

BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON

Find the Information That Matters. Visualize Your Data, Your Way. Scalable, Flexible, Global Enterprise Ready

ICT Perspectives on Big Data: Well Sorted Materials

Horizontal IoT Application Development using Semantic Web Technologies

Technologies & Applications

Contents RELATIONAL DATABASES

Session 1: IT Infrastructure Security Vertica / Hadoop Integration and Analytic Capabilities for Federal Big Data Challenges

Alejandro Vaisman Esteban Zimanyi. Data. Warehouse. Systems. Design and Implementation. ^ Springer

Converged, Real-time Analytics Enabling Faster Decision Making and New Business Opportunities

Technical. Overview. ~ a ~ irods version 4.x

Internet of Things (IoT): A vision, architectural elements, and future directions

Professor, D.Sc. (Tech.) Eugene Kovshov MSTU «STANKIN», Moscow, Russia

GEOG 482/582 : GIS Data Management. Lesson 10: Enterprise GIS Data Management Strategies GEOG 482/582 / My Course / University of Washington

Complex Event Processing (CEP) Why and How. Richard Hallgren BUGS

How To Understand The History Of Navigation In French Marine Science

Data-intensive HPC: opportunities and challenges. Patrick Valduriez

Fast Innovation requires Fast IT

Concepts of Database Management Seventh Edition. Chapter 9 Database Management Approaches

Design of Electric Energy Acquisition System on Hadoop

Software Requirements Specification. Schlumberger Scheduling Assistant. for. Version 0.2. Prepared by Design Team A. Rice University COMP410/539

CloudDB: A Data Store for all Sizes in the Cloud

Vector storage and access; algorithms in GIS. This is lecture 6

Integrating VoltDB with Hadoop

ECS 165A: Introduction to Database Systems

Sanjeev Kumar. contribute

COMP 5311: Topics in DB Research Comp 5311 Database Management Systems. 7. Topics in DB Research

Teradata s Big Data Technology Strategy & Roadmap

The Data Grid: Towards an Architecture for Distributed Management and Analysis of Large Scientific Datasets

BigData. An Overview of Several Approaches. David Mera 16/12/2013. Masaryk University Brno, Czech Republic

Distributed Computing and Big Data: Hadoop and MapReduce

International Journal of Scientific & Engineering Research, Volume 5, Issue 4, April ISSN

Big Data and Analytics in Government

ISSN: A Review: Image Retrieval Using Web Multimedia Mining

DATA MINING TECHNOLOGY. Keywords: data mining, data warehouse, knowledge discovery, OLAP, OLAM.

Clustering through Decision Tree Construction in Geology

The emergence of big data technology and analytics

SAP BOBJ. Participants will gain the detailed knowledge necessary to design a dashboard that can be used to facilitate the decision making process.

Exploiting Data at Rest and Data in Motion with a Big Data Platform

An Architecture Model of Sensor Information System Based on Cloud Computing

Optimization of Image Search from Photo Sharing Websites Using Personal Data

DATA MINING CONCEPTS AND TECHNIQUES. Marek Maurizio E-commerce, winter 2011

Reverse Engineering in Data Integration Software

Ramesh Bhashyam Teradata Fellow Teradata Corporation

Digital Modernization of Oilfields Digital Oilfield to Intelligent Oilfield. Karamay Hongyou Software Co., Ltd.

Simplifying Big Data Analytics: Unifying Batch and Stream Processing. John Fanelli,! VP Product! In-Memory Compute Summit! June 30, 2015!!

CHAPTER-24 Mining Spatial Databases

R-trees. R-Trees: A Dynamic Index Structure For Spatial Searching. R-Tree. Invariants

Chapter 6 8/12/2015. Foundations of Business Intelligence: Databases and Information Management. Problem:

Manufacturing and the Internet of Everything

Delivering Business Intelligence With Microsoft SQL Server 2005 or 2008 HDT922 Five Days

Gaming as a Service. Prof. Victor C.M. Leung. The University of British Columbia, Canada

Chapter 6 FOUNDATIONS OF BUSINESS INTELLIGENCE: DATABASES AND INFORMATION MANAGEMENT Learning Objectives

Course MIS. Foundations of Business Intelligence

ICOM 6005 Database Management Systems Design. Dr. Manuel Rodríguez Martínez Electrical and Computer Engineering Department Lecture 2 August 23, 2001

OBJECT RECOGNITION IN THE ANIMATION SYSTEM

iservdb The database closest to you IDEAS Institute

5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014

5.5 Copyright 2011 Pearson Education, Inc. publishing as Prentice Hall. Figure 5-2

Indexing and Retrieval of Historical Aggregate Information about Moving Objects

USING COMPLEX EVENT PROCESSING TO MANAGE PATTERNS IN DISTRIBUTION NETWORKS

Big Data and Analytics: Getting Started with ArcGIS. Mike Park Erik Hoel

INTERNET OF THINGS Recent Advances and Applications MengChu Zhou, Tongji University and New Jersey Institute of Technology

Chapter 7. Using Hadoop Cluster and MapReduce

Transcription:

SeaCloudDM: Massive Heterogeneous Sensor Data Management in the Internet of Things Jiajie Xu Institute of Software, Chinese Academy of Sciences (ISCAS) 2012-05-15

Outline 1. Challenges in IoT Data Management 2. SeaCloudDM: Architectures and Solutions 3. Conclusion

Outline 1. Research Bac 2. SeaCloudDM: Architecture and Solutions 3. Conclusion

Background Internet of Things: The Internet of Things (IoT) refers to uniquely identifiable objects (things) and their virtual representations in an Internet-like structure. IoT-CloudDB Query Processing Applications Universal IoT data management platform intelligent Acquire Store Process Use

Challenges and Limitations 1 An IoT system may contain various kinds of sensors whose sampling data have heterogeneous data structures (heterogeneity) For instance, traffic sensors: GPS sensors RFID readers video-based traffic-flow analysis sensors, traffic loop sensors road condition sensors...

Challenges and Limitations 2 The data to be managed in IoT are massive, dynamic data Multimedia Data from video cameras, telemetric devices, and other similar devices; Moving Objects Databases Sensor data are sampled frequently, resulting in large data size; Sensor data are dynamically change data streams with arriving of new data and deleting of old data.

Challenges and Limitations 3 Spatial-temporal attribute is intrinsic for IoT data (Spatial- Temporal Logic) Queries can not be answered through keyword match with SensorID. Instead, through Spatial-Temporal Constrains; Queries can not be answered through keyword match with Time-stamps. Instead, through Spatial-Temporal interpolations.

Outline 1. Research Background 2. SeaCloudDM: Architectures and So 3. Conclusion

Architecture of Sea-Cloud-based massive heterogeneous sensor Data Management (SeaCloudDM) framework Data Analysis & Application Layer Cloud Data Management Layer Sensor Deployment Layer Sea-Computing Layer OLAP, Statistical Analysis & Data Mining based on Massive Sensor Data Information Retrieval and Intelligent Recommendation IoT Applications (ITS, Smart-Grid, Emergency Management ) RDB-KV Cloud: Relational-DB and Key-Value Combined Cloud Storage RDB-KV DB: Unified DB for Heterogeneous Sensor Data Management Sea-Computing: Sensor Connection & Raw Sampling Data Processing Sea-Cloud-based Cooperative Data Management Mechanism Sea-Computing Node Sea-Computing Node Sea-Computing Node Sea-Computing Node Sea-Computing Node Sea-Computing Node traffic sensors hydrological sensors geological sensors video analysis sensors telemetric analysis sensors moving video analysis sensors

1. Sea-Data Processing Mechanism Key samplings (numerical) Heterogeneous input from various kinds of sensors, standardized (uniform) output data to Cloud Data management layer ; SamplingValue = (t, (x, y), npos, schema, value) SamplingSequence=(schema, (t, ((x, y), npos, value, flag))) Derived Numerical Values Multimedia Analysis Extraction of Key Samplings Numerical Samplings RFID Samplings Multimedia data analysis: multimedia data numerical data; Key samplings identification and extraction Multimedia Samplings State-threshold based extraction method.

2. Cloud-Data Management Mechanism Relational DataBase and Key-Value store combined (RDB-KV) model Homogeneous database nodes for heterogeneous sensor sampling data (both historical and present); DBMS-kernel based data types, operators and indices for sensor data management, supporting both SQL and keyword search; Sensor Sampling Sequence: for data stream and for spatialtemporal interpolation query processing. RDB-KV Cloud SQL / ST-Logic Keyword Search

2. Cloud-Data Management Mechanism (Cont d) RDB-KV Database for Uniformed Storage of Heterogeneous Sensor Data Example: Data Types, Operators for sensor sampling data Operators Data Types Spatial-Temporal Queries Keyword Queries

2. Cloud-Data Management Mechanism (Cont d) Sensor-Sampling-Sequence Spatial-Temporal Tree (S4T-Tree) Spatial R-Tree Index static objects Grid-Sketched Spatial-Temporal R-Tree Index moving objects T Grid Cell MBR 1 MBR 2 Sketched Trajectory Original Trajectory 2 1 1 Y 1 2 3 4 1 2 3 X Original Trajectory Unit MBR 3 MBR 4 MBR 5 MBR 6 stu 1 stu 3 stu 2 stu 4 stu 8 stu 5 stu 9 stu n Maximum Bounding Rectangle (MBR) Sketched Trajectory Unit (STUs) RDB + KV Database Engine: Keyword-Tuple Reverted-File Index keysearch Operator

2. Cloud-Data Management Mechanism (Cont d) RDB-KV Cloud for Management of Massive Sensor Sampling Data Geo-based Data distribution: Sampling sequence truncation; Home (registered) location based distribution (Hilbert Curve Symbolization). Global Indices: Spatial-Temporal, Keyword index master 根 Root 结 点 Node master Keyword Range Partition Table t site 1 site 2 site 3 site 4 叶 Leaf 结 点 Nodes site 1 site 2 site 3 site 4 叶 Leaf 结 点 Node s 的 管 辖 Service 区 域 Area KeywordsiteID Mapping B+ Tree α(site 1 ) α(site 2 ) α(site 4 ) α(site 3 ) IoTData Table FTKB+ - Tree

Outline 1. Research Background 2. SeaCloudDM: Architectures and Solutions

Conclusion SeaCloudDM framework with a sea-cloud-based cooperative data management mechanism is proposed; Sea (70%~90%) + Cloud (10%~30%) Data Management A RDB-KV database model is proposed, with related data types, operators, and indices defined; manages heterogeneous sensor sampling data in a uniformed manner A RDB-KV cloud data management model is proposed to manage massive sensor sampling data. combines the advantages of both relational databases and key-value stores

Thanks!