1 Cloud computing based big data ecosystem and requirements Yongshun Cai ( 蔡 永 顺 ) Associate Rapporteur of ITU T SG13 Q17 China Telecom Dong Wang ( 王 东 ) Rapporteur of ITU T SG13 Q18 ZTE Corporation
2 Agenda ITU T progress with big data Big data characteristics and challenges Cloud computing based big data ecosystem Use cases of big data applications & services Requirements and capabilities of cloud computing based big data Conclusion
3 2013, July: ITU T progress with big data o Initiation of 1 st big data working item Y.Bigdata reqts (Requirements and capabilities for cloud computing based big data) by ITU T SG13 Q17 Overview of cloud computing based big data; Big Data system context and its activities; Cloud computing based big data requirements, capabilities, use cases and scenarios o Targeted to be consented in July , November: o 2014, July: o ITU T watch report Big Data: Big today, normal tomorrow by TSAG Initiation of 2 nd big data working item Y.IoT BigData reqts (Specific requirements and capabilities of the Internet of Things for Big Data) by ITU T SG13 Q2 2015, April: o Initiation of further big data working items: o Y.BigDataEX reqts (Big Data Exchange Requirements and Framework for Telecom Big data)) o Y. Suppl.BigData RoadMap (Supplement on Big Data Standardization Roadmap) o Y.BDaaS arch ((Functional architecture of Big Data as a Service)
4 Big data definition and characteristics Definition Big data: a category of technologies and services where the capabilities provided to collect, store, search, share, analyse and visualize data which have the characteristics of high volume, high velocity and high variety. Characteristics Volume (large): refers to the amount of data collected, stored, analyzed and visualized, which need big data technologies to be resolved. Variety (Diverse): refers to different data types and data formats that are performed by big data applications and platforms. Velocity (High): refers to both how fast the data is being collected and how fast the data is processed to deliver expected results.
5 Big data challenges Heterogeneity and incompleteness Data processed using big data technologies can miss some attributes or introduce noise in data transmission. Even after data cleaning and error correction, some incompleteness and errors in data are likely to remain. Scale Managing large and rapidly increasing volumes of data is a challenging issue for data processing. In the past, the data scale challenge was mitigated by evolution of processing and storage resources. Timeliness The acquisition rate and timeliness, to effectively find elements within limited time that meet a specified criterion in a large data set, are new challenges faced by data processing. Privacy Data about human individuals, such as demographic information, internet activities, commutation patterns, social interactions, energy or water consumption, are being collected and analysed for different purposes.
6 Cloud computing based big data ecosystem The roles in cloud computing can support the three main roles of big data as followings; CSN: data provider; CSP: big data application provider; CSP: big data infrastructure provider; CSC: big data service customer. 6 interne Orange
7 Use cases 1 Personalization customized service o Each web access activity might be remains as logs of records of visiting Web sites o The CSP:BDIP store relevant information of the Web service user s activities o The CSC:BDAP may know the patterns or preference of the user through analysis of the user s activities
8 Use cases 2 Intelligent transport big data analysis o The traffic of vehicles and the status of roads are collected in real time. The big data applications for intelligent transport systems are built. o Real time and accurate route navigation o Accurate scheduling for bus departures o Automatic recognition of violations of laws on road traffic
9 Requirements & capabilities 1 Data collection Data source recognition, which offers the capabilities to locate the data sources and detect the types of data being exposed; Data adaptation, which offers the capabilities to transform and organize the data being accepted with targeted structured and attributes (numbering, location, ownerships, etc); Data transmission, which offers the capabilities to transfer data sets from one location to another keeping the integrity and consistency. Data extraction, which offers the capabilities to extract the semi structured data and unstructured data; Data import, which offers the capabilities to import large amount of static data and real time data; Data integration, which offers the capabilities to integrate data from different data sources (different data type or schema) using by metadata or ontology;
10 Requirements & capabilities 2 Data storage Data registration, which offers the capabilities to create, update and delete the metadata with corresponding changes to data storage; Note In case of non structured data registration, data registration component can request the transforming raw data to semi structured data (e.g. JSON, BSON) for data registration; Data persistence, which offers the capabilities to store data with low redundancy and low cost and converged with processing capability; Data semantic intellectualization, which offers the capabilities to define semantic relationships among different data set for knowledge sharing; Data access, which offers the capabilities to access data through multiple interfaces, such as API; Data indexing, which offers the capabilities to generate and update index for data sets; Data duplication and backup, which offers the capabilities to duplicate and make backup for data sets;
11 Requirements & capabilities 3 Data analysis Data preparation, which offers the capabilities to transform data into a form that can be analyzed. This capability includes exploring, de noising, changing, and shaping the data; Data analysis, which offers the capabilities to investigate, inspect, and model data in order to discover useful information; Data visualization, which offers the capabilities to display and present data analysis results; Workflow automation, which offers the automation processes, in whole or part, during which data or functions are passed from one step to another for action, according to a set of procedural rules; Analysis algorithm adaption, which offers the capabilities to apply algorithms of classification, regression, clustering, association rules, ranking and so on according to the requirements;
12 Requirements & capabilities 4 Data management Data provenance, which offers the capabilities to manage information pertaining to any source of data including the party or parties involved in generation, introduction and/or mash up processes for data; Data preservation, which offers the capabilities to manage the series of activities necessary to ensure continued access to data for as long as necessary; Data privacy, which offers the capabilities to manage the right of individuals to control or influence what information related to them; Data security, which offers the capabilities to handle of the network and service aspects of security, including administrative, operational and maintenance issues; Data ownership, which offers the capabilities to manage digital right of data possession, disposition according to the change of data status (e.g. data integration); Processing task track and control, which offers the information such as success of the job and task, running time and resource utilization and etc.
13 Conclusion Y.Bigdata reqts Requirements and capabilities for cloud computing based big data First ITU T Big data related recommendation reusing definitions provided in the Cloud vocabulary Rec. ITU T Y.3500 ISO/IEC and roles in the cloud computing reference Rec. ITU T Y.3502 ISO/IEC This Recommendation is expected to be delivered in Q User view approach: Big data system context with activities Cloud computing based big data ecosystem and activities Usecase to functions derivation: Use cases of cloud computing in support of big data Use cases of cloud computing based big data as analysis services (BDaaS) Main requirements / capabilities cover the whole process of big data processing: Data collection Data storage/persistence Data analyse Data visualization/application Data management/security/privacy As a basis or reference for other big data recommendations Y.BDaaS arch Y.IoT BigData reqts Y.BigDataEX reqts..
14 Meeting plan in 2015 July 13 23, Geneva, Rapporteur meeting Nov 30 Dec 11, Geneva, Plenary meeting Welcome to join us T/studygroups/ /13/Pages/default.aspx
Joint UNECE/Eurostat/OECD Work Session on Statistical Metadata (METIS) Generic Statistical Business Process Model Version 4.0 April 2009 Prepared by the UNECE Secretariat 1 I. Background 1. The Joint UNECE
Cloud Service Level Agreement Standardisation Guidelines Brussels 24/06/2014 1 Table of Contents Preamble... 4 1. Principles for the development of Service Level Agreement Standards for Cloud Computing...
Semantic Search in Portals using Ontologies Wallace Anacleto Pinheiro Ana Maria de C. Moura Military Institute of Engineering - IME/RJ Department of Computer Engineering - Rio de Janeiro - Brazil [awallace,anamoura]@de9.ime.eb.br
Problem Management Contents Introduction Overview Goal of Problem Management Components of Problem Management Challenges to Effective Problem Management Difference between Problem and Incident Management
Processes 395 Integrating configuration management into existing processes 7.2 Integrating configuration management into existing processes To implement configuration management in a complex environment,
Peter Reimann b Michael Reiter a Holger Schwarz b Dimka Karastoyanova a Frank Leymann a SIMPL A Framework for Accessing External Data in Simulation Workflows Stuttgart, March 20 a Institute of Architecture
PROJECT FINAL REPORT Grant Agreement number: 212117 Project acronym: FUTUREFARM Project title: FUTUREFARM-Integration of Farm Management Information Systems to support real-time management decisions and
Siebel Email Administration Guide Siebel Innovation Pack 2013 Version 8.1/8.2 September 2013 Copyright 2005, 2013 Oracle and/or its affiliates. All rights reserved. This software and related documentation
Brochure Optimize data protection with analytics and insights HP Backup Navigator for HP Data Protector software HP Backup Navigator for HP Data Protector software Executive Summary In an ever-growing,
28 Intelligent Value Chain Networks: Business Intelligence and Other ICT Tools and Technologies in Supply/Demand Chains Evelin Vatovec Krmac University of Ljubljana, Faculty of Maritime Studies and Transport
TABLE OF CONTENTS Introduction... 3 The Importance of Triplestores... 4 Why Triplestores... 5 The Top 8 Things You Should Know When Considering a Triplestore... 9 Inferencing... 9 Integration with Text
Principles to be observed by Pre-LOUs that wish to integrate into the Interim Global Legal Entity Identifier System (GLEIS) Executive Summary This note establishes the principles that should be observed
General Principles of Software Validation; Final Guidance for Industry and FDA Staff Document issued on: January 11, 2002 This document supersedes the draft document, "General Principles of Software Validation,
Acceptance test Annual Business Information Management plan Annual information provisioning plan Application Application management ASL (Application Services Library) ASP (Application Service Providing)
Managing digital records without an electronic record management system Crown copyright 2012 You may re-use this information (excluding logos) free of charge in any format or medium, under the terms of
2008 by Bundesamt für Sicherheit in der Informationstechnik (BSI) Godesberger Allee 185-189, 53175 Bonn Contents Contents 1 Introduction 1.1 Version History 1.2 Objective 1.3 Target group 1.4 Application
ITIL glossary and abbreviations English This glossary may be freely downloaded. See www.itil-officialsite.com/internationalactivities/itilglossaries.aspx for details of licence terms. 1 Acknowledgements
Data Protection Act 1998 Guidance on the use of cloud computing Contents Overview... 2 Introduction... 2 What is cloud computing?... 3 Definitions... 3 Deployment models... 4 Service models... 5 Layered
Federal Communications Commission Information Technology Strategic Plan Implementing technology today to meet FCC business needs tomorrow Office of the Managing Director Information Technology Center July
SAP BusinessObjects Business Intelligence SAP BusinessObjects Business Intelligence 4.0 Solutions Empowering the Real-Time, Mobile, Social, and Global Enterprise SAP BusinessObjects Business Intelligence
FRAUNHOFER RESEARCH INSTITUTION AISEC CLOUD COMPUTING SECURITY PROTECTION GOALS.TAXONOMY.MARKET REVIEW. DR. WERNER STREITBERGER, ANGELIKA RUPPEL 02/2010 Parkring 4 D-85748 Garching b. München Tel.: +49
Checklist to Assess Security in IT Contracts Federal Agencies that outsource or contract IT services or solutions must determine if security is adequate in existing and new contracts. Executive Summary
UPnP CERTIFIED TECHNOLOGY YOUR SIMPLE SOLUTION FOR HOME, OFFICE AND SMALL BUSINESS INTEROPERABILITY September 2010 INTRODUCTION Not long ago, consumers and small business owners were reasonably satisfied
Investigation of IT Auditing and Checklist Generation Approach to Assure a Secure Cloud Computing Framework Rajni Maheshwari M.Tech (Computer) College of Engineering, Bharati Vidyapeeth Deemed University
Plug Into The Cloud with Oracle Database 12c ORACLE WHITE PAPER DECEMBER 2014 Disclaimer The following is intended to outline our general product direction. It is intended for information purposes only,
INSIGHTS. INNOVATION. IMPACT Turning Big Data into Big Outcomes With Rolta OneView Enterprise Suite Issue 4 2 Welcome 3 Why Big Data Analytics? 6 Challenges in Adapting Big Data Analytics 7 Driving a Big
Reducing the Cyber Risk in 10 Critical Areas Information Risk Management Regime Establish a governance framework Enable and support risk management across the organisation. Determine your risk appetite