Challenges of Big Data Platform



Similar documents
Big Data Analytics Nokia

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database

Parallel Data Warehouse

BIG DATA: FROM HYPE TO REALITY. Leandro Ruiz Presales Partner for C&LA Teradata

Introducing Oracle Exalytics In-Memory Machine

Aligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap

Toronto 26 th SAP BI. Leap Forward with SAP

Data Warehouse: Introduction

Big Data Technology ดร.ช ชาต หฤไชยะศ กด. Choochart Haruechaiyasak, Ph.D.

How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning

SQream Technologies Ltd - Confiden7al

An Oracle White Paper June High Performance Connectors for Load and Access of Data from Hadoop to Oracle Database

KNIME & Avira, or how I ve learned to love Big Data

Safe Harbor Statement

Hadoop and Relational Database The Best of Both Worlds for Analytics Greg Battas Hewlett Packard

BIG DATA CAN DRIVE THE BUSINESS AND IT TO EVOLVE AND ADAPT RALPH KIMBALL BUSSUM 2014

Solutions for Communications with IBM Netezza Network Analytics Accelerator

The 4 Pillars of Technosoft s Big Data Practice

Trafodion Operational SQL-on-Hadoop

Well packaged sets of preinstalled, integrated, and optimized software on select hardware in the form of engineered systems and appliances

SQL Server 2012 Performance White Paper

ORACLE BUSINESS INTELLIGENCE, ORACLE DATABASE, AND EXADATA INTEGRATION

Microsoft Services Exceed your business with Microsoft SharePoint Server 2010

QlikView Business Discovery Platform. Algol Consulting Srl

Analyze It use cases in telecom & healthcare

QLIKVIEW DEPLOYMENT FOR BIG DATA ANALYTICS AT KING.COM

How To Use Big Data And Analytics For Csp (China Mobile)

CONCEPTUALIZING BUSINESS INTELLIGENCE ARCHITECTURE MOHAMMAD SHARIAT, Florida A&M University ROSCOE HIGHTOWER, JR., Florida A&M University

An Integrated Analytics & Big Data Infrastructure September 21, 2012 Robert Stackowiak, Vice President Data Systems Architecture Oracle Enterprise

Luncheon Webinar Series May 13, 2013

How To Use Big Data For Telco (For A Telco)

Hortonworks & SAS. Analytics everywhere. Page 1. Hortonworks Inc All Rights Reserved

Keywords Big Data; OODBMS; RDBMS; hadoop; EDM; learning analytics, data abundance.

China Bank BigData Usecase Huawei FusionInsight Solution

Artur Borycki. Director International Solutions Marketing

Oracle Business Intelligence 11g Business Dashboard Management

How To Handle Big Data With A Data Scientist

Microsoft Analytics Platform System. Solution Brief

Transforming the Telecoms Business using Big Data and Analytics

W H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract

How To Make Sense Of Data With Altilia

2009 Oracle Corporation 1

5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014

Ramesh Bhashyam Teradata Fellow Teradata Corporation

Allot ClearSee. Providing Breakthrough Network Business Intelligence. Insightful Analytics and Superior Data Source For Data Network Service Providers

ESS event: Big Data in Official Statistics. Antonino Virgillito, Istat

Please give me your feedback

News and trends in Data Warehouse Automation, Big Data and BI. Johan Hendrickx & Dirk Vermeiren

Fujitsu Big Data Software Use Cases

SQL Server 2012 Parallel Data Warehouse. Solution Brief

SAP and Hortonworks Reference Architecture

Key Messages of Enterprise Cluster NAS Huawei OceanStor N8500

Zynga Analytics Leveraging Big Data to Make Games More Fun and Social

A New Era Of Analytic

SAP HANA. SAP HANA Performance Efficient Speed and Scale-Out for Real-Time Business Intelligence

ORACLE OLAP. Oracle OLAP is embedded in the Oracle Database kernel and runs in the same database process

BIG DATA What it is and how to use?

How to Enhance Traditional BI Architecture to Leverage Big Data

BIG DATA & ANALYTICS. Transforming the business and driving revenue through big data and analytics

BUILDING BLOCKS OF DATAWAREHOUSE. G.Lakshmi Priya & Razia Sultana.A Assistant Professor/IT

IST722 Data Warehousing

OBSERVEIT DEPLOYMENT SIZING GUIDE

Your Data, Any Place, Any Time. Microsoft SQL Server 2008 provides a trusted, productive, and intelligent data platform that enables you to:

Business Intelligence in SharePoint 2013

Best Practices for Hadoop Data Analysis with Tableau

HDP Hadoop From concept to deployment.

Business Intelligence, Analytics & Reporting: Glossary of Terms

Oracle s Big Data solutions. Roger Wullschleger. <Insert Picture Here>

Big Data - Business, Math, Technology Best combination for big data 商 业 理 解, 数 据 科 学, 技 术 实 践 之 完 美 结 合

Scalability and Performance Report - Analyzer 2007

Reference Architecture, Requirements, Gaps, Roles

Ganzheitliches Datenmanagement

Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing

How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time

NoSQL for SQL Professionals William McKnight

Il mondo dei DB Cambia : Tecnologie e opportunita`

Implementing Data Models and Reports with Microsoft SQL Server

Chapter 6 8/12/2015. Foundations of Business Intelligence: Databases and Information Management. Problem:

Colgate-Palmolive selects SAP HANA to improve the speed of business analytics with IBM and SAP

Building a real-time, self-service data analytics ecosystem Greg Arnold, Sr. Director Engineering

The Principles of the Business Data Lake

HP Vertica at MIT Sloan Sports Analytics Conference March 1, 2013 Will Cairns, Senior Data Scientist, HP Vertica

Service Assurance based on Packet Capture

AGENDA. What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story. Our BIG DATA Roadmap. Hadoop PDW

Adobe Insight, powered by Omniture

Alexander Nikov. 5. Database Systems and Managing Data Resources. Learning Objectives. RR Donnelley Tries to Master Its Data

Microsoft BI Platform Overview

Netezza and Business Analytics Synergy

Big + Fast + Safe + Simple = Lowest Technical Risk

How To Use Hp Vertica Ondemand

Your Data, Any Place, Any Time.

Native Connectivity to Big Data Sources in MSTR 10

Data Refinery with Big Data Aspects

Business Intelligence: Effective Decision Making

An Oracle White Paper November Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics

Advanced Big Data Analytics with R and Hadoop

Transcription:

Security Level: Challenges of Big Platform www.huawei.com HUAWEI TECHNOLOGIES CO., LTD.

Contents Categories of data in carrier network Network insight Customer behavior insight Society activity insight Challenges

Business Domain Volume Five Categories of Enterprise Management Generated Manually 100TB~10TB OSS Generated by machine 1TB~100TB xxgb / Year Network Element Generated by Machine 10PB / Year,1~3 years accumulation BSS Generated Manually 100GB~10TB xxtb / Year VAS Generated Manually 100TB~10PB 100TB / Year Source E-Learning ERP Account HR CMS NE Parameter NE Config NE Log Alert Perf CHR CDR SDR MR Counter NE Log Billing MKT report Order User Profile CRM Order Usage Service Content GIS ISP Characteri stics Structured(Table) Unstructured (graphics text video) Structured(table) Unstructured(Time series data) Semi-Structured (signaling call records) Unstructured(Time series data) OSS Structured(table) Structured(table point sets) Semi-Structured(column cluster) Unstructured(graphics text video time-series data) BSS VAS MRP SCM FRM ERP HRM Probe or NE Integration 企 业 管 理 域 NodeB RNC SGSN GGSN/DPI

Evolutions of data analytic business in big data era Past: Typical analytic business is operation analysis, based on statistics, off line, isolated data; Nowadays: New business,such as network optimization, customer experience, etc. Large volume, real-time, various kinds of data type; order VAS Offer design Operational analytic system:operation reports/kpi reports (statistics) Stats of network management performance Network schedule (statistics) CRM/Billing Performance Alerts BSS OSS NE data CEM NPM/SQM AD promotion HR, Financial reports HR/FRM/SRM Enterprise management E v o l u t i o n Statistics offline isolated Large volume real-time, convergent of various data types Business Set>100TB Volume/Flow Velocity Variety flow rate Accumulation rate ( >60% scenarios) Operation Report Statistics data Offline Statistic scenario,low accumulation rate No Requirements on scale-out format and sources CRM Billing,structured Billing Verification <100T Offline Fixed No Billing structured Network optimization Network equipment data,10pb Customer experience Indicator Precise marketing Elastic data processing cluster of over 100 servers, Handle 1PB data Network data, 10PB ~200Gbps Archive 1 year s data Elastic data processing cluster of over 100 servers, Handle 1PB data Customer profile 100GB~300GB ~100,000 packages/s from NEs, such as RAN, PS, etc Network signaling, xdr, traffic stastics, NE configuration data, semi-structured data takes the majority Fixed volume In-memory computing CRM billing xdr, structured data, semi-structured data

evolutions driven by carrier business Business Evolution Three categori es of Big business Network Insight Analytics based on network data, combined with user data, to adjust network layout; Focus on network status: location, equipment workload, adjust network dynamically Customer Insight Analytics based on user data, combined with network equipment data, to recognize characters of customer behavior To understand who is using network, consume which service, and to optimize business Society Insight Analytics based on laws behind data,,to dig out data values Based on laws, guide carrier develop new valuable business

Categories and characteristics of carrier big data business Business Network Insight Customer Insight Society Insight NE data Summary Operational data VAS and External data Achieved data Capability TS DPI MR Log xdr Dial test Traffi c test order Ac co unt UP Complaints User account User consuming CRM CBS IPCC VAS Netw ork Mark eting LBS VAS Internet usage User profil e xdr Log Traffic statistics Ad-Hoc Query Real-time response Multi-dimension visualization, rich and complex models representation and query Query is not complex High concurrency Complex Query Complex data mining algorithms, need the guides from data scientist and industry experts storage and integration Raw data Large volume,10pb level, Low cost Low data volume Summarized data Moderate Volume Mixed with raw data and summarized data volume varies in different domain, averagely 10PB level, requires low cost ETL High performance loading Real time update model complex Cross domain data integration High performance Low cost Real time High concurrency Complex Query Complex models and algorithms

Business requirements onnetwork Insight processing procedure Requirements 3 representation 4 analytics and processing Multi-dimension analytic For a carrier network to provide service for 40M users, there are several challenges: Volume: 120T -> 5.6P; Integration: 33 nodes -> 6 nodes; query response time: 100s -> 15s; Multi-dimension analytics Target(40M users) Management 3 DW preprocessing 2 1 Summari ze Archieve 140k Records/s 354kRecords/s 60 days,120t 1 Year,5.6P summarizatio n and storage 2: raw data summarization 1:Archive and query raw data 3:statistics /analysis libs Feeding rate 90,000rows/s Ensure stable query performance 1 year s data,5.6p Compression rate: 10:1 Support a few AD-hoc queries Support complex queries invoving10 tables 20 concurrent reporting queries, respond in 15 seconds ingress PS CS NMS EMS 20M users,25gbps, 60 days raw data, 120TB 40M users,200gbps, 1 year s raw data, 5.6PB analytics and processing 4: Multidimension analytics Multi- Dimension:14 dimensions; General analytics:combination of 5 to 9 dimensions of SDR BKPI combination of 10 to14 dimensions in BKPI Second level response time, on 1.4 billion rows

Business requirements on Customer Insight Precise AD promotion based on user behavior information, refined event content requirements from suppliers Promote electronic magazine for people taking public traffic Promote Wifi offers to people in coffee shops without wifi services Promote cosmetics vouchers to females in shopping market 8 AM Go to office Working days weekends holidays vocations Big Platform Get subscriber s location Based on behaviors,analysis users consuming characteristic, favorite content ant offers;

Business requirements on Customer Insight Two general requirements on BI technologies:high performance DW with low cost, analysis & mining algorithms based on user behaviors and values processing procedure Requirements Application Service capabilities (information archive, process) Item inquiry Dynamic policy ingress Characteristic profile Traffic analysis Performance assess retrieve Network analysis Finance analysis Text processing Content visualization classification Location service Graphic service Customer insight Marketing management Pain point 1:Poor OLAP performance, minute level response time with server hundreds GB data. OLAP system is built by ROLAP solution, such as Cognos, DB2 etc; Pain point 2:Poor DW performance, high cost(raw data storage and computation costs above 70% capability of a DW,reach the maximum volume and capability of traditional database) Pain point 3:high software / hardware cost:solution is composed with high end servers, disk array and commercial dbms, expensive license and hardware aggregation classification Infrastructure Distributed/Distributed Statistics analysis ( mining, analysis) DBMS query engine Distributed platform Hardware Distributed file system Distributed database association predicates Distributed computation Query: Point query and analytic query from RTD Exploring query such as customer segmentation requires full table scan and muti-table join Query on predefined 1024 KPIs Tag,labeling, 500+ indicators, 50+ graphic computation mining: Customized model(user Modeling) User/Item/content/properties/similarity,Min Hash(CF) Behavior Targeting,customer profiling based on behavior and values

Business requirements on society Insight Focus on anonymous wireless users and location based application, focus on government, industry and enterprise application Traffic Application:Congestion information possible through Telco signaling data Population Analytics:traffic planning, city resources distribution, abnormal events

Business requirements on society Insight To dig out laws of group activity through data mining algorithms applied on maps and dimensional data. Core part is the data analysis layer. Visualization OD Graph&Matrix Population Density OD transport classification Traffic congestion detection Analysis UniBI Reporting Tools Population Density OD Table OD transportation Mode Classification Traffic Congestion Detection HDFS + Map/Reduce Preprocessing Map preprocessing District segmentation Extract district coordinates Cleaning Integration Exploration Selection HDFS + HQL Road segmentation Extract road coordinates Sources MR (Time, IMSI, Longitude, Latitude, RNCID, CellID)

Summary of big data business requirements Huawei product lines is attempting to build new big data business. Huawei product lines have various requirements on big data components: mainly on MPP DB in-memory analytics DB streaming computation MOLAP parallel computation, analytics & mining algorithms; Requirements storage and computation MPP DB:Support 10PB level volume; 100+ node linear scalability; respond queries on 0.1 billion rows in 1 minute;10:1 compression ratio; Real-time analytics in-memory DB:100TB, columnar, wide table with 2000-5000 columns, 30,000 updates/s, ad-hoc query respond in 3 seconds, to support real time business policy adjustment, real-time KPI calculation Streaming processing : 1 million events per second; 1 micro second latency for each event analytics MOLAP:support SQL and MDX, <5s response time in 80~90% scenarios; 1s response latency on TB data with hundred dimensions Real-time dashboard; mining : High accuracy, various algorithms, online data mining, quick response.

Thank you www.huawei.com Copyright 2011 Huawei Technologies Co., Ltd. All Rights Reserved. The information in this document may contain predictive statements including, without limitation, statements regarding the future financial and operating results, future product portfolio, new technology, etc. There are a number of factors that could cause actual results and developments to differ materially from those expressed or implied in the predictive statements. Therefore, such information is provided for reference purpose only and constitutes neither an offer nor an acceptance. Huawei may change the information at any time without notice.