We are building the next generation of Big Data and Analytics solutions!

Similar documents
Comprehensive Analytics on the Hortonworks Data Platform

HDP Hadoop From concept to deployment.

BIG DATA What it is and how to use?

Information Builders Mission & Value Proposition

GAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION

Transforming the Telecoms Business using Big Data and Analytics

Hadoop Ecosystem B Y R A H I M A.

A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani

Modern Data Architecture for Predictive Analytics

Big data for the Masses The Unique Challenge of Big Data Integration

How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning

BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES

The Future of Data Management with Hadoop and the Enterprise Data Hub

Hadoop Evolution In Organizations. Mark Vervuurt Cluster Data Science & Analytics

Tapping Into Hadoop and NoSQL Data Sources with MicroStrategy. Presented by: Jeffrey Zhang and Trishla Maru

Hadoop Ecosystem Overview. CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook

#mstrworld. Tapping into Hadoop and NoSQL Data Sources in MicroStrategy. Presented by: Trishla Maru. #mstrworld

WHITE PAPER. Four Key Pillars To A Big Data Management Solution

Big Data Big Data/Data Analytics & Software Development

Aligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap

Integrating a Big Data Platform into Government:

Tap into Hadoop and Other No SQL Sources

HDP Enabling the Modern Data Architecture

Community Driven Apache Hadoop. Apache Hadoop Basics. May Hortonworks Inc.

Upcoming Announcements

ESS event: Big Data in Official Statistics. Antonino Virgillito, Istat

Big Data Analytics. Copyright 2011 EMC Corporation. All rights reserved.

Big Data and Data Science: Behind the Buzz Words

Integrating Hadoop. Into Business Intelligence & Data Warehousing. Philip Russom TDWI Research Director for Data Management, April

#TalendSandbox for Big Data

Big Data on Microsoft Platform

SQL Server 2012 PDW. Ryan Simpson Technical Solution Professional PDW Microsoft. Microsoft SQL Server 2012 Parallel Data Warehouse

A Survey on Big Data Concepts and Tools

Introduction to Hadoop HDFS and Ecosystems. Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data

Oracle Big Data Spatial & Graph Social Network Analysis - Case Study

Hortonworks & SAS. Analytics everywhere. Page 1. Hortonworks Inc All Rights Reserved

The 4 Pillars of Technosoft s Big Data Practice

The Future of Data Management

MySQL and Hadoop: Big Data Integration. Shubhangi Garg & Neha Kumari MySQL Engineering

Big Data, Why All the Buzz? (Abridged) Anita Luthra, February 20, 2014

Session 0202: Big Data in action with SAP HANA and Hadoop Platforms Prasad Illapani Product Management & Strategy (SAP HANA & Big Data) SAP Labs LLC,

How To Make Sense Of Data With Altilia

Big Data Analytics Nokia

Big Data. Fast Forward. Putting data to productive use

Big Data: Tools and Technologies in Big Data

Big Data Storage Challenges for the Industrial Internet of Things

Implement Hadoop jobs to extract business value from large and varied data sets

Workshop on Hadoop with Big Data

Hadoop Beyond Hype: Complex Adaptive Systems Conference Nov 16, Viswa Sharma Solutions Architect Tata Consultancy Services

Harnessing big data with Hortonworks Data Platform and Red Hat JBoss Data Virtualization

Hortonworks CISC Innovation day

Big Data Course Highlights

A Modern Data Architecture with Apache Hadoop

Big Data, Cloud Computing, Spatial Databases Steven Hagan Vice President Server Technologies

Big Data Introduction

Big Data and Advanced Analytics Applications and Capabilities Steven Hagan, Vice President, Server Technologies

Getting Started with Hadoop. Raanan Dagan Paul Tibaldi

Big Data in Healthcare: Myth, Hype, and Hope

Bringing Big Data to People

Cloudera Enterprise Data Hub in Telecom:

Chukwa, Hadoop subproject, 37, 131 Cloud enabled big data, 4 Codd s 12 rules, 1 Column-oriented databases, 18, 52 Compression pattern, 83 84

Big Data & QlikView. Democratizing Big Data Analytics. David Freriks Principal Solution Architect

How To Handle Big Data With A Data Scientist

Next presentation starting soon Business Analytics using Big Data to gain competitive advantage

BITKOM& NIK - Big Data Wo liegen die Chancen für den Mittelstand?

Hadoop Job Oriented Training Agenda

AGENDA. What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story. Our BIG DATA Roadmap. Hadoop PDW

2015 Analyst and Advisor Summit. Advanced Data Analytics Dr. Rod Fontecilla Vice President, Application Services, Chief Data Scientist

Datenverwaltung im Wandel - Building an Enterprise Data Hub with

BIG DATA: FROM HYPE TO REALITY. Leandro Ruiz Presales Partner for C&LA Teradata

Big Data Management and Security

Microsoft SQL Server 2012 with Hadoop

Building Scalable Big Data Pipelines

Big Data Analytics for Space Exploration, Entrepreneurship and Policy Opportunities. Tiffani Crawford, PhD

Data processing goes big

You should have a working knowledge of the Microsoft Windows platform. A basic knowledge of programming is helpful but not required.

BIG DATA IS MESSY PARTNER WITH SCALABLE

SOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP. Eva Andreasson Cloudera

Collaborative Big Data Analytics. Copyright 2012 EMC Corporation. All rights reserved.

Search and Real-Time Analytics on Big Data

Large scale processing using Hadoop. Ján Vaňo

Introduction to Big data. Why Big data? Case Studies. Introduction to Hadoop. Understanding Features of Hadoop. Hadoop Architecture.

Big Data Open Source Stack vs. Traditional Stack for BI and Analytics

Customized Report- Big Data

USING BIG DATA FOR INTELLIGENT BUSINESSES

Apache Hadoop: The Big Data Refinery

Turn "Big Data" into Business Value with Real-Time BI. Timo Elliott, March 2012

TRAINING PROGRAM ON BIGDATA/HADOOP

Microsoft Big Data. Solution Brief

How to Hadoop Without the Worry: Protecting Big Data at Scale

This Symposium brought to you by

Native Connectivity to Big Data Sources in MSTR 10

Using Tableau Software with Hortonworks Data Platform

Extending the Enterprise Data Warehouse with Hadoop Robert Lancaster. Nov 7, 2012

ANALYTICS CENTER LEARNING PROGRAM

Getting Started Practical Input For Your Roadmap

Big Data Realities Hadoop in the Enterprise Architecture

BIG DATA & ANALYTICS. Transforming the business and driving revenue through big data and analytics

Making Sense of Big Data in Insurance

BIG DATA TRENDS AND TECHNOLOGIES

Transcription:

We are building the next generation of Big Data and Analytics solutions!

Background 26 years Experience IT Industry 12 Years Solutions Architect - International Profile Passionate about Technology Genuine Interest In All Things Digital 10 Years - IT Director - Private Banking OSLO Resourceful 4 Years - CEO - Cloud Explorers - Big Data International Innovative Out of the Box Thinker Disruption I DATA SCIENCE Stephen Karl Ranson CEO

Big Data - Less Fluff! More Concrete! PEOPLE ARE DOING IT! NOW! ITS TIME TO BEGIN 2015!!! If you wait you will be too late!

Big Data - A Brief History of Big Data 2001 - (During the BI Boom) - META Group (Now Gartner) Analyst Doug Laney wrote a report addressing the growth challenges and opportunities facing future Data Warehouse/BI Projects in terms of 3(V s) dimensions (V)olume, (V)ariety and (V)elocity (2001 the Internet was 458,000,000 users (7.6%) world population, with 29,254,370 sites online) (2015 the Internet is 3,074,220,500+ users (43.9%) world population (148% growth over 14 years), with 1,219,400,120+ sites online (190% growth over 14 years)) (Internet used for websites, mail, file transfer, online services (Cloud), streaming content, devices, voip. With devices from browsers, smartphones, cameras, cars, televisions, machinery, Embedded devices, refrigerators.war!) 2004 - Doug Cutting whilst working on an open source project Nutch Reads two white papers from Google explaining their approaches to problems he was trying to solve. The papers described (GFS - Google File System) & (MR - Map Reduce) implemented initial thinking into Nutch. 2006 - Yahoo Hired Doug and his team and they branched their work out of Nutch into a new project to help Yahoo solve many of the same challenges that Google faced/solved, they called it HADOOP <- V. Important! 2012 - Gartner Revisits the 3V s with more perspective, added some more V s and tried to give a clearer definition: Business Intelligence - Uses descriptive statistics with data with high information density to measure things, detect trends etc.; Big Data - Uses inductive statistics and concepts from nonlinear system identification to infer laws (regressions, nonlinear relationships and causal effects) from large sets of data with low information density to reveal relationships, dependencies and perform predictions of outcomes and behaviors. Simply put Big Data is a large volume unstructured data which cannot be handled by standard data management systems like DBMS, RDBMS or ORDBMS

Big Data Characteristics (The 6V s and a C) Volume The quantity of data that is generated is very important in this context. It is the size of the data which determines the value and potential of the data under consideration and whether it can actually be considered Big Data or not. The name Big Data itself contains a term which is related to size and hence the characteristic. Variety - The next aspect of Big Data is its variety. This means that the category to which Big Data belongs to is also a very essential fact that needs to be known by the data analysts. This helps the people, who are closely analyzing the data and are associated with it, to effectively use the data to their advantage and thus upholding the importance of the Big Data. Velocity - The term velocity in the context refers to the speed of generation of data or how fast the data is generated and processed to meet the demands and the challenges which lie ahead in the path of growth and development. Variability - This is a factor which can be a problem for those who analyse the data. This refers to the inconsistency which can be shown by the data at times, thus hampering the process of being able to handle and manage the data effectively. Veracity - The quality of the data being captured can vary greatly. Accuracy of analysis depends on the veracity of the source data. Value - Enabling Business decisions and giving Business insight and advantage. Complexity - Data management can become a very complex process, especially when large volumes of data come from multiple sources. These data need to be linked, connected and correlated in order to be able to grasp the information that is supposed to be conveyed by these data. This situation, is therefore, termed as the complexity of Big Data.

Big Data Formats Structured - High degree of organization and typically found in relational databases or spreadsheets. Maps easily to data types or user defined types based on standard types. Can be searched using standard algorithms and manipulated in well defined ways. Semistructured - (Such as log files) a little more difficult to understand. Normally stored as text files with some basic form of order such as tab delimited or comma separated columns. Unlike a database that returns known meaning for a resulting column, each column needs to be assigned a type and meaning to any extracted data elements. Unstructured - no advantages of having structure coded into the data set. (Still in can data stored in a computer ever be unstructured?) This is data with too little structure to make sense of it. Traditional approaches for analysis is difficult and costly. Also typically the volumes are high with this class of data. Five main types of Data found today Sentiment data Clickstream data Sensor data or machine data Gelocation data Server logs

Big Data Sources

Advanced Analytics SINGLE VIEW OF ENTITY The first of three common patterns in analytics applications, a single view of an entity (like a customer, product or a machine) is now possible because platforms like Hadoop can store and organize previously unmanageable columns and varieties of data. DATA DISCOVERY New, voluminous data types such as machine and sensor data, geolocation data, clickstream data and sentiment data are valuable when correlated with other data sets in a shared enterprise data lake. The patterns within the data lake can then fuel machine learning applications. PREDICTIVE ANALYTICS As data scientists and analysts reveal patterns and correlations inside massive data sets, new models emerge to explain business performance. Most importantly, these models can reliably predict future events based on previously dissociated data.

HADOOP Hello my name is Hadoop! I am named after Doug Cutting s son s toy yellow elephant :-) WHAT DO I DO? + = Big Data! STORAGE Elastic/Reliable/Unlimited COMPUTATION Framework Scaleable Data Crunching / Analysis (C) Copyright Cloud Explorers Solutions AS 2015

HADOOP IS AN AQUARIUM? I provide a powerful environment for the Big Data Fishes :-)!

Lets Meet some of the FISH! (14+ and growing) Ambari - HADOOP ADMIN TOOLS FOR MANAGING ANDMAINTAINING A CLUSTER Avro - FRAMEWORK FRO DATA SERIALIZTION INTO A COMPACT BINARY FORMAT Flume - DATAFLOW SERVICE FOR MOVEMENT OF LARGE VOLUMES OF LOG FILES INTO HADOOP HBase - DISTRIBUTED COLUMNAR DATABASE USING HDFS (LARGE TABLES) HCatalog - PROVIDES A RELATION VIEW OF DATA STORED IN HADOOP Hive - DISTRIBUTED DATAWAREHOUSE FOR HDFS & SQL STYLE QUERY LANGUAGE (HIVEQL) Solr - POWERFUL NOSQL SEARCH ENGINE WITH INDEXES, FACETS, PIVOTS, SEMANTIC Hue - ADMINISTRATIVE INTERFACE FOR HADOOP (GUI) Mahout - LIBRARY OF MACHINE LEARNING ALGORITHMS IMPLEMENTED AS MAP REDUCE ON HADOOP Oozie - WORKFLOW MANAGEMENT TOOL HANDLING SCHEDULING AND CHAINING Pig - PLATFORM FOR ANALYSIS OF VERY LARGE DATA SETS WITH ITS OWN QUERY LANGUAGE PIG LATIN Sqoop - TOOL FOR EFFICIENTLY MOVING LARGE AMOUNTS OF DATA FROM DBS TO HDFS ZooKeeeper - SIMPLE INTERFACE TO CENTRALIZED CO-ORDINATION OF SERVICES Apache Storm - REAL TIME DATA STREAMING WITH REAL TIME SEARCH

THE BIG DATA - SHIPPING NOW! You Can Start Today!

Big Data - The Data Lake LETS STOCK OUR LAKE! WITH OUR DATA FISH!

Big Data - The Data Lake Business Data Profiles Clients Identity Relationships E-mail Documents Reports Facts Analysis/Mining History CRM ERP/Accounting Transactions Content DataWarehouses Enterprise Business Data

Omni Channel/Contact Points - Marketing Conversation Channel Integration Channel Usability Channel Transparency Informative Brand Experience Awareness Convenient Web Coupons Vocabulary Business KPI s SMS Mobile E-Commerce Data Consistent Continuity Real time Social Big Data KPI s Email Response Data TV/Radio? Post Store Store In Sync Store Staff Knowledgeable & Informed

Omni Channel/Contact Points - Really Means - DATA! Omni Channel/Contact PointsRelationship - Critical Relationship DATA! Web Coupons Vocabulary Business KPI s Your Mobile Social Big Data KPI s Business Email TV/Radio? Post Store

Quality Data Commercial Data

Free & Diverse Data OPEN Data GEO Data Geodata

Social Data Geodata Social Data

BIG DATA ACTIONS! Entirely New New Generation DATA Big Data DRIVEN KPI s Business BUSINESS! KPI s Richer Vocabulary Information Insight Knowledge

Big Data- Data Lake Open Data Free Public Data Environment, Infrastructure, Finance, Health, Education, Reference, Society Commercial Quality Data Brokers, High quality lists, Survey, Panel Data, Clickstream, Directories, Telephone numbers, Board and Company roles... Search Segment Web Commerce Existing Web Commerce Sites Logs, Trackers, Purchases, Abandoned Carts, Interest, Usage... Client Data Facts Analysis/Mining History Clients Identity Relationships Profiles Reports Documents E-mail CRM Transactions Accounting Content DataWarehouses SPEED LAYER Visualize Analysis Reports Dashboards PUBLISH LAYER Response Data Response Data from all Channels Web Logs, Mail Logs, Telefoni Logs, Social... Red Thread across all channels... BATCH LAYER Geo Data Mapping and Geo spatial Data, size of house, nearest shops, cafes... Real Time Sensors - STORM ibeacon, Wifi, Web Sessions, Sensors POS, Payment Terminals, Temperature, Weather, Traffic,News Events, IOT Infographics API/Events Social Social Data, FB, Twitter, Instagram, Snapchat, Google, Blogs, Natural Language Processing, Sentiment Analysis, Networks, Likes, Interests, Trends, Discussions, Product Awareness and Feedback 1 2 3 EXTRACT Import Wash / Enrich Segment / Search Visualize / Analysis Search/ Dashboards Reports / Infographics / Events Export Publish Operationalize ACTIONS

Seeing is believing!

Quick Demo! - BIG DATA IS NOW!

THANK YOU :-)