Big Data & Cloud Computing. Faysal Shaarani



Similar documents
Data Warehousing Reinvented for the Cloud World. Benoit Dageville

QLIKVIEW INTEGRATION TION WITH AMAZON REDSHIFT John Park Partner Engineering

Next-Generation Cloud Analytics with Amazon Redshift

Scaling Your Data to the Cloud

How to Leverage Cloud to Quickly Build Scalable Applications

MOC 20467B: Designing Business Intelligence Solutions with Microsoft SQL Server 2012

College of Engineering, Technology, and Computer Science

Native Connectivity to Big Data Sources in MicroStrategy 10. Presented by: Raja Ganapathy

Building your Big Data Architecture on Amazon Web Services

Big Data Analytics Nokia

AGENDA. What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story. Our BIG DATA Roadmap. Hadoop PDW

Emerging Technologies Shaping the Future of Data Warehouses & Business Intelligence

Oracle Database - Engineered for Innovation. Sedat Zencirci Teknoloji Satış Danışmanlığı Direktörü Türkiye ve Orta Asya

White Paper - GPU-Based SQL Database. SQream Technologies. SQream DB GPU-Based SQL Database Technical Overview White Paper

Analytics in the Cloud. Peter Sirota, GM Elastic MapReduce

Oracle Database 11g: New Features for Administrators DBA Release 2

Mike Boyarski Jaspersoft Product Marketing Business Intelligence in the Cloud

Hadoop & Spark Using Amazon EMR

HPE Vertica & Hadoop. Tapping Innovation to Turbocharge Your Big Data. #SeizeTheData

How to make BIG DATA work for you. Faster results with Microsoft SQL Server PDW

Tapping Into Hadoop and NoSQL Data Sources with MicroStrategy. Presented by: Jeffrey Zhang and Trishla Maru

Big Data for the Rest of Us Technical White Paper

Case Studies: Protecting Sensitive Data in

Why Big Data in the Cloud?

Transforming the Economics of Data Warehousing with Cloud Computing

QlikView Business Discovery Platform. Algol Consulting Srl

SAS BIG DATA SOLUTIONS ON AWS SAS FORUM ESPAÑA, OCTOBER 16 TH, 2014 IAN MEYERS SOLUTIONS ARCHITECT / AMAZON WEB SERVICES

MICROSTRATEGY ON AWS

PLATFORA INTERACTIVE, IN-MEMORY BUSINESS INTELLIGENCE FOR HADOOP

Native Connectivity to Big Data Sources in MSTR 10

Oracle 11g New Features - OCP Upgrade Exam

#mstrworld. Tapping into Hadoop and NoSQL Data Sources in MicroStrategy. Presented by: Trishla Maru. #mstrworld

2015 Ironside Group, Inc. 2

THE CLOUD DATA BRIEF. Big Data Transitions to the Cloud

Inge Os Sales Consulting Manager Oracle Norway

Big Data on AWS. Services Overview. Bernie Nallamotu Principle Solutions Architect

BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES

A TECHNICAL WHITE PAPER ATTUNITY VISIBILITY

How To Use Hp Vertica Ondemand

Reference Architecture User Guide. For Environments deployed on Amazon Web Services

Elastic Data Warehousing in the Cloud Is the sky really the limit?

Objectif. Participant. Prérequis. Pédagogie. Oracle Database 11g - New Features for Administrators Release 2. 5 Jours [35 Heures]

Instant Data Warehousing with SAP data

Oracle9i Data Warehouse Review. Robert F. Edwards Dulcian, Inc.

Big Data Architecture & Analytics A comprehensive approach to harness big data architecture and analytics for growth

ACCELERATING SQL SERVER WITH XTREMIO

Sumit Sarkar Real-time BO Universe to Cloud Data Sources Session #

Course Outline: Course: Implementing a Data Warehouse with Microsoft SQL Server 2012 Learning Method: Instructor-led Classroom Learning

Tap into Hadoop and Other No SQL Sources

Driving Peak Performance IBM Corporation

Business Intelligence and Healthcare

Background on Elastic Compute Cloud (EC2) AMI s to choose from including servers hosted on different Linux distros

Trends and Research Opportunities in Spatial Big Data Analytics and Cloud Computing NCSU GeoSpatial Forum

Unlock your data for fast insights: dimensionless modeling with in-memory column store. By Vadim Orlov

hmetrix Revolutionizing Healthcare Analytics with Vertica & Tableau

Informatica Data Replication: Maximize Return on Data in Real Time Chai Pydimukkala Principal Product Manager Informatica

Report Data Management in the Cloud: Limitations and Opportunities

A Novel Cloud Based Elastic Framework for Big Data Preprocessing

Accelerate Data Loading for Big Data Analytics Attunity Click-2-Load for HP Vertica

Three Reasons Why Visual Data Discovery Falls Short

Establish and maintain Center of Excellence (CoE) around Data Architecture

News and trends in Data Warehouse Automation, Big Data and BI. Johan Hendrickx & Dirk Vermeiren

Oracle Database Public Cloud Services

Evaluation Checklist Data Warehouse Automation

Automated Data Ingestion. Bernhard Disselhoff Enterprise Sales Engineer

TOP 8 TRENDS FOR 2016 BIG DATA

Oracle Architecture, Concepts & Facilities

AtScale Intelligence Platform

MicroStrategy Cloud Reduces the Barriers to Enterprise BI...

What is Cloud Computing? Tackling the Challenges of Big Data. Tackling The Challenges of Big Data. Matei Zaharia. Matei Zaharia. Big Data Collection

Data Warehouse as a Service. Lot 2 - Platform as a Service. Version: 1.1, Issue Date: 05/02/2014. Classification: Open

Evaluating NoSQL for Enterprise Applications. Dirk Bartels VP Strategy & Marketing

Datenverwaltung im Wandel - Building an Enterprise Data Hub with

The Future of Data Management

Introducing Oracle Exalytics In-Memory Machine

Focus on the business, not the business of data warehousing!

Oracle Database 12c: Performance Management and Tuning NEW

Data Warehousing. Jens Teubner, TU Dortmund Winter 2015/16. Jens Teubner Data Warehousing Winter 2015/16 1

CLOUD COMPUTING FOR THE ENTERPRISE AND GLOBAL COMPANIES Steve Midgley Head of AWS EMEA

Real Time Big Data Processing

Next Generation Data Warehousing Appliances

SQL Server 2012 Parallel Data Warehouse. Solution Brief

LEARNING SOLUTIONS website milner.com/learning phone

SQL Server What s New? Christopher Speer. Technology Solution Specialist (SQL Server, BizTalk Server, Power BI, Azure) v-cspeer@microsoft.

ORACLE BUSINESS INTELLIGENCE, ORACLE DATABASE, AND EXADATA INTEGRATION

Part V Applications. What is cloud computing? SaaS has been around for awhile. Cloud Computing: General concepts

TECHNOLOGY TRANSFER PRESENTS MIKE FERGUSON JUNE 3-4, 2015 JUNE 5, 2015 RESIDENZA DI RIPETTA - VIA DI RIPETTA, 231 ROME (ITALY)

Preview of Oracle Database 12c In-Memory Option. Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Where We Are. References. Cloud Computing. Levels of Service. Cloud Computing History. Introduction to Data Management CSE 344

Business Usage Monitoring for Teradata

Big Data & the LAMP Stack: How to Boost Performance

The IBM Cognos Platform

Top 10 Performance Tips for OBI-EE

References. Introduction to Database Systems CSE 444. Motivation. Basic Features. Outline: Database in the Cloud. Outline

Introduction to Database Systems CSE 444

DLT Solutions and Amazon Web Services

Implementing a Data Warehouse with Microsoft SQL Server 2012 MOC 10777

APP DEVELOPMENT ON THE CLOUD MADE EASY WITH PAAS

In-Memory Analytics: A comparison between Oracle TimesTen and Oracle Essbase

Course 10777A: Implementing a Data Warehouse with Microsoft SQL Server 2012

Transcription:

Big Data & Cloud Computing Faysal Shaarani

Agenda Business Trends in Data What is Big Data? Traditional Computing Vs. Cloud Computing Snowflake Architecture for the Cloud

Business Trends in Data Critical decision-making tool and driver for business over time. Different type of data: (Internet of things will only increase the volume and structure of this data Data volume and variety has made it very difficult and costly to ingest, process, and distill information for timely and accurate decision making.

What is Big Data? Various attempts to define it. Data sets too large and complex to manipulate or interrogate with standard methods or tools. Businesses not looking into it would likely be in trouble

Big Data is not just about Volume

Big Data at a microscopic level Structured Data Semi-Structured Data Multi-Structured Data Governed by the 5 V s: Volume, Velocity, Variety, Veracity and Value

What Makes up Big Data?

Big Data Analytics is a Must Large Amounts of Data Available Competitive Advantage Better strategic and operational business decisions. Identifications of hidden patterns unknown correlations Effective marketing, Customer Satisfaction, and increased revenue

Applications for Big Data Analytics

Big Data for Smarter Healthcare

Big Data Market Size

The Data World is Different Today Conventional Data s: static, predictable queries on highly refined data were the norm. Knobs and parameters for the user or DBAs to tune based on their knowledge of the queries and workloads to be run. That s near impossible to manage in today s world. Cloud Data s Reduces up-front project costs. pay-as-you-go, on-demand, and elastic scalability model Enables organizations to scale their applications as required while paying only for the resources they use. Provides significant benefits for both the business and IT.

Limitations of Traditional Databases Limited Elasticity: Compute and data Complex to Manage: data indexing partitioning, DBAs, query Tuning Costly: infrastructure Management, Licensing, Tools and Skills. Both Shared nothing architecture and shared disk architecture dbs have two dimensions of scalability (data and compute). None are elastic

Benefits of Cloud Computing Infinite resources, Elasticity on Demand Pay only for what you use Bring solutions to market quickly No need to involve IT

Public Clouds & Ecosystem Tools Infrastructure: AWS; Azure; Google Cloud Data Warehousing: RedShift, Snowflake, others. BI Tools: Tableau, Looker, Microstrategy, SAS ETL: Talend, Informatica, CloverETL

On-premise Databases in the Cloud Any on-premise database can be hosted in the cloud. i.e. Oracle, MySQL, SQL Server, DB2, etc. Amazon Redshift (Open Source Moved to the cloud) Fast, fully managed, petabyte-scale data warehouse service Simple and cost-effective to efficiently analyze all your data using any existing business intelligence tools. Just $0.25 per CPU hour & $1,000 per terabyte per year No commitments or upfront costs Less than a tenth of most other data warehousing solutions.

Databases Architected for The Cloud Snowflake: Fast, fully managed, petabyte-scale data warehouse service simple and cost-effective to efficiently analyze all your data using any existing business intelligence tools. Just $1 to $2 per server/hr & $200 per terabyte per month No commitments or upfront costs Less than tenth of most other data warehousing solutions.

Data Warehousing Cloud Service ETL & Data Loading Database is separate from Virtual One Virtual, multiple Databases Finance Users Virtual Virtual S Virtual Marketing Users One Database, multiple Virtual s Database s Virtual scales independently from Database Data loading does not Test/Dev Users Virtual S Virtual Virtual Sales Users Impact query performance Biz Dev User

Data Warehousing Cloud Service ETL & Data Loading Supports structured and semi-structured data: JSON and Avro Finance Users Marketing Users The tools you know + Snowflake web UI Database s Test/Dev Users Sales Users Biz Dev User

Multidimensional Elasticity Three dimensions of elasticity ETL & Data Loading Data Workload Users Workload Elasticity Finance Users Test/Dev Users Virtual Virtual Virtual Databases Virtual Virtual Virtual Marketing Users Sales Users Biz Dev User Data Elasticity Data Elasticity

Inside Multidimensional Elasticity CSV Loading Running on EC2 Columnar compressed FDN Files Stored on S3 Virtual adaptively caches FDN files in local flash storage Query optimization and runtime execution prunes data for efficiency Running on EC2

Snowflake Architecture User Interface ODBC Driver JDBC Driver Web UI Cloud Services Optimization Query Mgmt Mgmt Security Metadata Virtual Processing EC2 Database Storage S3 Data Sales Marketing Materials Cloud Infrastructure Amazon AWS Customer Service Financial Analysts Quality Control Loading

Relational Processing of Semi-Structured Data 1. Variant data type compresses storage of semistructured data 2. Data is analyzed during load to discern repetitive attributes within the hierarchy 3. Repetitive attributes are columnar compressed and statistics are collected for relational query optimization 4. SQL extensions enable relational queries against both semi-structured and structured data

Security General Availability Features Account Service Account 2-factor authentication Account Account Federated Authentication Data encryption over the Internet Snowflake Operations Operations Encryption of data at rest Roles and privilege management Auditing/security logging of all operations

10TB Query Workload Comparison Oracle Redshift Snowflake Snowflake Improvement Upfront Commitment $1.7M $48,000 (1-Year Reserved Instance) $10,000 5x Load 9 hours 14 hours $100 1.4 hours $45 10x Query 1.5 hours 3.5 hours $25 40 min $20 5x Resize Forklift upgrade 3 hours, full data migration 5 minutes no data migration 30x Monthly Idle Cost $47,000 $5500 $500 (10TB DB Storage) 10x