Monetizing Millions of Mobile Users with Cloud Business Analytics
|
|
- Judith Atkins
- 8 years ago
- Views:
Transcription
1 Monetizing Millions of Mobile Users with Cloud Business Analytics MicroStrategy World 2013 David Abercrombie Data Analytics Engineer
2 Agenda Tapjoy Big Data Architecture MicroStrategy Cloud Implementation MicroStrategy Cloud vs. Amazon EC2 Vertica
3 Tapjoy
4 Tapjoy by the Numbers 5,300+ Active Apps 1.3MM Daily Ad Conversions 339MM Global Reach
5 A Few of Our Advertisers CPG RETAIL ALCOHOL QSR INSURANCE TELCO FINANCIAL ENTERTAINMEN T HEALTH & BEAUTY AUTOMOTIVE TRAVEL TECHNOLOGY
6 A Few of Our Brand Advertisers
7 Recent Recognition
8 Big Data Architecture
9 Operational Data Systems Ruby application servers in Amazon EC2 Memcached Amazon Simple DB => Riak in data center Amazon RDS => Heroku PostgreSQL Hadoop on EC2 (HDFS, Hive, Pig, etc.) Vertica on EC2 => data center R, Mahout, Mallet, etc. Syslog-ng => RabbitMQ Amazon S3
10 Analytic Data Volume About one billion analytic rows per day 10 to 15+ thousand per second Even more in transactional databases Several dozen terabytes online Compression and parallelism: Vertica Aggregate 100 billion rows in a second!
11 Transactional System Ruby Apps web_requests Gzip Files Other Data Files Was Syslog NG Will be Rabbit MQ ETL 1 Amazon S3 Storage Area ETL 2 Hadoop Main Backup DW Vertica BI Prod MSTR ETL 4 RDS MySQL ETL 3 ETLs 1) Hadoop 2) Load Vertica 3) Load Vertica 4) Aggregation
12 MicroStrategy Cloud Implementation
13 MicroStrategy Uses Executive dashboards Canned reports Ad hoc analysis External partner dashboards Mobile Mobile, web and iframe Replace Tableau, etc.
14 Why Cloud? Obvious: No installation No maintenance No need for special admin skills Other: We have had no data center! Start-up, we are all stretched thin
15 Cloud challenges User and group management (MSTR LDAP) Project-level setup Mobile URL configs Deployment (dev, QA, prod) Training, docs, and consulting Password reset No end-to-end metrics Work your Cloud reps they are great!
16 Schema Design 25 dimensions 100 metrics 80 tables Partly denormalized snowflake Aggregated facts, one-hour granularity Custom ELT (within Vertica)
17 MicroStrategy Design Challenges Non-additive metrics Type 2 Slow Changing Dimensions Tech Note Row-level security for external partners Tech Note Ensuring parallel SQL for Vertica
18 MicroStrategy Cloud vs. Amazon EC2
19 EC2 response time anomalies
20 EC2 response time anomalies Good ( prod ) Bad ( bi ) Minimum 120 ms 150 ms First Quartile 132 ms 344 ms Median 153 ms 2,660 ms Mean 432 ms 10,180 ms Third Quartile 219 ms 11,230 ms Maximum 9,652 ms 211,500 ms Standard Deviation 976 ms 18,080 ms
21 EC2 load anomalies (response time) top - 18:53:52 up 16 days, 1:09, 4 users, load average: 82.65, 86.79, Tasks: 147 total, 1 running, 144 sleeping, 2 stopped, 0 zombie Cpu(s): 0.1%us, 2.0%sy, 11.9%ni, 81.9%id, 3.9%wa, 0.0%hi, 0.0%si, 0.2%st Mem: k total, k used, 44780k free, 1136k buffers Swap: 0k total, 0k used, 0k free, k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 1666 dbadmin g 5.5g 76m S :02 vertica 175 root S :12.49 kswapd spread S :03.80 spread root S :08.50 pdflush 1 root S :02.01 init 2 root RT S :01.48 migration/0 3 root S :00.50 ksoftirqd/0 4 root RT S :00.01 watchdog/0 5 root S :00.95 events/0 6 root S :01.44 khelper
22 Vertica
23 Clustered Database Move the computation to the data, never move the data to the computation Dr. Michael Stonebraker Moving Computation is Cheaper than Moving Data Hadoop documentation
24 Vertica/MicroStrategy Resources Track 2 Session 8, 4:45 today in Lafleur 1: HP Vertica: A Step-by-Step Plan for Implementing Mobile Business Intelligence MicroStrategy Technical Note 42360
25 Vertica Magic MPP shared nothing Parallel SQL Column store Compression Run Length Encoding
26 Vertica Parallel SQL Client connects to any node Becomes the initiator node Initiator creates execution plan Initiator sends plans to other nodes Nodes execute plans on their local data Nodes return results to initiator Initiator combines result subsets from nodes Initiator send answer to client
27 Vertica Column Store Pioneered at C-Store project at MIT, etc. Read only the columns you need Reduces IO Each IO is more relevant Enables Run Length Encoding
28 Vertica Projections Disk data structures Like Oracle materialized views Like index-only access path Can have many per table Optimize for different queries Watch load time and disk Disk files each contain only one column
29 Run Length Encoding Sorted: identical values together in same bucket Each bucket maps to a set of rows in subsequent columns Gender Class Pass Name SELECT Name F M Fresh Junior Senior Soph Fresh Junior Senior Soph F T F T F T F T F T F T F T F T FROM Students WHERE Class= Junior AND Gender= M AND Pass= F
30 308 million rows, 28 columns, 8 GB Column Name Bytes Rows per Byte device_size_code 5,290 58,100 month_key 5,460 56,300 day_key 34,000 9,030 hour_key 571, platform_key 9,160, conversions 308,000, currency_key 609,000, publisher_partner_key 722,000, advertiser_partner_key 729,000, offer_key 742,000, publisher_app_key 784,000, tapjoy_revenue 903,000, virtual_currency_rewarded 1,110,000,
31 Denormalize! Column Name Bytes Rows per Byte device_size_code 5,290 58,100 month_key 5,460 56,300 day_key 34,000 9,030 hour_key 571, platform_key 9,160, Very little extra disk and IO to denormalize few MB on 8 GB table Moderately denormalized snowflake schema
32 Vertica Segmentation (sharding) Distribute data evenly among nodes For parallel query processing Typically hash() a high-cardinality column But not date or timestamp More than one segmentation allowed Mutiple Vertica projections
33 Parallel Joins Local joins require identical segmentation Must include equality on segmentation columns Replicate dimensions to all nodes
34 Predicate Pushdown and Fact Levels MicroStrategy facts must be at same level to be combined into a single metric 921 Advanced project design p 128. Vertica aggregation views can change a fact level Reduce dimensionality Vertica will push a filter predicate down into a view before aggregating, helping performance
35 Changing fact level with SQL view CREATE VIEW fact_table_1_ab ( dimension_a, dimension_b, high_cardinality_id, fact_1 ) AS SELECT dimension_a, dimension_b, high_cardinality_id, sum(metric_1) FROM fact_table_1 GROUP BY dimension_a, dimension_b, high_cardinality_id;
36 Predicate Pushdown and Fact Levels EXPLAIN SELECT * FROM fact_table_1_ab WHERE dimension_a = 'A'; Access Path: +-GROUPBY HASH (SORT OUTPUT) (PATH ID: 3) Aggregates: sum(fact_table_1.metric_1) Group By: fact_table_1.dimension_a, fact_table_1.dimension_b, fact_table_1.high_cardinality_id Execute on: All Nodes +---> STORAGE ACCESS for fact_table_1 (PATH ID: 4) Projection: mstr.fact_table_1_super_b0 Materialize: fact_table_1.dimension_a, fact_table_1.dimension_b, fact_table_1.high_cardinality_id, fact_table_1.metric_1 Filter: (fact_table_1.dimension_a = 'A') Execute on: All Nodes
37 Vertica Challenges Data structure design is key Must match SQL and business needs Parallel joins Poor update/delete and referential integrity Complicates ELT BTW, bulk loads are fast!
38 Agenda Tapjoy Big Data Architecture MicroStrategy Cloud Implementation MicroStrategy Cloud vs. Amazon EC2 Vertica
39 Tapjoy, Inc. All Rights Reserved. Tapjoy, Inc. Confidential and Proprietary - Please Do Not Copy or Distribute Without Tapjoy, Inc. s Prior Written Consent. The data provided herein is for information purposes only and while all efforts are made to ensure accuracy, errors may arise. Tapjoy and the Tapjoy logo are trademarks or registered trademarks of Tapjoy, Inc. All third party logos and trademarks mentioned are the property of their respective owners.
Using MySQL for Big Data Advantage Integrate for Insight Sastry Vedantam sastry.vedantam@oracle.com
Using MySQL for Big Data Advantage Integrate for Insight Sastry Vedantam sastry.vedantam@oracle.com Agenda The rise of Big Data & Hadoop MySQL in the Big Data Lifecycle MySQL Solutions for Big Data Q&A
More informationHow To Scale Out Of A Nosql Database
Firebird meets NoSQL (Apache HBase) Case Study Firebird Conference 2011 Luxembourg 25.11.2011 26.11.2011 Thomas Steinmaurer DI +43 7236 3343 896 thomas.steinmaurer@scch.at www.scch.at Michael Zwick DI
More informationPerformance and Scalability Overview
Performance and Scalability Overview This guide provides an overview of some of the performance and scalability capabilities of the Pentaho Business Analytics Platform. Contents Pentaho Scalability and
More informationFacultat d'informàtica de Barcelona Univ. Politècnica de Catalunya. Administració de Sistemes Operatius. System monitoring
Facultat d'informàtica de Barcelona Univ. Politècnica de Catalunya Administració de Sistemes Operatius System monitoring Topics 1. Introduction to OS administration 2. Installation of the OS 3. Users management
More informationCost-Effective Business Intelligence with Red Hat and Open Source
Cost-Effective Business Intelligence with Red Hat and Open Source Sherman Wood Director, Business Intelligence, Jaspersoft September 3, 2009 1 Agenda Introductions Quick survey What is BI?: reporting,
More informationPerformance and Scalability Overview
Performance and Scalability Overview This guide provides an overview of some of the performance and scalability capabilities of the Pentaho Business Analytics platform. PENTAHO PERFORMANCE ENGINEERING
More informationAmazon Redshift & Amazon DynamoDB Michael Hanisch, Amazon Web Services Erez Hadas-Sonnenschein, clipkit GmbH Witali Stohler, clipkit GmbH 2014-05-15
Amazon Redshift & Amazon DynamoDB Michael Hanisch, Amazon Web Services Erez Hadas-Sonnenschein, clipkit GmbH Witali Stohler, clipkit GmbH 2014-05-15 2014 Amazon.com, Inc. and its affiliates. All rights
More informationNext-Generation Cloud Analytics with Amazon Redshift
Next-Generation Cloud Analytics with Amazon Redshift What s inside Introduction Why Amazon Redshift is Great for Analytics Cloud Data Warehousing Strategies for Relational Databases Analyzing Fast, Transactional
More informationBuilding Scalable Big Data Infrastructure Using Open Source Software. Sam William sampd@stumbleupon.
Building Scalable Big Data Infrastructure Using Open Source Software Sam William sampd@stumbleupon. What is StumbleUpon? Help users find content they did not expect to find The best way to discover new
More informationBig Data? Definition # 1: Big Data Definition Forrester Research
Big Data Big Data? Definition # 1: Big Data Definition Forrester Research Big Data? Definition # 2: Quote of Tim O Reilly brings it all home: Companies that have massive amounts of data without massive
More informationSQL Server PDW. Artur Vieira Premier Field Engineer
SQL Server PDW Artur Vieira Premier Field Engineer Agenda 1 Introduction to MPP and PDW 2 PDW Architecture and Components 3 Data Structures 4 PDW Tools Data Load / Data Output / Administrative Console
More informationBig Data Use Case. How Rackspace is using Private Cloud for Big Data. Bryan Thompson. May 8th, 2013
Big Data Use Case How Rackspace is using Private Cloud for Big Data Bryan Thompson May 8th, 2013 Our Big Data Problem Consolidate all monitoring data for reporting and analytical purposes. Every device
More informationParallel Data Warehouse
MICROSOFT S ANALYTICS SOLUTIONS WITH PARALLEL DATA WAREHOUSE Parallel Data Warehouse Stefan Cronjaeger Microsoft May 2013 AGENDA PDW overview Columnstore and Big Data Business Intellignece Project Ability
More informationBackground on Elastic Compute Cloud (EC2) AMI s to choose from including servers hosted on different Linux distros
David Moses January 2014 Paper on Cloud Computing I Background on Tools and Technologies in Amazon Web Services (AWS) In this paper I will highlight the technologies from the AWS cloud which enable you
More informationWhy NoSQL? Your database options in the new non- relational world. 2015 IBM Cloudant 1
Why NoSQL? Your database options in the new non- relational world 2015 IBM Cloudant 1 Table of Contents New types of apps are generating new types of data... 3 A brief history on NoSQL... 3 NoSQL s roots
More informationAGENDA. What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story. Our BIG DATA Roadmap. Hadoop PDW
AGENDA What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story Hadoop PDW Our BIG DATA Roadmap BIG DATA? Volume 59% growth in annual WW information 1.2M Zetabytes (10 21 bytes) this
More informationManaging Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database
Managing Big Data with Hadoop & Vertica A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Copyright Vertica Systems, Inc. October 2009 Cloudera and Vertica
More informationBig Data & Cloud Computing. Faysal Shaarani
Big Data & Cloud Computing Faysal Shaarani Agenda Business Trends in Data What is Big Data? Traditional Computing Vs. Cloud Computing Snowflake Architecture for the Cloud Business Trends in Data Critical
More informationReal Time Big Data Processing
Real Time Big Data Processing Cloud Expo 2014 Ian Meyers Amazon Web Services Global Infrastructure Deployment & Administration App Services Analytics Compute Storage Database Networking AWS Global Infrastructure
More information5 Signs You Might Be Outgrowing Your MySQL Data Warehouse*
Whitepaper 5 Signs You Might Be Outgrowing Your MySQL Data Warehouse* *And Why Vertica May Be the Right Fit Like Outgrowing Old Clothes... Most of us remember a favorite pair of pants or shirt we had as
More informationScalable Architecture on Amazon AWS Cloud
Scalable Architecture on Amazon AWS Cloud Kalpak Shah Founder & CEO, Clogeny Technologies kalpak@clogeny.com 1 * http://www.rightscale.com/products/cloud-computing-uses/scalable-website.php 2 Architect
More informationUnlock your data for fast insights: dimensionless modeling with in-memory column store. By Vadim Orlov
Unlock your data for fast insights: dimensionless modeling with in-memory column store By Vadim Orlov I. DIMENSIONAL MODEL Dimensional modeling (also known as star or snowflake schema) was pioneered by
More informationAnalytics March 2015 White paper. Why NoSQL? Your database options in the new non-relational world
Analytics March 2015 White paper Why NoSQL? Your database options in the new non-relational world 2 Why NoSQL? Contents 2 New types of apps are generating new types of data 2 A brief history of NoSQL 3
More informationExadata in the Retail Sector
Exadata in the Retail Sector Jon Mead Managing Director - Rittman Mead Consulting Agenda Introduction Business Problem Approach Design Considerations Observations Wins Summary Q&A What it is not... Introductions
More informationBig Data & QlikView. Democratizing Big Data Analytics. David Freriks Principal Solution Architect
Big Data & QlikView Democratizing Big Data Analytics David Freriks Principal Solution Architect TDWI Vancouver Agenda What really is Big Data? How do we separate hype from reality? How does that relate
More informationSQream Technologies Ltd - Confiden7al
SQream Technologies Ltd - Confiden7al 1 Ge#ng Big Data Done On a GPU- Based Database Ori Netzer VP Product 26- Mar- 14 Analy7cs Performance - 3 TB, 18 Billion records SQream Database 400x More Cost Efficient!
More informationZynga Analytics Leveraging Big Data to Make Games More Fun and Social
Connecting the World Through Games Zynga Analytics Leveraging Big Data to Make Games More Fun and Social Daniel McCaffrey General Manager, Platform and Analytics Engineering World s leading social game
More informationPerformance Management in Big Data Applica6ons. Michael Kopp, Technology Strategist @mikopp
Performance Management in Big Data Applica6ons Michael Kopp, Technology Strategist NoSQL: High Volume/Low Latency DBs Web Java Key Challenges 1) Even Distribu6on 2) Correct Schema and Access paperns 3)
More informationQLIKVIEW DEPLOYMENT FOR BIG DATA ANALYTICS AT KING.COM
QLIKVIEW DEPLOYMENT FOR BIG DATA ANALYTICS AT KING.COM QlikView Technical Case Study Series Big Data June 2012 qlikview.com Introduction This QlikView technical case study focuses on the QlikView deployment
More informationBig Data Architecture & Analytics A comprehensive approach to harness big data architecture and analytics for growth
MAKING BIG DATA COME ALIVE Big Data Architecture & Analytics A comprehensive approach to harness big data architecture and analytics for growth Steve Gonzales, Principal Manager steve.gonzales@thinkbiganalytics.com
More informationInnovative technology for big data analytics
Technical white paper Innovative technology for big data analytics The HP Vertica Analytics Platform database provides price/performance, scalability, availability, and ease of administration Table of
More informationData Warehousing and Analytics Infrastructure at Facebook. Ashish Thusoo & Dhruba Borthakur athusoo,dhruba@facebook.com
Data Warehousing and Analytics Infrastructure at Facebook Ashish Thusoo & Dhruba Borthakur athusoo,dhruba@facebook.com Overview Challenges in a Fast Growing & Dynamic Environment Data Flow Architecture,
More informationBeyond Web Application Log Analysis using Apache TM Hadoop. A Whitepaper by Orzota, Inc.
Beyond Web Application Log Analysis using Apache TM Hadoop A Whitepaper by Orzota, Inc. 1 Web Applications As more and more software moves to a Software as a Service (SaaS) model, the web application has
More informationSystem Requirements Table of contents
Table of contents 1 Introduction... 2 2 Knoa Agent... 2 2.1 System Requirements...2 2.2 Environment Requirements...4 3 Knoa Server Architecture...4 3.1 Knoa Server Components... 4 3.2 Server Hardware Setup...5
More informationhmetrix Revolutionizing Healthcare Analytics with Vertica & Tableau
Powered by Vertica Solution Series in conjunction with: hmetrix Revolutionizing Healthcare Analytics with Vertica & Tableau The cost of healthcare in the US continues to escalate. Consumers, employers,
More informationTesting 3Vs (Volume, Variety and Velocity) of Big Data
Testing 3Vs (Volume, Variety and Velocity) of Big Data 1 A lot happens in the Digital World in 60 seconds 2 What is Big Data Big Data refers to data sets whose size is beyond the ability of commonly used
More informationSisense. Product Highlights. www.sisense.com
Sisense Product Highlights Introduction Sisense is a business intelligence solution that simplifies analytics for complex data by offering an end-to-end platform that lets users easily prepare and analyze
More informationHow, What, and Where of Data Warehouses for MySQL
How, What, and Where of Data Warehouses for MySQL Robert Hodges CEO, Continuent. Introducing Continuent The leading provider of clustering and replication for open source DBMS Our Product: Continuent Tungsten
More informationBig Data on AWS. Services Overview. Bernie Nallamotu Principle Solutions Architect
on AWS Services Overview Bernie Nallamotu Principle Solutions Architect \ So what is it? When your data sets become so large that you have to start innovating around how to collect, store, organize, analyze
More informationWhere We Are. References. Cloud Computing. Levels of Service. Cloud Computing History. Introduction to Data Management CSE 344
Where We Are Introduction to Data Management CSE 344 Lecture 25: DBMS-as-a-service and NoSQL We learned quite a bit about data management see course calendar Three topics left: DBMS-as-a-service and NoSQL
More informationStudent Project 1 - Explorative Data Analysis with Hadoop and Spark
Student Project 1 - Explorative Data Analysis with Hadoop and Spark 42matters is a rapidly growing start up, leading the development of next generation mobile user modeling technology. Our solutions are
More informationApache Kylin Introduction Dec 8, 2014 @ApacheKylin
Apache Kylin Introduction Dec 8, 2014 @ApacheKylin Luke Han Sr. Product Manager lukhan@ebay.com @lukehq Yang Li Architect & Tech Leader yangli9@ebay.com Agenda What s Apache Kylin? Tech Highlights Performance
More informationExadata for Oracle DBAs. Longtime Oracle DBA
Exadata for Oracle DBAs Longtime Oracle DBA Why this Session? I m an Oracle DBA Familiar with RAC, 11gR2 and ASM About to become a Database Machine Administrator (DMA) How much do I have to learn? How
More informationSession 1: IT Infrastructure Security Vertica / Hadoop Integration and Analytic Capabilities for Federal Big Data Challenges
Session 1: IT Infrastructure Security Vertica / Hadoop Integration and Analytic Capabilities for Federal Big Data Challenges James Campbell Corporate Systems Engineer HP Vertica jcampbell@vertica.com Big
More informationResource Sizing: Spotfire for AWS
Resource Sizing: for AWS With TIBCO for AWS, you can have the best in analytics software available at your fingertips in just a few clicks. On a single Amazon Machine Image (AMI), you get a multi-user
More informationOracle Database 12c Plug In. Switch On. Get SMART.
Oracle Database 12c Plug In. Switch On. Get SMART. Duncan Harvey Head of Core Technology, Oracle EMEA March 2015 Safe Harbor Statement The following is intended to outline our general product direction.
More informationDatabase Design Patterns. Winter 2006-2007 Lecture 24
Database Design Patterns Winter 2006-2007 Lecture 24 Trees and Hierarchies Many schemas need to represent trees or hierarchies of some sort Common way of representing trees: An adjacency list model Each
More informationSearch Big Data with MySQL and Sphinx. Mindaugas Žukas www.ivinco.com
Search Big Data with MySQL and Sphinx Mindaugas Žukas www.ivinco.com Agenda Big Data Architecture Factors and Technologies MySQL and Big Data Sphinx Search Server overview Case study: building a Big Data
More information2010 Ingres Corporation. Interactive BI for Large Data Volumes Silicon India BI Conference, 2011, Mumbai Vivek Bhatnagar, Ingres Corporation
Interactive BI for Large Data Volumes Silicon India BI Conference, 2011, Mumbai Vivek Bhatnagar, Ingres Corporation Agenda Need for Fast Data Analysis & The Data Explosion Challenge Approaches Used Till
More informationAn Architectural Review Of Integrating MicroStrategy With SAP BW
An Architectural Review Of Integrating MicroStrategy With SAP BW Manish Jindal MicroStrategy Principal HCL Objectives To understand how MicroStrategy integrates with SAP BW Discuss various Design Options
More informationAnalyzing Big Data with AWS
Analyzing Big Data with AWS Peter Sirota, General Manager, Amazon Elastic MapReduce @petersirota What is Big Data? Computer generated data Application server logs (web sites, games) Sensor data (weather,
More informationDelivering Intelligence to Publishers Through Big Data
Delivering Intelligence to Publishers Through Big Data 2015-05- 21 Jonathan Sharley Team Lead, Data Operations www.sovrn.com Who is Sovrn? Ø An advertising network with direct relationships to 20,000+
More informationMark Bennett. Search and the Virtual Machine
Mark Bennett Search and the Virtual Machine Agenda Intro / Business Drivers What to do with Search + Virtual What Makes Search Fast (or Slow!) Virtual Platforms Test Results Trends / Wrap Up / Q & A Business
More informationEnabling Database-as-a-Service (DBaaS) within Enterprises or Cloud Offerings
Solution Brief Enabling Database-as-a-Service (DBaaS) within Enterprises or Cloud Offerings Introduction Accelerating time to market, increasing IT agility to enable business strategies, and improving
More informationMoving From Hadoop to Spark
+ Moving From Hadoop to Spark Sujee Maniyam Founder / Principal @ www.elephantscale.com sujee@elephantscale.com Bay Area ACM meetup (2015-02-23) + HI, Featured in Hadoop Weekly #109 + About Me : Sujee
More informationHadoop: A Framework for Data- Intensive Distributed Computing. CS561-Spring 2012 WPI, Mohamed Y. Eltabakh
1 Hadoop: A Framework for Data- Intensive Distributed Computing CS561-Spring 2012 WPI, Mohamed Y. Eltabakh 2 What is Hadoop? Hadoop is a software framework for distributed processing of large datasets
More informationThe Arts & Science of Tuning HANA models for Performance. Abani Pattanayak, SAP HANA CoE Nov 12, 2015
The Arts & Science of Tuning HANA models for Performance Abani Pattanayak, SAP HANA CoE Nov 12, 2015 Disclaimer This presentation outlines our general product direction and should not be relied on in making
More informationHP Vertica and MicroStrategy 10: a functional overview including recommendations for performance optimization. Presented by: Ritika Rahate
HP Vertica and MicroStrategy 10: a functional overview including recommendations for performance optimization Presented by: Ritika Rahate MicroStrategy Data Access Workflows There are numerous ways for
More informationKNIME & Avira, or how I ve learned to love Big Data
KNIME & Avira, or how I ve learned to love Big Data Facts about Avira (AntiVir) 100 mio. customers Extreme Reliability 500 employees (Tettnang, San Francisco, Kuala Lumpur, Bucharest, Amsterdam) Company
More informationINTRODUCING DRUID: FAST AD-HOC QUERIES ON BIG DATA MICHAEL DRISCOLL - CEO ERIC TSCHETTER - LEAD ARCHITECT @ METAMARKETS
INTRODUCING DRUID: FAST AD-HOC QUERIES ON BIG DATA MICHAEL DRISCOLL - CEO ERIC TSCHETTER - LEAD ARCHITECT @ METAMARKETS MICHAEL E. DRISCOLL CEO @ METAMARKETS - @MEDRISCOLL Metamarkets is the bridge from
More informationPreview of Oracle Database 12c In-Memory Option. Copyright 2013, Oracle and/or its affiliates. All rights reserved.
Preview of Oracle Database 12c In-Memory Option 1 The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any
More informationParquet. Columnar storage for the people
Parquet Columnar storage for the people Julien Le Dem @J_ Processing tools lead, analytics infrastructure at Twitter Nong Li nong@cloudera.com Software engineer, Cloudera Impala Outline Context from various
More informationThe Technology Evaluator s Cheat Sheets. Business Intelligence & Analy:cs
The Technology Evaluator s Cheat Sheets Business Intelligence & Analy:cs Summary So1ware Stacks Full Stacks (DB + ETL Tools + Front- End So1ware) Back- End Stacks (DB and/or ETL Tools Only) Front- End
More informationBig Data Analytics Platform @ Nokia
Big Data Analytics Platform @ Nokia 1 Selecting the Right Tool for the Right Workload Yekesa Kosuru Nokia Location & Commerce Strata + Hadoop World NY - Oct 25, 2012 Agenda Big Data Analytics Platform
More informationTiber Solutions. Understanding the Current & Future Landscape of BI and Data Storage. Jim Hadley
Tiber Solutions Understanding the Current & Future Landscape of BI and Data Storage Jim Hadley Tiber Solutions Founded in 2005 to provide Business Intelligence / Data Warehousing / Big Data thought leadership
More informationIntroduction to Hadoop HDFS and Ecosystems. Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data
Introduction to Hadoop HDFS and Ecosystems ANSHUL MITTAL Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data Topics The goal of this presentation is to give
More informationAmerica s Most Wanted a metric to detect persistently faulty machines in Hadoop
America s Most Wanted a metric to detect persistently faulty machines in Hadoop Dhruba Borthakur and Andrew Ryan dhruba,andrewr1@facebook.com Presented at IFIP Workshop on Failure Diagnosis, Chicago June
More informationMonitor and Manage Your MicroStrategy BI Environment Using Enterprise Manager and Health Center
Monitor and Manage Your MicroStrategy BI Environment Using Enterprise Manager and Health Center Presented by: Dennis Liao Sales Engineer Zach Rea Sales Engineer January 27 th, 2015 Session 4 This Session
More informationSAP HANA SAP s In-Memory Database. Dr. Martin Kittel, SAP HANA Development January 16, 2013
SAP HANA SAP s In-Memory Database Dr. Martin Kittel, SAP HANA Development January 16, 2013 Disclaimer This presentation outlines our general product direction and should not be relied on in making a purchase
More informationBringing Big Data to People
Bringing Big Data to People Microsoft s modern data platform SQL Server 2014 Analytics Platform System Microsoft Azure HDInsight Data Platform Everyone should have access to the data they need. Process
More informationA Comparison of Approaches to Large-Scale Data Analysis
A Comparison of Approaches to Large-Scale Data Analysis Sam Madden MIT CSAIL with Andrew Pavlo, Erik Paulson, Alexander Rasin, Daniel Abadi, David DeWitt, and Michael Stonebraker In SIGMOD 2009 MapReduce
More informationUsing Hadoop to Expand Data Warehousing
Using Hadoop to Expand Data Warehousing Mike Peterson VP of Platforms and Data Architecture, Neustar Feb 28, 2013 1 Copyright Think Big Analytics and Neustar Inc. Why do this? Transforming to an Information
More informationBringing Big Data Modelling into the Hands of Domain Experts
Bringing Big Data Modelling into the Hands of Domain Experts David Willingham Senior Application Engineer MathWorks david.willingham@mathworks.com.au 2015 The MathWorks, Inc. 1 Data is the sword of the
More informationWINDOWS AZURE DATA MANAGEMENT AND BUSINESS ANALYTICS
WINDOWS AZURE DATA MANAGEMENT AND BUSINESS ANALYTICS Managing and analyzing data in the cloud is just as important as it is anywhere else. To let you do this, Windows Azure provides a range of technologies
More informationOpen source Google-style large scale data analysis with Hadoop
Open source Google-style large scale data analysis with Hadoop Ioannis Konstantinou Email: ikons@cslab.ece.ntua.gr Web: http://www.cslab.ntua.gr/~ikons Computing Systems Laboratory School of Electrical
More informationData Warehouse and Business Intelligence Testing: Challenges, Best Practices & the Solution
Warehouse and Business Intelligence : Challenges, Best Practices & the Solution Prepared by datagaps http://www.datagaps.com http://www.youtube.com/datagaps http://www.twitter.com/datagaps Contact contact@datagaps.com
More informationMicrosoft Analytics Platform System. Solution Brief
Microsoft Analytics Platform System Solution Brief Contents 4 Introduction 4 Microsoft Analytics Platform System 5 Enterprise-ready Big Data 7 Next-generation performance at scale 10 Engineered for optimal
More informationIntegrating Apache Spark with an Enterprise Data Warehouse
Integrating Apache Spark with an Enterprise Warehouse Dr. Michael Wurst, IBM Corporation Architect Spark/R/Python base Integration, In-base Analytics Dr. Toni Bollinger, IBM Corporation Senior Software
More informationJun Liu, Senior Software Engineer Bianny Bian, Engineering Manager SSG/STO/PAC
Jun Liu, Senior Software Engineer Bianny Bian, Engineering Manager SSG/STO/PAC Agenda Quick Overview of Impala Design Challenges of an Impala Deployment Case Study: Use Simulation-Based Approach to Design
More informationTuning Microsoft SQL Server for SharePoint. Daniel Glenn
Tuning Microsoft SQL Server for SharePoint Daniel Glenn Daniel Glenn @DanielGlenn http://knowsp.com SharePoint and Collaboration Practice Leader @ InfoWorks, Inc. www.infoworks-tn.com PASS Nashville Business
More informationData Warehouse in the Cloud Marketing or Reality? Alexei Khalyako Sr. Program Manager Windows Azure Customer Advisory Team
Data Warehouse in the Cloud Marketing or Reality? Alexei Khalyako Sr. Program Manager Windows Azure Customer Advisory Team Data Warehouse we used to know High-End workload High-End hardware Special know-how
More informationLecture Data Warehouse Systems
Lecture Data Warehouse Systems Eva Zangerle SS 2013 PART C: Novel Approaches in DW NoSQL and MapReduce Stonebraker on Data Warehouses Star and snowflake schemas are a good idea in the DW world C-Stores
More informationOpen Source Technologies on Microsoft Azure
Open Source Technologies on Microsoft Azure A Survey @DChappellAssoc Copyright 2014 Chappell & Associates The Main Idea i Open source technologies are a fundamental part of Microsoft Azure The Big Questions
More informationHadoop & its Usage at Facebook
Hadoop & its Usage at Facebook Dhruba Borthakur Project Lead, Hadoop Distributed File System dhruba@apache.org Presented at the The Israeli Association of Grid Technologies July 15, 2009 Outline Architecture
More informationOpen source large scale distributed data management with Google s MapReduce and Bigtable
Open source large scale distributed data management with Google s MapReduce and Bigtable Ioannis Konstantinou Email: ikons@cslab.ece.ntua.gr Web: http://www.cslab.ntua.gr/~ikons Computing Systems Laboratory
More informationBringing Big Data into the Enterprise
Bringing Big Data into the Enterprise Overview When evaluating Big Data applications in enterprise computing, one often-asked question is how does Big Data compare to the Enterprise Data Warehouse (EDW)?
More informationRealtime Apache Hadoop at Facebook. Jonathan Gray & Dhruba Borthakur June 14, 2011 at SIGMOD, Athens
Realtime Apache Hadoop at Facebook Jonathan Gray & Dhruba Borthakur June 14, 2011 at SIGMOD, Athens Agenda 1 Why Apache Hadoop and HBase? 2 Quick Introduction to Apache HBase 3 Applications of HBase at
More informationOracle BI EE Implementation on Netezza. Prepared by SureShot Strategies, Inc.
Oracle BI EE Implementation on Netezza Prepared by SureShot Strategies, Inc. The goal of this paper is to give an insight to Netezza architecture and implementation experience to strategize Oracle BI EE
More informationHow to Leverage Cloud to Quickly Build Scalable Applications
How to Leverage Cloud to Quickly Build Scalable Applications Chris Keyser Principal Solution Architect David Polley Senior Director Cloud Product Management Cloud Growth Recent IDC cloud research shows
More informationData processing goes big
Test report: Integration Big Data Edition Data processing goes big Dr. Götz Güttich Integration is a powerful set of tools to access, transform, move and synchronize data. With more than 450 connectors,
More informationIncreasing XenServer s VM density
Increasing XenServer s VM density Jonathan Davies, XenServer System Performance Lead XenServer Engineering, Citrix Cambridge, UK 24 Oct 2013 Jonathan Davies (Citrix) Increasing XenServer s VM density 24
More informationReplicating to everything
Replicating to everything Featuring Tungsten Replicator A Giuseppe Maxia, QA Architect Vmware About me Giuseppe Maxia, a.k.a. "The Data Charmer" QA Architect at VMware Previously at AB / Sun / 3 times
More informationBig Data Approaches. Making Sense of Big Data. Ian Crosland. Jan 2016
Big Data Approaches Making Sense of Big Data Ian Crosland Jan 2016 Accelerate Big Data ROI Even firms that are investing in Big Data are still struggling to get the most from it. Make Big Data Accessible
More informationGoogle Analytics and Google Analytics Premium: limits and quotas
Table Of Contents Data collection & Processing limits Accounts and Profiles Reports Admin Area Google Analytics data fields Lengths Google Analytics API Data collection & Processing limits 10 million hits
More informationTesting Big data is one of the biggest
Infosys Labs Briefings VOL 11 NO 1 2013 Big Data: Testing Approach to Overcome Quality Challenges By Mahesh Gudipati, Shanthi Rao, Naju D. Mohan and Naveen Kumar Gajja Validate data quality by employing
More informationAn Oracle White Paper June 2012. High Performance Connectors for Load and Access of Data from Hadoop to Oracle Database
An Oracle White Paper June 2012 High Performance Connectors for Load and Access of Data from Hadoop to Oracle Database Executive Overview... 1 Introduction... 1 Oracle Loader for Hadoop... 2 Oracle Direct
More informationBIG DATA CAN DRIVE THE BUSINESS AND IT TO EVOLVE AND ADAPT RALPH KIMBALL BUSSUM 2014
BIG DATA CAN DRIVE THE BUSINESS AND IT TO EVOLVE AND ADAPT RALPH KIMBALL BUSSUM 2014 Ralph Kimball Associates 2014 The Data Warehouse Mission Identify all possible enterprise data assets Select those assets
More informationThe Future of Data Management
The Future of Data Management with Hadoop and the Enterprise Data Hub Amr Awadallah (@awadallah) Cofounder and CTO Cloudera Snapshot Founded 2008, by former employees of Employees Today ~ 800 World Class
More informationInge Os Sales Consulting Manager Oracle Norway
Inge Os Sales Consulting Manager Oracle Norway Agenda Oracle Fusion Middelware Oracle Database 11GR2 Oracle Database Machine Oracle & Sun Agenda Oracle Fusion Middelware Oracle Database 11GR2 Oracle Database
More informationCollege of Engineering, Technology, and Computer Science
College of Engineering, Technology, and Computer Science Design and Implementation of Cloud-based Data Warehousing In partial fulfillment of the requirements for the Degree of Master of Science in Technology
More information