Big Data is Dead, Long Live Business Intelligence?
|
|
- Scott Garrison
- 7 years ago
- Views:
Transcription
1 berlin Big Data is Dead, Long Live Business Intelligence? Michael Muckel, Head of Data Platform Markus Schmidberger, Data Platform Architect Berlin, April 12 th , Amazon Web s, Inc. or its Affiliates. All rights reserved.
2 Glomex: A ProSiebenSat.1 company Page 2
3 Glomex The Global Media Exchange Glomex Video Value Platform Publishers Content providers Media Delivery Platform Non-P7S1 publishers External broadcasters Web-only content owners Media Exchange Platform Page 3
4 Glomex Data Platform Video Value Platform Media Delivery Platform Media Exchange Platform Data Platform Real-time-Monitoring Batch Analytics Machine Learning Page 4
5 Key Components of our New Data Platform Real-Time Monitoring Enable our development teams to serve our content to our users in the best quality possible. Analytics Provide our teams access to the data to enable data-driven development of new features and products. Content Discovery Find the most relevant content for our customers and their users. Page 5
6 Lambda Architecture AWS Lambda Graphic provided by Page 6
7 Simplify Data Processing data ingest / collect store process / analyze visualize / serve answers Time to Answer (Latency) Throughput Cost more concrete numbers at the end Page 7
8 Data Processing in Big Data World Collect Store ETL Analyze Consume IoT Logging Applications ios Web Apps Mobile Apps Logstash A Android Transactional Data Search Data File Data Stream Data Search SQL NoSQL Cache File Storage Stream Storage Amazon ElastiCache Amazon DynamoDB Amazon RDS Amazon ES Amazon S3 Amazon Glacier Apache Kafka Amazon Kinesis Amazon DynamoDB Hot Warm Hot Cold Hot ML Stream Processing Batch Interactive Amazon ML Amazon Redshift Impala Pig Streaming Amazon Kinesis AWS Lambda Amazon Elastic MapReduce Fast Slow Fast Analysis & Visualization Notebooks IDE Predictions Amazon QuickSight Apps & APIs Page 8
9 Our Data Platform Architecture Data Platform - Micro Layout Content API Content Content Discovery other modules CDN files CDN Log Data Management Metadata KPI & Analytics Data API Portal data stream data stream AdProxy Log VAS Log Data Lake Data Layer Technical Monitoring Real-Time Dashboards data stream Player Feedback Data Quality Dev / Ops Analytics Data Platform Access Team External Data Data Science Analytics Data Science UI Data Platform Monitoring INGEST STORE PROCESS & ANALYSE VISUALIZE & SERVE Page 9
10 Real-Time Player Monitoring Data Platform - Micro Layout Content API Content Content Discovery other modules CDN files CDN Log Data Management Metadata KPI & Analytics Data API Portal data stream data stream AdProxy Log VAS Log Data Lake Data Layer Technical Monitoring Real-Time Dashboards data stream Player Feedback Data Quality Dev / Ops Analytics Data Platform Access Team External Data Data Science Analytics Data Science UI Data Platform Monitoring INGEST STORE PROCESS & ANALYSE VISUALIZE & SERVE Page 10
11 Monitoring Video-Streaming Experience Focus on Metrics from the User s Perspective From Server-Uptime To (anonymized) Real-User Monitoring Page 11
12 1 Analyze 3 Automate 2 Take Actions Page 12
13 Our Ingest Process Page 13
14 Kinesis Firehose is doing his job Next session: Streaming Data: The Opportunity and How to Work With It Page 14
15 Data Facts 20 GB 5 Billion ~100 ms Per day click-stream data in Kinesis Firehose Record processed per day Data freshness to S3 Page 15
16 ElasticSearch + Grafana for real-time analyses Not AWS managed! Page 16
17 ElasticSearch on Spot Instances Page 17
18 CDN Batch Processing Data Platform - Micro Layout Content API Content Content Discovery other modules CDN files CDN Log Data Management Metadata KPI & Analytics Data API Portal data stream data stream AdProxy Log VAS Log Data Lake Data Layer Technical Monitoring Real-Time Dashboards data stream Player Feedback Data Quality Dev / Ops Analytics Data Platform Access Team External Data Data Science Analytics Data Science UI Data Platform Monitoring INGEST STORE PROCESS & ANALYSE VISUALIZE & SERVE Page 18
19 Processing CDN-Logs 25 GB 300 Million Per day as zipped log-files Record processed per day + Normal challenges with external data sources Out-of-order deliver / Data quality issues / Varying file sizes / etc. Page 19
20 Requirements for our Data Processing Pipeline Monitor Complete Pipeline Enable Reprocessing of Historical Datasets Be Ready to Scale Page 20
21 Our CDN Pipeline Page 21
22 AWS Lambda Limits 5 min 512 MB AWS Lambda Timeout AWS Lambda temp disk How to process 800MB gziped logfile? How to split compressed gzip files? Splitter using Amazon SQS and Amazon EC2 Spot Instances Page 22
23 Our Meta Data Store AWS Big Data Blog: Maintaining-an-Amazon-S3-Metadata-Index-without-Servers
24 Our Meta Data Store Page 24
25 Be serverless and serve data Amazon Kinesis AWS Lambda AWS Lambda Amazon API Gateway Page 25
26 CDN Batch Facts 2.3 min 600 rec/sec 6 1 $ / hour Average run-time of AWS Lambda Processing time Parallel AWS Lambda functions Cost for 25 GB/day CDN processing AWS Lambda duration Redshift CPU Page 26
27 Data Science Environment Data Platform - Micro Layout Content API Content Content Discovery other modules CDN files CDN Log Data Management Metadata KPI & Analytics Data API Portal data stream data stream AdProxy Log VAS Log Data Lake Data Layer Technical Monitoring Real-Time Dashboards data stream Player Feedback Data Quality Dev / Ops Analytics Data Platform Access Team External Data Data Science Analytics Data Science UI Data Platform Monitoring INGEST STORE PROCESS & ANALYSE VISUALIZE & SERVE Page 27
28 Data Science Environment Project Jupyter: Page 28
29 Data Science Environment - Architecture Data Sources Amazon Kinesis Amazon Redshift Amazon S3 Elasticsearch Cluster Technology Amazon EMR In development Development Github In development Page 29
30 Our Lambda Architecture on AWS Data Platform - Lambda Architecture Batch Layer other player modules CDN files Amazon Redshift AWS Lambda Amazon API Gateway Portal AWS Lambda Amazon Elastic MapReduce + Spark Serving Layer EC2 with Caravel S3 EC2 with Jupyther Team data stream Instance with Kinesis Agent Amazon KinesisFirehose AWS Lambda EC2 with ElasticSearch EC2 with Grafana Speed Layer Applications Page 30
31 Key Takeaways Lambda Architecture Enrich your traditional, batch-driven BI-workflow with real-time analytics Use Lambda-Architecture as a guiding principle and adapt it to your needs Page 31
32 Key Takeaways Focus on features development and robust pipelines not on infrastructure management AWS managed services provide an robust way to run complex big data infrastructures Follow best-practices provided by AWS and the community Page 32
33 Key Takeaways Provide an open data environments Trust the creativity of your engineering teams to find insights in your datasets Structure your data that it can be access in processed and raw form Notebooks provide easy access to even large distributed datasets Page 33
34 Michael Muckel, Head of Data Platform Markus Schmidberger, Data Platform Architect We are hiring Data Scientists Data Engineers Project Managers
SAS BIG DATA SOLUTIONS ON AWS SAS FORUM ESPAÑA, OCTOBER 16 TH, 2014 IAN MEYERS SOLUTIONS ARCHITECT / AMAZON WEB SERVICES
SAS BIG DATA SOLUTIONS ON AWS SAS FORUM ESPAÑA, OCTOBER 16 TH, 2014 IAN MEYERS SOLUTIONS ARCHITECT / AMAZON WEB SERVICES AWS GLOBAL INFRASTRUCTURE 10 Regions 25 Availability Zones 51 Edge locations WHAT
More informationHadoop & Spark Using Amazon EMR
Hadoop & Spark Using Amazon EMR Michael Hanisch, AWS Solutions Architecture 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Agenda Why did we build Amazon EMR? What is Amazon EMR?
More informationCAPTURING & PROCESSING REAL-TIME DATA ON AWS
CAPTURING & PROCESSING REAL-TIME DATA ON AWS @ 2015 Amazon.com, Inc. and Its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent
More informationReal Time Big Data Processing
Real Time Big Data Processing Cloud Expo 2014 Ian Meyers Amazon Web Services Global Infrastructure Deployment & Administration App Services Analytics Compute Storage Database Networking AWS Global Infrastructure
More informationBig Data on AWS. Services Overview. Bernie Nallamotu Principle Solutions Architect
on AWS Services Overview Bernie Nallamotu Principle Solutions Architect \ So what is it? When your data sets become so large that you have to start innovating around how to collect, store, organize, analyze
More informationCloud Big Data Architectures
Cloud Big Data Architectures Lynn Langit QCon Sao Paulo, Brazil 2016 About this Workshop Real-world Cloud Scenarios w/aws, Azure and GCP 1. Big Data Solution Types 2. Data Pipelines 3. ETL and Visualization
More informationIntegrating a Big Data Platform into Government:
Integrating a Big Data Platform into Government: Drive Better Decisions for Policy and Program Outcomes John Haddad, Senior Director Product Marketing, Informatica Digital Government Institute s Government
More informationBig Data Web Analytics Platform on AWS for Yottaa
Big Data Web Analytics Platform on AWS for Yottaa Background Yottaa is a young, innovative company, providing a website acceleration platform to optimize Web and mobile applications and maximize user experience,
More informationMicroservices on AWS
Microservices on AWS AWS Summit Berlin 2016 Matthias Jung, Solutions Architect Julien Simon, Evangelist April, 12 th, 2016 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Agenda
More informationLambda Architecture for Batch and Real- Time Processing on AWS with Spark Streaming and Spark SQL. May 2015
Lambda Architecture for Batch and Real- Time Processing on AWS with Spark Streaming and Spark SQL May 2015 2015, Amazon Web Services, Inc. or its affiliates. All rights reserved. Notices This document
More informationChukwa, Hadoop subproject, 37, 131 Cloud enabled big data, 4 Codd s 12 rules, 1 Column-oriented databases, 18, 52 Compression pattern, 83 84
Index A Amazon Web Services (AWS), 50, 58 Analytics engine, 21 22 Apache Kafka, 38, 131 Apache S4, 38, 131 Apache Sqoop, 37, 131 Appliance pattern, 104 105 Application architecture, big data analytics
More informationThing Big: How to Scale Your Own Internet of Things. Walter'Pernstecher'-'pernstec@amazon.de' Dr.'Markus'Schmidberger'-'schmidbe@amazon.
Thing Big: How to Scale Your Own Internet of Things Walter'Pernstecher'-'pernstec@amazon.de' Dr.'Markus'Schmidberger'-'schmidbe@amazon.de' Internet of Things is the network of physical objects or "things"
More informationBERLIN. 2015, Amazon Web Services, Inc. or its affiliates. All rights reserved
BERLIN 2015, Amazon Web Services, Inc. or its affiliates. All rights reserved Build Your Mobile App Faster with AWS Mobile Services Jan Metzner AWS Solutions Architect @janmetzner Danilo Poccia AWS Technical
More informationDatenverwaltung im Wandel - Building an Enterprise Data Hub with
Datenverwaltung im Wandel - Building an Enterprise Data Hub with Cloudera Bernard Doering Regional Director, Central EMEA, Cloudera Cloudera Your Hadoop Experts Founded 2008, by former employees of Employees
More informationSTREAM ANALYTIX. Industry s only Multi-Engine Streaming Analytics Platform
STREAM ANALYTIX Industry s only Multi-Engine Streaming Analytics Platform One Platform for All Create real-time streaming data analytics applications in minutes with a powerful visual editor Get a wide
More informationCIS 4930/6930 Spring 2014 Introduction to Data Science Data Intensive Computing. University of Florida, CISE Department Prof.
CIS 4930/6930 Spring 2014 Introduction to Data Science Data Intensive Computing University of Florida, CISE Department Prof. Daisy Zhe Wang Cloud Computing and Amazon Web Services Cloud Computing Amazon
More informationAmazon Redshift & Amazon DynamoDB Michael Hanisch, Amazon Web Services Erez Hadas-Sonnenschein, clipkit GmbH Witali Stohler, clipkit GmbH 2014-05-15
Amazon Redshift & Amazon DynamoDB Michael Hanisch, Amazon Web Services Erez Hadas-Sonnenschein, clipkit GmbH Witali Stohler, clipkit GmbH 2014-05-15 2014 Amazon.com, Inc. and its affiliates. All rights
More informationThe Future of Data Management
The Future of Data Management with Hadoop and the Enterprise Data Hub Amr Awadallah (@awadallah) Cofounder and CTO Cloudera Snapshot Founded 2008, by former employees of Employees Today ~ 800 World Class
More informationHDP Hadoop From concept to deployment.
HDP Hadoop From concept to deployment. Ankur Gupta Senior Solutions Engineer Rackspace: Page 41 27 th Jan 2015 Where are you in your Hadoop Journey? A. Researching our options B. Currently evaluating some
More informationTalend Real-Time Big Data Sandbox. Big Data Insights Cookbook
Talend Real-Time Big Data Talend Real-Time Big Data Overview of Real-time Big Data Pre-requisites to run Setup & Talend License Talend Real-Time Big Data Big Data Setup & About this cookbook What is the
More informationIntroduction to AWS in Higher Ed
Introduction to AWS in Higher Ed Lori Clithero loricli@amazon.com 206.227.5054 University of Washington Cloud Day 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 2 Cloud democratizes
More informationBig data blue print for cloud architecture
Big data blue print for cloud architecture -COGNIZANT Image Area Prabhu Inbarajan Srinivasan Thiruvengadathan Muralicharan Gurumoorthy Praveen Codur 2012, Cognizant Next 30 minutes Big Data / Cloud challenges
More informationAmazon Web Services. Lawrence Berkeley LabTech Conference 9/10/15. Jamie Baker Federal Scientific Account Manager AWS WWPS bakjames@amazon.
Web Services Lawrence Berkeley LabTech Conference 9/10/15 Jamie Baker Federal Scientific Account Manager AWS WWPS bakjames@amazon.com 2015, Web Services, Inc. or its Affiliates. All rights reserved. AWS
More informationBig Data Use Case: Business Analytics
Big Data Use Case: Business Analytics Starting point A telecommunications company wants to allude to the topic of Big Data. The established Big Data working group has access to the data stock of the enterprise
More informationEnd to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ
End to End Solution to Accelerate Data Warehouse Optimization Franco Flore Alliance Sales Director - APJ Big Data Is Driving Key Business Initiatives Increase profitability, innovation, customer satisfaction,
More informationBIG DATA & DATA SCIENCE
BIG DATA & DATA SCIENCE ACADEMY PROGRAMS IN-COMPANY TRAINING PORTFOLIO 2 TRAINING PORTFOLIO 2016 Synergic Academy Solutions BIG DATA FOR LEADING BUSINESS Big data promises a significant shift in the way
More informationMaking big data simple with Databricks
Making big data simple with Databricks We are Databricks, the company behind Spark Founded by the creators of Apache Spark in 2013 Data 75% Share of Spark code contributed by Databricks in 2014 Value Created
More informationNext-Generation Cloud Analytics with Amazon Redshift
Next-Generation Cloud Analytics with Amazon Redshift What s inside Introduction Why Amazon Redshift is Great for Analytics Cloud Data Warehousing Strategies for Relational Databases Analyzing Fast, Transactional
More informationTHE DEVELOPER GUIDE TO BUILDING STREAMING DATA APPLICATIONS
THE DEVELOPER GUIDE TO BUILDING STREAMING DATA APPLICATIONS WHITE PAPER Successfully writing Fast Data applications to manage data generated from mobile, smart devices and social interactions, and the
More informationBIG DATA. Using the Lambda Architecture on a Big Data Platform to Improve Mobile Campaign Management. Author: Sandesh Deshmane
BIG DATA Using the Lambda Architecture on a Big Data Platform to Improve Mobile Campaign Management Author: Sandesh Deshmane Executive Summary Growing data volumes and real time decision making requirements
More informationHadoop Evolution In Organizations. Mark Vervuurt Cluster Data Science & Analytics
In Organizations Mark Vervuurt Cluster Data Science & Analytics AGENDA 1. Yellow Elephant 2. Data Ingestion & Complex Event Processing 3. SQL on Hadoop 4. NoSQL 5. InMemory 6. Data Science & Machine Learning
More informationBackground on Elastic Compute Cloud (EC2) AMI s to choose from including servers hosted on different Linux distros
David Moses January 2014 Paper on Cloud Computing I Background on Tools and Technologies in Amazon Web Services (AWS) In this paper I will highlight the technologies from the AWS cloud which enable you
More informationBig Data Research in the AMPLab: BDAS and Beyond
Big Data Research in the AMPLab: BDAS and Beyond Michael Franklin UC Berkeley 1 st Spark Summit December 2, 2013 UC BERKELEY AMPLab: Collaborative Big Data Research Launched: January 2011, 6 year planned
More informationDashboard Engine for Hadoop
Matt McDevitt Sr. Project Manager Pavan Challa Sr. Data Engineer June 2015 Dashboard Engine for Hadoop Think Big Start Smart Scale Fast Agenda Think Big Overview Engagement Model Solution Offerings Dashboard
More informationBeyond Lambda - how to get from logical to physical. Artur Borycki, Director International Technology & Innovations
Beyond Lambda - how to get from logical to physical Artur Borycki, Director International Technology & Innovations Simplification & Efficiency Teradata believe in the principles of self-service, automation
More informationAnalytics on Spark & Shark @Yahoo
Analytics on Spark & Shark @Yahoo PRESENTED BY Tim Tully December 3, 2013 Overview Legacy / Current Hadoop Architecture Reflection / Pain Points Why the movement towards Spark / Shark New Hybrid Environment
More informationAnalyzing Big Data with AWS
Analyzing Big Data with AWS Peter Sirota, General Manager, Amazon Elastic MapReduce @petersirota What is Big Data? Computer generated data Application server logs (web sites, games) Sensor data (weather,
More informationAutomated Data Ingestion. Bernhard Disselhoff Enterprise Sales Engineer
Automated Data Ingestion Bernhard Disselhoff Enterprise Sales Engineer Agenda Pentaho Overview Templated dynamic ETL workflows Pentaho Data Integration (PDI) Use Cases Pentaho Overview Overview What we
More informationBig Data Architecture & Analytics A comprehensive approach to harness big data architecture and analytics for growth
MAKING BIG DATA COME ALIVE Big Data Architecture & Analytics A comprehensive approach to harness big data architecture and analytics for growth Steve Gonzales, Principal Manager steve.gonzales@thinkbiganalytics.com
More informationBig Data Pipeline and Analytics Platform
Big Data Pipeline and Analytics Platform Using NetflixOSS and Other Open Source Software Sudhir Tonse (@stonse) Danny Yuan (@g9yuayon) Netflix is a log generating company that also happens to stream movies
More informationBIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES
BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES Relational vs. Non-Relational Architecture Relational Non-Relational Rational Predictable Traditional Agile Flexible Modern 2 Agenda Big Data
More informationFrom Spark to Ignition:
From Spark to Ignition: Fueling Your Business on Real-Time Analytics Eric Frenkiel, MemSQL CEO June 29, 2015 San Francisco, CA What s in Store For This Presentation? 1. MemSQL: A real-time database for
More informationLambda Architecture. Near Real-Time Big Data Analytics Using Hadoop. January 2015. Email: bdg@qburst.com Website: www.qburst.com
Lambda Architecture Near Real-Time Big Data Analytics Using Hadoop January 2015 Contents Overview... 3 Lambda Architecture: A Quick Introduction... 4 Batch Layer... 4 Serving Layer... 4 Speed Layer...
More informationBuilding 1000 node cluster on EMR Manjeet Chayel
Building 1000 node cluster on EMR Manjeet Chayel What is EMR? Amazon Elas+c MapReduce Hadoop- as- a- service Map- Reduce engine What is EMR? Integrated with tools Massively parallel Integrated to AWS services
More informationHow to Leverage Cloud to Quickly Build Scalable Applications
How to Leverage Cloud to Quickly Build Scalable Applications Chris Keyser Principal Solution Architect David Polley Senior Director Cloud Product Management Cloud Growth Recent IDC cloud research shows
More informationWHITE PAPER. Building Big Data Analytical Applications at Scale Using Existing ETL Skillsets INTELLIGENT BUSINESS STRATEGIES
INTELLIGENT BUSINESS STRATEGIES WHITE PAPER Building Big Data Analytical Applications at Scale Using Existing ETL Skillsets By Mike Ferguson Intelligent Business Strategies June 2015 Prepared for: Table
More informationNative Connectivity to Big Data Sources in MSTR 10
Native Connectivity to Big Data Sources in MSTR 10 Bring All Relevant Data to Decision Makers Support for More Big Data Sources Optimized Access to Your Entire Big Data Ecosystem as If It Were a Single
More informationThe Internet of Things and Big Data: Intro
The Internet of Things and Big Data: Intro John Berns, Solutions Architect, APAC - MapR Technologies April 22 nd, 2014 1 What This Is; What This Is Not It s not specific to IoT It s not about any specific
More informationCRITEO INTERNSHIP PROGRAM 2015/2016
CRITEO INTERNSHIP PROGRAM 2015/2016 A. List of topics PLATFORM Topic 1: Build an API and a web interface on top of it to manage the back-end of our third party demand component. Challenge(s): Working with
More informationUnified Batch & Stream Processing Platform
Unified Batch & Stream Processing Platform Himanshu Bari Director Product Management Most Big Data Use Cases Are About Improving/Re-write EXISTING solutions To KNOWN problems Current Solutions Were Built
More informationNative Connectivity to Big Data Sources in MicroStrategy 10. Presented by: Raja Ganapathy
Native Connectivity to Big Data Sources in MicroStrategy 10 Presented by: Raja Ganapathy Agenda MicroStrategy supports several data sources, including Hadoop Why Hadoop? How does MicroStrategy Analytics
More informationCapitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes
Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes Highly competitive enterprises are increasingly finding ways to maximize and accelerate
More informationSOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP. Eva Andreasson Cloudera
SOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP Eva Andreasson Cloudera Most FAQ: Super-Quick Overview! The Apache Hadoop Ecosystem a Zoo! Oozie ZooKeeper Hue Impala Solr Hive Pig Mahout HBase MapReduce
More informationOracle Big Data SQL Technical Update
Oracle Big Data SQL Technical Update Jean-Pierre Dijcks Oracle Redwood City, CA, USA Keywords: Big Data, Hadoop, NoSQL Databases, Relational Databases, SQL, Security, Performance Introduction This technical
More informationSAP and Hortonworks Reference Architecture
SAP and Hortonworks Reference Architecture Hortonworks. We Do Hadoop. June Page 1 2014 Hortonworks Inc. 2011 2014. All Rights Reserved A Modern Data Architecture With SAP DATA SYSTEMS APPLICATIO NS Statistical
More informationScalable Architecture on Amazon AWS Cloud
Scalable Architecture on Amazon AWS Cloud Kalpak Shah Founder & CEO, Clogeny Technologies kalpak@clogeny.com 1 * http://www.rightscale.com/products/cloud-computing-uses/scalable-website.php 2 Architect
More informationGanzheitliches Datenmanagement
Ganzheitliches Datenmanagement für Hadoop Michael Kohs, Senior Sales Consultant @mikchaos The Problem with Big Data Projects in 2016 Relational, Mainframe Documents and Emails Data Modeler Data Scientist
More informationBIG DATA ANALYTICS For REAL TIME SYSTEM
BIG DATA ANALYTICS For REAL TIME SYSTEM Where does big data come from? Big Data is often boiled down to three main varieties: Transactional data these include data from invoices, payment orders, storage
More informationBuilding your Big Data Architecture on Amazon Web Services
Building your Big Data Architecture on Amazon Web Services Abhishek Sinha @abysinha sinhaar@amazon.com AWS Services Deployment & Administration Application Services Compute Storage Database Networking
More informationUnlocking the True Value of Hadoop with Open Data Science
Unlocking the True Value of Hadoop with Open Data Science Kristopher Overholt Solution Architect Big Data Tech 2016 MinneAnalytics June 7, 2016 Overview Overview of Open Data Science Python and the Big
More informationDeploy. Friction-free self-service BI solutions for everyone Scalable analytics on a modern architecture
Friction-free self-service BI solutions for everyone Scalable analytics on a modern architecture Apps and data source extensions with APIs Future white label, embed or integrate Power BI Deploy Intelligent
More informationBig Data for everyone Democratizing big data with the cloud. Steffen Krause Technical Evangelist @AWS_Aktuell skrause@amazon.de
Big Data for everyone Democratizing big data with the cloud Steffen Krause Technical Evangelist @AWS_Aktuell skrause@amazon.de Does this Data make me look big? Overview Designing big data solutions in
More informationElasticsearch on Cisco Unified Computing System: Optimizing your UCS infrastructure for Elasticsearch s analytics software stack
Elasticsearch on Cisco Unified Computing System: Optimizing your UCS infrastructure for Elasticsearch s analytics software stack HIGHLIGHTS Real-Time Results Elasticsearch on Cisco UCS enables a deeper
More informationQLIKVIEW INTEGRATION TION WITH AMAZON REDSHIFT John Park Partner Engineering
QLIKVIEW INTEGRATION TION WITH AMAZON REDSHIFT John Park Partner Engineering June 2014 Page 1 Contents Introduction... 3 About Amazon Web Services (AWS)... 3 About Amazon Redshift... 3 QlikView on AWS...
More informationBig Data Approaches. Making Sense of Big Data. Ian Crosland. Jan 2016
Big Data Approaches Making Sense of Big Data Ian Crosland Jan 2016 Accelerate Big Data ROI Even firms that are investing in Big Data are still struggling to get the most from it. Make Big Data Accessible
More informationInformation Builders Mission & Value Proposition
Value 10/06/2015 2015 MapR Technologies 2015 MapR Technologies 1 Information Builders Mission & Value Proposition Economies of Scale & Increasing Returns (Note: Not to be confused with diminishing returns
More informationSPARK USE CASE IN TELCO. Apache Spark Night 9-2-2014! Chance Coble!
SPARK USE CASE IN TELCO Apache Spark Night 9-2-2014! Chance Coble! Use Case Profile Telecommunications company Shared business problems/pain Scalable analytics infrastructure is a problem Pushing infrastructure
More informationBig Data Trends and HDFS Evolution
Big Data Trends and HDFS Evolution Sanjay Radia Founder & Architect Hortonworks Inc Page 1 Hello Founder, Hortonworks Part of the Hadoop team at Yahoo! since 2007 Chief Architect of Hadoop Core at Yahoo!
More informationIntroduction to Amazon Web Services! Leo Zhadanovsky! @leozh leo@amazon.com! Senior Solutions Architect
Introduction to Amazon Web Services! Leo Zhadanovsky! @leozh leo@amazon.com! Senior Solutions Architect AWS HISTORY About How didamazon Amazon Web Services! Deep experience in building and operating global
More informationUpcoming Announcements
Enterprise Hadoop Enterprise Hadoop Jeff Markham Technical Director, APAC jmarkham@hortonworks.com Page 1 Upcoming Announcements April 2 Hortonworks Platform 2.1 A continued focus on innovation within
More informationMoving From Hadoop to Spark
+ Moving From Hadoop to Spark Sujee Maniyam Founder / Principal @ www.elephantscale.com sujee@elephantscale.com Bay Area ACM meetup (2015-02-23) + HI, Featured in Hadoop Weekly #109 + About Me : Sujee
More informationLinux A first-class citizen in Windows Azure. Bruno Terkaly bterkaly@microsoft.com Principal Software Engineer Mobile/Cloud/Startup/Enterprise
Linux A first-class citizen in Windows Azure Bruno Terkaly bterkaly@microsoft.com Principal Software Engineer Mobile/Cloud/Startup/Enterprise 1 First, I am software developer (C/C++, ASM, C#, Java, Node.js,
More informationBig Data at Spotify. Anders Arpteg, Ph D Analytics Machine Learning, Spotify
Big Data at Spotify Anders Arpteg, Ph D Analytics Machine Learning, Spotify Quickly about me Quickly about Spotify What is all the data used for? Quickly about Spark Hadoop MR vs Spark Need for (distributed)
More informationInternet of Things. Opportunity Challenges Solutions
Internet of Things Opportunity Challenges Solutions Copyright 2014 Boeing. All rights reserved. GPDIS_2015.ppt 1 ANALYZING INTERNET OF THINGS USING BIG DATA ECOSYSTEM Internet of Things matter for... Industrial
More informationBig Data & Cloud Computing. Faysal Shaarani
Big Data & Cloud Computing Faysal Shaarani Agenda Business Trends in Data What is Big Data? Traditional Computing Vs. Cloud Computing Snowflake Architecture for the Cloud Business Trends in Data Critical
More informationTE's Analytics on Hadoop and SAP HANA Using SAP Vora
TE's Analytics on Hadoop and SAP HANA Using SAP Vora Naveen Narra Senior Manager TE Connectivity Santha Kumar Rajendran Enterprise Data Architect TE Balaji Krishna - Director, SAP HANA Product Mgmt. -
More informationTechnology Enablement
SOLUTION OVERVIEW 1 ABOUT TECHMILEAGE Founded in 2008 / Tempe, Arizona Over 100 engagements Full range of business & technology services Software Development, Big Data, Cloud/AWS, BI, Advanced Analytics
More informationTapping Into Hadoop and NoSQL Data Sources with MicroStrategy. Presented by: Jeffrey Zhang and Trishla Maru
Tapping Into Hadoop and NoSQL Data Sources with MicroStrategy Presented by: Jeffrey Zhang and Trishla Maru Agenda Big Data Overview All About Hadoop What is Hadoop? How does MicroStrategy connects to Hadoop?
More informationA very short talk about Apache Kylin Business Intelligence meets Big Data. Fabian Wilckens EMEA Solutions Architect
A very short talk about Apache Kylin Business Intelligence meets Big Data Fabian Wilckens EMEA Solutions Architect 1 The challenge today 2 Very quickly: OLAP Online Analytical Processing How many beers
More informationBIG DATA TECHNOLOGY. Hadoop Ecosystem
BIG DATA TECHNOLOGY Hadoop Ecosystem Agenda Background What is Big Data Solution Objective Introduction to Hadoop Hadoop Ecosystem Hybrid EDW Model Predictive Analysis using Hadoop Conclusion What is Big
More informationHow To Create A Data Visualization With Apache Spark And Zeppelin 2.5.3.5
Big Data Visualization using Apache Spark and Zeppelin Prajod Vettiyattil, Software Architect, Wipro Agenda Big Data and Ecosystem tools Apache Spark Apache Zeppelin Data Visualization Combining Spark
More informationAzure Data Lake Analytics
Azure Data Lake Analytics Compose and orchestrate data services at scale Fully managed service to support orchestration of data movement and processing Connect to relational or non-relational data
More informationBig Data Spatial Analytics An Introduction
2013 Esri International User Conference July 8 12, 2013 San Diego, California Technical Workshop Big Data Spatial Analytics An Introduction Marwa Mabrouk Mansour Raad Esri iu UC2013. Technical Workshop
More informationCLOUD COMPUTING FOR THE ENTERPRISE AND GLOBAL COMPANIES Steve Midgley Head of AWS EMEA
CLOUD COMPUTING FOR THE ENTERPRISE AND GLOBAL COMPANIES Steve Midgley Head of AWS EMEA AWS Introduction Why are enterprises choosing AWS? What are enterprises using AWS for? How are enterprise getting
More informationPulsar Realtime Analytics At Scale. Tony Ng April 14, 2015
Pulsar Realtime Analytics At Scale Tony Ng April 14, 2015 Big Data Trends Bigger data volumes More data sources DBs, logs, behavioral & business event streams, sensors Faster analysis Next day to hours
More informationPredictive Analytics with Storm, Hadoop, R on AWS
Douglas Moore Principal Consultant & Architect February 2013 Predictive Analytics with Storm, Hadoop, R on AWS Leading Provider Data Science and Engineering Services Accelerating Your Time to Value using
More informationProfessional Hadoop Solutions
Brochure More information from http://www.researchandmarkets.com/reports/2542488/ Professional Hadoop Solutions Description: The go-to guidebook for deploying Big Data solutions with Hadoop Today's enterprise
More informationThe Future of Data Management with Hadoop and the Enterprise Data Hub
The Future of Data Management with Hadoop and the Enterprise Data Hub Amr Awadallah Cofounder & CTO, Cloudera, Inc. Twitter: @awadallah 1 2 Cloudera Snapshot Founded 2008, by former employees of Employees
More informationUnified Big Data Processing with Apache Spark. Matei Zaharia @matei_zaharia
Unified Big Data Processing with Apache Spark Matei Zaharia @matei_zaharia What is Apache Spark? Fast & general engine for big data processing Generalizes MapReduce model to support more types of processing
More informationHow To Handle Big Data With A Data Scientist
III Big Data Technologies Today, new technologies make it possible to realize value from Big Data. Big data technologies can replace highly customized, expensive legacy systems with a standard solution
More informationBig Data and Analytics: Getting Started with ArcGIS. Mike Park Erik Hoel
Big Data and Analytics: Getting Started with ArcGIS Mike Park Erik Hoel Agenda Overview of big data Distributed computation User experience Data management Big data What is it? Big Data is a loosely defined
More informationBig Data and Industrial Internet
Big Data and Industrial Internet Keijo Heljanko Department of Computer Science and Helsinki Institute for Information Technology HIIT School of Science, Aalto University keijo.heljanko@aalto.fi 16.6-2015
More informationBuilding Cloud-powered Mobile Apps
Building Cloud-powered Mobile Apps Jenny Sun, AWS Solution Architect August 30, 2014 Session Goals Mobile Apps on AWS How to build a mobile app today? Social Logins Geo Tagging File and Data Storage Push
More informationHow Companies are! Using Spark
How Companies are! Using Spark And where the Edge in Big Data will be Matei Zaharia History Decreasing storage costs have led to an explosion of big data Commodity cluster software, like Hadoop, has made
More informationBIG DATA TRENDS AND TECHNOLOGIES
BIG DATA TRENDS AND TECHNOLOGIES THE WORLD OF DATA IS CHANGING Cloud WHAT IS BIG DATA? Big data are datasets that grow so large that they become awkward to work with using onhand database management tools.
More informationIntroduction to Big Data! with Apache Spark" UC#BERKELEY#
Introduction to Big Data! with Apache Spark" UC#BERKELEY# This Lecture" The Big Data Problem" Hardware for Big Data" Distributing Work" Handling Failures and Slow Machines" Map Reduce and Complex Jobs"
More informationRyan Horn, Lead Software Engineer at Twilio. November 12, 2014 Las Vegas. BDT312 Using the Cloud to Scale from a Database to a Data Platform
BDT312 Using the Cloud to Scale from a Database to a Data Platform Ryan Horn, Lead Software Engineer at Twilio November 12, 2014 Las Vegas 2014 Amazon.com, Inc. and its affiliates. All rights reserved.
More informationTime-Series Databases and Machine Learning
Time-Series Databases and Machine Learning Jimmy Bates November 2017 1 Top-Ranked Hadoop 1 3 5 7 Read Write File System World Record Performance High Availability Enterprise-grade Security Distribution
More informationProductionizing a 24/7 Spark Streaming Service on YARN
Productionizing a 24/7 Spark Streaming Service on YARN Issac Buenrostro, Arup Malakar Spark Summit 2014 July 1, 2014 About Ooyala Cross-device video analytics and monetization products and services Founded
More information