Big Data Web Analytics Platform on AWS for Yottaa

Size: px
Start display at page:

Download "Big Data Web Analytics Platform on AWS for Yottaa"

Transcription

1 Big Data Web Analytics Platform on AWS for Yottaa Background Yottaa is a young, innovative company, providing a website acceleration platform to optimize Web and mobile applications and maximize user experience, security and profitability. It delivers fast, personalized experiences to users on any device, browser, location, or connection, and makes it possible to discover each user s browsing context and tailor the app s content and optimization profile to meet the specific needs of the moment. case study

2 Business Challenge The client needed to implement an in-house Operation Intelligence platform but the leading tools on the market do not have the ability to quickly detect issues in system performance and security, so they approached SoftServe, renowned for its expertise in complex Big Data solutions and architecture design methods, for help. It was also important that an Agile approach was used as innovation, expansion, continuous improvement and rapid time to market were the key drivers, forcing everyone in the team to constantly focus on results. Big Data Challenges The project need to solve several challenges, including: High Throughput 1 Billion messages per day Large volume: Estimated 300 TB Near-real time and batch processing at the same time < 1 min event processing latency < 3 sec query response time Semi-structured data sources -Web - Logs BIG DATA WEB ANALYTICS PLATFORM ON AWS 2

3 Project Description Importance of PoC (Proof-of-concept) in Big Data projects Technical risks are an integral part of any Big Data project. In fact, high complexity and immature technology is the current reality that software architects and engineers face. But building a full-scale prototype is not realistic due to the high cost and time requirements. While architecture analysis alone is insufficient to prove many important system properties such as performance and scalability. Instead, MVP (minimum viable product), throw-away and vertical evolutionary prototypes help in areas that architecture analysis cannot sufficiently address. For this project a throwaway prototype (also known as rapid prototypes or proof-of-concept) was chosen to quickly evaluate the riskiest technology selections, and an MVP to get early feedback from end users and updating the product roadmap accordingly. AWS as an Environment for PoC With no hardware to procure, and no infrastructure to maintain and scale, Amazon Web Service is an extremely effective time-to-market accelerator and was the perfect platform for this project, particularly as, at the early stages, the exact software and hardware requirements were not immediately clear. Particular benefits of the platform were: 1.The opportunity to experiment with software and hardware while looking for the appropriate solution (the environment can be provisioned and unprovisioned in just a few clicks). 2. Tight integration between S3 and EMR enabled a unique Hadoop cluster on demand (Amazon EMR) and resulted in significant cost savings. Hadoop itself has two functions, storing and computing, but with S3 fully covering storage, there was no need to keep the Hadoop cluster up BIG DATA WEB ANALYTICS PLATFORM ON AWS 3

4 and running when not utilizing the computing function. And as the web logs were stored on S3, it was possible to terminate the Hadoop cluster without data loss and launch the cluster again only when required. Elasticsearch as a Platform for Dashboarding During the discovery phase, SoftServe s Big Data experts suggested Elasticsearch as a primary data storage for real-time analytics. The goal was to implement near-real time scenarios with high query performance (< 3 sec) and minimum data latency (< 1 min) in a highly concurrent environment. The PoC phase was important in order to mitigate performance risks when utilizing Aggregates - a new functionality in Elasticsearch - and focused primarily on three tasks: Task 1. Quickly populate Elasticsearch with log data, instead of full integration with existing infrastructure which typically takes much longer. Task 2. Discover the optimal hardware and configuration for Elasticsearch and tune it for the required workload. Task 3. Create interactive visualization for testing and demos. The SoftServe team utilized EMR and Elasticsearch-Hadoop driver for Task 1. Pig and Hive was used to parse and load log data from S3 into Elasticsearch (see diagram below) where each Hive external table had been pointed to an Elasticsearch index. For Task 2 (discover optimal hardware) the team experimented with different EC2 types (general purpose, storage optimized, compute optimized, etc.) and block storages (SSD and HDD drives, instance storages, EBS with provisioned IOPS, etc.). Performance and load tests BIG DATA WEB ANALYTICS PLATFORM ON AWS 4

5 showed that the CPU was a bottleneck on the required workload, so finally compute optimize instances (c4 model) was selected as a base EC2 instance type for the Elasticsearch cluster. For Task 3, an initial version of the interactive dashboard using Kibana was created in less than a day. From PoC to Production While some of SoftServe s Big Data experts were working on Elasticsearch evaluation using EMR as a processing engine and load data from S3 into Elasticsearch (ETL approach), others were implementing Flume- >Kafka->Logs Consumers processing pipeline for near-real time scenario. Elasticsearch mappings (data model) was agreed at the beginning, allowing work to commence on both Elasticsearch PoC and near real time scenario, in parallel. This enabled the MVP, and the pre-production system, to be created in a short timeframe. Next Steps Amazon EMR already supports Apache Spark, a powerful computing engine that can supplement the Operation Intelligence platform by introducing stream processing through Spark Streaming and advanced analytics (Spark MLib), out-of-the-box ( new-apache-spark-on-amazon-emr/). But now it fully supports Kafka Direct Approach, Spark has everything required for seamless integration with the system, so the SoftServe team can leverage all advantages of AWS for the future Lambda Architecture. BIG DATA WEB ANALYTICS PLATFORM ON AWS 5

6 Value Delivered AWS proved to be an extremely effective time-to-market accelerator for the Operation Intelligence platform because no time was required for infrastructure deployment and configuration. The pay-as-you-go basis allows costs to be optimized and provides the flexibility to increase capacity over time depending on need. SoftServe have continued helping the client with implementation and technology advice. The first production version has been successfully released and the system is now evolving with new features. SoftServe s Big Data accelerators, experienced team and effective cloud technologies along with the client s strong product vision, were crucial to the project s success. The prototyping approach from PoC to MVP and to full-featured production, once again showed its effectiveness and allowed the agreed business and technical goals to be achieved. BIG DATA WEB ANALYTICS PLATFORM ON AWS 6

7 ebook About SoftServe SoftServe is a leading technology solutions company specializing in software development and consultancy services. Since 1993 we ve been partnering with organizations from start-ups to large enterprises to help them accelerate growth and innovation, transform operational efficiency, and deliver new products to market. To achieve this we ve built a strong team of the brightest, most inquiring minds in the industry, and we form close, collaborative relationships with our clients so we can really understand their needs and deliver intuitive software that exceeds their expectations. Our experience stretches from Big Data/Analytics, Cloud, Security and UX Design to the Internet of Things, Digital Health and Digital Transformation, we have offices across the globe and development centers across Eastern Europe. For more information please visit USA HQ Toll Free: Tel: Ukraine HQ Tel: Bulgaria Tel: Germany Tel: Netherlands Tel: Poland Tel: UK Tel: info@softserveinc.com WEBSITE: BIG DATA WEB ANALYTICS PLATFORM ON AWS 7

Reporting and Analytics Solution for Kony, Next-Gen Mobile PaaS Company

Reporting and Analytics Solution for Kony, Next-Gen Mobile PaaS Company Reporting and Analytics Solution for Kony, Next-Gen Mobile PaaS Company Client Background The SoftServe SAG (Software Architecture Group) is one of the largest, most experienced, and battle tested teams

More information

Distributed Agile Practice for the Healthcare Solution

Distributed Agile Practice for the Healthcare Solution Distributed Agile Practice for the Healthcare Solution Client Background Allscripts is the leader in software, services, information and connectivity solutions that empower physicians and other healthcare

More information

Transformation of Disparate Legacy Software Applications into Complex SaaS Platform

Transformation of Disparate Legacy Software Applications into Complex SaaS Platform Transformation of Disparate Legacy Software Applications into Complex SaaS Platform Client Background Our client is the largest provider of ASC software and therapy software as well as billing services

More information

Manual Penetration Testing for ContractPal

Manual Penetration Testing for ContractPal Manual Penetration Testing for ContractPal Customer Background ContractPal, Inc. is a SaaS Business Process Outsourcing (BPO) company that has been offering its services and custom applications to a wide

More information

Automated Performance Testing of Desktop Applications

Automated Performance Testing of Desktop Applications By Ostap Elyashevskyy Automated Performance Testing of Desktop Applications Introduction For the most part, performance testing is associated with Web applications. This area is more or less covered by

More information

BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES

BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES Relational vs. Non-Relational Architecture Relational Non-Relational Rational Predictable Traditional Agile Flexible Modern 2 Agenda Big Data

More information

Maximize Profitability With SoftServe s Software Asset Management Solution for ISVs

Maximize Profitability With SoftServe s Software Asset Management Solution for ISVs By Richard Herrington Maximize Profitability With SoftServe s Software Asset Management Solution for ISVs How ISVs Can Profitably Extend Legacy Product Support While Investing More in Future Innovations

More information

Hadoop & Spark Using Amazon EMR

Hadoop & Spark Using Amazon EMR Hadoop & Spark Using Amazon EMR Michael Hanisch, AWS Solutions Architecture 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Agenda Why did we build Amazon EMR? What is Amazon EMR?

More information

Big Data Analytics - Accelerated. stream-horizon.com

Big Data Analytics - Accelerated. stream-horizon.com Big Data Analytics - Accelerated stream-horizon.com Legacy ETL platforms & conventional Data Integration approach Unable to meet latency & data throughput demands of Big Data integration challenges Based

More information

From Spark to Ignition:

From Spark to Ignition: From Spark to Ignition: Fueling Your Business on Real-Time Analytics Eric Frenkiel, MemSQL CEO June 29, 2015 San Francisco, CA What s in Store For This Presentation? 1. MemSQL: A real-time database for

More information

Real Time Big Data Processing

Real Time Big Data Processing Real Time Big Data Processing Cloud Expo 2014 Ian Meyers Amazon Web Services Global Infrastructure Deployment & Administration App Services Analytics Compute Storage Database Networking AWS Global Infrastructure

More information

Elasticsearch on Cisco Unified Computing System: Optimizing your UCS infrastructure for Elasticsearch s analytics software stack

Elasticsearch on Cisco Unified Computing System: Optimizing your UCS infrastructure for Elasticsearch s analytics software stack Elasticsearch on Cisco Unified Computing System: Optimizing your UCS infrastructure for Elasticsearch s analytics software stack HIGHLIGHTS Real-Time Results Elasticsearch on Cisco UCS enables a deeper

More information

CAPTURING & PROCESSING REAL-TIME DATA ON AWS

CAPTURING & PROCESSING REAL-TIME DATA ON AWS CAPTURING & PROCESSING REAL-TIME DATA ON AWS @ 2015 Amazon.com, Inc. and Its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent

More information

Reporting and Analytics Solution for Next-Gen Mobile PaaS Company

Reporting and Analytics Solution for Next-Gen Mobile PaaS Company Reporting and Analytics Solution for Next-Gen Mobile PaaS Company CASE STUDY Client Background The SoftServe SAG (Software Architecture Group) is one of the largest, most experienced, and battle tested

More information

Big Data Services From Hitachi Data Systems

Big Data Services From Hitachi Data Systems SOLUTION PROFILE Big Data Services From Hitachi Data Systems Create Strategy, Implement and Manage a Solution for Big Data for Your Organization Big Data Consulting Services and Big Data Transition Services

More information

Accelerating Web-Based SQL Server Applications with SafePeak Plug and Play Dynamic Database Caching

Accelerating Web-Based SQL Server Applications with SafePeak Plug and Play Dynamic Database Caching Accelerating Web-Based SQL Server Applications with SafePeak Plug and Play Dynamic Database Caching A SafePeak Whitepaper February 2014 www.safepeak.com Copyright. SafePeak Technologies 2014 Contents Objective...

More information

The Future of Data Management

The Future of Data Management The Future of Data Management with Hadoop and the Enterprise Data Hub Amr Awadallah (@awadallah) Cofounder and CTO Cloudera Snapshot Founded 2008, by former employees of Employees Today ~ 800 World Class

More information

Cloudera Enterprise Data Hub in Telecom:

Cloudera Enterprise Data Hub in Telecom: Cloudera Enterprise Data Hub in Telecom: Three Customer Case Studies Version: 103 Table of Contents Introduction 3 Cloudera Enterprise Data Hub for Telcos 4 Cloudera Enterprise Data Hub in Telecom: Customer

More information

BIG DATA. Using the Lambda Architecture on a Big Data Platform to Improve Mobile Campaign Management. Author: Sandesh Deshmane

BIG DATA. Using the Lambda Architecture on a Big Data Platform to Improve Mobile Campaign Management. Author: Sandesh Deshmane BIG DATA Using the Lambda Architecture on a Big Data Platform to Improve Mobile Campaign Management Author: Sandesh Deshmane Executive Summary Growing data volumes and real time decision making requirements

More information

Big Analytics: A Next Generation Roadmap

Big Analytics: A Next Generation Roadmap Big Analytics: A Next Generation Roadmap Cloud Developers Summit & Expo: October 1, 2014 Neil Fox, CTO: SoftServe, Inc. 2014 SoftServe, Inc. Remember Life Before The Web? 1994 Even Revolutions Take Time

More information

Customer Case Study. Sharethrough

Customer Case Study. Sharethrough Customer Case Study Customer Case Study Benefits Faster prototyping of new applications Easier debugging of complex pipelines Improved overall engineering team productivity Summary offers a robust advertising

More information

Big Data at Cloud Scale

Big Data at Cloud Scale Big Data at Cloud Scale Pushing the limits of flexible & powerful analytics Copyright 2015 Pentaho Corporation. Redistribution permitted. All trademarks are the property of their respective owners. For

More information

How to Leverage Cloud to Quickly Build Scalable Applications

How to Leverage Cloud to Quickly Build Scalable Applications How to Leverage Cloud to Quickly Build Scalable Applications Chris Keyser Principal Solution Architect David Polley Senior Director Cloud Product Management Cloud Growth Recent IDC cloud research shows

More information

Big Data Analytics Platform @ Nokia

Big Data Analytics Platform @ Nokia Big Data Analytics Platform @ Nokia 1 Selecting the Right Tool for the Right Workload Yekesa Kosuru Nokia Location & Commerce Strata + Hadoop World NY - Oct 25, 2012 Agenda Big Data Analytics Platform

More information

Collaborative Big Data Analytics. Copyright 2012 EMC Corporation. All rights reserved.

Collaborative Big Data Analytics. Copyright 2012 EMC Corporation. All rights reserved. Collaborative Big Data Analytics 1 Big Data Is Less About Size, And More About Freedom TechCrunch!!!!!!!!! Total data: bigger than big data 451 Group Findings: Big Data Is More Extreme Than Volume Gartner!!!!!!!!!!!!!!!

More information

Buying vs. Building Business Analytics. A decision resource for technology and product teams

Buying vs. Building Business Analytics. A decision resource for technology and product teams Buying vs. Building Business Analytics A decision resource for technology and product teams Introduction Providing analytics functionality to your end users can create a number of benefits. Actionable

More information

SAS BIG DATA SOLUTIONS ON AWS SAS FORUM ESPAÑA, OCTOBER 16 TH, 2014 IAN MEYERS SOLUTIONS ARCHITECT / AMAZON WEB SERVICES

SAS BIG DATA SOLUTIONS ON AWS SAS FORUM ESPAÑA, OCTOBER 16 TH, 2014 IAN MEYERS SOLUTIONS ARCHITECT / AMAZON WEB SERVICES SAS BIG DATA SOLUTIONS ON AWS SAS FORUM ESPAÑA, OCTOBER 16 TH, 2014 IAN MEYERS SOLUTIONS ARCHITECT / AMAZON WEB SERVICES AWS GLOBAL INFRASTRUCTURE 10 Regions 25 Availability Zones 51 Edge locations WHAT

More information

Big Data Use Case: Business Analytics

Big Data Use Case: Business Analytics Big Data Use Case: Business Analytics Starting point A telecommunications company wants to allude to the topic of Big Data. The established Big Data working group has access to the data stock of the enterprise

More information

BIG DATA TRENDS AND TECHNOLOGIES

BIG DATA TRENDS AND TECHNOLOGIES BIG DATA TRENDS AND TECHNOLOGIES THE WORLD OF DATA IS CHANGING Cloud WHAT IS BIG DATA? Big data are datasets that grow so large that they become awkward to work with using onhand database management tools.

More information

How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time

How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time SCALEOUT SOFTWARE How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time by Dr. William Bain and Dr. Mikhail Sobolev, ScaleOut Software, Inc. 2012 ScaleOut Software, Inc. 12/27/2012 T wenty-first

More information

Extending the Enterprise Data Warehouse with Hadoop Robert Lancaster. Nov 7, 2012

Extending the Enterprise Data Warehouse with Hadoop Robert Lancaster. Nov 7, 2012 Extending the Enterprise Data Warehouse with Hadoop Robert Lancaster Nov 7, 2012 Who I Am Robert Lancaster Solutions Architect, Hotel Supply Team rlancaster@orbitz.com @rob1lancaster Organizer of Chicago

More information

Neelesh Kamkolkar, Product Manager. A Guide to Scaling Tableau Server for Self-Service Analytics

Neelesh Kamkolkar, Product Manager. A Guide to Scaling Tableau Server for Self-Service Analytics Neelesh Kamkolkar, Product Manager A Guide to Scaling Tableau Server for Self-Service Analytics 2 Many Tableau customers choose to deliver self-service analytics to their entire organization. They strategically

More information

Technology Enablement

Technology Enablement SOLUTION OVERVIEW 1 ABOUT TECHMILEAGE Founded in 2008 / Tempe, Arizona Over 100 engagements Full range of business & technology services Software Development, Big Data, Cloud/AWS, BI, Advanced Analytics

More information

Empowering the Masses with Analytics

Empowering the Masses with Analytics Empowering the Masses with Analytics THE GAP FOR BUSINESS USERS For a discussion of bridging the gap from the perspective of a business user, read Three Ways to Use Data Science. Ask the average business

More information

A very short talk about Apache Kylin Business Intelligence meets Big Data. Fabian Wilckens EMEA Solutions Architect

A very short talk about Apache Kylin Business Intelligence meets Big Data. Fabian Wilckens EMEA Solutions Architect A very short talk about Apache Kylin Business Intelligence meets Big Data Fabian Wilckens EMEA Solutions Architect 1 The challenge today 2 Very quickly: OLAP Online Analytical Processing How many beers

More information

Big Data and Industrial Internet

Big Data and Industrial Internet Big Data and Industrial Internet Keijo Heljanko Department of Computer Science and Helsinki Institute for Information Technology HIIT School of Science, Aalto University keijo.heljanko@aalto.fi 16.6-2015

More information

Databricks. A Primer

Databricks. A Primer Databricks A Primer Who is Databricks? Databricks vision is to empower anyone to easily build and deploy advanced analytics solutions. The company was founded by the team who created Apache Spark, a powerful

More information

Databricks. A Primer

Databricks. A Primer Databricks A Primer Who is Databricks? Databricks was founded by the team behind Apache Spark, the most active open source project in the big data ecosystem today. Our mission at Databricks is to dramatically

More information

Big data blue print for cloud architecture

Big data blue print for cloud architecture Big data blue print for cloud architecture -COGNIZANT Image Area Prabhu Inbarajan Srinivasan Thiruvengadathan Muralicharan Gurumoorthy Praveen Codur 2012, Cognizant Next 30 minutes Big Data / Cloud challenges

More information

HDP Hadoop From concept to deployment.

HDP Hadoop From concept to deployment. HDP Hadoop From concept to deployment. Ankur Gupta Senior Solutions Engineer Rackspace: Page 41 27 th Jan 2015 Where are you in your Hadoop Journey? A. Researching our options B. Currently evaluating some

More information

Building Success on Acquia Cloud:

Building Success on Acquia Cloud: Building Success on Acquia Cloud: 10 Layers of PaaS TECHNICAL Guide Table of Contents Executive Summary.... 3 Introducing the 10 Layers of PaaS... 4 The Foundation: Five Layers of PaaS Infrastructure...

More information

Cisco Data Preparation

Cisco Data Preparation Data Sheet Cisco Data Preparation Unleash your business analysts to develop the insights that drive better business outcomes, sooner, from all your data. As self-service business intelligence (BI) and

More information

Amazon EC2 Product Details Page 1 of 5

Amazon EC2 Product Details Page 1 of 5 Amazon EC2 Product Details Page 1 of 5 Amazon EC2 Functionality Amazon EC2 presents a true virtual computing environment, allowing you to use web service interfaces to launch instances with a variety of

More information

DATA ENGINEERING FELLOWS PROGRAM

DATA ENGINEERING FELLOWS PROGRAM DATA ENGINEERING FELLOWS PROGRAM Insight Data Engineering Fellows Program is an intensive, seven week professional training fellowship. The program enables software engineers and academic programmers to

More information

Building your Big Data Architecture on Amazon Web Services

Building your Big Data Architecture on Amazon Web Services Building your Big Data Architecture on Amazon Web Services Abhishek Sinha @abysinha sinhaar@amazon.com AWS Services Deployment & Administration Application Services Compute Storage Database Networking

More information

Ali Ghodsi Head of PM and Engineering Databricks

Ali Ghodsi Head of PM and Engineering Databricks Making Big Data Simple Ali Ghodsi Head of PM and Engineering Databricks Big Data is Hard: A Big Data Project Tasks Tasks Build a Hadoop cluster Challenges Clusters hard to setup and manage Build a data

More information

BIG DATA FOR MEDIA SIGMA DATA SCIENCE GROUP MARCH 2ND, OSLO

BIG DATA FOR MEDIA SIGMA DATA SCIENCE GROUP MARCH 2ND, OSLO BIG DATA FOR MEDIA SIGMA DATA SCIENCE GROUP MARCH 2ND, OSLO ANTHONY A. KALINDE SIGMA DATA SCIENCE GROUP ASSOCIATE "REALTIME BEHAVIOURAL DATA COLLECTION CLICKSTREAM EXAMPLE" WHAT IS CLICKSTREAM ANALYTICS?

More information

Increasing revenue realization CASE STUDY. by leveraging. Big Data. Mobile marketing platform

Increasing revenue realization CASE STUDY. by leveraging. Big Data. Mobile marketing platform Increasing revenue realization CASE STUDY by leveraging Big Data Mobile marketing platform background Opera Mediaworks is a part of Opera Software. It is the world's leading mobile advertising platform.

More information

Big Data on AWS. Services Overview. Bernie Nallamotu Principle Solutions Architect

Big Data on AWS. Services Overview. Bernie Nallamotu Principle Solutions Architect on AWS Services Overview Bernie Nallamotu Principle Solutions Architect \ So what is it? When your data sets become so large that you have to start innovating around how to collect, store, organize, analyze

More information

Cloud Big Data Architectures

Cloud Big Data Architectures Cloud Big Data Architectures Lynn Langit QCon Sao Paulo, Brazil 2016 About this Workshop Real-world Cloud Scenarios w/aws, Azure and GCP 1. Big Data Solution Types 2. Data Pipelines 3. ETL and Visualization

More information

Comprehensive Analytics on the Hortonworks Data Platform

Comprehensive Analytics on the Hortonworks Data Platform Comprehensive Analytics on the Hortonworks Data Platform We do Hadoop. Page 1 Page 2 Back to 2005 Page 3 Vertical Scaling Page 4 Vertical Scaling Page 5 Vertical Scaling Page 6 Horizontal Scaling Page

More information

Performance and Scalability Overview

Performance and Scalability Overview Performance and Scalability Overview This guide provides an overview of some of the performance and scalability capabilities of the Pentaho Business Analytics Platform. Contents Pentaho Scalability and

More information

Case Study. Cloud Adoption, Fault Tolerant AWS Support & Magento ecommerce Implementation. Case Study

Case Study. Cloud Adoption, Fault Tolerant AWS Support & Magento ecommerce Implementation. Case Study Cloud Adoption, Fault Tolerant AWS Support & Magento ecommerce Implementation World s Largest Publisher of Medical and Scientific Literature 1 2013 Compunnel Software Group Cloud Adoption, Fault Tolerant

More information

How To Handle Big Data With A Data Scientist

How To Handle Big Data With A Data Scientist III Big Data Technologies Today, new technologies make it possible to realize value from Big Data. Big data technologies can replace highly customized, expensive legacy systems with a standard solution

More information

Scalable Architecture on Amazon AWS Cloud

Scalable Architecture on Amazon AWS Cloud Scalable Architecture on Amazon AWS Cloud Kalpak Shah Founder & CEO, Clogeny Technologies kalpak@clogeny.com 1 * http://www.rightscale.com/products/cloud-computing-uses/scalable-website.php 2 Architect

More information

EXECUTIVE SUMMARY CONTENTS. 1. Summary 2. Objectives 3. Methodology and Approach 4. Results 5. Next Steps 6. Glossary 7. Appendix. 1.

EXECUTIVE SUMMARY CONTENTS. 1. Summary 2. Objectives 3. Methodology and Approach 4. Results 5. Next Steps 6. Glossary 7. Appendix. 1. CONTENTS 1. Summary 2. Objectives 3. Methodology and Approach 4. Results 5. Next Steps 6. Glossary 7. Appendix EXECUTIVE SUMMARY Tenzing Managed IT services has recently partnered with Amazon Web Services

More information

Big data platform for IoT Cloud Analytics. Chen Admati, Advanced Analytics, Intel

Big data platform for IoT Cloud Analytics. Chen Admati, Advanced Analytics, Intel Big data platform for IoT Cloud Analytics Chen Admati, Advanced Analytics, Intel Agenda IoT @ Intel End-to-End offering Analytics vision Big data platform for IoT Cloud Analytics Platform Capabilities

More information

Real World Big Data Architecture - Splunk, Hadoop, RDBMS

Real World Big Data Architecture - Splunk, Hadoop, RDBMS Copyright 2015 Splunk Inc. Real World Big Data Architecture - Splunk, Hadoop, RDBMS Raanan Dagan, Big Data Specialist, Splunk Disclaimer During the course of this presentagon, we may make forward looking

More information

CLOUD MANAGED SERVICES FRAMEWORK E-BOOK

CLOUD MANAGED SERVICES FRAMEWORK E-BOOK CLOUD MANAGED SERVICES FRAMEWORK E-BOOK TABLE OF CONTENTS 1 Introduction 2 2 Operational Insight 3 3 Cloud Management Process Control 4 4 Infrastructure, Application & Data Security 5 5 Continuous Improvement

More information

Getting Started & Successful with Big Data

Getting Started & Successful with Big Data Getting Started & Successful with Big Data @Pentaho #BigDataWebSeries 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Your Hosts Today Davy Nys VP EMEA & APAC Pentaho Paul

More information

Practical Approaches to Big Data & Analytics: From Infrastructure to

Practical Approaches to Big Data & Analytics: From Infrastructure to 2014 Cisco and/or its affiliates. All rights reserved. Practical Approaches to Big Data & Analytics: From Infrastructure to Applications Kapil Bakshi Distinguished Architect, Cisco System Digital Government

More information

OPTIMIZING PERFORMANCE IN AMAZON EC2 INTRODUCTION: LEVERAGING THE PUBLIC CLOUD OPPORTUNITY WITH AMAZON EC2. www.boundary.com

OPTIMIZING PERFORMANCE IN AMAZON EC2 INTRODUCTION: LEVERAGING THE PUBLIC CLOUD OPPORTUNITY WITH AMAZON EC2. www.boundary.com OPTIMIZING PERFORMANCE IN AMAZON EC2 While the business decision to migrate to Amazon public cloud services can be an easy one, tracking and managing performance in these environments isn t so clear cut.

More information

W H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract

W H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract W H I T E P A P E R Deriving Intelligence from Large Data Using Hadoop and Applying Analytics Abstract This white paper is focused on discussing the challenges facing large scale data processing and the

More information

WE RUN SEVERAL ON AWS BECAUSE WE CRITICAL APPLICATIONS CAN SCALE AND USE THE INFRASTRUCTURE EFFICIENTLY.

WE RUN SEVERAL ON AWS BECAUSE WE CRITICAL APPLICATIONS CAN SCALE AND USE THE INFRASTRUCTURE EFFICIENTLY. WE RUN SEVERAL CRITICAL APPLICATIONS ON AWS BECAUSE WE CAN SCALE AND USE THE INFRASTRUCTURE EFFICIENTLY. - Murari Gopalan Director, Technology Expedia Expedia, a leading online travel company for leisure

More information

Lambda Architecture for Batch and Real- Time Processing on AWS with Spark Streaming and Spark SQL. May 2015

Lambda Architecture for Batch and Real- Time Processing on AWS with Spark Streaming and Spark SQL. May 2015 Lambda Architecture for Batch and Real- Time Processing on AWS with Spark Streaming and Spark SQL May 2015 2015, Amazon Web Services, Inc. or its affiliates. All rights reserved. Notices This document

More information

Hadoop-based Open Source ediscovery: FreeEed. (Easy as popcorn)

Hadoop-based Open Source ediscovery: FreeEed. (Easy as popcorn) + Hadoop-based Open Source ediscovery: FreeEed (Easy as popcorn) + Hello! 2 Sujee Maniyam & Mark Kerzner Founders @ Elephant Scale consulting and training around Hadoop, Big Data technologies Enterprise

More information

Performance and Scalability Overview

Performance and Scalability Overview Performance and Scalability Overview This guide provides an overview of some of the performance and scalability capabilities of the Pentaho Business Analytics platform. PENTAHO PERFORMANCE ENGINEERING

More information

Big Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum

Big Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum Big Data Analytics with EMC Greenplum and Hadoop Big Data Analytics with EMC Greenplum and Hadoop Ofir Manor Pre Sales Technical Architect EMC Greenplum 1 Big Data and the Data Warehouse Potential All

More information

Rapid Bottleneck Identification

Rapid Bottleneck Identification Rapid Bottleneck Identification TM A Better Way to Load Test WHITEPAPER You re getting ready to launch or upgrade a critical Web application. Quality is crucial, but time is short. How can you make the

More information

SPARK USE CASE IN TELCO. Apache Spark Night 9-2-2014! Chance Coble!

SPARK USE CASE IN TELCO. Apache Spark Night 9-2-2014! Chance Coble! SPARK USE CASE IN TELCO Apache Spark Night 9-2-2014! Chance Coble! Use Case Profile Telecommunications company Shared business problems/pain Scalable analytics infrastructure is a problem Pushing infrastructure

More information

Real-time Ad-hoc Analytics on S3 with MemSQL

Real-time Ad-hoc Analytics on S3 with MemSQL Real-time Ad-hoc Analytics on S3 with MemSQL Satish Cattamanchi 4INFO Sarvesh Gupta Tavant Technologies September, 2015 ABSTRACT Enterprises are witnessing a rapid increase in data volume with growing

More information

Big Data Pipeline and Analytics Platform

Big Data Pipeline and Analytics Platform Big Data Pipeline and Analytics Platform Using NetflixOSS and Other Open Source Software Sudhir Tonse (@stonse) Danny Yuan (@g9yuayon) Netflix is a log generating company that also happens to stream movies

More information

BIG DATA ARCHITECTURE AND ANALYTICS BIG DATA STRATEGIES FOR BUSINESS GROWTH

BIG DATA ARCHITECTURE AND ANALYTICS BIG DATA STRATEGIES FOR BUSINESS GROWTH BIG DATA ARCHITECTURE AND ANALYTICS BIG DATA STRATEGIES FOR BUSINESS GROWTH Aaron Werman aaron.werman@firstdata.com or aaron.werman@gmail.com if you are in the payments industry, LinkedIn group Big Data

More information

A New Era Of Analytic

A New Era Of Analytic Penang egovernment Seminar 2014 A New Era Of Analytic Megat Anuar Idris Head, Project Delivery, Business Analytics & Big Data Agenda Overview of Big Data Case Studies on Big Data Big Data Technology Readiness

More information

Big Data for everyone Democratizing big data with the cloud. Steffen Krause Technical Evangelist @AWS_Aktuell skrause@amazon.de

Big Data for everyone Democratizing big data with the cloud. Steffen Krause Technical Evangelist @AWS_Aktuell skrause@amazon.de Big Data for everyone Democratizing big data with the cloud Steffen Krause Technical Evangelist @AWS_Aktuell skrause@amazon.de Does this Data make me look big? Overview Designing big data solutions in

More information

Qlik UKI Consulting Services Catalogue

Qlik UKI Consulting Services Catalogue Qlik UKI Consulting Services Catalogue The key to a successful Qlik project lies in the right people, the right skills, and the right activities in the right order www.qlik.co.uk Table of Contents Introduction

More information

Big Data Infrastructure at Spotify

Big Data Infrastructure at Spotify Big Data Infrastructure at Spotify Wouter de Bie Team Lead Data Infrastructure June 12, 2013 2 Agenda Let s talk about Data Infrastructure, how we did it, what we learned and how we ve failed Some Context

More information

Einsatzfelder von IBM PureData Systems und Ihre Vorteile.

Einsatzfelder von IBM PureData Systems und Ihre Vorteile. Einsatzfelder von IBM PureData Systems und Ihre Vorteile demirkaya@de.ibm.com Agenda Information technology challenges PureSystems and PureData introduction PureData for Transactions PureData for Analytics

More information

Create and Drive Big Data Success Don t Get Left Behind

Create and Drive Big Data Success Don t Get Left Behind Create and Drive Big Data Success Don t Get Left Behind The performance boost from MapR not only means we have lower hardware requirements, but also enables us to deliver faster analytics for our users.

More information

A Unified View of Network Monitoring. One Cohesive Network Monitoring View and How You Can Achieve It with NMSaaS

A Unified View of Network Monitoring. One Cohesive Network Monitoring View and How You Can Achieve It with NMSaaS A Unified View of Network Monitoring One Cohesive Network Monitoring View and How You Can Achieve It with NMSaaS Executive Summary In the past few years, the enterprise computing technology has changed

More information

Dominik Wagenknecht Accenture

Dominik Wagenknecht Accenture Dominik Wagenknecht Accenture Improving Mainframe Performance with Hadoop October 17, 2014 Organizers General Partner Top Media Partner Media Partner Supporters About me Dominik Wagenknecht Accenture Vienna

More information

Making big data simple with Databricks

Making big data simple with Databricks Making big data simple with Databricks We are Databricks, the company behind Spark Founded by the creators of Apache Spark in 2013 Data 75% Share of Spark code contributed by Databricks in 2014 Value Created

More information

QLIKVIEW INTEGRATION TION WITH AMAZON REDSHIFT John Park Partner Engineering

QLIKVIEW INTEGRATION TION WITH AMAZON REDSHIFT John Park Partner Engineering QLIKVIEW INTEGRATION TION WITH AMAZON REDSHIFT John Park Partner Engineering June 2014 Page 1 Contents Introduction... 3 About Amazon Web Services (AWS)... 3 About Amazon Redshift... 3 QlikView on AWS...

More information

Big Data Architecture & Analytics A comprehensive approach to harness big data architecture and analytics for growth

Big Data Architecture & Analytics A comprehensive approach to harness big data architecture and analytics for growth MAKING BIG DATA COME ALIVE Big Data Architecture & Analytics A comprehensive approach to harness big data architecture and analytics for growth Steve Gonzales, Principal Manager steve.gonzales@thinkbiganalytics.com

More information

Introducing Oracle Exalytics In-Memory Machine

Introducing Oracle Exalytics In-Memory Machine Introducing Oracle Exalytics In-Memory Machine Jon Ainsworth Director of Business Development Oracle EMEA Business Analytics 1 Copyright 2011, Oracle and/or its affiliates. All rights Agenda Topics Oracle

More information

A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM

A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM Sneha D.Borkar 1, Prof.Chaitali S.Surtakar 2 Student of B.E., Information Technology, J.D.I.E.T, sborkar95@gmail.com Assistant Professor, Information

More information

The big data revolution

The big data revolution The big data revolution Friso van Vollenhoven (Xebia) Enterprise NoSQL Recently, there has been a lot of buzz about the NoSQL movement, a collection of related technologies mostly concerned with storing

More information

Talend Real-Time Big Data Sandbox. Big Data Insights Cookbook

Talend Real-Time Big Data Sandbox. Big Data Insights Cookbook Talend Real-Time Big Data Talend Real-Time Big Data Overview of Real-time Big Data Pre-requisites to run Setup & Talend License Talend Real-Time Big Data Big Data Setup & About this cookbook What is the

More information

Expand Your Infrastructure with the Elastic Cloud. Mark Ryland Chief Solutions Architect Jenn Steele Product Marketing Manager

Expand Your Infrastructure with the Elastic Cloud. Mark Ryland Chief Solutions Architect Jenn Steele Product Marketing Manager Expand Your Infrastructure with the Elastic Cloud Mark Ryland Chief Solutions Architect Jenn Steele Product Marketing Manager Today we re going to talk about The Cloud Scenarios Questions You Probably

More information

Aligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap

Aligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap Aligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap 3 key strategic advantages, and a realistic roadmap for what you really need, and when 2012, Cognizant Topics to be discussed

More information

How enterprises will use the cloud for big data analytics

How enterprises will use the cloud for big data analytics How enterprises will use the cloud for big data analytics Lynn Langit November 10, 2014 This report is underwritten by Cazena. TABLE OF CONTENTS Executive summary... 3 A majority of enterprises are interested

More information

Big Data Driven Knowledge Discovery for Autonomic Future Internet

Big Data Driven Knowledge Discovery for Autonomic Future Internet Big Data Driven Knowledge Discovery for Autonomic Future Internet Professor Geyong Min Chair in High Performance Computing and Networking Department of Mathematics and Computer Science College of Engineering,

More information

Pulsar Realtime Analytics At Scale. Tony Ng April 14, 2015

Pulsar Realtime Analytics At Scale. Tony Ng April 14, 2015 Pulsar Realtime Analytics At Scale Tony Ng April 14, 2015 Big Data Trends Bigger data volumes More data sources DBs, logs, behavioral & business event streams, sensors Faster analysis Next day to hours

More information

CIS 4930/6930 Spring 2014 Introduction to Data Science Data Intensive Computing. University of Florida, CISE Department Prof.

CIS 4930/6930 Spring 2014 Introduction to Data Science Data Intensive Computing. University of Florida, CISE Department Prof. CIS 4930/6930 Spring 2014 Introduction to Data Science Data Intensive Computing University of Florida, CISE Department Prof. Daisy Zhe Wang Cloud Computing and Amazon Web Services Cloud Computing Amazon

More information

BASHO DATA PLATFORM SIMPLIFIES BIG DATA, IOT, AND HYBRID CLOUD APPS

BASHO DATA PLATFORM SIMPLIFIES BIG DATA, IOT, AND HYBRID CLOUD APPS WHITEPAPER BASHO DATA PLATFORM BASHO DATA PLATFORM SIMPLIFIES BIG DATA, IOT, AND HYBRID CLOUD APPS INTRODUCTION Big Data applications and the Internet of Things (IoT) are changing and often improving our

More information

DataStax Enterprise, powered by Apache Cassandra (TM)

DataStax Enterprise, powered by Apache Cassandra (TM) PerfAccel (TM) Performance Benchmark on Amazon: DataStax Enterprise, powered by Apache Cassandra (TM) Disclaimer: All of the documentation provided in this document, is copyright Datagres Technologies

More information

10 Practical Tips for Cloud Optimization

10 Practical Tips for Cloud Optimization Real Life in the Cloud The Cloud Sprawl Cloud Control Challenges 1. Transparency 2. Governance. Predictability Cloud Optimization in Action 10 Cloud Optimization Guidelines to Keep in Mind The 11th Guideline:

More information

Big Data & QlikView. Democratizing Big Data Analytics. David Freriks Principal Solution Architect

Big Data & QlikView. Democratizing Big Data Analytics. David Freriks Principal Solution Architect Big Data & QlikView Democratizing Big Data Analytics David Freriks Principal Solution Architect TDWI Vancouver Agenda What really is Big Data? How do we separate hype from reality? How does that relate

More information

Big Data simplified. SAPSA Impuls, Stockholm 2014-11-13 Martin Faiss & Niklas Packendorff, SAP

Big Data simplified. SAPSA Impuls, Stockholm 2014-11-13 Martin Faiss & Niklas Packendorff, SAP Big Data simplified SAPSA Impuls, Stockholm 2014-11-13 Martin Faiss & Niklas Packendorff, SAP Complexity built up over decades hampers the ability to innovate; radical simplification is needed to unlock

More information

Hadoop in the Hybrid Cloud

Hadoop in the Hybrid Cloud Presented by Hortonworks and Microsoft Introduction An increasing number of enterprises are either currently using or are planning to use cloud deployment models to expand their IT infrastructure. Big

More information