EFFICIENT ANALYSIS OF APPLICATION SERVERS IN THE CLOUD

Size: px
Start display at page:

Download "EFFICIENT ANALYSIS OF APPLICATION SERVERS IN THE CLOUD"

Transcription

1 EFFICIENT ANALYSIS OF APPLICATION SERVERS IN THE CLOUD Progress report meeting December 2012 Phuong Tran Gia Under the supervision of Prof. Michel R. Dagenais Dorsal Laboratory, École Polytechnique de Montréal

2 Outline Completed tasks Future work Challenges Questions

3 Completed tasks Build background in: Distributed and large scale systems The design and implementation of the Linux kernel Userspace and Kernel Tracing LTTng documentations Project proposal of Cloud Computing project State of the art of study, including Zipkin(Twitter) and Dapper(Google)

4 Zipkin A distributed tracing framework Why Zipkin?

5 Zipkin A distributed tracing framework Why Zipkin? Helping developers gain deeper knowledge about how certain requests perform in a distributed system

6 Zipkin A distributed tracing framework Why Zipkin? Helping developers gain deeper knowledge about how certain requests perform in a distributed system Helping developers gather timing data for the disparate services at Twitter

7 Zipkin Architecture

8 Cluster #02 Thrift Finagle MySQL Memcache Scribe daemon (client side) Zipkin Architecture Zipkin Server Zipkin-web Query daemon Cluster #01 Cassandra DB Thrift Admin Finagle (client side), Ruby Thrift Scribe daemon (client side) Scribe/ Zookeeper Zookeeper Zipkin Collector (Zipkin-collector core, Scribe-server side, Finagle server side, Zipkin collector daemon)

9 Finagle in Zipkin A core module of Zipkin. Finagle is an asynchronous network stack for the JVM that we can use to build asynchronous Remote Procedure Call (RPC) clients and servers in Java, Scala, or any JVMhosted language. github.com/twitter/finagle

10 Finagle in Zipkin [1] How Finagle-http works

11 Zipkin terminology Annotation: string data associated with a particular timestamp, service and host Time time: :17:05 value: something happened server: service: timelineservice

12 Zipkin terminology Span: represents one specific method call; made up of a set of annotations. Has a name and an id T: 0ms Client Send Time

13 Zipkin terminology Span: represents one specific method call; made up of a set of annotations. Has a name and an id T: 0ms Client Send Time T: 10ms Server Receive

14 Zipkin terminology Span: represents one specific method call; made up of a set of annotations. Has a name and an id T: 0ms Client Send Time T: 10ms Server Receive T:90ms Server Send

15 Zipkin terminology Span: represents one specific method call; made up of a set of annotations. Has a name and an id T: 0ms Client Send T: 100ms Client Receive Time T: 10ms Server Receive T:90ms Server Send

16 Zipkin terminology Span: represents one specific method call; made up of a set of annotations. Has a name and an id T: 0ms Client Send T: 100ms Client Receive Time T: 20ms Read 30 Kbytes from file T: 10ms Server Receive T:90ms Server Send

17 Zipkin terminology Span: represents one specific method call; made up of a set of annotations. Has a name and an id T: 0ms Client Send T: 100ms Client Receive Time T: 20ms Read 30 Kbytes from file T: 10ms Server Receive T:90ms Server Send Trace: A set of spans all associated with the same request.

18 Zipkin UI

19 Zipkin UI

20 Zipkin UI

21 Zipkin UI

22 Zipkin UI

23 Zipkin UI - Dependencies

24 Google s Dapper

25 Google s Dapper [2] Collect traces from production requests Low overhead Minimum of extra work for developer

26 Future work Build an analysis module for application servers by intercepting Asynchronous RPC. Start tracing MySQL/PostgreSQL, Redis/Cassandra services and tomcat/apache2 application servers Could be integrated with Streaming Data module (like Scribe) for cluster management. Large scale system performance profiling

27 Challenges Supporting many protocols Distributed tracing Performance improvements Tracing overhead management (adaptive sampling, production workloads) Flexible and easy to use. Integration with LTTng

28 Questions Typical application servers in Cloud environment of interest for industrial partners?

29 References [1] Johan Oskarsson, Zipkin- Runtime Open House, July 27, 2012 [2] Benjamin H. Sigelman, Luiz Andre Barroso, Mike Burrows, Pat Stephenson, Manoj Plakal, Donald Beaver, Saul Jaspan, Chandan Shanbhag, Dapper, a Large-Scale Distributed Systems Tracing Infrastructure, Google Technical Report dapper , April 2010

Large-scale performance monitoring framework for cloud monitoring. Live Trace Reading and Processing

Large-scale performance monitoring framework for cloud monitoring. Live Trace Reading and Processing Large-scale performance monitoring framework for cloud monitoring Live Trace Reading and Processing Julien Desfossez Michel Dagenais May 2014 École Polytechnique de Montreal Live Trace Reading Read the

More information

Dick Sites Google Inc. February 2015. Datacenter Computers. modern challenges in CPU design

Dick Sites Google Inc. February 2015. Datacenter Computers. modern challenges in CPU design Dick Sites Google Inc. February 2015 Datacenter Computers modern challenges in CPU design Thesis: Servers and desktops require different design emphasis February 2015 2 Goals Draw a vivid picture of a

More information

Scaling Pinterest. Yash Nelapati Ascii Artist. Pinterest Engineering. Saturday, August 31, 13

Scaling Pinterest. Yash Nelapati Ascii Artist. Pinterest Engineering. Saturday, August 31, 13 Scaling Pinterest Yash Nelapati Ascii Artist Pinterest is... An online pinboard to organize and share what inspires you. Growth March 2010 Page views per day Mar 2010 Jan 2011 Jan 2012 May 2012 Growth

More information

ITG Software Engineering

ITG Software Engineering IBM WebSphere Administration 8.5 Course ID: Page 1 Last Updated 12/15/2014 WebSphere Administration 8.5 Course Overview: This 5 Day course will cover the administration and configuration of WebSphere 8.5.

More information

Scalable and Live Trace Processing with Kieker Utilizing Cloud Computing

Scalable and Live Trace Processing with Kieker Utilizing Cloud Computing Scalable and Live Trace Processing with Kieker Utilizing Cloud Computing Florian Fittkau, Jan Waller, Peer Brauer, and Wilhelm Hasselbring Department of Computer Science, Kiel University, Kiel, Germany

More information

Building Scalable Big Data Infrastructure Using Open Source Software. Sam William sampd@stumbleupon.

Building Scalable Big Data Infrastructure Using Open Source Software. Sam William sampd@stumbleupon. Building Scalable Big Data Infrastructure Using Open Source Software Sam William sampd@stumbleupon. What is StumbleUpon? Help users find content they did not expect to find The best way to discover new

More information

LinuxCon Europe 2013. Cloud Monitoring and Distribution Bug Reporting with Live Streaming and Snapshots. mathieu.desnoyers@efficios.

LinuxCon Europe 2013. Cloud Monitoring and Distribution Bug Reporting with Live Streaming and Snapshots. mathieu.desnoyers@efficios. LinuxCon Europe 2013 Cloud Monitoring and Distribution Bug Reporting with Live Streaming and Snapshots mathieu.desnoyers@efficios.com 1 Presenter Mathieu Desnoyers http://www.efficios.com Author/Maintainer

More information

Introduction to Big data. Why Big data? Case Studies. Introduction to Hadoop. Understanding Features of Hadoop. Hadoop Architecture.

Introduction to Big data. Why Big data? Case Studies. Introduction to Hadoop. Understanding Features of Hadoop. Hadoop Architecture. Big Data Hadoop Administration and Developer Course This course is designed to understand and implement the concepts of Big data and Hadoop. This will cover right from setting up Hadoop environment in

More information

Efficient and Large-Scale Infrastructure Monitoring with Tracing

Efficient and Large-Scale Infrastructure Monitoring with Tracing CloudOpen Europe 2013 Efficient and Large-Scale Infrastructure Monitoring with Tracing Julien.desfossez@efcios.com 1 Content Overview of tracing and LTTng LTTng features for Cloud Providers LTTng as a

More information

Attila Szegedi, Software Engineer @asz. Wednesday, November 23, 11

Attila Szegedi, Software Engineer @asz. Wednesday, November 23, 11 Attila Szegedi, Software Engineer @asz 1 Twitter s Open Source Involvements 2 Both users and producers Twitter s systems are almost completely based on Open Source software our finance department runs

More information

Cloud Operating Systems for Servers

Cloud Operating Systems for Servers Cloud Operating Systems for Servers Mike Day Distinguished Engineer, Virtualization and Linux August 20, 2014 mdday@us.ibm.com 1 What Makes a Good Cloud Operating System?! Consumes Few Resources! Fast

More information

How To Scale Out Of A Nosql Database

How To Scale Out Of A Nosql Database Firebird meets NoSQL (Apache HBase) Case Study Firebird Conference 2011 Luxembourg 25.11.2011 26.11.2011 Thomas Steinmaurer DI +43 7236 3343 896 thomas.steinmaurer@scch.at www.scch.at Michael Zwick DI

More information

Dissecting Open Source Cloud Evolution: An OpenStack Case Study

Dissecting Open Source Cloud Evolution: An OpenStack Case Study Dissecting Open Source Cloud Evolution: An OpenStack Case Study Salman A. Baset, Chunqiang Tang, Byung Chul Tak, Long Wang IBM T.J. Watson Research Center, Yorktown Heights, NY, USA Abstract Open source

More information

ProvenanceLens: Service Provenance Management in the Cloud

ProvenanceLens: Service Provenance Management in the Cloud ProvenanceLens: Service Provenance Management in the Cloud Tao Li 1,2,3, Ling Liu 3, Xiaolong Zhang 1, Kai Xu 1, Chao Yang 4 1 School of Computer, Wuhan University of Science and Technology 2 Hubei Province

More information

Peers Techno log ies Pv t. L td. HADOOP

Peers Techno log ies Pv t. L td. HADOOP Page 1 Peers Techno log ies Pv t. L td. Course Brochure Overview Hadoop is a Open Source from Apache, which provides reliable storage and faster process by using the Hadoop distibution file system and

More information

Comparing SQL and NOSQL databases

Comparing SQL and NOSQL databases COSC 6397 Big Data Analytics Data Formats (II) HBase Edgar Gabriel Spring 2015 Comparing SQL and NOSQL databases Types Development History Data Storage Model SQL One type (SQL database) with minor variations

More information

WSO2 Message Broker. Scalable persistent Messaging System

WSO2 Message Broker. Scalable persistent Messaging System WSO2 Message Broker Scalable persistent Messaging System Outline Messaging Scalable Messaging Distributed Message Brokers WSO2 MB Architecture o Distributed Pub/sub architecture o Distributed Queues architecture

More information

Sun Cloud API: A RESTful Open API for Cloud Computing

Sun Cloud API: A RESTful Open API for Cloud Computing Sun Cloud API: A RESTful Open API for Cloud Computing Lew Tucker CTO, Cloud Computing Sun Microsystems, Inc. 1 Future vision: Global Cloud of Clouds (a.k.a InterCloud ) Inter-connected network of servers,

More information

Monitis Project Proposals for AUA. September 2014, Yerevan, Armenia

Monitis Project Proposals for AUA. September 2014, Yerevan, Armenia Monitis Project Proposals for AUA September 2014, Yerevan, Armenia Distributed Log Collecting and Analysing Platform Project Specifications Category: Big Data and NoSQL Software Requirements: Apache Hadoop

More information

Public deliverable Copyright Beneficiaries of the MIKELANGELO Project MIKELANGELO. D2.4 First Cloud-Bursting Use Case Implementation Strategy

Public deliverable Copyright Beneficiaries of the MIKELANGELO Project MIKELANGELO. D2.4 First Cloud-Bursting Use Case Implementation Strategy MIKELANGELO D2.4 First Cloud-Bursting Use Case Implementation Strategy Workpackage: 2 Use Case & Architecture Analysis Author(s): Tzach Livyatan Cloudius Systems Nadav Har'El Cloudius Systems Reviewer

More information

Cloud Application Development (SE808, School of Software, Sun Yat-Sen University) Yabo (Arber) Xu

Cloud Application Development (SE808, School of Software, Sun Yat-Sen University) Yabo (Arber) Xu Lecture 4 Introduction to Hadoop & GAE Cloud Application Development (SE808, School of Software, Sun Yat-Sen University) Yabo (Arber) Xu Outline Introduction to Hadoop The Hadoop ecosystem Related projects

More information

GigaSpaces Real-Time Analytics for Big Data

GigaSpaces Real-Time Analytics for Big Data GigaSpaces Real-Time Analytics for Big Data GigaSpaces makes it easy to build and deploy large-scale real-time analytics systems Rapidly increasing use of large-scale and location-aware social media and

More information

Network Attached Storage. Jinfeng Yang Oct/19/2015

Network Attached Storage. Jinfeng Yang Oct/19/2015 Network Attached Storage Jinfeng Yang Oct/19/2015 Outline Part A 1. What is the Network Attached Storage (NAS)? 2. What are the applications of NAS? 3. The benefits of NAS. 4. NAS s performance (Reliability

More information

Instrumentation Software Profiling

Instrumentation Software Profiling Instrumentation Software Profiling Software Profiling Instrumentation of a program so that data related to runtime performance (e.g execution time, memory usage) is gathered for one or more pieces of the

More information

Open Source for Cloud Infrastructure

Open Source for Cloud Infrastructure Open Source for Cloud Infrastructure June 29, 2012 Jackson He General Manager, Intel APAC R&D Ltd. Cloud is Here and Expanding More users, more devices, more data & traffic, expanding usages >3B 15B Connected

More information

In-Memory BigData. Summer 2012, Technology Overview

In-Memory BigData. Summer 2012, Technology Overview In-Memory BigData Summer 2012, Technology Overview Company Vision In-Memory Data Processing Leader: > 5 years in production > 100s of customers > Starts every 10 secs worldwide > Over 10,000,000 starts

More information

How To Manage An Sap Solution

How To Manage An Sap Solution ... Foreword... 17... Acknowledgments... 19... Introduction... 21 1... Performance Management of an SAP Solution... 33 1.1... SAP Solution Architecture... 34 1.1.1... SAP Solutions and SAP Components...

More information

160 Numerical Methods and Programming, 2012, Vol. 13 (http://num-meth.srcc.msu.ru) UDC 004.021

160 Numerical Methods and Programming, 2012, Vol. 13 (http://num-meth.srcc.msu.ru) UDC 004.021 160 Numerical Methods and Programming, 2012, Vol. 13 (http://num-meth.srcc.msu.ru) UDC 004.021 JOB DIGEST: AN APPROACH TO DYNAMIC ANALYSIS OF JOB CHARACTERISTICS ON SUPERCOMPUTERS A.V. Adinets 1, P. A.

More information

Real-time Big Data Analytics with Storm

Real-time Big Data Analytics with Storm Ron Bodkin Founder & CEO, Think Big June 2013 Real-time Big Data Analytics with Storm Leading Provider of Data Science and Engineering Services Accelerating Your Time to Value IMAGINE Strategy and Roadmap

More information

Benchmarking Couchbase Server for Interactive Applications. By Alexey Diomin and Kirill Grigorchuk

Benchmarking Couchbase Server for Interactive Applications. By Alexey Diomin and Kirill Grigorchuk Benchmarking Couchbase Server for Interactive Applications By Alexey Diomin and Kirill Grigorchuk Contents 1. Introduction... 3 2. A brief overview of Cassandra, MongoDB, and Couchbase... 3 3. Key criteria

More information

Migration and Building of Data Centers in IBM SoftLayer with the RackWare Management Module

Migration and Building of Data Centers in IBM SoftLayer with the RackWare Management Module Migration and Building of Data Centers in IBM SoftLayer with the RackWare Management Module June, 2015 WHITE PAPER Contents Advantages of IBM SoftLayer and RackWare Together... 4 Relationship between

More information

Time series IoT data ingestion into Cassandra using Kaa

Time series IoT data ingestion into Cassandra using Kaa Time series IoT data ingestion into Cassandra using Kaa Andrew Shvayka ashvayka@cybervisiontech.com Agenda Data ingestion challenges Why Kaa? Why Cassandra? Reference architecture overview Hands-on Sandbox

More information

Leveraging the Eclipse TPTP* Agent Infrastructure

Leveraging the Eclipse TPTP* Agent Infrastructure 2005 Intel Corporation; made available under the EPL v1.0 March 3, 2005 Eclipse is a trademark of Eclipse Foundation, Inc 1 Leveraging the Eclipse TPTP* Agent Infrastructure Andy Kaylor Intel Corporation

More information

Introduction to Big Data Training

Introduction to Big Data Training Introduction to Big Data Training The quickest way to be introduce with NOSQL/BIG DATA offerings Learn and experience Big Data Solutions including Hadoop HDFS, Map Reduce, NoSQL DBs: Document Based DB

More information

Zing Vision. Answering your toughest production Java performance questions

Zing Vision. Answering your toughest production Java performance questions Zing Vision Answering your toughest production Java performance questions Outline What is Zing Vision? Where does Zing Vision fit in your Java environment? Key features How it works Using ZVRobot Q & A

More information

Zynga Analytics Leveraging Big Data to Make Games More Fun and Social

Zynga Analytics Leveraging Big Data to Make Games More Fun and Social Connecting the World Through Games Zynga Analytics Leveraging Big Data to Make Games More Fun and Social Daniel McCaffrey General Manager, Platform and Analytics Engineering World s leading social game

More information

IBM Software Services for Lotus Consulting Education Accelerated Value Program. Log Files. 2009 IBM Corporation

IBM Software Services for Lotus Consulting Education Accelerated Value Program. Log Files. 2009 IBM Corporation Log Files 2009 IBM Corporation Goals Understand where to find log files Understand the purpose of various log files Components and log files Look at logs, starting with the most likely component Review

More information

Last Class: OS and Computer Architecture. Last Class: OS and Computer Architecture

Last Class: OS and Computer Architecture. Last Class: OS and Computer Architecture Last Class: OS and Computer Architecture System bus Network card CPU, memory, I/O devices, network card, system bus Lecture 3, page 1 Last Class: OS and Computer Architecture OS Service Protection Interrupts

More information

BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB

BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB Planet Size Data!? Gartner s 10 key IT trends for 2012 unstructured data will grow some 80% over the course of the next

More information

Example of Standard API

Example of Standard API 16 Example of Standard API System Call Implementation Typically, a number associated with each system call System call interface maintains a table indexed according to these numbers The system call interface

More information

Rainbird: Real-time Analytics @Twitter

Rainbird: Real-time Analytics @Twitter Rainbird: Real-time Analytics @Twitter Kevin Weil -- @kevinweil Product Lead for Revenue, Twitter TM Agenda Why Real-time Analytics? Rainbird and Cassandra Production Uses at Twitter Open Source My Background

More information

Tobias.Trelle@codecentric.de @tobiastrelle. codecentric AG 1

Tobias.Trelle@codecentric.de @tobiastrelle. codecentric AG 1 NoSQL Unit & Travis CI Test Automation for NoSQL Databases Tobias.Trelle@codecentric.de @tobiastrelle codecentric AG 1 Tobias Trelle Senior IT Consultant @ codecentric AG Organizer of MongoDB User Group

More information

Developing modular Java applications

Developing modular Java applications Developing modular Java applications Julien Dubois France Regional Director SpringSource Julien Dubois France Regional Director, SpringSource Book author :«Spring par la pratique» (Eyrolles, 2006) new

More information

Storage Made Easy Enterprise File Share and Sync (EFSS) Cloud Control Gateway Architecture

Storage Made Easy Enterprise File Share and Sync (EFSS) Cloud Control Gateway Architecture Storage Made Easy Enterprise File Share and Sync (EFSS) Architecture Software Stack The SME platform is built using open Internet technologies. The base operating system used s hardened Linux CentOS. HTTPD

More information

Migration and Building of Data Centers in IBM SoftLayer with the RackWare Management Module

Migration and Building of Data Centers in IBM SoftLayer with the RackWare Management Module Migration and Building of Data Centers in IBM SoftLayer with the RackWare Management Module June, 2015 WHITE PAPER Contents Advantages of IBM SoftLayer and RackWare Together... 4 Relationship between

More information

Case Studies PHP 2015

Case Studies PHP 2015 Case Studies PHP 2015 PHP TECHNOLOGIES PHP is a well known programming language which is used for web to develop dynamic web pages. Most web developers today use PHP coding and this language has been in

More information

MS 20465C: Designing a Data Solution with Microsoft SQL Server

MS 20465C: Designing a Data Solution with Microsoft SQL Server MS 20465C: Designing a Data Solution with Microsoft SQL Server Description: Note: Days: 5 Prerequisites: The focus of this five-day instructor-led course is on planning and implementing enterprise database

More information

A FRAMEWORK FOR MANAGING RUNTIME ENVIRONMENT OF JAVA APPLICATIONS

A FRAMEWORK FOR MANAGING RUNTIME ENVIRONMENT OF JAVA APPLICATIONS A FRAMEWORK FOR MANAGING RUNTIME ENVIRONMENT OF JAVA APPLICATIONS Abstract T.VENGATTARAMAN * Department of Computer Science, Pondicherry University, Puducherry, India. A.RAMALINGAM Department of MCA, Sri

More information

Developing Scalable Smart Grid Infrastructure to Enable Secure Transmission System Control

Developing Scalable Smart Grid Infrastructure to Enable Secure Transmission System Control Developing Scalable Smart Grid Infrastructure to Enable Secure Transmission System Control EP/K006487/1 UK PI: Prof Gareth Taylor (BU) China PI: Prof Yong-Hua Song (THU) Consortium UK Members: Brunel University

More information

Chapter 1 - Web Server Management and Cluster Topology

Chapter 1 - Web Server Management and Cluster Topology Objectives At the end of this chapter, participants will be able to understand: Web server management options provided by Network Deployment Clustered Application Servers Cluster creation and management

More information

Do Containers fully 'contain' security issues? A closer look at Docker and Warden. By Farshad Abasi, 2015-09-16

Do Containers fully 'contain' security issues? A closer look at Docker and Warden. By Farshad Abasi, 2015-09-16 Do Containers fully 'contain' security issues? A closer look at Docker and Warden. By Farshad Abasi, 2015-09-16 Overview What are Containers? Containers and The Cloud Containerization vs. H/W Virtualization

More information

Migration and Disaster Recovery Underground in the NEC / Iron Mountain National Data Center with the RackWare Management Module

Migration and Disaster Recovery Underground in the NEC / Iron Mountain National Data Center with the RackWare Management Module Migration and Disaster Recovery Underground in the NEC / Iron Mountain National Data Center with the RackWare Management Module WHITE PAPER May 2015 Contents Advantages of NEC / Iron Mountain National

More information

Apache HBase. Crazy dances on the elephant back

Apache HBase. Crazy dances on the elephant back Apache HBase Crazy dances on the elephant back Roman Nikitchenko, 16.10.2014 YARN 2 FIRST EVER DATA OS 10.000 nodes computer Recent technology changes are focused on higher scale. Better resource usage

More information

DBMS Project. COP5725 - Spring 2011. Final Submission Report

DBMS Project. COP5725 - Spring 2011. Final Submission Report DBMS Project COP5725 - Spring 2011 Final Submission Report Chandra Shekar # 6610-6717 Nitin Gujral # 4149-1481 Rajesh Sindhu # 4831-2035 Shrirama Tejasvi # 7521-6735 LINK TO PROJECT Project Website : www.cise.ufl.edu/~mallela

More information

Hadoop and Map-Reduce. Swati Gore

Hadoop and Map-Reduce. Swati Gore Hadoop and Map-Reduce Swati Gore Contents Why Hadoop? Hadoop Overview Hadoop Architecture Working Description Fault Tolerance Limitations Why Map-Reduce not MPI Distributed sort Why Hadoop? Existing Data

More information

Start up Jobs Germany FEB 2014

Start up Jobs Germany FEB 2014 Start up Jobs y FEB 2014 JOB TITLE LANGUAGE LOCATION REQUIREMENTS REF Lead English Berlin Lots of PHP, Magento, Zend, 80H PHPUnit, MySQL Snr ERP English Berlin Navision ERP development, Version 80I 2009

More information

Removing Failure Points and Increasing Scalability for the Engine that Drives webmd.com

Removing Failure Points and Increasing Scalability for the Engine that Drives webmd.com Removing Failure Points and Increasing Scalability for the Engine that Drives webmd.com Matt Wilson Director, Consumer Web Operations, WebMD @mattwilsoninc 9/12/2013 About this talk Go over original site

More information

Agenda. Some Examples from Yahoo! Hadoop. Some Examples from Yahoo! Crawling. Cloud (data) management Ahmed Ali-Eldin. First part: Second part:

Agenda. Some Examples from Yahoo! Hadoop. Some Examples from Yahoo! Crawling. Cloud (data) management Ahmed Ali-Eldin. First part: Second part: Cloud (data) management Ahmed Ali-Eldin First part: ZooKeeper (Yahoo!) Agenda A highly available, scalable, distributed, configuration, consensus, group membership, leader election, naming, and coordination

More information

Complete Java Classes Hadoop Syllabus Contact No: 8888022204

Complete Java Classes Hadoop Syllabus Contact No: 8888022204 1) Introduction to BigData & Hadoop What is Big Data? Why all industries are talking about Big Data? What are the issues in Big Data? Storage What are the challenges for storing big data? Processing What

More information

Client Overview. Engagement Situation. Key Requirements for Platform Development :

Client Overview. Engagement Situation. Key Requirements for Platform Development : Client Overview Our client provides leading video platform for enterprise HD video conferencing and has product suite focused on product-based visual communication solutions. Our client leverages its solutions

More information

How to choose the right PaaS Platform?

How to choose the right PaaS Platform? How to choose the right PaaS Platform? Rajagopalan. S Senior Solution Architect Wipro Technologies 1 The Problem Which one is suitable for your Enterprise? How do you identify that? 2 Agenda PaaS Landscape

More information

Can High-Performance Interconnects Benefit Memcached and Hadoop?

Can High-Performance Interconnects Benefit Memcached and Hadoop? Can High-Performance Interconnects Benefit Memcached and Hadoop? D. K. Panda and Sayantan Sur Network-Based Computing Laboratory Department of Computer Science and Engineering The Ohio State University,

More information

Search Big Data with MySQL and Sphinx. Mindaugas Žukas www.ivinco.com

Search Big Data with MySQL and Sphinx. Mindaugas Žukas www.ivinco.com Search Big Data with MySQL and Sphinx Mindaugas Žukas www.ivinco.com Agenda Big Data Architecture Factors and Technologies MySQL and Big Data Sphinx Search Server overview Case study: building a Big Data

More information

MS 20487A Developing Windows Azure and Web Services

MS 20487A Developing Windows Azure and Web Services MS 20487A Developing Windows Azure and Web Services Description: Days: 5 Prerequisites: In this course, students will learn how to design and develop services that access local and remote data from various

More information

A B S T R A C T. Index Terms: Hadoop, Clickstream, I. INTRODUCTION

A B S T R A C T. Index Terms: Hadoop, Clickstream, I. INTRODUCTION Big Data Analytics with Hadoop on Cloud for Masses Rupali Sathe,Srijita Bhattacharjee Department of Computer Engineering Pillai HOC College of Engineering and Technology, Rasayani A B S T R A C T Businesses

More information

APP DEVELOPMENT ON THE CLOUD MADE EASY WITH PAAS

APP DEVELOPMENT ON THE CLOUD MADE EASY WITH PAAS APP DEVELOPMENT ON THE CLOUD MADE EASY WITH PAAS This article looks into the benefits of using the Platform as a Service paradigm to develop applications on the cloud. It also compares a few top PaaS providers

More information

Transparent System Call Based Performance Debugging for Cloud Computing

Transparent System Call Based Performance Debugging for Cloud Computing Transparent System Call Based Performance Debugging for Cloud Computing Nikhil Khadke, Michael P. Kasick, Soila P. Kavulya, Jiaqi Tan, and Priya Narasimhan, PARALLEL DATA LABORATORY Carnegie Mellon University

More information

Performance Monitoring and Analysis System for MUSCLE-based Applications

Performance Monitoring and Analysis System for MUSCLE-based Applications Polish Infrastructure for Supporting Computational Science in the European Research Space Performance Monitoring and Analysis System for MUSCLE-based Applications W. Funika, M. Janczykowski, K. Jopek,

More information

The Virtualization Practice

The Virtualization Practice The Virtualization Practice White Paper: Managing Applications in Docker Containers Bernd Harzog Analyst Virtualization and Cloud Performance Management October 2014 Abstract Docker has captured the attention

More information

LARGE-SCALE DATA STORAGE APPLICATIONS

LARGE-SCALE DATA STORAGE APPLICATIONS BENCHMARKING AVAILABILITY AND FAILOVER PERFORMANCE OF LARGE-SCALE DATA STORAGE APPLICATIONS Wei Sun and Alexander Pokluda December 2, 2013 Outline Goal and Motivation Overview of Cassandra and Voldemort

More information

Introduction. AppDynamics for Databases Version 2.9.4. Page 1

Introduction. AppDynamics for Databases Version 2.9.4. Page 1 Introduction AppDynamics for Databases Version 2.9.4 Page 1 Introduction to AppDynamics for Databases.................................... 3 Top Five Features of a Database Monitoring Tool.............................

More information

Java in Web 2.0. Alexis Roos Principal Field Technologist, CTO Office OEM SW Sales Sun Microsystems, Inc.

Java in Web 2.0. Alexis Roos Principal Field Technologist, CTO Office OEM SW Sales Sun Microsystems, Inc. Java in Web 2.0 Alexis Roos Principal Field Technologist, CTO Office OEM SW Sales Sun Microsystems, Inc. 1 Agenda Java overview Technologies supported by Java Platform to create Web 2.0 services Future

More information

CSE 590: Special Topics Course ( Supercomputing ) Lecture 10 ( MapReduce& Hadoop)

CSE 590: Special Topics Course ( Supercomputing ) Lecture 10 ( MapReduce& Hadoop) CSE 590: Special Topics Course ( Supercomputing ) Lecture 10 ( MapReduce& Hadoop) Rezaul A. Chowdhury Department of Computer Science SUNY Stony Brook Spring 2016 MapReduce MapReduce is a programming model

More information

IT Analytics and Big Data - Making Your Life Easier

IT Analytics and Big Data - Making Your Life Easier IT Analytics and Big Data - Making Your Life Easier Paul Smith Smitty IBM Service Management Architect Cloud and Smarter Infrastructure Wednesday, March 12, 2014 Session # 15190 Agenda Big Data and Customer

More information

Workshop on Hadoop with Big Data

Workshop on Hadoop with Big Data Workshop on Hadoop with Big Data Hadoop? Apache Hadoop is an open source framework for distributed storage and processing of large sets of data on commodity hardware. Hadoop enables businesses to quickly

More information

Using Moab Service Manager. Steve Hurst September 17, 2009

Using Moab Service Manager. Steve Hurst September 17, 2009 Using Moab Service Manager Steve Hurst September 17, 2009 Overview What is Moab Service Manager (MSM) When to use MSM Example MSM Uses Developing MSM plug-ins Explore Apache plug-in Questions 9/17/2009

More information

Getting Started with SandStorm NoSQL Benchmark

Getting Started with SandStorm NoSQL Benchmark Getting Started with SandStorm NoSQL Benchmark SandStorm is an enterprise performance testing tool for web, mobile, cloud and big data applications. It provides a framework for benchmarking NoSQL, Hadoop,

More information

Application Performance Monitoring of a scalable Java web-application in a cloud infrastructure

Application Performance Monitoring of a scalable Java web-application in a cloud infrastructure Application Performance Monitoring of a scalable Java web-application in a cloud infrastructure Final Presentation August 5, 2013 Student: Supervisor: Advisor: Michael Rose Prof. Dr. Florian Matthes Alexander

More information

How Hadoop Clusters Break

How Hadoop Clusters Break How Hadoop Clusters Break Ariel Rabkin and Randy Katz EECS Department, UC Berkeley Berkeley, California, USA {asrabkin,randy}@cs.berkeley.edu Abstract This article describes lessons from examining a sample

More information

Database Scalability and Oracle 12c

Database Scalability and Oracle 12c Database Scalability and Oracle 12c Marcelle Kratochvil CTO Piction ACE Director All Data/Any Data marcelle@piction.com Warning I will be covering topics and saying things that will cause a rethink in

More information

HP OO 10.X - SiteScope Monitoring Templates

HP OO 10.X - SiteScope Monitoring Templates HP OO Community Guides HP OO 10.X - SiteScope Monitoring Templates As with any application continuous automated monitoring is key. Monitoring is important in order to quickly identify potential issues,

More information

Department of Veterans Affairs VistA Integration Adapter Release 1.0.5.0 Enhancement Manual

Department of Veterans Affairs VistA Integration Adapter Release 1.0.5.0 Enhancement Manual Department of Veterans Affairs VistA Integration Adapter Release 1.0.5.0 Enhancement Manual Version 1.1 September 2014 Revision History Date Version Description Author 09/28/2014 1.0 Updates associated

More information

Explain how to prepare the hardware and other resources necessary to install SQL Server. Install SQL Server. Manage and configure SQL Server.

Explain how to prepare the hardware and other resources necessary to install SQL Server. Install SQL Server. Manage and configure SQL Server. Course 6231A: Maintaining a Microsoft SQL Server 2008 Database About this Course Elements of this syllabus are subject to change. This five-day instructor-led course provides students with the knowledge

More information

Department of Computer Science University of Cyprus EPL646 Advanced Topics in Databases. Lecture 14

Department of Computer Science University of Cyprus EPL646 Advanced Topics in Databases. Lecture 14 Department of Computer Science University of Cyprus EPL646 Advanced Topics in Databases Lecture 14 Big Data Management IV: Big-data Infrastructures (Background, IO, From NFS to HFDS) Chapter 14-15: Abideboul

More information

Openbus Documentation

Openbus Documentation Openbus Documentation Release 1 Produban February 17, 2014 Contents i ii An open source architecture able to process the massive amount of events that occur in a banking IT Infraestructure. Contents:

More information

Open Source Technologies on Microsoft Azure

Open Source Technologies on Microsoft Azure Open Source Technologies on Microsoft Azure A Survey @DChappellAssoc Copyright 2014 Chappell & Associates The Main Idea i Open source technologies are a fundamental part of Microsoft Azure The Big Questions

More information

www.expaway.com Offerte del 13 giugno 2014

www.expaway.com Offerte del 13 giugno 2014 www.expaway.com Offerte del 13 giugno 2014 TR1414A - SOFTWARE DEVELOPER/ ARCHITECT (GERLINGEN) Location: Gerlingen (9 km west of Stuttgart) Field of operation: Consumer Services Founded: 2011 and German

More information

Tier Architectures. Kathleen Durant CS 3200

Tier Architectures. Kathleen Durant CS 3200 Tier Architectures Kathleen Durant CS 3200 1 Supporting Architectures for DBMS Over the years there have been many different hardware configurations to support database systems Some are outdated others

More information

Last Class: OS and Computer Architecture. Last Class: OS and Computer Architecture

Last Class: OS and Computer Architecture. Last Class: OS and Computer Architecture Last Class: OS and Computer Architecture System bus Network card CPU, memory, I/O devices, network card, system bus Lecture 3, page 1 Last Class: OS and Computer Architecture OS Service Protection Interrupts

More information

Monitoring, Tracing, Debugging (Under Construction)

Monitoring, Tracing, Debugging (Under Construction) Monitoring, Tracing, Debugging (Under Construction) I was already tempted to drop this topic from my lecture on operating systems when I found Stephan Siemen's article "Top Speed" in Linux World 10/2003.

More information

Programming Hadoop 5-day, instructor-led BD-106. MapReduce Overview. Hadoop Overview

Programming Hadoop 5-day, instructor-led BD-106. MapReduce Overview. Hadoop Overview Programming Hadoop 5-day, instructor-led BD-106 MapReduce Overview The Client Server Processing Pattern Distributed Computing Challenges MapReduce Defined Google's MapReduce The Map Phase of MapReduce

More information

Cloud Storage Solution for WSN Based on Internet Innovation Union

Cloud Storage Solution for WSN Based on Internet Innovation Union Cloud Storage Solution for WSN Based on Internet Innovation Union Tongrang Fan 1, Xuan Zhang 1, Feng Gao 1 1 School of Information Science and Technology, Shijiazhuang Tiedao University, Shijiazhuang,

More information

[Type text] Week. National summer training program on. Big Data & Hadoop. Why big data & Hadoop is important?

[Type text] Week. National summer training program on. Big Data & Hadoop. Why big data & Hadoop is important? 1 Week National summer training program on Big Data & Hadoop Why big data & Hadoop is important? Highlights of Big Data & Hadoop Implement a Hadoop Project Learn to write Complex MapReduce programs Perform

More information

Design and Implementation of One-way IP Performance Measurement Tool

Design and Implementation of One-way IP Performance Measurement Tool Design and Implementation of One-way IP Performance Measurement Tool Jaehoon Jeong 1, Seungyun Lee 1, Yongjin Kim 1, and Yanghee Choi 2 1 Protocol Engineering Center, ETRI, 161 Gajong-Dong, Yusong-Gu,

More information

Adding scalability to legacy PHP web applications. Overview. Mario Valdez-Ramirez

Adding scalability to legacy PHP web applications. Overview. Mario Valdez-Ramirez Adding scalability to legacy PHP web applications Overview Mario Valdez-Ramirez The scalability problems of legacy applications Usually were not designed with scalability in mind. Usually have monolithic

More information

The Cloud to the rescue!

The Cloud to the rescue! The Cloud to the rescue! What the Google Cloud Platform can make for you Aja Hammerly, Developer Advocate twitter.com/thagomizer_rb So what is the cloud? The Google Cloud Platform The Google Cloud Platform

More information

This course is intended for database professionals who need who plan, implement, and manage database solutions. Primary responsibilities include:

This course is intended for database professionals who need who plan, implement, and manage database solutions. Primary responsibilities include: Course Page - Page 1 of 5 Designing Solutions for Microsoft SQL Server 2014 M-20465 Length: 3 days Price: $1,795.00 Course Description The focus of this three-day instructor-led course is on planning and

More information

Matt Benton, Mike Bull, Josh Fyne, Georgie Mackender, Richard Meal, Louise Millard, Bogdan Paunescu, Chencheng Zhang

Matt Benton, Mike Bull, Josh Fyne, Georgie Mackender, Richard Meal, Louise Millard, Bogdan Paunescu, Chencheng Zhang GROUP 5 Hadoop Revision Report Matt Benton, Mike Bull, Josh Fyne, Georgie Mackender, Richard Meal, Louise Millard, Bogdan Paunescu, Chencheng Zhang 9 Dec 2010 Table of Contents 1. Introduction... 2 1.1.

More information

An overview of Drupal infrastructure and plans for future growth. prepared by Kieran Lal and Gerhard Killesreiter for the Drupal Association

An overview of Drupal infrastructure and plans for future growth. prepared by Kieran Lal and Gerhard Killesreiter for the Drupal Association An overview of Drupal infrastructure and plans for future growth prepared by Kieran Lal and Gerhard Killesreiter for the Drupal Association Drupal.org Old Infrastructure Problems: Web servers not efficiently

More information

Rackspace Cloud Databases and Container-based Virtualization

Rackspace Cloud Databases and Container-based Virtualization Rackspace Cloud Databases and Container-based Virtualization August 2012 J.R. Arredondo @jrarredondo Page 1 of 6 INTRODUCTION When Rackspace set out to build the Cloud Databases product, we asked many

More information