Divy Agrawal and Amr El Abbadi Department of Computer Science University of California at Santa Barbara

Size: px
Start display at page:

Download "Divy Agrawal and Amr El Abbadi Department of Computer Science University of California at Santa Barbara"

Transcription

1 Divy Agrawal and Amr El Abbadi Department of Computer Science University of California at Santa Barbara

2 Sudipto Das (Microsoft summer intern) Shyam Antony (Microsoft now) Aaron Elmore (Amazon summer intern) 8/5/2010 2

3 Different from earlier attempts: Distributed Computing Distributed Databases Grid Computing Cloud Computing is REAL Organic growth: Yahoo!, Microsoft, Amazon,Google Poised to be an integral aspect of the global computing Infrastructures 8/5/2010 3

4 Client Site Client Site Client Site HAProxy (Load Balancer) Elastic IP 8/5/2010 MySQL Master DB Replication Database becomes the Scalability Bottleneck MySQL Slave DB Cannot leverage elasticity 4

5 Client Site Client Site Client Site HAProxy (Load Balancer) Elastic IP MySQL Master DB Replication MySQL Slave DB 8/5/2010 5

6 Client Site Client Site Client Site HAProxy (Load Balancer) Elastic IP Scalable and Elastic Key Value Stores But limited consistency and 8/5/2010 operational flexibility 6

7 Scalability Elasticity Fault tolerance Self Manageability Sacrifice consistency? Can be done, but is it a foregone conclusion!!! 8/5/2010 7

8 Separate System and Application State Limit interactions to a single node Decouple Ownership from Data Storage Limited dist synchronization is practical 8/5/2010 8

9 Data Fusion Enrich Key Value stores GStore: Efficient Transactional Multi-key access [ACM SOCC 2010] Data Fission Cloud enabled relational databases ElasTraS: Elastic TranSactional Database [HotClouds2009;Tech. Report 2010] 8/5/2010 9

10

11 Many applications need multi-key accesses: Online multi-player games Collaborative applications Enrich functionality of the Key value stores 8/5/

12 Apps select any set of keys to form a group A granule of on-demand transactional access Data store provides transactional group access Non-overlapping groups 8/5/

13 Keys located on different nodes Horizontal Partitions of the Keys Key Group A single node gains ownership of all keys in a KeyGroup Group Formation Phase 8/5/

14 Conceptually akin to locking Allows collocation of ownership Transfer key ownership from followers to leader Guarantee safe transfer in the presence of system dynamics: Dynamic migration of data and its control Failures 8/5/

15 Transactional Multi-Key Access Application Clients Grouping Middleware Layer resident on top of a Key-Value Store Grouping Layer Transaction Manager Key-Value Store Logic Grouping Layer Transaction Manager Key-Value Store Logic Grouping Layer Transaction Manager Key-Value Store Logic 8/5/2010 Distributed Storage G-Store 15

16

17 Designed to make RDBMS cloud-friendly Database viewed as a collection of partitions Suitable for standard OLTP workloads: Large single tenant database instance Database partitioned at the schema level Multi-tenant with large number of small databases Each partition is a self contained database 8/5/

18 Elastic to deal with workload changes Dynamic load balancing of partitions Automatic recovery from node failures Transactional access to database partitions 8/5/

19 TM Master Health and Load Management OTM OTM Application Clients DB Read/Write Workload Lease Management OTM Durable Writes Metadata Manager Master Proxy Application Logic Txn Manager P 1 P 2 P n DB Partitions ElasTraS Client MM Proxy Log Manager Distributed Fault-tolerant Storage 8/5/

20 Performed in Amazon EC2 Used TPC-C for evaluation Cluster size: 10 to 30 nodes Number of concurrent clients: 100 to 1800 Data size: ~1T Each node: 8 cores, 7G RAM, 1.7T disk Max thruput: 0.2M TPC-C Xact/sec on 30 machines using HDFS & Zookeeper software. 8/5/

21 Cloud Computing poses fundamental challenges to database researchers: Scalability, Reliability and Data Consistency Need to understand the new applications Live Data migration is critical. Challenging multi-node/center atomic operations Clear characterization of properties & guarantees. 8/5/

Database Scalabilty, Elasticity, and Autonomic Control in the Cloud

Database Scalabilty, Elasticity, and Autonomic Control in the Cloud Database Scalabilty, Elasticity, and Autonomic Control in the Cloud Divy Agrawal Department of Computer Science University of California at Santa Barbara Collaborators: Amr El Abbadi, Sudipto Das, Aaron

More information

Amr El Abbadi. Computer Science, UC Santa Barbara amr@cs.ucsb.edu

Amr El Abbadi. Computer Science, UC Santa Barbara amr@cs.ucsb.edu Amr El Abbadi Computer Science, UC Santa Barbara amr@cs.ucsb.edu Collaborators: Divy Agrawal, Sudipto Das, Aaron Elmore, Hatem Mahmoud, Faisal Nawab, and Stacy Patterson. Client Site Client Site Client

More information

How To Manage A Multi-Tenant Database In A Cloud Platform

How To Manage A Multi-Tenant Database In A Cloud Platform UCSB Computer Science Technical Report 21-9. Live Database Migration for Elasticity in a Multitenant Database for Cloud Platforms Sudipto Das Shoji Nishimura Divyakant Agrawal Amr El Abbadi Department

More information

Big Data, Deep Learning and Other Allegories: Scalability and Fault- tolerance of Parallel and Distributed Infrastructures.

Big Data, Deep Learning and Other Allegories: Scalability and Fault- tolerance of Parallel and Distributed Infrastructures. Big Data, Deep Learning and Other Allegories: Scalability and Fault- tolerance of Parallel and Distributed Infrastructures Professor of Computer Science UC Santa Barbara Divy Agrawal Research Director,

More information

G-Store: A Scalable Data Store for Transactional Multi key Access in the Cloud

G-Store: A Scalable Data Store for Transactional Multi key Access in the Cloud G-Store: A Scalable Data Store for Transactional Multi key Access in the Cloud Sudipto Das Divyakant Agrawal Amr El Abbadi Department of Computer Science University of California, Santa Barbara Santa Barbara,

More information

ElasTraS: An Elastic, Scalable, and Self Managing Transactional Database for the Cloud

ElasTraS: An Elastic, Scalable, and Self Managing Transactional Database for the Cloud UCSB Computer Science Technical Report 2010-04. ElasTraS: An Elastic, Scalable, and Self Managing Transactional Database for the Cloud Sudipto Das Shashank Agarwal Divyakant Agrawal Amr El Abbadi Department

More information

Scalable and Elastic Transactional Data Stores for Cloud Computing Platforms

Scalable and Elastic Transactional Data Stores for Cloud Computing Platforms UNIVERSITY OF CALIFORNIA Santa Barbara Scalable and Elastic Transactional Data Stores for Cloud Computing Platforms A Dissertation submitted in partial satisfaction of the requirements for the degree of

More information

Scalable and Elastic Transactional Data Stores for Cloud Computing Platforms

Scalable and Elastic Transactional Data Stores for Cloud Computing Platforms UNIVERSITY OF CALIFORNIA Santa Barbara Scalable and Elastic Transactional Data Stores for Cloud Computing Platforms A Dissertation submitted in partial satisfaction of the requirements for the degree of

More information

From a Virtualized Computing Nucleus to a Cloud Computing Universe: A Case for Dynamic Clouds

From a Virtualized Computing Nucleus to a Cloud Computing Universe: A Case for Dynamic Clouds From a Virtualized Computing Nucleus to a Cloud Computing Universe: A Case for Dynamic Clouds Divyakant Agrawal Sudipto Das Amr El Abbadi Department of Computer Science University of California, Santa

More information

Albatross: Lightweight Elasticity in Shared Storage Databases for the Cloud using Live Data Migration

Albatross: Lightweight Elasticity in Shared Storage Databases for the Cloud using Live Data Migration : Lightweight Elasticity in Shared Storage Databases for the Cloud using Live Data Sudipto Das Shoji Nishimura Divyakant Agrawal Amr El Abbadi University of California, Santa Barbara NEC Corporation Santa

More information

Cloud DBMS: An Overview. Shan-Hung Wu, NetDB CS, NTHU Spring, 2015

Cloud DBMS: An Overview. Shan-Hung Wu, NetDB CS, NTHU Spring, 2015 Cloud DBMS: An Overview Shan-Hung Wu, NetDB CS, NTHU Spring, 2015 Outline Definition and requirements S through partitioning A through replication Problems of traditional DDBMS Usage analysis: operational

More information

Database Scalability, Elasticity, and Autonomy in the Cloud

Database Scalability, Elasticity, and Autonomy in the Cloud Database Scalability, Elasticity, and Autonomy in the Cloud [Extended Abstract] Divyakant Agrawal, Amr El Abbadi, Sudipto Das, and Aaron J. Elmore Department of Computer Science University of California

More information

Hadoop IST 734 SS CHUNG

Hadoop IST 734 SS CHUNG Hadoop IST 734 SS CHUNG Introduction What is Big Data?? Bulk Amount Unstructured Lots of Applications which need to handle huge amount of data (in terms of 500+ TB per day) If a regular machine need to

More information

Hadoop: A Framework for Data- Intensive Distributed Computing. CS561-Spring 2012 WPI, Mohamed Y. Eltabakh

Hadoop: A Framework for Data- Intensive Distributed Computing. CS561-Spring 2012 WPI, Mohamed Y. Eltabakh 1 Hadoop: A Framework for Data- Intensive Distributed Computing CS561-Spring 2012 WPI, Mohamed Y. Eltabakh 2 What is Hadoop? Hadoop is a software framework for distributed processing of large datasets

More information

ElasTraS: An Elastic Transactional Data Store in the Cloud

ElasTraS: An Elastic Transactional Data Store in the Cloud ElasTraS: An Elastic Transactional Data Store in the Cloud Sudipto Das Divyakant Agrawal Amr El Abbadi Department of Computer Science, UC Santa Barbara, CA, USA {sudipto, agrawal, amr}@cs.ucsb.edu Abstract

More information

Hosting Transaction Based Applications on Cloud

Hosting Transaction Based Applications on Cloud Proc. of Int. Conf. on Multimedia Processing, Communication& Info. Tech., MPCIT Hosting Transaction Based Applications on Cloud A.N.Diggikar 1, Dr. D.H.Rao 2 1 Jain College of Engineering, Belgaum, India

More information

References. Introduction to Database Systems CSE 444. Motivation. Basic Features. Outline: Database in the Cloud. Outline

References. Introduction to Database Systems CSE 444. Motivation. Basic Features. Outline: Database in the Cloud. Outline References Introduction to Database Systems CSE 444 Lecture 24: Databases as a Service YongChul Kwon Amazon SimpleDB Website Part of the Amazon Web services Google App Engine Datastore Website Part of

More information

Introduction to Database Systems CSE 444

Introduction to Database Systems CSE 444 Introduction to Database Systems CSE 444 Lecture 24: Databases as a Service YongChul Kwon References Amazon SimpleDB Website Part of the Amazon Web services Google App Engine Datastore Website Part of

More information

TRANSACTION MANAGEMENT TECHNIQUES AND PRACTICES IN CURRENT CLOUD COMPUTING ENVIRONMENTS : A SURVEY

TRANSACTION MANAGEMENT TECHNIQUES AND PRACTICES IN CURRENT CLOUD COMPUTING ENVIRONMENTS : A SURVEY TRANSACTION MANAGEMENT TECHNIQUES AND PRACTICES IN CURRENT CLOUD COMPUTING ENVIRONMENTS : A SURVEY Ahmad Waqas 1,2, Abdul Waheed Mahessar 1, Nadeem Mahmood 1,3, Zeeshan Bhatti 1, Mostafa Karbasi 1, Asadullah

More information

Transactions Management in Cloud Computing

Transactions Management in Cloud Computing Transactions Management in Cloud Computing Nesrine Ali Abd-El Azim 1, Ali Hamed El Bastawissy 2 1 Computer Science & information Dept., Institute of Statistical Studies & Research, Cairo, Egypt 2 Faculty

More information

Non-Stop for Apache HBase: Active-active region server clusters TECHNICAL BRIEF

Non-Stop for Apache HBase: Active-active region server clusters TECHNICAL BRIEF Non-Stop for Apache HBase: -active region server clusters TECHNICAL BRIEF Technical Brief: -active region server clusters -active region server clusters HBase is a non-relational database that provides

More information

DISTRIBUTED SYSTEMS [COMP9243] Lecture 9a: Cloud Computing WHAT IS CLOUD COMPUTING? 2

DISTRIBUTED SYSTEMS [COMP9243] Lecture 9a: Cloud Computing WHAT IS CLOUD COMPUTING? 2 DISTRIBUTED SYSTEMS [COMP9243] Lecture 9a: Cloud Computing Slide 1 Slide 3 A style of computing in which dynamically scalable and often virtualized resources are provided as a service over the Internet.

More information

Concurrency Lock Issues in Relational Cloud Computing

Concurrency Lock Issues in Relational Cloud Computing Concurrency Lock Issues in Relational Cloud Computing Shri V D Garde, Aarthi Mudaliar NCHSE, Bhopal Sri Sathya Sai College for Women, Bhopal vgarde@bsnl.in art_sanj2006@yahoo.co.in Abstract The widespread

More information

Design for Failure High Availability Architectures using AWS

Design for Failure High Availability Architectures using AWS Design for Failure High Availability Architectures using AWS Harish Ganesan Co founder & CTO 8KMiles www.twitter.com/harish11g http://www.linkedin.com/in/harishganesan Sample Use Case Multi tiered LAMP/LAMJ

More information

The deployment of OHMS TM. in private cloud

The deployment of OHMS TM. in private cloud Healthcare activities from anywhere anytime The deployment of OHMS TM in private cloud 1.0 Overview:.OHMS TM is software as a service (SaaS) platform that enables the multiple users to login from anywhere

More information

CloudDB: A Data Store for all Sizes in the Cloud

CloudDB: A Data Store for all Sizes in the Cloud CloudDB: A Data Store for all Sizes in the Cloud Hakan Hacigumus Data Management Research NEC Laboratories America http://www.nec-labs.com/dm www.nec-labs.com What I will try to cover Historical perspective

More information

High Availability for Database Systems in Cloud Computing Environments. Ashraf Aboulnaga University of Waterloo

High Availability for Database Systems in Cloud Computing Environments. Ashraf Aboulnaga University of Waterloo High Availability for Database Systems in Cloud Computing Environments Ashraf Aboulnaga University of Waterloo Acknowledgments University of Waterloo Prof. Kenneth Salem Umar Farooq Minhas Rui Liu (post-doctoral

More information

Web Application Hosting in the AWS Cloud Best Practices

Web Application Hosting in the AWS Cloud Best Practices Web Application Hosting in the AWS Cloud Best Practices May 2010 Matt Tavis Page 1 of 12 Abstract Highly-available and scalable web hosting can be a complex and expensive proposition. Traditional scalable

More information

Elasticity in Multitenant Databases Through Virtual Tenants

Elasticity in Multitenant Databases Through Virtual Tenants Elasticity in Multitenant Databases Through Virtual Tenants 1 Monika Jain, 2 Iti Sharma Career Point University, Kota, Rajasthan, India 1 jainmonica1989@gmail.com, 2 itisharma.uce@gmail.com Abstract -

More information

How to Choose Between Hadoop, NoSQL and RDBMS

How to Choose Between Hadoop, NoSQL and RDBMS How to Choose Between Hadoop, NoSQL and RDBMS Keywords: Jean-Pierre Dijcks Oracle Redwood City, CA, USA Big Data, Hadoop, NoSQL Database, Relational Database, SQL, Security, Performance Introduction A

More information

5-Layered Architecture of Cloud Database Management System

5-Layered Architecture of Cloud Database Management System Available online at www.sciencedirect.com ScienceDirect AASRI Procedia 5 (2013 ) 194 199 2013 AASRI Conference on Parallel and Distributed Computing and Systems 5-Layered Architecture of Cloud Database

More information

Analysis and Research of Cloud Computing System to Comparison of Several Cloud Computing Platforms

Analysis and Research of Cloud Computing System to Comparison of Several Cloud Computing Platforms Volume 1, Issue 1 ISSN: 2320-5288 International Journal of Engineering Technology & Management Research Journal homepage: www.ijetmr.org Analysis and Research of Cloud Computing System to Comparison of

More information

Technical Overview: Anatomy of the Cloudant DBaaS

Technical Overview: Anatomy of the Cloudant DBaaS Technical Overview: Anatomy of the Cloudant DBaaS Guaranteed Data Layer Performance, Scalability, and Availability 2013 Cloudant, Inc. 1 The End of Scale- It- Yourself Databases? Today s applications are

More information

Leveraging BlobSeer to boost up the deployment and execution of Hadoop applications in Nimbus cloud environments on Grid 5000

Leveraging BlobSeer to boost up the deployment and execution of Hadoop applications in Nimbus cloud environments on Grid 5000 Leveraging BlobSeer to boost up the deployment and execution of Hadoop applications in Nimbus cloud environments on Grid 5000 Alexandra Carpen-Amarie Diana Moise Bogdan Nicolae KerData Team, INRIA Outline

More information

Introduction to Cloud : Cloud and Cloud Storage. Lecture 2. Dr. Dalit Naor IBM Haifa Research Storage Systems. Dalit Naor, IBM Haifa Research

Introduction to Cloud : Cloud and Cloud Storage. Lecture 2. Dr. Dalit Naor IBM Haifa Research Storage Systems. Dalit Naor, IBM Haifa Research Introduction to Cloud : Cloud and Cloud Storage Lecture 2 Dr. Dalit Naor IBM Haifa Research Storage Systems 1 Advanced Topics in Storage Systems for Big Data - Spring 2014, Tel-Aviv University http://www.eng.tau.ac.il/semcom

More information

Virtualizing Apache Hadoop. June, 2012

Virtualizing Apache Hadoop. June, 2012 June, 2012 Table of Contents EXECUTIVE SUMMARY... 3 INTRODUCTION... 3 VIRTUALIZING APACHE HADOOP... 4 INTRODUCTION TO VSPHERE TM... 4 USE CASES AND ADVANTAGES OF VIRTUALIZING HADOOP... 4 MYTHS ABOUT RUNNING

More information

Outdated Architectures Are Holding Back the Cloud

Outdated Architectures Are Holding Back the Cloud Outdated Architectures Are Holding Back the Cloud Flash Memory Summit Open Tutorial on Flash and Cloud Computing August 11,2011 Dr John R Busch Founder and CTO Schooner Information Technology JohnBusch@SchoonerInfoTechcom

More information

Apache Hadoop: Past, Present, and Future

Apache Hadoop: Past, Present, and Future The 4 th China Cloud Computing Conference May 25 th, 2012. Apache Hadoop: Past, Present, and Future Dr. Amr Awadallah Founder, Chief Technical Officer aaa@cloudera.com, twitter: @awadallah Hadoop Past

More information

Introduction to NOSQL

Introduction to NOSQL Introduction to NOSQL Université Paris-Est Marne la Vallée, LIGM UMR CNRS 8049, France January 31, 2014 Motivations NOSQL stands for Not Only SQL Motivations Exponential growth of data set size (161Eo

More information

Benchmarking and Analysis of NoSQL Technologies

Benchmarking and Analysis of NoSQL Technologies Benchmarking and Analysis of NoSQL Technologies Suman Kashyap 1, Shruti Zamwar 2, Tanvi Bhavsar 3, Snigdha Singh 4 1,2,3,4 Cummins College of Engineering for Women, Karvenagar, Pune 411052 Abstract The

More information

Data Management in the Cloud. Zhen Shi

Data Management in the Cloud. Zhen Shi Data Management in the Cloud Zhen Shi Overview Introduction 3 characteristics of cloud computing 2 types of cloud data management application 2 types of cloud data management architecture Conclusion Introduction

More information

Database Scalability {Patterns} / Robert Treat

Database Scalability {Patterns} / Robert Treat Database Scalability {Patterns} / Robert Treat robert treat omniti postgres oracle - mysql mssql - sqlite - nosql What are Database Scalability Patterns? Part Design Patterns Part Application Life-Cycle

More information

Apache Hadoop. Alexandru Costan

Apache Hadoop. Alexandru Costan 1 Apache Hadoop Alexandru Costan Big Data Landscape No one-size-fits-all solution: SQL, NoSQL, MapReduce, No standard, except Hadoop 2 Outline What is Hadoop? Who uses it? Architecture HDFS MapReduce Open

More information

Chapter 19 Cloud Computing for Multimedia Services

Chapter 19 Cloud Computing for Multimedia Services Chapter 19 Cloud Computing for Multimedia Services 19.1 Cloud Computing Overview 19.2 Multimedia Cloud Computing 19.3 Cloud-Assisted Media Sharing 19.4 Computation Offloading for Multimedia Services 19.5

More information

Learning Management Redefined. Acadox Infrastructure & Architecture

Learning Management Redefined. Acadox Infrastructure & Architecture Learning Management Redefined Acadox Infrastructure & Architecture w w w. a c a d o x. c o m Outline Overview Application Servers Databases Storage Network Content Delivery Network (CDN) & Caching Queuing

More information

Cloud/SaaS enablement of existing applications

Cloud/SaaS enablement of existing applications Cloud/SaaS enablement of existing applications GigaSpaces: Nati Shalom, CTO & Founder About GigaSpaces Technologies Enabling applications to run a distributed cluster as if it was a single machine 75+

More information

Zero Downtime In Multi tenant Software as a Service Systems

Zero Downtime In Multi tenant Software as a Service Systems Zero Downtime In Multi tenant Software as a Service Systems Toine Hurkmans Principal, Research Engineering Exact Software About Exact Software Founded 25 years ago Business Solutions for SMB space 100.000

More information

High Availability Using MySQL in the Cloud:

High Availability Using MySQL in the Cloud: High Availability Using MySQL in the Cloud: Today, Tomorrow and Keys to Success Jason Stamper, Analyst, 451 Research Michael Coburn, Senior Architect, Percona June 10, 2015 Scaling MySQL: no longer a nice-

More information

Affordable, Scalable, Reliable OLTP in a Cloud and Big Data World: IBM DB2 purescale

Affordable, Scalable, Reliable OLTP in a Cloud and Big Data World: IBM DB2 purescale WHITE PAPER Affordable, Scalable, Reliable OLTP in a Cloud and Big Data World: IBM DB2 purescale Sponsored by: IBM Carl W. Olofson December 2014 IN THIS WHITE PAPER This white paper discusses the concept

More information

Data Consistency on Private Cloud Storage System

Data Consistency on Private Cloud Storage System Volume, Issue, May-June 202 ISS 2278-6856 Data Consistency on Private Cloud Storage System Yin yein Aye University of Computer Studies,Yangon yinnyeinaye.ptn@email.com Abstract: Cloud computing paradigm

More information

On- Prem MongoDB- as- a- Service Powered by the CumuLogic DBaaS Platform

On- Prem MongoDB- as- a- Service Powered by the CumuLogic DBaaS Platform On- Prem MongoDB- as- a- Service Powered by the CumuLogic DBaaS Platform Page 1 of 16 Table of Contents Table of Contents... 2 Introduction... 3 NoSQL Databases... 3 CumuLogic NoSQL Database Service...

More information

NoSQL for SQL Professionals William McKnight

NoSQL for SQL Professionals William McKnight NoSQL for SQL Professionals William McKnight Session Code BD03 About your Speaker, William McKnight President, McKnight Consulting Group Frequent keynote speaker and trainer internationally Consulted to

More information

Entering the Zettabyte Age Jeffrey Krone

Entering the Zettabyte Age Jeffrey Krone Entering the Zettabyte Age Jeffrey Krone 1 Kilobyte 1,000 bits/byte. 1 megabyte 1,000,000 1 gigabyte 1,000,000,000 1 terabyte 1,000,000,000,000 1 petabyte 1,000,000,000,000,000 1 exabyte 1,000,000,000,000,000,000

More information

Enabling Database-as-a-Service (DBaaS) within Enterprises or Cloud Offerings

Enabling Database-as-a-Service (DBaaS) within Enterprises or Cloud Offerings Solution Brief Enabling Database-as-a-Service (DBaaS) within Enterprises or Cloud Offerings Introduction Accelerating time to market, increasing IT agility to enable business strategies, and improving

More information

BIG DATA TRENDS AND TECHNOLOGIES

BIG DATA TRENDS AND TECHNOLOGIES BIG DATA TRENDS AND TECHNOLOGIES THE WORLD OF DATA IS CHANGING Cloud WHAT IS BIG DATA? Big data are datasets that grow so large that they become awkward to work with using onhand database management tools.

More information

HDFS and Availability Data Retragement

HDFS and Availability Data Retragement 마스터 제목 스타일 편집 마스터 부제목 Availability 스타일 편집 and Data durability in HDFS 3 Jun 2011 nfracatals, 고등기술 연구소 / 이문수 moon@nfractals.com Company Profile and Business 1 Who we are? Since 2009 Consulting Solution

More information

Getting Started with Hadoop. Raanan Dagan Paul Tibaldi

Getting Started with Hadoop. Raanan Dagan Paul Tibaldi Getting Started with Hadoop Raanan Dagan Paul Tibaldi What is Apache Hadoop? Hadoop is a platform for data storage and processing that is Scalable Fault tolerant Open source CORE HADOOP COMPONENTS Hadoop

More information

Distributed File Systems

Distributed File Systems Distributed File Systems Mauro Fruet University of Trento - Italy 2011/12/19 Mauro Fruet (UniTN) Distributed File Systems 2011/12/19 1 / 39 Outline 1 Distributed File Systems 2 The Google File System (GFS)

More information

Cloud Platforms, Challenges & Hadoop. Aditee Rele Karpagam Venkataraman Janani Ravi

Cloud Platforms, Challenges & Hadoop. Aditee Rele Karpagam Venkataraman Janani Ravi Cloud Platforms, Challenges & Hadoop Aditee Rele Karpagam Venkataraman Janani Ravi Cloud Platform Models Aditee Rele Microsoft Corporation Dec 8, 2010 IT CAPACITY Provisioning IT Capacity Under-supply

More information

Hadoop & Spark Using Amazon EMR

Hadoop & Spark Using Amazon EMR Hadoop & Spark Using Amazon EMR Michael Hanisch, AWS Solutions Architecture 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Agenda Why did we build Amazon EMR? What is Amazon EMR?

More information

In Memory Accelerator for MongoDB

In Memory Accelerator for MongoDB In Memory Accelerator for MongoDB Yakov Zhdanov, Director R&D GridGain Systems GridGain: In Memory Computing Leader 5 years in production 100s of customers & users Starts every 10 secs worldwide Over 15,000,000

More information

TECHNOLOGY WHITE PAPER Jun 2012

TECHNOLOGY WHITE PAPER Jun 2012 TECHNOLOGY WHITE PAPER Jun 2012 Technology Stack C# Windows Server 2008 PHP Amazon Web Services (AWS) Route 53 Elastic Load Balancing (ELB) Elastic Compute Cloud (EC2) Amazon RDS Amazon S3 Elasticache

More information

Best Practices for Using MySQL in the Cloud

Best Practices for Using MySQL in the Cloud Best Practices for Using MySQL in the Cloud Luis Soares, Sr. Software Engineer, MySQL Replication, Oracle Lars Thalmann, Director Replication, Backup, Utilities and Connectors THE FOLLOWING IS INTENDED

More information

MyISAM Default Storage Engine before MySQL 5.5 Table level locking Small footprint on disk Read Only during backups GIS and FTS indexing Copyright 2014, Oracle and/or its affiliates. All rights reserved.

More information

Cloud Based Application Architectures using Smart Computing

Cloud Based Application Architectures using Smart Computing Cloud Based Application Architectures using Smart Computing How to Use this Guide Joyent Smart Technology represents a sophisticated evolution in cloud computing infrastructure. Most cloud computing products

More information

Can the Elephants Handle the NoSQL Onslaught?

Can the Elephants Handle the NoSQL Onslaught? Can the Elephants Handle the NoSQL Onslaught? Avrilia Floratou, Nikhil Teletia David J. DeWitt, Jignesh M. Patel, Donghui Zhang University of Wisconsin-Madison Microsoft Jim Gray Systems Lab Presented

More information

Extending Hadoop beyond MapReduce

Extending Hadoop beyond MapReduce Extending Hadoop beyond MapReduce Mahadev Konar Co-Founder @mahadevkonar (@hortonworks) Page 1 Bio Apache Hadoop since 2006 - committer and PMC member Developed and supported Map Reduce @Yahoo! - Core

More information

Distribution transparency. Degree of transparency. Openness of distributed systems

Distribution transparency. Degree of transparency. Openness of distributed systems Distributed Systems Principles and Paradigms Maarten van Steen VU Amsterdam, Dept. Computer Science steen@cs.vu.nl Chapter 01: Version: August 27, 2012 1 / 28 Distributed System: Definition A distributed

More information

A programming model in Cloud: MapReduce

A programming model in Cloud: MapReduce A programming model in Cloud: MapReduce Programming model and implementation developed by Google for processing large data sets Users specify a map function to generate a set of intermediate key/value

More information

EPAM Systems. EPAM White Paper

EPAM Systems. EPAM White Paper EPAM Systems EPAM White Paper Version 2.0: August 10 2012 Excellence in Software Content 1. Introduction... 4 2. Business Case... 5 3. Problem Statement... 6 4. Proposed Solutions and Implementation...

More information

NoSQL Data Base Basics

NoSQL Data Base Basics NoSQL Data Base Basics Course Notes in Transparency Format Cloud Computing MIRI (CLC-MIRI) UPC Master in Innovation & Research in Informatics Spring- 2013 Jordi Torres, UPC - BSC www.jorditorres.eu HDFS

More information

Consistency Models for Cloud-based Online Games: the Storage System s Perspective

Consistency Models for Cloud-based Online Games: the Storage System s Perspective Consistency Models for Cloud-based Online Games: the Storage System s Perspective Ziqiang Diao Otto-von-Guericke University Magdeburg 39106 Magdeburg, Germany diao@iti.cs.uni-magdeburg.de ABSTRACT The

More information

Study concluded that success rate for penetration from outside threats higher in corporate data centers

Study concluded that success rate for penetration from outside threats higher in corporate data centers Auditing in the cloud Ownership of data Historically, with the company Company responsible to secure data Firewall, infrastructure hardening, database security Auditing Performed on site by inspecting

More information

Software- as- a- Service (SaaS) on AWS Business and Architecture Overview

Software- as- a- Service (SaaS) on AWS Business and Architecture Overview Software- as- a- Service (SaaS) on AWS Business and Architecture Overview SaaS and AWS Introduction Software- as- a Service (SaaS) is an application delivery model that enables users to utilize a software

More information

Multitenancy. Berthold Reinwald, IBM Almaden Research Center. UW MSR Summer Institute, 2010

Multitenancy. Berthold Reinwald, IBM Almaden Research Center. UW MSR Summer Institute, 2010 Multitenancy Berthold Reinwald, IBM Almaden Research Center UW MSR Summer Institute, 2010 Two Use Cases for Multi-Tenancy SaaS ISVs (Multi-tenant s): - Long tail of tenants - very large number of small

More information

Cloud Courses Description

Cloud Courses Description Courses Description 101: Fundamental Computing and Architecture Computing Concepts and Models. Data center architecture. Fundamental Architecture. Virtualization Basics. platforms: IaaS, PaaS, SaaS. deployment

More information

Virtual Machine in Data Center Switches Huawei Virtual System

Virtual Machine in Data Center Switches Huawei Virtual System Virtual Machine in Data Center Switches Huawei Virtual System Contents 1 Introduction... 3 2 VS: From the Aspect of Virtualization Technology... 3 3 VS: From the Aspect of Market Driving... 4 4 VS: From

More information

Web Application Hosting in the AWS Cloud Best Practices

Web Application Hosting in the AWS Cloud Best Practices Web Application Hosting in the AWS Cloud Best Practices September 2012 Matt Tavis, Philip Fitzsimons Page 1 of 14 Abstract Highly available and scalable web hosting can be a complex and expensive proposition.

More information

How To Understand Cloud Computing

How To Understand Cloud Computing Dr Markus Hagenbuchner markus@uow.edu.au CSCI319 Introduction to Cloud Computing CSCI319 Chapter 1 Page: 1 of 10 Content and Objectives 1. Introduce to cloud computing 2. Develop and understanding to how

More information

Managing large clusters resources

Managing large clusters resources Managing large clusters resources ID2210 Gautier Berthou (SICS) Big Processing with No Locality Job( /crawler/bot/jd.io/1 ) submi t Workflow Manager Compute Grid Node Job This doesn t scale. Bandwidth

More information

The Impact of PaaS on Business Transformation

The Impact of PaaS on Business Transformation The Impact of PaaS on Business Transformation September 2014 Chris McCarthy Sr. Vice President Information Technology 1 Legacy Technology Silos Opportunities Business units Infrastructure Provisioning

More information

An Approach to Implement Map Reduce with NoSQL Databases

An Approach to Implement Map Reduce with NoSQL Databases www.ijecs.in International Journal Of Engineering And Computer Science ISSN: 2319-7242 Volume 4 Issue 8 Aug 2015, Page No. 13635-13639 An Approach to Implement Map Reduce with NoSQL Databases Ashutosh

More information

Running Oracle Applications on AWS

Running Oracle Applications on AWS Running Oracle Applications on AWS Bharath Terala Sr. Principal Consultant Apps Associates LLC June 09, 2014 Copyright 2014. Apps Associates LLC. 1 Agenda About the Presenter About Apps Associates LLC

More information

Elasticity Primitives for Database as a Service

Elasticity Primitives for Database as a Service UNIVERSITY OF CALIFORNIA Santa Barbara Elasticity Primitives for Database as a Service A Dissertation submitted in partial satisfaction of the requirements for the degree of Doctor of Philosophy in Computer

More information

Dependable SaaS Applications and Solutions

Dependable SaaS Applications and Solutions Software as Service Bojan Cukic WVU For non-profit educational use only 1 WHY Software as Service? Product sales and license fee revenues declining Users unhappy with high maintenance support fees Exceptions:

More information

be architected pool of servers reliability and

be architected pool of servers reliability and TECHNICAL WHITE PAPER GRIDSCALE DATABASE VIRTUALIZATION SOFTWARE FOR MICROSOFT SQL SERVER Typical enterprise applications are heavily reliant on the availability of data. Standard architectures of enterprise

More information

Where We Are. References. Cloud Computing. Levels of Service. Cloud Computing History. Introduction to Data Management CSE 344

Where We Are. References. Cloud Computing. Levels of Service. Cloud Computing History. Introduction to Data Management CSE 344 Where We Are Introduction to Data Management CSE 344 Lecture 25: DBMS-as-a-service and NoSQL We learned quite a bit about data management see course calendar Three topics left: DBMS-as-a-service and NoSQL

More information

Journal of science STUDY ON REPLICA MANAGEMENT AND HIGH AVAILABILITY IN HADOOP DISTRIBUTED FILE SYSTEM (HDFS)

Journal of science STUDY ON REPLICA MANAGEMENT AND HIGH AVAILABILITY IN HADOOP DISTRIBUTED FILE SYSTEM (HDFS) Journal of science e ISSN 2277-3290 Print ISSN 2277-3282 Information Technology www.journalofscience.net STUDY ON REPLICA MANAGEMENT AND HIGH AVAILABILITY IN HADOOP DISTRIBUTED FILE SYSTEM (HDFS) S. Chandra

More information

Chapter 7. Using Hadoop Cluster and MapReduce

Chapter 7. Using Hadoop Cluster and MapReduce Chapter 7 Using Hadoop Cluster and MapReduce Modeling and Prototyping of RMS for QoS Oriented Grid Page 152 7. Using Hadoop Cluster and MapReduce for Big Data Problems The size of the databases used in

More information

Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, Amr El Abbadi. [aelmore,agrawal,amr] @ cs.ucsb.edu ccurino @ microsoft.com

Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, Amr El Abbadi. [aelmore,agrawal,amr] @ cs.ucsb.edu ccurino @ microsoft.com Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, Amr El Abbadi [aelmore,agrawal,amr] @ cs.ucsb.edu ccurino @ microsoft.com 2 Moving to the Cloud Why cloud? (are you really asking?) Economy-of-scale arguments

More information

Design and Evaluation of a Hierarchical Multi-Tenant Data Management Framework for Cloud Applications

Design and Evaluation of a Hierarchical Multi-Tenant Data Management Framework for Cloud Applications Design and Evaluation of a Hierarchical Multi-Tenant Data Management Framework for Cloud Applications Pieter-Jan Maenhaut, Hendrik Moens, Veerle Ongenae and Filip De Turck Ghent University, Faculty of

More information

Alfresco Enterprise on AWS: Reference Architecture

Alfresco Enterprise on AWS: Reference Architecture Alfresco Enterprise on AWS: Reference Architecture October 2013 (Please consult http://aws.amazon.com/whitepapers/ for the latest version of this paper) Page 1 of 13 Abstract Amazon Web Services (AWS)

More information

SWIFT. Page:1. Openstack Swift. Object Store Cloud built from the grounds up. David Hadas Swift ATC. HRL davidh@il.ibm.com 2012 IBM Corporation

SWIFT. Page:1. Openstack Swift. Object Store Cloud built from the grounds up. David Hadas Swift ATC. HRL davidh@il.ibm.com 2012 IBM Corporation Page:1 Openstack Swift Object Store Cloud built from the grounds up David Hadas Swift ATC HRL davidh@il.ibm.com Page:2 Object Store Cloud Services Expectations: PUT/GET/DELETE Huge Capacity (Scale) Always

More information

BookKeeper. Flavio Junqueira Yahoo! Research, Barcelona. Hadoop in China 2011

BookKeeper. Flavio Junqueira Yahoo! Research, Barcelona. Hadoop in China 2011 BookKeeper Flavio Junqueira Yahoo! Research, Barcelona Hadoop in China 2011 What s BookKeeper? Shared storage for writing fast sequences of byte arrays Data is replicated Writes are striped Many processes

More information

Cloud computing - Architecting in the cloud

Cloud computing - Architecting in the cloud Cloud computing - Architecting in the cloud anna.ruokonen@tut.fi 1 Outline Cloud computing What is? Levels of cloud computing: IaaS, PaaS, SaaS Moving to the cloud? Architecting in the cloud Best practices

More information

A Survey of Distributed Database Management Systems

A Survey of Distributed Database Management Systems Brady Kyle CSC-557 4-27-14 A Survey of Distributed Database Management Systems Big data has been described as having some or all of the following characteristics: high velocity, heterogeneous structure,

More information

This paper defines as "Classical"

This paper defines as Classical Principles of Transactional Approach in the Classical Web-based Systems and the Cloud Computing Systems - Comparative Analysis Vanya Lazarova * Summary: This article presents a comparative analysis of

More information

TOWARDS TRANSACTIONAL DATA MANAGEMENT OVER THE CLOUD

TOWARDS TRANSACTIONAL DATA MANAGEMENT OVER THE CLOUD TOWARDS TRANSACTIONAL DATA MANAGEMENT OVER THE CLOUD Rohan G. Tiwari Database Research Group, College of Computing Georgia Institute of Technology Atlanta, USA rtiwari6@gatech.edu Shamkant B. Navathe Database

More information

Scalable Architecture on Amazon AWS Cloud

Scalable Architecture on Amazon AWS Cloud Scalable Architecture on Amazon AWS Cloud Kalpak Shah Founder & CEO, Clogeny Technologies kalpak@clogeny.com 1 * http://www.rightscale.com/products/cloud-computing-uses/scalable-website.php 2 Architect

More information

DATABASE REPLICATION A TALE OF RESEARCH ACROSS COMMUNITIES

DATABASE REPLICATION A TALE OF RESEARCH ACROSS COMMUNITIES DATABASE REPLICATION A TALE OF RESEARCH ACROSS COMMUNITIES Bettina Kemme Dept. of Computer Science McGill University Montreal, Canada Gustavo Alonso Systems Group Dept. of Computer Science ETH Zurich,

More information