Introduction to Middleware

Similar documents
A programming model in Cloud: MapReduce

Ubiquitous access Inherently distributed Many, diverse clients (single purpose rich) Unlimited computation and data on demand

Middleware Lou Somers

Cloud data store services and NoSQL databases. Ricardo Vilaça Universidade do Minho Portugal

Where We Are. References. Cloud Computing. Levels of Service. Cloud Computing History. Introduction to Data Management CSE 344

Cloud Computing at Google. Architecture

Lecture 6 Cloud Application Development, using Google App Engine as an example

Tier Architectures. Kathleen Durant CS 3200

A Survey Study on Monitoring Service for Grid

The Service Availability Forum Specification for High Availability Middleware

CUMULUX WHICH CLOUD PLATFORM IS RIGHT FOR YOU? COMPARING CLOUD PLATFORMS. Review Business and Technology Series

DISTRIBUTED SYSTEMS [COMP9243] Lecture 9a: Cloud Computing WHAT IS CLOUD COMPUTING? 2

Cluster, Grid, Cloud Concepts

White Paper: 1) Architecture Objectives: The primary objective of this architecture is to meet the. 2) Architecture Explanation

BSC vision on Big Data and extreme scale computing

JReport Server Deployment Scenarios

2.1 What are distributed systems? What are systems? Different kind of systems How to distribute systems? 2.2 Communication concepts

SDFS Overview. By Sam Silverberg

F1: A Distributed SQL Database That Scales. Presentation by: Alex Degtiar (adegtiar@cmu.edu) /21/2013

Middleware- Driven Mobile Applications

Development of nosql data storage for the ATLAS PanDA Monitoring System

SOFT 437. Software Performance Analysis. Ch 5:Web Applications and Other Distributed Systems

Operating Systems: Basic Concepts and History

NoSQL Data Base Basics

Scality RING High performance Storage So7ware for pla:orms, StaaS and Cloud ApplicaAons

Cloud Application Development (SE808, School of Software, Sun Yat-Sen University) Yabo (Arber) Xu

Direct NFS - Design considerations for next-gen NAS appliances optimized for database workloads Akshay Shah Gurmeet Goindi Oracle

A Middleware Strategy to Survive Compute Peak Loads in Cloud

Service Oriented Architectures

Database Scalability and Oracle 12c

Big Data With Hadoop

Eclipse Exam Tutorial - Pros and Cons

Classic Grid Architecture

Efficiency of Web Based SAX XML Distributed Processing

Distributed File System. MCSN N. Tonellotto Complements of Distributed Enabling Platforms

WSO2 Message Broker. Scalable persistent Messaging System

Bigtable is a proven design Underpins 100+ Google services:

The Service Revolution software engineering without programming languages

Cloud Computing. Chapter 3 Platform as a Service (PaaS)

JBoss & Infinispan open source data grids for the cloud era

Distribution transparency. Degree of transparency. Openness of distributed systems

Client/Server Computing Distributed Processing, Client/Server, and Clusters

Virtual machine interface. Operating system. Physical machine interface

SQL Server 2014 New Features/In- Memory Store. Juergen Thomas Microsoft Corporation

Comparing SQL and NOSQL databases

<Insert Picture Here> Oracle NoSQL Database A Distributed Key-Value Store

Responsive, resilient, elastic and message driven system

System Models for Distributed and Cloud Computing

Business Process Management

Last time. Today. IaaS Providers. Amazon Web Services, overview

Portable Scale-Out Benchmarks for MySQL. MySQL User Conference 2008 Robert Hodges CTO Continuent, Inc.

Infrastructure that supports (distributed) componentbased application development

CORBA and object oriented middleware. Introduction

A Virtual Cloud Computing Provider for Mobile Devices

Scalability of web applications. CSCI 470: Web Science Keith Vertanen

Qualogy M. Schildmeijer. Whitepaper Oracle Exalogic FMW Optimization

Distributed Systems. Tutorial 12 Cassandra

Bringing Big Data Modelling into the Hands of Domain Experts

References. Introduction to Database Systems CSE 444. Motivation. Basic Features. Outline: Database in the Cloud. Outline

Introduction to Database Systems CSE 444

Distributed storage for structured data

Facebook: Cassandra. Smruti R. Sarangi. Department of Computer Science Indian Institute of Technology New Delhi, India. Overview Design Evaluation

Enterprise Application Integration

Hypertable Architecture Overview

Use of Hadoop File System for Nuclear Physics Analyses in STAR

C/S Basic Concepts. The Gartner Model. Gartner Group Model. GM: distributed presentation. GM: distributed logic. GM: remote presentation

CLOUD DATABASE DATABASE AS A SERVICE

Introduction to Database Systems CSE 444. Lecture 24: Databases as a Service

Storing and Processing Sensor Networks Data in Public Clouds

DISTRIBUTED AND PARALLELL DATABASE

SOA, case Google. Faculty of technology management Information Technology Service Oriented Communications CT30A8901.

Scaling Objectivity Database Performance with Panasas Scale-Out NAS Storage

In Memory Accelerator for MongoDB

Chapter Outline. Chapter 2 Distributed Information Systems Architecture. Middleware for Heterogeneous and Distributed Information Systems

LinuxWorld Conference & Expo Server Farms and XML Web Services

Performance Analysis of Lucene Index on HBase Environment

BLM 413E - Parallel Programming Lecture 3

Development Model for the Cloud Paradigm Shift of the Same Old Same Old? Dr. Umit Yalcinalp, Salesforce.com Developer Evangelist

Aneka: A Software Platform for.net-based Cloud Computing

Using Object Database db4o as Storage Provider in Voldemort

Agenda. Some Examples from Yahoo! Hadoop. Some Examples from Yahoo! Crawling. Cloud (data) management Ahmed Ali-Eldin. First part: Second part:

An Oracle White Paper November Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics

Write a technical report Present your results Write a workshop/conference paper (optional) Could be a real system, simulation and/or theoretical

Lecture 10: HBase! Claudia Hauff (Web Information Systems)!

SWIFT. Page:1. Openstack Swift. Object Store Cloud built from the grounds up. David Hadas Swift ATC. HRL 2012 IBM Corporation

Assignment # 1 (Cloud Computing Security)

Introduction to Windows Azure Cloud Computing Futures Group, Microsoft Research Roger Barga, Jared Jackson,Nelson Araujo, Dennis Gannon, Wei Lu, and

Data Management in the Cloud -

Cognos8 Deployment Best Practices for Performance/Scalability. Barnaby Cole Practice Lead, Technical Services

Distributed File Systems

HDMQ :Towards In-Order and Exactly-Once Delivery using Hierarchical Distributed Message Queues. Dharmit Patel Faraj Khasib Shiva Srivastava

Integrated Application and Data Protection. NEC ExpressCluster White Paper

Liferay Performance Tuning

Chapter 4 Cloud Computing Applications and Paradigms. Cloud Computing: Theory and Practice. 1

Scalable Architecture on Amazon AWS Cloud

BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB

x64 Servers: Do you want 64 or 32 bit apps with that server?

Data Management in the Cloud

Lecture 5: GFS & HDFS! Claudia Hauff (Web Information Systems)! ti2736b-ewi@tudelft.nl

Top 10 reasons your ecommerce site will fail during peak periods

Transcription:

Introduction to Middleware Petr Tůma DEPARTMENT OF DISTRIBUTED AND DEPENDABLE SYSTEMS FACULTY OF MATHEMATICS AND PHYSICS CHARLES UNIVERSITY IN PRAGUE http://d3s.mff.cuni.cz

What is middleware anyway? Think about programming paradigms 1/3 functional programming it helps to view a program as an evaluation of a complex mathematical function with no side effects procedural programming it helps to view a program as a hierarchical composition of smaller procedures object oriented programming it helps to view a program as interactions that change state of otherwise independent objects

What is middleware anyway? 2/3 Programming paradigms give us useful concepts and tools to solve certain problems in certain situations. Note also design patterns application frameworks programming methodologies...

What is middleware anyway? 3/3 Middleware are tools (and often concepts) that are helpful in writing applications for distributed or parallel environments. how to install (deploy) applications how to achieve high performance how to communicate easily how to synchronize easily how to access data...

... or in simpler words Middleware is the stuff that system designers believe belongs in applications and application designers believe belongs in systems. [Paraphrasing K. J. Klingenstein]

Communication

How to communicate over network? Network sends messages, right?... View communication as sending messages! Just what could the interface look like? send (target, message) receive (buffer) Dull?

Example: MPI 1/2 Count all prime numbers in given range. Trivial implementation Iterate over all numbers in range Test each number separately Note Easy to split into smaller tasks Similar to many useful applications

Example: MPI Standard for communication in clusters... 2/2 Support for complex data types Coordinated communication patterns Portability between platforms and languages Example uses just two functions from dozens Scatter (in_buffer, out_buffer) Splits input buffer to multiple participants Reduce (in_buffer, operation) Performs operation between buffers

Example: JGroups 1/2 Implement cooperative drawing board. Trivial implementation Send all drawing operations to all members Just draw operations as they come Note Membership for drawing Ordering becomes important

Example: JGroups Toolkit for multicast communication... 2/2 Easy use of multicast Membership, reliability, ordering, atomicity Example uses these functions Channels with membership management Reliable send and receive Callback interface Serialization

Common messaging features... Message structure Complex messages (data structures, XML) Message portability (automated conversion) Communication patterns Many to many communication Queueing, threads, callbacks Communication properties Reliability, resiliency (including persistence) Ordering (sender, causal, total)

Can we communicate otherwise? Do we send messages in normal code?... Events really delivered as method calls! Proxies on client look like real objects Proxy methods are stubs that send messages All processing really happens on the server No need for writing communication code

Example: RMI Support for communication in Java... Seamless integration Simple features Example points to think about How fast can all this be? How are arguments passed? Where do threads come from?

Example: CORBA Standard in heterogeneous environments... Supports C, C++, Ada, Java, Python... Many extensions Real time environments Embedded environments Component deployment Example points to think about Interface description and language mapping How long would this take to write manually?

Common remote call features... Interface definitions Functions, objects, components Mappings to multiple languages and environments Parallelism support Threading models Event loop integration Protocol support Standard protocols (SOAP over HTTP, IIOP) Optimizations (shared memory, compression)

Can we communicate otherwise? Threads on one node share memory?... We can try sharing memory too! Intercepting memory access just iceberg tip Help from virtual memory manager hardware Even easier in interpreted environments Appropriate consistency model Immediate synchronization has high overhead Relaxed consistency models can be useful Special addressing modes

Example: OpenChord DHT 1/2 Implement distributed data storage. Trivial implementation Distribute data to nodes using hashing Use replication for reliability Note Efficient routing is essential Other issues like membership

Example: OpenChord DHT Implementation of a distributed hash table... Simple interface put (key, value) get (key) Logarithmic routing complexity Robust in face of failures Example points to think about Benefits of peer to peer environments Applications (robust storage, caches, scale) 2/2

Synchronization

How to synchronize parallel activities? We have threads and locks, so what?... We need something simple and scalable! Threading interfaces are relatively cumbersome Blocking synchronization does not scale Fine grained locking leads to deadlocks Notoriously difficult to get right

Example: OpenMP Standard for fine grained parallel constructs... We do not really want to care about threads Just give hints where parallelism is possible Example points to think about What can compiler decide by itself? What about asymmetric architectures? How about the operating system thread API?

Example: deucestm 1/2 Implement simple account transfers. Trivial implementation Balance kept in simple array Transfer is just decrement and increment Note Think about synchronization How likely are races or deadlocks?

Example: deucestm Java transactional memory implementation... Implemented as an agent Atomicity in annotations Example points to think about Concurrency vs efficiency Code robustness Composability 2/2

Example: ProActive Java distributed agent environment... Computation done by independent agents Asynchronous communication Synchronization on futures Example points to think about Relaxed synchronization Application of migration

Persistence

How to connect to persistent storage? Relational databases seem to be quite good?... So map our data to tables! Many objects-to-relations mapping technologies (Almost) seamless usage in source code But relational databases are not all

Example: Google BigTable Highly scalable distributed storage... 1/2 Table with row keys, column keys, timestamps, values Relatively simple interface Individual row manipulation atomic RowMutation rm (table, key); rm.set (column, value); Operation op; Apply (&op, &rm) Single row look up and multiple row scan

Example: Google BigTable Highly scalable distributed storage... 2/2 Impressive performance Reported results on 1786 four core nodes, 500 GB table, 500 nodes as table servers Random 1kB reads 120 MB/s Sequential 1kB reads 1235 MB/s Random 1kB writes 1000 MB/s Many interesting applications Google Crawler, 800 TB table, 1 trillion cells Google Earth, 70 TB table, 9 billion cells...

Moving To Clouds

Combine all this... How about having an environment that combines all these nifty technologies? Remote communication Transparent data persistence Simplified application deployment Management and monitoring tools...

Combine even more... Throw in virtualization technology... Computing resources become dynamic Virtualization possible at different levels Entire operating system Amazon Elastic Cloud Microsoft Azure Application server Google AppEngine

Example: Google App Engine Application server cloud for Java and Python... Dynamic web Servlets, JSP Django Persistent storage JDO, JPA Datastore with query language Asynchronous tasks Load balancing

Example: Amazon Elastic Cloud Virtualized system cloud... Storage System volumes Boot from block store volume Attach volume as a device Snapshot and replication Object buckets Retrieve by key Computing instances On demand Reserved Interfaces for controlling everything

...

Courses... NSWI068 Objektové a komponentové systémy Some middleware technologies, components NSWI080 Middleware Multiple middleware technologies NDBI016 Transactions Theoretical background for correctness Algorithms, software transactional memory NPRG042 Programování v paralelním prostředí Algorithms, some cluster middleware NSWI035 Principy distribuovaných systémů Distributed algorithms

http://d3s.mff.cuni.cz