Using Linux Clusters as VoD Servers



Similar documents
Using Linux Clusters as VoD Servers

Virtuoso and Database Scalability

Distributed File System. MCSN N. Tonellotto Complements of Distributed Enabling Platforms

SAN TECHNICAL - DETAILS/ SPECIFICATIONS

Quantum StorNext. Product Brief: Distributed LAN Client

System requirements for MuseumPlus and emuseumplus

Performance and scalability of a large OLTP workload

Globus Striped GridFTP Framework and Server. Raj Kettimuthu, ANL and U. Chicago

Design Issues in a Bare PC Web Server

POWER ALL GLOBAL FILE SYSTEM (PGFS)

Cisco CRS-1/IOS-XR Device Management 3.5.2: Based on Cisco Active Network Abstraction Software

OVERVIEW. CEP Cluster Server is Ideal For: First-time users who want to make applications highly available

Distributed RAID Architectures for Cluster I/O Computing. Kai Hwang

Implementing Network Attached Storage. Ken Fallon Bill Bullers Impactdata

Dependable Systems. 9. Redundant arrays of. Prof. Dr. Miroslaw Malek. Wintersemester 2004/05

Scala Storage Scale-Out Clustered Storage White Paper

CS 3530 Operating Systems. L02 OS Intro Part 1 Dr. Ken Hoganson

Implementing a Digital Video Archive Based on XenData Software

BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB

Enabling Technologies for Distributed Computing

LAN/Server Free* Backup Experiences

Client/Server and Distributed Computing

STI Hardware Specifications for PCs

The High Performance Internet of Things: using GVirtuS for gluing cloud computing and ubiquitous connected devices

Installation Prerequisites for MetaFrame Presentation Server 3.0

Distributed Data Storage Based on Web Access and IBP Infrastructure. Faculty of Informatics Masaryk University Brno, The Czech Republic

A Comparison on Current Distributed File Systems for Beowulf Clusters

Managing a Fibre Channel Storage Area Network

Agenda. Enterprise Application Performance Factors. Current form of Enterprise Applications. Factors to Application Performance.

Content Distribution Management

IBM WebSphere Business Integration for HIPAA

Technical Specification Data

Reference Architecture. EMC Global Solutions. 42 South Street Hopkinton MA

PATROL Console Server and RTserver Getting Started

Exploiting Remote Memory Operations to Design Efficient Reconfiguration for Shared Data-Centers over InfiniBand

WanVelocity. WAN Optimization & Acceleration

Features - Media Management

Welcome to the unit of Hadoop Fundamentals on Hadoop architecture. I will begin with a terminology review and then cover the major components

Using Multipathing Technology to Achieve a High Availability Solution

Microsoft SQL Server 2005 on Windows Server 2003

Enabling Technologies for Distributed and Cloud Computing

Cisco Active Network Abstraction 4.0

SERVER CLUSTERING TECHNOLOGY & CONCEPT

Gigabit Ethernet Design

Fluke Networks NetFlow Tracker

Arrow ECS sp. z o.o. Oracle Partner Academy training environment with Oracle Virtualization. Oracle Partner HUB

Scalable Cloud Computing Solutions for Next Generation Sequencing Data

System Requirements G E N E R A L S Y S T E M R E C O M M E N D A T I O N S

Communicating with devices

Using High Availability Technologies Lesson 12

VTrak SATA RAID Storage System

Implementing Tivoli Storage Manager on Linux on System z

Chapter 5 Busses, Ports and Connecting Peripherals

Microsoft Exchange Server 2003 Deployment Considerations

Setting a new standard

OBSERVEIT DEPLOYMENT SIZING GUIDE

Web Application s Performance Testing

How To Run A Pacemar Optima System On A Microsoft Powerbook 2.5 (Powerbook) With A Powerbook 1.5 And Powerbook 3.5 On A Pc Or Macbook 2 (Powerstation) With An

Communication and connectivity the ideal solution for integrated system management and data integrity

IP SAN Fundamentals: An Introduction to IP SANs and iscsi

How To Design A Data Center

Scaling Objectivity Database Performance with Panasas Scale-Out NAS Storage

Implementing Offline Digital Video Storage using XenData Software

Configuration Maximums VMware Infrastructure 3

NAS or iscsi? White Paper Selecting a storage system. Copyright 2007 Fusionstor. No.1

Minimum Hardware Specifications Upgrades

Fibre Channel Overview of the Technology. Early History and Fibre Channel Standards Development

A virtual SAN for distributed multi-site environments

Paceart Optima System 1.4 TECHNICAL REQUIREMENTS

Implementing an Automated Digital Video Archive Based on the Video Edition of XenData Software

Best Practices for Data Sharing in a Grid Distributed SAS Environment. Updated July 2010

Software Development around a Millisecond

Disaster Recovery and Business Continuity Basics The difference between Disaster Recovery and Business Continuity

I/A Series Information Suite AIM*DataLink

NETWORK MANAGEMENT SOLUTIONS

IBM CICS Transaction Gateway for Multiplatforms, Version 7.0

G Porcupine. Robert Grimm New York University

Direct NFS - Design considerations for next-gen NAS appliances optimized for database workloads Akshay Shah Gurmeet Goindi Oracle

short introduction to linux high availability description of problem and solution possibilities linux tools

IBM Software Group. Lotus Domino 6.5 Server Enablement

LinuxWorld Conference & Expo Server Farms and XML Web Services

Hadoop Distributed File System. T Seminar On Multimedia Eero Kurkela

Cray DVS: Data Virtualization Service

System Requirements. SuccessMaker 5

This guide specifies the required and supported system elements for the application.

Simplest Scalable Architecture

Configuration Management of Massively Scalable Systems

Comparing SMB Direct 3.0 performance over RoCE, InfiniBand and Ethernet. September 2014

WINNEBAGO COUNTY INFORMATION TECHNOLOGY STANDARDS 2014/2015

Chapter 16 Distributed Processing, Client/Server, and Clusters

Network Attached Storage. Jinfeng Yang Oct/19/2015

Architecture and Mode of Operation

Big data management with IBM General Parallel File System

Sockets vs. RDMA Interface over 10-Gigabit Networks: An In-depth Analysis of the Memory Traffic Bottleneck

Transcription:

HAC LUCE Using Linux Clusters as VoD Servers Víctor M. Guĺıas Fernández gulias@lfcia.org Computer Science Department University of A Corunha funded by:

Outline Background: The Borg Cluster Video on Demand. Introduction Goals and initial design proposal Key technologies Design refinement Conclusions and ongoing work

Antecedents 1994. Functional paradigm + Parallel Computation Declarative, no side-effects High level of expressiveness, abstraction and prototyping 1995. Yale Haskell concurrent extension Explicit concurrency, higher-order communications 1996-1999. RTS of a distributed functional language Semi-implicit model (annotation, dynamic load balance) Explicit model, higher-order asynchronous communication Interesting architecture: Beowulf Clusters

Borg, LFCIA s Beowulf Cluster 5 4 6 7 1 2 3 4 FORERUNNER 3810 (External world) 10 Mb Ethernet link Dual Pentium II 350Mhz 384MB RAM 8GB HD SCSI AMD K6 300Mhz 96MB RAM 4GB HD IDE (23, up to 47) 5 100 Mb Fast Ethernet link (2 per node) 6 3COM SuperStack II 3300 Switch (4, 24 slots per switch) 7 1 Gb link (2 independent networks) 6 7 3 1 2 frontend 3 4 CPU 5 CPU... CPU CPU CPU... CPU

Video on Demand. Requirements A Video On Demand (VoD) server is a system that provides video services in which a user can request any Video Object (VO) at any time. Applications: Movie on demand, distance learning, interactive news, etc. Main requirements: Large storage capacity High bandwidth Predictable response time (in preference: low) Support a high number of concurrent users Scalability Adaptability Fault tolerancy Low cost

Existent (Commercial) VoD Solutions Project start point 1999 [WAN]) (MPEG-2 2Mbps [LAN], Real 28 Kbps Apple, Quicktime Streaming Server (now Darwin Streaming Server) IBM, DB2 Digital Library Video Charger Microsoft, NetShow Theater (now Windows Media Server) Oracle, Video Server Real Networks, RealVideo Server SGI, WebForce MediaBase (now Kassenna) Sun, StorEdge Media Central Some solutions as extensions or adaptations of other commercial products Expensive, closed, no scalable, no flexible (adaptable) solutions

VoDka: Distributed Functional VoD Server Hierarchical storage system, based on Linux Clusters built with commodity hardware. Design using GoF s desing patterns and OTP behaviours. Development languages: Monitorization, Control, Scheduling: Erlang/OTP Input/Output: C (prototype in Erlang/OTP) Emphasis in reusability: Use of OpenSource tools (Linux, Erlang/OTP,...) Adaptation of existent solutions (Darwin) Subsystems independendization (OpenMonet, Erlatron)

Initial Design Proposal Tape loaders, jukeboxes... Tertiary level (massive storage) Cluster nodes with local storage Secondary level (large scale cache) Cluster heads Primary level (buffering and protocol adaptation) Output streams Three level hierarchical structure: Tertiary level (): Large capacity Secondary level (): Scheduling, aggregated bandwidth Primary level (Streaming): Protocol adaptation

Borg Configuration as VoD Server K7 500 128MB Bus 32 bits Alpha Server DS20 Buses 64 bits borg24 (192.168.155.24) 3COM Superstack II 3300 Borg0 borg0 1 (192.168.155.254) borg1 1... borg23 1 (192.168.155.1...23) borg0 0 (192.168.154.254) borg1 0... borg23 0 (192.168.154.1...23) AutoPAK VXA Tape Charger (~$8.000) 3 SCSI: 2 heads, 1 arm 15 slots (33GB Tapes) * ~100 2 hour long 300Kb/s movies * ~15 2 hour long 2Mb/s movies Each head: 5MB/s

Key Technologies: Erlang/OTP Designed, developed and used by Ericsson AB for their complex distributed control systems Suitable for distributed, fault tolerant, soft real time systems Main design features: Functional: no side-effects Concurrent: Asynchronous message passing, value transparent transport, high order communications Distributed: Transparent location of processes over nodes Fault tolerance built-in primitives Code hot swap without stopping the system Open Telecom Platform: libraries design patterns for distributed systems Generic servers Supervision mechanisms Distributed database (location transparency, fragmentation, replication, integration with the language) Integration: SNMP, ASN.1, Interface with C, Corba, Java,

Key Technologies: Linux Cluster Group experience with high speed networks, distributed systems and clustering over Linux Source code availability: modification of any part of the software for adapting, correct, locate problems or instrumentation. Homogeneous licence (GPL): easy legal treatment (versus, e.g., current MPEG4 situation, where each component has a different licence, and its interaction is sometimes annoying and even contradictory). Compatibility: code developed with Linux can be ported without problems to Solaris, AIX, Tru64, IRIX, etc. Respect of standards (POSIX 1003.*, SVID, 4.xBSD) Good performance Wide availability of development tools Different hardware platforms support (x86, Alpha, SPARC, ARM, S/390 (zseries), IA64, SH3, MIPS... )

Learned lessons CNeed for flexibility: hierarchical architecture redefinition, giving another module composition with a standard API. The number for levels and the way they are composed should be flexible (e.g.: the whole server as data source of another server) In each level: different protocols both for storage and transference Need of factorization: communication process abstraction (pipes, transfers), server reflection (VoDka Browser ), monitorization (observer+mediator ), etc. Independence with regard to protocols and file formats (at this time in constant change) Bottlenecks are not so in the network but in the nodes: disk access TCP/IP is convenient for inter-module communication, but is a heavy protocol when dealing with hundreds of Mbit There is no efficient off-line massive storage with commodity hardware. Tape chargers are efficient but too expensive (IBM 3575: $15000).

Design refinement. Current proposal (I) Separation of the management subsystem (common web application) and the video server kernel. Video On Demand Kernel Architecture Smirnoff SERVER VODKA Relevant Data Observer Request1 Request2(MO) Response1 SERVLETS XSLT Consults Consults Management Data (MO) Agent Management Subsystem The figure represents SMIRNOFF (Second Monitoring and Improved Release Now Offering Further Features) version structure

Design refinement. Current proposal (II) Internal transferences in VoDka after a request from a remote client VODKA Monitor Monitor RTP MO MO Stream DD2 Streaming DD2 Group lookup Sched lookup Sched lookup Group MO lookupans(a) lookupans(a) lookupans(a,b,c) DD1 lookup f(state,mo)={ds1,dd2} lookup lookup lookupans(a) f(state,mo)=ds2 lookupans(a) lookupans(b) lookup lookupans(c) H.263 MO Frontend Frontend Streamer transfer (File) (TAPE) Group PIPE TCP DD1 DS1 Video Stream TCP PIPE FILE DD2 DS2 () STREAM I/O STORAGE I/O A whole server can be data source for the storage level, using the adequate input at the storage driver

Design refinement. Current proposal (III) VODKA Monitor Monitor Monitor Stream Group Streaming Sched Sched n cache levels Sched Group RTP H.263 Frontend Frontend Streamer Group (File) (TAPE) Group (File) (File) () The figure shows VoDka SMIRNOFF s flexibility: 1 streaming level, n cache levels, and the massive storage level.

Design refinement. Current proposal (IV) borg24 borg25 Streaming Sched Sched 100Mb/s Sched 10Mb/s Frontend Streamer Group Group 10Mb/s Streaming Sched Sched 100Mb/s 100Mb/s 100Mb/s (File) (TAPE) Streamer (File) (File) (File) ATM Frontend covas borg1 borg22 borg23

XSL transformations using the Cluster XML documents (generated with OpenMonet) transformation XSL selection based in the document type, browser, etc.: mod_xsl XSLT C++ Alibrary Sablotron adaptation: sablotron_adapter Load balancing: erlatron Client xslt result erlatron master ready xslt erlatron slave erlatron slave port C adapter sablotron library C adapter Client erlatron slave C adapter pool of workers erlatron server requests per second 40 35 30 25 20 15 10 ab -c 10 -n 500 ab -c 15 -n 500 ab -c 25 -n 500 ab -c 50 -n 500 execution time (s) 200 150 100 50 ab -c C -n 250 5 0 1 2 4 8 12 16 22 number of slaves (P) 0 12 4 8 12 16 20 24 48 concurrency level (C)

Conclusions and ongoing work Need of flexibility Need of factorization Independence with regard to protocols and formats Bottleneck: disk access TCP/IP: heavy protocol Doesn t exist commodity massive off-line storage Prototype implantation within the University Campus Adaptation and implementation study of a prototype within R network

Fundings Related projects: FEDER 1FD97-1759 (VoD server development) (Technological partner: R) XUNTA PGIDT99COM10502 (performance evaluation) University of A Coruña 2000-5050252026 (design patterns in distributed functional systems) Infrastructure: Xunta de Galicia. Infrastructure 1998. Borg Xunta de Galicia. Infrastructure 2001. BorgTNG

Web resources LFCIA Home Page. http://www.lfcia.org VoDka Project Home Page. http://vodka.lfcia.org OpenMonet Home Page. http://monet.sourceforge.net mod_xsl Home Page. http://modxsl.sourceforge.net A Coruña, September 12-14, 2001 http://www.lfcia.org/sit01