Scality RING High performance Storage So7ware for Email pla:orms, StaaS and Cloud ApplicaAons Friday, March 18, 2011
MARKET ExponenAal Storage Demand The Digital Universe: Growing by a factor of 44 in next 10 years (IDC) Storage demand grows 50%/year Storage demand for unstructured data is growing at 70% per year There is no sign of slowing down Source: IDC Digital Universe Study May 2010.
MARKET Why Cloud Storage? Free up cash Reduce cost Increase data availability Amazon S3 is 0,15USD/GB/month, that s $5.4 over 3 years. With SAN, cost/tb increases beyond 300 TB. With data growth, there is no other choice! For enterprise and service providers, Storage represents 30% of IT investment. With Cloud Storage they invest / pay as they consume. By standardizaaon and economies of scale. Same principles as virtualizaaon of servers applied to storage. Even the best systems do fail! Tape backup proved to be cumbersome and unreliable. Cloud storage offers a different approach : mulaple live copies Be able to compete. Levarge decreasing price of generic hardware. See «Storage bubble about to burst». According to IDC, 70% of data growth comes from «unstructured content», ideal use case for Cloud storage. 3
HISTORY Long standing relaaonship with some of the worlds most experienced service providers worldwide 4
Problem HISTORY Requirements came from customers : Comcast, Time Warner, Cox, Orange, Tiscali «Sharding of database» creates a hard associaaon between applicaaon server and user Single point of failure : when a SAN / NAS / FC switch reboots, service is down for minutes or hours Amazon S3 is 0,15USD/GB/year, that s $5.4 over 3 years. With SAN, cost/tb increases beyond 300 TB. Managing mulaple SAN, volumes, Aering, changing a disk on RAID is complex, error prone and costly. CompeAAon from Google Requirement A stateless system. AutomaAc index load distribuaon. No component should ever cause a service loss Be able to compete. Leverage decreasing price of generic hardware. Ease of management : autonomic, policy based, self healing system. Enabling new services : text search, photo recogniaon, transcoding 5
Scality Ring
Scality Ring Mission statement Scality Ring is a unique cloud storage infrastructure so7ware designed to provide The Most scalable primary storage The Lowest cost of storage The Lowest cost of opera<on 7
Scality Ring Architecture
Scality Ring US & PCT UAlity patent Mul$purpose Storage System Based Upon a Distributed Hashing Mechanism with Transac$onal Support and Failover Capability Utility Patent #20100162035, available online here. Scalable fault tolerant distributed storage No central point Fault tolerant Elastic & scalable 9
Scality Ring Provisional US patents Probabilistic tiered storage engine for distributed hierarchical object storage devices Provisional Patent #61285019 Automated Tiered Storage Based on object access times and disk utilization Supports multiple datacenters policies for disaster scenarios 10
Scality Ring Technology Overview Connectors Interface to the outside/applicaaons Zimbra/Dovecot mail system connector NaAve web service interface (REST API) Amazon S3 compaable interface C API for custom connectors Storage Nodes Building block of the pool Provide both compuang and storage resources to the cloud Fully distributed addressing using peer to peer IO Daemon Low level storage hypervisor High performance IO operaaons Can use different types of storage hierarchically, for example SSD, local disks (SAS or SATA) and ISCSI at the same Ame 11
Typical mula applicaaons deployment with Aered storage Connectors 12
Open pla:orm for easy integraaon Connectors NaAve Ring storage access protocol is based on HTTP REST C SDK provides the best performance and is used internally to develop connectors File system emulaaon is opamized for blob type access, not for random IO 13
Scality Public/Private Cloud SoluAon Scality Store Rest Storage Service (S3 compatible protocol) with access control and billing mechanisms A 2 nd Tier can add options and can use any cost effective technology (private or 14 public)
Scality Open source Program $100 000 bounty for developers Connectors Mission: promote the transition to object based cloud storage by providing a structured API to simplify application developers job and address key user concerns. Improve interoperability of applicaaons and cloud storage services Create a canonical set of API features with the help of the Open Source community Jump start exisang applicaaons with a Bounty Program Learn more at hqp://scop.scality.com! 15
Tiered & Geo Redundant Model Synchronous or asynchronous redundancy Latency control with caching and multi tiering 16
Connectors Policy driven Per object class of storage Control the number of replicas Control where an object is stored in a mula ring scenario Compress or encrypt an object transparently for the applicaaon Design Based on a Javascript language compiled down to LLVM bit code Hooks on all storage commands: GET PUT DELETE LOCK 17
Most scalable primary storage
Virtual Ring 20 bytes key space Each store node has an automaacally assigned key Peer to peer addressing No central point! Very efficient O(log(n)) algorithms No inherent limit in the # of servers by design Store node Simple Key/Value store Very high performance Autonomous Truly fully distributed architecture DHT & consistent hashing 19
Consistent hashing Nodes are arranged in a 360 degrees ring also called a key space Each node is responsible for a piece of that ring, ie key range When a node is added or lost only 1/n % of the keys are affected Scalable design ElasAc clustering Physical servers are present as mulaple virtual instances, i.e. losing a node spreads its load to mulaple physical servers Built in load balancing Any node can be queried for any key RedirecAon to the right node in no more than 1/2 log2(n) hops 100 servers > 3 hops maximum 1000 servers > 5 hops maximum 10000 servers > 7 hops maximum No boqleneck or central locaaon in the architecture 20
Replica<on Objects are replicated on separate physical servers, guaranteed Replica keys are simple projecaons and do not need to be stored in a central database Between 0 & 5 replicas per object Fault tolerance and data safety Data replicaaon Self healing Balance misplaced objects (adding nodes) Transparently proxy misplaced objects Rebuild missing replicas CRC checksum of all contents Key Object 11 Replica 1 44 Replica 2 77 21
Lowest cost of storage
Scality Ring Lowest cost of storage Hardware agnosac (Standard x86 hardware or NAS/SAN technologies) Benefit from latest disk technologies Mix different hardware Flexible so7ware based replicaaon model All included soluaon Tiering (old and unsued data on slow/cheap disks) Rack aware / Geo redundant Scalable 23
Storage node keys are pre computed according to admin constraints # of physical server fault tolerance # of parallel hard drive fault tolerance Maximum number of copies Ring topology: mula site, rack aware etc Key AllocaAon Provides a mathematically provable fault tolerance No overhead at runtime and no central database Effective use of replication where another system would require twice as many copies for the same server redundancy. 24
Lowest cost of opera<ons
Scality Ring Lowest cost of operaaon Built-in central management platform Automated self healing tasks Scalable operational processes, lowest ratio of admin/tb Programmatic control and statistics (SNMP & web services & Command line interface) No Single Point of Failure (You ll deal with outages the next business day) Upgrades (Can be staged as every node is independent) Stationary configuration (Once in production, only basic maintenance tasks) SLA possible for availability, performance, service 26
Screen clipping taken: 12/11/2010; 10:13 Supervisor snapshot 27
Supervisor Central management pla:orm Monitor applicaaon connectors, storage nodes down to individual disk drives Passive component RingSH Command line interface Easy to script with Manage pla:orm, store/ retrieve/delete objects List all keys AdministraAon
RING Supervisor 29
Experience from live systems Plan well, don t react System has a lot of ineraa: things don t break easily Housekeeping tasks planning is your biggest leverage Monthly KPI review & watch trends vs. single events Cap, grow, monitor, opamize, step & repeat Hardware failures: MTBF is short (to be expected commodity systems) but does not impact service Keep it simple: Deliver business requirements but don t over engineer. 30
Conclusion
Conclusion Scality Ring, an Object Storage pla:orm (built for unstructured content) Cost effecaveness: Delivering on economic and operaaonal promises Ring allows for flexible architectures (Fits exisang deployment model & Allows non disrupave growth and architecture evoluaon) Availability (99,99% to 99,999999..%) Performance (Tiering, Caching, GEO redundancy) Scalability to ExaBytes pla:orms MigraAon from any kind of pla:orm No hardware locked in Scality has team experienced in operaang systems at scale in this industry Ring is proven in producaon today (Mail environment pla:orms, MulAple cloud storage providers) 32
Our customers voted for us! CUSTOMERS Email Providers: Delivering 50% TCO First customer operaaonal with more than 2Ml mailboxes and Billions of Objects processed Telenet, full deployment of messaging pla:orm based on Zimbra Complete ROI study, we deliver 50% reducaon in TCO Full support of Zimbra, OpenXchange, Dovecot and others... Cloud Storage Providers : Enabling a new business Flexible business models, Low iniaal investment Full S3 compaable API Amazon 40% market share, other large 20%, remains 40% of a 5 Bn$ market in 2015 available ProducAon cost is half of Amazon price Droplets to generate more applicaaon compaable with Cloud Storage 33