Scalable Internet/Scalable Storage Seif Haridi KTH/SICS
Interdisk: The Big Idea 2
Interdisk: The Big Idea I: 3
Interdisk: The Big Idea I: Internet is global data communication 4
Interdisk: The Big Idea I: Internet is global data communication 5
Interdisk: The Big Idea I: Can Internet is global data communication be used by all, including my mum 6
All my data in one place 7
All my data in one place 8
All my data in one place Sharing Excel file 9
All my data in one place Sharing Excel file Watching a movie 10
All my data in one place Sharing Excel file Watching a movie 11
All my data in one place Sharing Excel file Watching a movie Sharing photos 12
All my data in one place Sharing Excel file Watching a movie Sharing photos Working with Word 13
ALL MY USUAL APPS VIDEO DISTRIBUTION I: CLOUD SERVICES STORAGE & BACKUP 14
A great interface (you wont even notice it) 15
A great interface (you wont even notice it) 16
A great interface (you wont even notice it) Spot the difference 17
The Interdisk advantages 18
The Interdisk advantages Interdisk lowers the cost to store and transport data 19
The Interdisk advantages Interdisk lowers the cost to store and transport data 20
The Interdisk advantages Interdisk lowers the cost to store and transport data Interdisk significantly improves data throughput 21
The Interdisk advantages Interdisk lowers the cost to store and transport data Interdisk significantly improves data throughput Interdisk uses standard interfaces 22
The ideal solution is a hybrid Best use of centralized and distributed resources 23
Google GDrive, Microsoft SkyDrive... Centralized storage: High cost & low performance Interdisk Distributed hard drive: Low cost & high performance 24
Example: Cost of Cloud Services, Amazon S3 pricing, April 2009 Storage: 100 TB Transport: From 10 TB to 500 TB Total cost per month $, thousands per month 90 80 70 60 50 40 30 20 COST OF TRANSPORT 10 0 Low data intensity services e.g. Backup COST OF STORAGE High data intensity services e.g. Network PVR Keeping data close to apps and users makes a lot of sense!! 25
Distribute the Head, centralize the Tail Data intesity HEAD TAIL Products & Services 26
The result: 40%-60% cost reduction - plus better performance Cost centralized solution Reducing cost of storage and cost of transport Data intesity Cost hybrid solution HEAD TAIL Products & Services 27
Approach The right structured overlay networks (P2P) radically improve the performance of high volume storage and transport services such as: Online storage CDN Video Services Cloud Services I: 28
Unstructured vs. Structured overlays Running streaming services on p2p networks designed for file sharing, such as Bittorrent, is a bad idea. They are designed for a different purpose creating, from a network perspective, unstructered overlays with characteristics that are unsuitable for commercial streaming: Download first watch later Increases network traffic Issues with latencies resulting in poor QoS Unstructured overlay, e.g. Bittorrent! Random peers contribute poor utilization Long chains, may hops bad latency Topology unaware increased traffic! Structured overlay e.g. All local peers contribute and short chains Good QoS Reduced network traffic!! 29
The ideal solution is a hybrid Best use of centralized and distributed resources 30
Autonomic Properties Self-Healing o Maintain the structure of the Overlay network and makes sure nodes receive/store the requested content Self-Optimization The network self-organize in such a way that the following aspects of the distribution are optimized Bandwidth Utilizatio Connectivity Data locality Delay minimization ISP Friendliness (Peering Costs)
techniques DHT distributed hash tables Gossip algorithms Decentralized optimization techniques decentralized auction based optimizations