Online Storage and Content Distribution System at a Large-scale: Peer-assistance and Beyond Bo Li Email: bli@cse.ust.hk Department of Computer Science and Engineering Hong Kong University of Science & Technology IEEE CCGrid @ Shanghai, May 20, 2009
Outline Online Storage and Content Distribution Objectives & Challenges Conceptual Design Space FS2You: Architecture & Mechanisms Measurement Results & Discussion Conclusion & Future Work
Online Storage and Content Distribution Online hosting service allow users to upload files, of both small and large sizes, onto dedicated servers, to be shared among a potentially large group of interested users
Online Storage and Content Distribution A new type of content distribution service: online storage and file sharing become increasingly popular Alexa, ranks 17 in the world 1-click hosting Daily counts for 3.18% global Internet users
Online Storage and Content Distribution Features compared with conventional P2P file sharing such as BitTorrent Better reliability and service guarantee Ease of use simple URL shared to others, one-click service Little or no software download and configuration
Online Storage and Content Distribution Files hosted in either CDNs or dedicated large data centers Rapidshare, 1500 TB of storage in its data centers, 110 Gb/s Skyrocketing server bandwidth costs: yearly 15~20 million USD impose usage restrictions or/and paid service
Outline Online Storage and Content Distribution Objectives & Challenges Conceptual Design Space FS2You: Architecture & Mechanisms Measurement Results & Discussion Conclusion & Future Work
Peer-Assisted Online Storage and Distribution Peer-assistance natural but non-trivial in design Balance two extremes - cost-performance tradeoff Server-based Distribution Guarantee file availability at the prohibitive cost of server bandwidth & storage P2P File Sharing Good scalability No guarantees on file availability A Seamless Integration Peer-assisted Online Storage and Distribution
Design Objectives Couple peer upload contribution & strategic server provisioning in a complementary manner Improve file availability & users downloading performance, while conserving substantial server costs Conceptual Design Design Space Space Practical Implementation FS2You
Challenges Large number of files with highly diverse popularity and different sizes Performance (availability) and user experience No or less restriction on user access Uploading (bandwidth) and downloading (storage and availability) Peer-assistance integration Limited or restricted server storage and bandwidth Semi-persistent file availability peer assistance to conserve server bandwidth costs maintain adequate levels of service quality & user experience
Outline Online Storage and Content Distribution Objectives & Challenges Conceptual Design Space FS2You: Architecture & Mechanisms Measurement Results & Discussion Conclusion & Future Work
General Model & Performance Metrics Important performance metrics to characterize good online storage and distribution systems from different perspectives Multiple files: files: of of diverse popularity & sizes: sizes: Limited server server storage: Limited server server bandwidth: Peer Peer assistance effectiveness: Peer Peer upload/download capacity: j j µ, c µi File availability: attract & serve as many users as possible maintain as high downloading performance as possible System throughput:
Design Space: Storage & Replacement Given a constrained server storage capacity a server storage & replacement strategy determines which set of files to be stored on the server Problem abstraction A classical 0-1 0-1 knapsack problem with with respect to to different objective functions
Design Space: Storage & Replacement To To attract attract & serve serve as as many many users users as as possible To To achieve the the maximum system-wide throughput NP-complete can be solved using a dynamic programming algorithm with a complexity of The static nature not efficient to be used in practical systems Not suitable to be used for the eviction or replacement operation not only the dynamic evolution of user interests on currently stored files but also a continuous flow of newly uploaded files from users
Design Space: Storage & Replacement Simplicity & efficiency are more of a concern in practical system implementations and operations, at a cost of acceptable sub-optimal solution This provides a simple framework for server storage & replacement strategy each file with a profit-to-weight index: files are ranked in descending order by their indices obeying a greedy algorithm to determine those files with relatively high ranks are preferentially stored alternatively, can simply & efficiently identify those with lower ranks, and perform evictions/replacements whenever necessary
Design Space: Storage & Replacement Unify important aspects with tunable design knobs K=0 unpopular-first-eviction strategy Maximize system throughput K=1 balanced consideration between file popularity & size Maximize system-wide file availability K (0,1) various degree of throughput & availability Flexibly applied in practical systems H i dynamically updated adapt to the evolution of user interests file ranking periodically in either a fine or coarse grained manner for eviction/replacement either start from the files with lowest ranks until a certain volume of files are evicted or customize a threshold of H i below which are the candidates for eviction
Illustration: Applicability & Flexibility File availability 0.92 0.9 0.88 0.86 0.84 0.82 Unpopularfirst eviction System throughput 0.8 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 2.6 k Opportunity to achieve both high availability & throughput File availability 3.2 3.1 3 2.9 2.8 2.7 System throughput (GB/second) File availability 1.1 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 With more emphasis on file availability, a real-world system with this customization will be demonstrated later k=0 k=0.4 k=1 k=2 0 25 50 75 100 125 150 175 200 225 250 Server Storage Capacity (GB)
Design Space: Bandwidth Allocation What is the optimal server bandwidth allocation across files to achieve the upper bound of system-wide average downloading rate? To To maximize Problem abstraction A classical continuous knapsack problem with with bounded variables
Design Space: Bandwidth Allocation How to design a near-optimal allocation strategy, that is simple enough to be implemented in practical systems? Follow the guideline conveyed by the optimal strategy Allocate more server bandwidth to less popular files with lower peer assistance effectiveness, while allowing popular ones to largely rely on peer assistance rather than server
Design Space: Bandwidth Allocation A simple framework of server bandwidth allocation Each file with a priority index P i inversely proportional to file popularity implying peer assistance awareness Relative weighting Allocate server bandwidth across files according to relative weighting
Server-side design: bandwidth allocation Tunable design knobs wide design spectrum l=-1 request-driven strategy popular files are provisioned with more server bandwidth typically used in traditional server-based systems without peer assistance l=0 water-leveling strategy can practically work well in peer-assisted online storage and distribution systems l mimic the optimal strategy Easily applied in practical systems P i periodically updated adapt to the evolution of user interests file popularity simply captured by recording the file request count over a certain period various degree of file popularity awareness implying peer assistance awareness
Outline Online Storage and Content Distribution Objectives & Challenges Conceptual Design Space FS2You: Architecture & Mechanisms Measurement Results & Discussion Conclusion & Future Work
Roxbeam Corp. Peer-to-Peer live streaming experiment - 2004 Coolstreaming (Google 1,000,000 entries in 2008) Xinyan Zhang, Jiangchuan Liu, Bo Li, and Peter Yum, Coolstreaming/DONet: a data-driven overlay network for peer-topeer live media streaming, Proc. of IEEE Infocom 2005. Credited as the first large-scale Internet P2P live streaming system Roxbeam Inc. 2005 onward Softbank (Japan), VC, Xinyan Zhang, co-founder Wall street journal incident, Oct 2005, and PPLive Inc Legal content P2P streaming, Japan Yahoo BB (2006), Phoenix TV Online hosting service: FS2You system (2006-2007), Google 800,000 entries in 2009
FS2You: Architecture Tracking Server Channels (files) Info & MD5 Bootstrapping List of peers in channels Hosting Servers Upload/Hosting Download 60 servers A real-world large-scale peerassisted online storage system One of the most popular online hosting services in China Peers Upload Download
Peer Partnership & Content delivery Combine coarse-grained tracking servers & decentralized gossip protocol Periodic partnership update, resilient to peer dynamics Periodic status-report (Peer ID & IP) Content Periodic exchange of Block Maps (BMs) among peers enables them to locate the needed blocks Retrieve distinct blocks from multiple partners Request-from-server conditions No partners (unpopular file or connection fail) None of the partners hold the desired block Aggregate downloading rate from partners < 10 KB/second empirically determined to prevent peers from aggressively consuming server bandwidth
Server-side Strategies: Uploading Service Not only provide online storage, but also cooperate with content distribution Uploading and storage services No size/format limitations attract millions of users 500GB~1TB content routinely uploaded per day One single copy of a file stored in one of the servers
Server-side Strategies: Downloading Service Complement peers to supply file blocks, especially to those suffering poor downloading rates How to properly satisfy a potentially large number of requests without incurring prohibitively high bandwidth costs? 1st block user experience Probabilistic service based on file popularity File popularity index inversely proportional to No. requests Peers in popular channels to largely rely on peer assistance rather than servers Allocate more server resources to unpopular files A specific design design instance with with the the knob knob l l >= >= 0
Outline Online Storage and Content Distribution Design Objectives & Challenges Conceptual Design Space FS2You: Architecture & Mechanisms Measurement Results & Discussion Conclusion & Future Work
Trace Collection To evaluate the performance of FS2You, we have implemented a detailed logging mechanism 350 GB traces from 3.3 million users, from June 21 to July 18, 2008 Each peer reports activities & status to the log servers Download Event Summary (event-driven) Peer ID, Channel IDs, File size Time of open/close/completion Total downloaded volume, downloaded volume from servers File Source Snapshot (periodically, overhead/accuracy)
Measurement Study Peer assistance What are the typical peer dynamics & behaviors of both short & long period, and the implications on peer resource utilization? Which set of peers contributes most to the system? Peer dynamics and behavior Reflect user demand & Fine tune server strategies File Characteristics Service Quality User Experience File availability & downloading rate File size & type preferences File popularity & request/replica distribution Correlation with peer assistance effectiveness
Overall Scale & Performance: A large number of users Weekend Pattern Crash failures of log server
Overall Scale & Performance: Huge traffic volumes Up to 80% contributed by P2P alleviate server load Even during calm period, Conserve > 70% server bandwidth The architectural & protocol designs in FS2You can scale to a large number of peers, and can withstand the test of a tremendous volume of traffic (in the order of terabytes per day) over a long period of time The cost of server capacity has been substantially saved by peer assistance
File Characteristics: Popularity 47% compressed archives (e.g., zip/rar) most multimedia content 30% videos, 12% audio, 11% others Flatter than Zipf prediction Immutability of files, and the fetch-at-most-once behavior Well fitted with the stretched exponential distribution Useful for workload synthesis
Correlation: File Popularity & Peer Assistance Effectiveness Adjacent-averaging smoothing In general, popular files enjoy higher peer assistance effectiveness Highly popular ones 80%~90% peer assistance effectiveness encouraging! Increasing noise variations in peer assistance as popularity decreases Interestingly, some less popular ones can also enjoy high peer assistance effectiveness, as some used to be popular with sufficient replicas among peers
Server Involvement & Service Quality Valley potential negative effects of the collaboration between current design of request-from-server threshold & server-side probabilistic serving strategy Both files that are completely supplied by servers and those that are mainly supported by P2P enjoy high average downloading performance Most experienced favorable downloading performance: Avg. 66 KB/s; Lowest > 40 KB/s
Further Exploration beyond FS2You s Customization Experiment using real-world data sets Average downloading rate (KB/second) 140 120 100 80 60 40 20 Optimal l = 1 l = 0.5 l = 0 l = 4 0 0 0.25 0.5 0.75 1 1.25 1.5 1.75 2 2.25 2.5 Server bandwidth (GB/second) Still exists potential improvement space with respect to downloading performance
Outline Online Storage and Content Distribution Design Objectives & Challenges Conceptual Design Space FS2You: Architecture & Mechanisms Measurement Results & Discussion Conclusion & Future Work
Conclusion Within the design space, the FS2You system can practically work well at a large scale, with evidences from extensive measurement study Stress-testing: the architectural & protocol designs in FS2You can scale to a large number of peers, and to withstand the test of a tremendous volume of traffic over a long period of time The cost of server capacity can be substantially saved by peer assistance The system provides high file availability & a satisfactory download experience to a large number of users with costeffective server involvement
Conclusion Significant cost savings Server storage capacity vs. daily volumes (50-60 TB) Aggregate server-side bandwidth While FS2You represents a practical instance, the design space is not just restricted to this Conceptual design (guideline) vs. practical implementation Simplicity and engineering issues Tunable knobs offer the flexibility to adapt to a wide range of design preferences for service providers & designers
Future Work (1) Unveil Inefficiencies & explore the causes as the system scales up to a large population, there still exist channels (files), times & scenarios where & when the download experience is unsatisfactory discover issues that are counter-intuitive or hidden apply statistical tools/data analysis/mining techniques to traces Different design with both theoretical analysis & practical implementation Rate or throughput optimization vs. file completion per unit time
Future Work (2) Empirically determined need to be fine-tuned based on both theoretical optimization & real-world experiences Interaction Peer-side request-from-server threshold High threshold improve peer downloading rates But may potentially incur excessive load on the servers Server-side probabilistic supply strategy Helps to reduce such server load But may sometimes leave out some peers in the cold, who indeed need help from servers How to to find find an an optimal strategy to to balance both sides in in large-scale systems?
Thanks! Q&A