Distributed Caching Algorithms for Content Distribution Networks Sem Borst, Varun Gupta, Anwar Walid Alcatel-Lucent Bell Labs, CMU BCAM Seminar Bilbao, September 30, 2010
Introduction Scope: personalized/on-demand delivery of high-definition video through service provider CatchUp TV / PauseLive TV features NPVR (Network Personal Video Recorder) capabilities Movie libraries / VoD (Video-on-Demand) User-generated content Unicast nature defies conventional broadcast TV paradigm 1
Caching strategies Focus: hierarchical network architecture Store popular content closer to network edge to reduce traffic load, capital expense and performance bottlenecks VHO 40% of content stored 5% of traffic served IO 30% stored 20% served CO 20% stored 30% served DSLAM STB 9% stored 35% served 1% stored 10% served 2
Caching strategies (cont d) Typically there are caches installed at only one or two levels VHO 60% of content stored (or 100%) 25% of traffic served IO CO 30% stored 30% served DSLAM 10% stored 45% served STB 3
Caching strategies (cont d) Two interrelated problems Design: optimal cache locations and sizes (joint work with Marty Reiman) Operation: items efficient (dynamic) placement of content 4
Popularity statistics Cache effectiveness strongly depends on locality / commonality in user requests request frequencies 1 2 3 N content item ranks 5
Popularity statistics (cont d) Empirical data suggests that rank statistics resemble Zipf- Mandelbrot distribution Relative frequency of n-th most popular item is with p n = H (q + n) α, n = 1,..., N, α 0: shape parameter q 0: shift parameter H = [ Nn=1 1 (q+n) α ] 1 normalization constant Ideal hit ratio for cache of size B N is R = B n=1 H (q + n) α 6
Popularity statistics (cont d) Shape parameter α varies with content type, and strongly impacts cache effectiveness 1 B = 100 B = 500 B = 1000 Hit ratio R 0.5 0 0 1 2 Shape parameter alpha Hit ratio as function of shape parameter α for various cache sizes B and population of N = 10, 000 content items 7
Popularity statistics (cont d) Zipf-Mandelbrot distribution is inherently static, and difficult to reconcile with dynamic phenomena Dynamic content ingestion and removal Time-varying popularity, request-at-most-once Both adverse and favorable implications Requires agile caching strategies policies and (implicit) popularity estimation, negatively affecting caching performance Causes popularity distribution to be steeper (higher α values over shorter time intervals), improving potential caching effectiveness 8
Optimal content placement Consider symmetric scenario (cache sizes, popularity distributions) For now, assume strictly hierarchical topology: content can only be requested from parent node Caches should be filled with most popular content items from lowest level up VHO IO CO DSLAM STB 9
Greedy content placement strategy Whenever node receives request for item, its local popularity estimate for that item is updated If requested item is not stored in local cache, then Request is forwarded to parent node Popularity estimate for requested item is compared with that for marginal item, which may then be evicted and replaced Provable convergence to optimal content placement 10
Optimal content placement (cont d) Relies on two strong (though reasonable) assumptions Symmetric popularity distributions and cache sizes Strictly hierarchical topology What if popularity distributions are spatially heterogeneous? Or what if content can be requested from peers as well? 11
Optimal content placement (cont d) Assume there are caches installed at only two levels VHO IO CO DSLAM STB 12
Optimal content placement (cont d) Consider cluster of M nodes at same level in hierarchy Cluster nodes are either directly connected or indirectly via common parent node root node parent node leaf nodes 1 2 M 13
Optimal content placement (cont d) Some notation c 0 : transfer cost from root node - 1 to parent node 0 c i : transfer cost from parent node 0 to node i c ij < c 0 +c i : transfer cost from leaf node j to leaf node i Then f ij := c 0 + c i j = i c 0 j = 0 0 j = 1 c 0 + c i c ij > 0 j 1, 0, i represents transport cost savings achieved by transferring data to leaf node i from node j instead of root node 14
Optimal content placement (cont d) Problem of maximizing cost savings may be formulated as max sub M N i=1 n=1 N n=1 s n d in M j=0 f ij x jin (1) s n x in B i, i = 0, 1,..., M (2) x jin x jn, M j=0 i = 1,..., M, j = 0, 1,..., M, n = 1,..., N(3) x jin 1, i = 1,..., M, n = 1,..., N, (4) with B i denoting cache size of i-th node, s n size of n-th item, d in demand for n-th item at i-th node 15
Inter-level cache cooperation Allow for heterogeneous demands, but assume c ij =, i.e., content can only be fetched from parent node and not from peers For compactness, denote c min := min i=1,...,m c i Proposition For arbitrary demands, greedy content placement strategy is guaranteed to achieve at least fraction (M 1)c min + Mc 0 (M 1)c min + (2M 1)c 0 M 2M 1 of maximum achievable cost savings 16
Intra-level cache collaboration Now suppose content can be requested from peers as well Intra-level connectivity allows distributed caches to cooperate and act as single logical cache, and makes caching at lower levels more cost-effective Greedy optimization of local hit rate will lead to complete replication of cache content Cache cooperation improves aggregate hit rate across cache cluster, at expense of lower local hit rate Optimal trade-off and degree of replication depends on cost of intra-level transfers relative to transfers from parent or root node 17
Intra-level cache cooperation (cont d) Assume symmetric transport cost, cache sizes and demands: B i B, c i c, c ij c, and d in d n For compactness, denote c := M(c + c 0 ) (M 1)c > c Problem (1) (4) may be simplified to max sub N n=1 N n=1 N n=1 s n d n (c p n + (M 1)c q n + Mc 0 x 0n ) (5) s n x 0n B 0 (6) s n (p n + (M 1)q n) MB (7) p n + x 0n 1, n = 1,..., N (8) q n + x 0n 1, n = 1,..., N (9) Knapsack problem type structure 18
Intra-level cache collaboration (cont d) Optimal solution of content placement problem has relatively simple structure Distinguish between two cases Mc (M 1)c : more advantageous to store un-replicated content in leaf nodes than in parent node Mc (M 1)c : more attractive to store un-replicated content in parent node than in leaf nodes with c cost between parent and leaf node and c cost between two leaf nodes 19
Case Mc (M 1)c root node parent node leaf nodes 1 2 M Four popularity tiers Highly popular (red): replicated in all leafs p n = 1, q n = 1 Fairly popular (pink): stored in single leaf p n = 1 Mildly popular (yellow): stored in parent node x 0n = 1 Hardly popular (green): stored in root node only 20
Case Mc (M 1)c (cont d) n 2 n 2 n 2 1 n 1 x 0n n 1 x0n n 1 x 0n n 1 n 1 1 n 1 n 0 1 n 0 n 0 1 n 0 n 0 1 q n p n qn 0 = 0 q n p n1 = 0 p n q n p n = 0 x 0n2 21
Case Mc (M 1)c root node parent node leaf nodes 1 2 M Four popularity tiers Highly popular (red): replicated in all leafs p n = 1, q n = 1 Fairly popular (pink): stored in common parent x 0n = 1 Mildly popular (yellow): stored in single leaf p n = 1 Hardly popular (green): stored in root node only 22
Case Mc (M 1)c (cont d) n 1 n 1 n 1 1 n 0 x 0n n 0 + 1 x 0n n 0 x 0 n n 2 n 2 1 n 2 n 1 n 0 1 n 1 n 0 n 0 1 n 1 n 0 n 0 1 q n p n qn 0 = 0 q n p n p n 2 = 0 q n p n x 0n 1 = 0 23
Local-Greedy algorithm For convenience, assume B 0 = 0, s n = 1 for all n = 1,... N If requested item is not stored in local cache, then Item is fetched from peer if cached elsewhere in cluster and otherwise from root node Value of requested item is compared with marginal cache value, i.e., value provided by marginal item in local cache, which may then be evicted and replaced Value of item n = { c d n if stored elsewhere in cluster c d n otherwise 24
Local-Greedy algorithm (cont d) May get stuck in suboptimal configuration globally optimal configuration local optimum Duplicating red item less valuable than single yellow item Duplicating yellow item less valuable then single green item 25
Local-Greedy algorithm (cont d) Performance guarantees (competitive ratios) Symmetric demands: within factor 4/3 from optimal Arbitrary demands: within factor 2 from optimal 26
Numerical experiments M = 10 leaf nodes, each with cache of size B = 1 TB Unit transport cost c 0 = 2, c = 1, c = 1 N = 10, 000 content items, with common size S = 2 GB Each leaf node can store K = B/S content items 27
Numerical experiments (cont d) Each leaf receives average of 1 request every 160 sec, i.e., total request rate per leaf is ν = 0.00625 sec 1 Zipf-Mandelbrot popularity distribution with shape parameter α and shift parameter q, i.e., p n = with normalization constant H (q + n) α, n = 1,..., N, H = N n=1 1 (q + n) α 1 Request rate for n-th item at each leaf node is d n = p n ν 28
Gains from cooperative caching Compare minimum bandwidth cost with that in two other scenarios Full replication: each leaf node stores K most popular items No replication: only single copy of M K most popular items is stored in one of leaf nodes Without caching, bandwidth cost would be MνS(c + c 0 ) = 10 0.00625 2 3 = 0.375 GBps = 3 Gbps 29
Bandwidth cost as function of shape parameter α for various scenarios and cache sizes 30
Some observations Caching effectiveness improves as popularity distribution gets steeper: bandwidth costs markedly decrease with increasing values of α Even when collective cache space can only hold 10% of total content, bandwidth costs reduce to fraction of those without caching, as long as value of α is not too low Best between either full or zero replication is often not much worse than optimal content placement; however, neither full nor zero replication performs well across entire range of α values Critical to adjust degree of replication to steepness of popularity distribution; Local-Greedy algorithm does just that 31
Performance of Local-Greedy algorithm Various leaf nodes receive requests over time, from Zipf-Mandelbrot popularity distribution sampled If requested item is not presently stored, node decides whether to cache it and if so, which currently stored item to evict Distinguish three scenarios for initial placement Full replication: each leaf node stores 500 top items No replication: only single copy of 5000 top items is stored in one of leaf nodes Random: each leaf stores 500 randomly selected items In optimal placement, items 1 through 165 fully replicated, and single copies of items 166 through 3515 stored 32
Performance ratio as function of number of requests, with static or dynamic popularities 33
Bandwidth savings as function of number of requests, with inaccurate popularity estimates 34
Some observations Local-Greedy algorithm gets progressively closer to optimum as system receives more requests and replaces items over time After only 3000 requests (out of total number 10,000 items) Local-Greedy algorithm has come to within 1% of optimum, and stays there Performs markedly better than worst-case ratio of 3/4 might suggest While algorithm seems to converge for all three initial states, scenario with no replication appears to be most favorable one, due to fact that in optimal placement only items 1 through 165 are fully replicated 35