Online Code Rate Adaptation in Cloud Storage Systems with Multiple Erasure Codes
|
|
|
- Archibald Norton
- 9 years ago
- Views:
Transcription
1 Online Code Rate Adaptation in Cloud Storage Systems with Multiple Erasure Codes Rui Zhu 1, Di Niu 1, Zongpeng Li 2 1 Department of Electrical and Computer Engineering, University of Alberta, Edmonton, Alberta, Canada. s: {rzhu3, dniu}@ualberta.ca 2 Department of Computer Science, University of Calgary, Calgary, Alberta, Canada. [email protected] Abstract Erasure codes have been adopted for cloud storage systems. While achieving higher reliability at lower storage overhead as compared to replication, erasure codes usually incur high reading cost when recovering an unavailable block. Although local reconstruction code constructions have been proposed to reduce recovery cost, additional benefits can be achieved by adopting erasure codes with different code rates for data blocks with different popularity. In this paper, we study the problem of code rate selection and adaptation in cloud storage systems that adopt multiple erasure codes via online learning. Unlike offline optimization, which requires the knowledge or estimation of future demands, the online learning algorithms can make decisions only based on past observations and dynamically adapt to demand changes. To avoid solving a hard integer program, we perform a stochastic relaxation to the formulated online learning problem and solve it using a exponentiated gradient algorithm, resulting in sparse solutions. We show a regret bound of O( T ) of the proposed algorithm by showing that our algorithm is a special case of the FTRL online learning framework. Through trace-driven simulations based on real request traces from Windows Azure Storage, we show that our online algorithm performs close to the best fixed offline policy, and trades off between recovery cost during degraded reads and storage overhead. I. INTRODUCTION Cloud storage systems, e.g., Hadoop Distributed File System (HDFS) [1], Google File System (GFS) [2], Windows Azure Storage (WAS) [3], store large amounts of data for both personal and enterprise-level users based on commodity hardware in datacenters. However, these systems may suffer from node failures and frequent data unavailability due to software glitches, I/O hotspots or local congestions on specific data nodes. To introduce fault-tolerance, the first generation of cloud storage systems (e.g., HDFS) adopts 3-way replication, which replicates each data block on three different data nodes. Recently, erasure coding, e.g., a (k, r) Reed-Solomon (RS) code, has been adopted by many production systems, e.g., Windows Azure Storage, Google s ColossusFS, Facebook s HDFS, to offer higher reliability than replication at a lower storage cost [4], [5]. However, one major weakness of erasure-coded storage systems is the high reconstruction or recovery cost which may lead to long request latency tails, as has been widely reported in previous studies [6] [10]. For example, to expedite normal reads, most production systems today adopt a systematic (k, r) RS code, where a single original copy of each data block is stored in the system. As a result, if the data node storing the original copy of a certain requested block B becomes a hotspot and is temporarily unavailable due to congestion, degraded reads must be performed to read k other (original or coded) blocks in the same coded group to reconstruct the content in block B. Apparently, degraded reads can incur k times the traffic of normal reads. Due to the reasons above, most storage systems still adopt 3-way replication to store hot content, and convert erasure-coded storage only when the data becomes cold, i.e., when the access demand for the data drops down. A large number of prior studies have focused on designing new coding structures to reduce the number of reads required during reconstruction and degraded reads, e.g., [5], [6], [8], [11], [12]. For example, Local Reconstruction Codes (LRC) [6], [11], [12] have been proposed to reduce the necessary number of blocks to be read during reconstruction, while still keeping a low storage overhead. While all the solutions above trade storage overhead off for lowered recovery cost, it is observed [7] that it is fundamentally inefficient to store all the data blocks using one erasure code the storage modes for different data blocks should be differentiated depending on their demands. Based on this idea, a storage system based on two erasure codes, HACFS [7], is proposed, where hot data are stored using a code with lower code rate for lower recovery cost during degraded reads (yet with higher storage overhead), and cold data are stored using a higher-rate code for lower storage overhead (yet with higher recovery cost). It is shown that with two erasure codes, HACFS can further improve the recovery performance as compared to single-code storage systems at a comparable storage cost. However, what is lacking in the literature is an intelligent decision system beyond simple threshold-based decisions to select the right erasure code for different data blocks based on dynamically changing demands, in order to optimize the overall system cost. We generalize the two-erasure-coded system HACFS [7] to the concept of storage systems with multiple codes (including any erasure code and replication as an extreme case), where each code is characterized by a certain storage overhead and a certain recovery cost. The objective is to select the best code for each coded group of data blocks on the go to minimize an aggregate cost of degraded reads and
2 storage overhead, as the demand changes. In this paper, we propose an online learning framework to select the best code for each coded group only based on the demands observed in the past. In contrast to offline optimization, an online algorithm does not rely on the knowledge or estimates of future demands and can make dynamic decisions in adaptation to demand changes. As selecting the best out of a finite number of codes involves integer programming and is a hard combinatorial problem, we perform a stochastic relaxation for the problem and propose an online learning algorithm based on exponentiated gradient descent to effectively push solutions onto discrete choices. We show that our algorithm is a special case of the Follow-the-Regularized-Leader (FTRL) online learning framework and has a regret bound of O( T ), which is the gap between online cost and the cost of the best fixed code selection policy up to time period T, the latter of which makes the unrealistic assumption that demands up to time T are all known. We evaluate the proposed online code selection algorithms through trace-driven simulations based on real request rates of 252 blocks collected from Windows Azure Storage in a one-day period. We observe highly skewed demands across different blocks, which echo the similar observations made in prior work [7], [9], [10], [13] and justify the differentiation of code rates between data blocks of different hotness to optimize overall recovery and storage cost. The simulations based on the selection between two LRC codes show that our proposed online learning algorithm can achieve a total amount of traffic cost during degraded reads close to that of the impractical offline optimal algorithm, while achieving an even lower storage overhead than the best fixed offline code selection policy. II. THE ONLINE CODE SELECTION PROBLEM In this section, we describe our model and formulate the online learning problem of dynamic code selection for different coded groups. We follow some conventions for mathematical notations throughout this paper. Vectors and matrices are mostly denoted as capital letters. A vector V is a column vector by default, with its i-th entry denoted by v[i]. The entry at the position (i, j) of a matrix A is denoted by a[i, j]. A. Model Description The content in a typical cloud storage system is stored in data blocks. In an erasure-coded storage system, a coded group i consists of k original data blocks and r i parity blocks which are encoded from the k original blocks according to a certain coding construction, e.g, RS, LRC, etc. We assume that the storage system can support a finite set of n erasure codes (possibly of different types), denoted as C = {c 1,..., c n }. The degraded read occurs when the data block requested is unavailable and the storage system will reconstruct the content of this block from other available blocks. For each code c j C, we define the per-block recovery cost l[j] as the minimum number of reads needed to retrieve an original block during degraded reads in a coded group adopting code c j. It is worth noting that the per-block recovery cost of c j is not always k, depending on the code construction. For example, (12, 6, 2) and (12, 2, 2) LRC codes have per-block recovery cost costs of 2 and 6 [6], respectively, while both having the same k = 12 yet different numbers of parity blocks. Furthermore, define the per-group storage overhead of each code c j as s[j] = r j. Define the vectors L := (l[1], l[2],..., l[n]), and S := (s[1], s[2],..., s[n]). Suppose there are m coded groups. In each time slot t = 1, 2,..., we observe a list D t = (d t [1],..., d t [m]) of block demands, where d t [i] denotes the total number of requests for blocks in coded group i. In each time slot t, we need to determine a code selection policy π t = (π t [1],..., π t [m]) for all coded groups, where π t [i] = j if and only if coded group i encodes data blocks according to the j-th code c j C. Thus, assuming a degraded read rate of ɛ t [i] in each coded group i (which can be measured), the total traffic for degraded reads in the system in time slot t is given by m ɛ t [i]d t [i] l[j]i(π t [i] = j), (1) i=1 where I( ) is an identity function that outputs one or zero. In the meantime, we require that the total storage overhead at time t cannot exceed M, i.e., g t = s[j]i(π t [i] = j) M. (2) i=1 Since ɛ t [i]d t [i] denotes the number of degraded reads in coded group i at time t and can always be monitored as one entity, in the following, to simply notations, we will just use d t [i] to represent the degraded reads ɛ t [i]d t [i]. If D t is known a priori, the code selection in each time slot t is to determine π t given demands D t and the properties L and S of the available codes, by minimizing (1) subject to (2). However, as we do not know the demands D t yet when making decisions about π t, in this paper, we use online learning to select codes for each coded group in adaption to demand changes. B. Online Learning Formulation We formulate our online learning problem as follows: (1) in time slot t, the system decides the code selection policy π t based on data up to t 1; (2) the system receives and serves the current user demands D t (amounts of degraded reads in all coded groups); (3) the system incurs a cost f t (π t ) for the given demands D t in time slot. (4) t := t 1; repeats step (1). In terms of online learning literature, the regret is defined as the gap between the cumulative cost t f t(π t ) of online learning and the cumulative cost of the best fixed policy over all the time periods, i.e., Regret T (A) := f t (π t ) min f t (π), π A
3 where A is the set of all possible code selection policies. The objective is select the sequence π 1,..., π T to bound the regret defined above as T goes to infinity. Since the code selection π t [i] for group i is a discrete decision to make, minimizing the regret usually involves integer programming and is a hard combinatorial optimization problem. To tackle the integer programming issue, we relax the decisions to be stochastic; that is, we let each coded group i choose each code c j according to the probability P(π[i] = j) := π t [i, j], where j π t[i, j] = 1. For stochastic policies π t as a matrix of π t [i, j], we define the total system cost f t (π t ) at time t as the expected total degraded read traffic penalized by the expected storage cost, i.e., f t (π t ) = d t [i] π t [i, j]l[j] ρ 2 (g t(π t ) M) 2, (3) where i=1 g t (π t ) = m i=1 π t [i, j]s[j] (4) is the total expected storage overhead at time t, and π t is a matrix of π t [i, j]. Now the regret for stochastic decisions is given by Regret T (A) := f t (π t ) min f t (π), (5) π A where A is the set of all possible stochastic polices. Unlike deterministic policies which require integer decisions, it is apparent that a stochastic solution only involves fractional solutions, which are easier to handle. In the next section, we propose to use the exponentiated gradient descent algorithm to drive each decision π t into a sparse matrix, as an approximate solution to the integer code selection problem. III. AN EXPONENTIATED GRADIENT DESCENT ALGORITHM We present an online algorithm based on exponentiated gradient descent to yield sparse stochastic code selection policies π t with a regret bound of O( T ). The basic idea is to use a preference matrix H t such that a higher value of H t [i, j] shows more preference on code c j by coded group i. Then, the probability that coded group i selects the code c j can be determined by a soft-max distribution, i.e., π t [i, j] = exp(h t [i, j]) n j =1 exp(h t[i, j ]). (6) Initially, let all coded groups have the same preferences toward all codes, i.e., for all i and j, H 1 [i, j] = 0. Now we update the preference matrix H t in each time slot t. According to (3), we can see that if the coded group i chooses the codebook c j, it incurs a loss of L ij (D t ) := d t [i]l[j] ρs[j](g t (π t ) M). (7) Since this entry-wise loss can provide a feedback on how well this action was, we can set H t [i, j] as follows: t 1 H t [i, j] η L i,j (D τ ), (8) or, in a recursive way, set H t [i, j] as follows: H t1 [i, j] H t [i, j] ηl i,j (D t ), t = 1, 2,.... (9) The algorithm is summarized in Algorithm 1. Note that in each time slot t, the total system cost f t is directly calculated from the stochastic policy π t and block demands D t, not from the actual code selection. Algorithm 1 The Exponentiated Gradient Descent for Code Selection. 1: Input: Parameter η > 0, ρ > 0. 2: Initially, H 1 [i, j] = 0 for all i and j. 3: for t = 1 to T do 4: Determine π t by (6). 5: Receive demands D t and observe the cost f t (π t ). 6: Update preference H t1 according to (9). 7: end for A. Regret Analysis We now analyze the regret bound of Algorithm 1. We first note that the loss L ij (D t ) in (7) is the gradient t := f t (π t ), where f t π t [i, j] = d t[i]l[j] ρ(g t (π t ) M)s[j] = L ij (D t ). Thus, we can rewrite the update of the preference matrix as follows: H t [i, j] H t 1 [i, j] η t 1 [i, j], (10) We introduce an auxiliary variable X t as X t [i, j] := exp(h t [i, j]), and the policy update rule becomes: π t [i, ] = X t[i, ] X t [i, ] 1, (11) where π t [i, ] is the vector of policy distribution for coded group i, X t [i, ] is the row vector of matrix X t, and X t [i, ] 1 denotes the L 1 norm of this vector. We now show that our algorithm is a special case of Follow-The-Regularized-Leader (FTRL) algorithm. In FTRL, the policy π t is updated as follows: { } π t1 = arg min π A η t τ π R(π). (12) Let R(π) = π log(π) be the negative entropy function, i.e., R(π) = π[i, j] log π[i, j], i=1 and its derivative is R(π)[i, j] = 1 log π[i, j] for all i and j. Using the Lagrangian multiplier and the constraint of π as
4 j π[i, j] = 1 for all coded group i and some fixed λ, we have F t1 (π, t λ) := η τ π R(π) λ[i]( π[i, j] 1). i=1 (13) Taking the partial derivatives on π and setting the derivatives to zero, we have F t1 π[i, j] = η t τ [i, j] 1 log π[i, j] λ[i] = 0. (14) To simplify the analysis, we use the auxiliary variable X t1 such that t log X t1 [i, j] = η τ [i, j] 1. (15) Then, (14) becomes log π[i, j] = log X[i, j] λ[i]. We first evaluate the recursive property of X t1. Since (15) holds for all time slot t, we can take t t 1, the previous time slot, and X t satisfies: Now we have: and t 1 log X t [i, j] = η τ [i, j] 1. (16) τ log X t1 [i, j] = log X t [i, j] t [i, j], (17) X t1 [i, j] = X t [i, j] exp( t [i, j]). (18) Now we evaluate the value of λ. For a coded group i, we take sum over j = 1,..., n of the above equation and we have 1 = π[i, j] = X[i, j] exp( λ[i]). (19) Therefore, the update of π t is equivalent to (6), showing Algorithm 1 is a special case of FTRL. Based on the following theorem, we can find the regret bound of Algorithm 1: Theorem 1 ( [14]). For every policy π A, the FTRL algorithm attains the following bound on the regret: Regret T (A) 2 2D R G 2 T, (20) where D R is the diameter of set A relative to the regularizer function R as D R = max {R(π 1) R(π 2 )}, π 1,π 2 A G is the upper bound of gradient, i.e., t G for all t, and the learning rate η is determined as D R η = 2G 2 T. Fig. 1. Two LRC codes used in simulation with Upcode and Downcode operations [7]. It shows how to change code rate with a few simple and fast operations. The gray blocks are local parity blocks, and the black blocks are global parity blocks. Now let us derive the specific values of the constants D R and G for the proposed Algorithm 1. We can derive the diameter D R for Algorithm 1 as follows: R(π 1 ) R(π 2 ) R(π 2 ) (21) = π 2 [i, j] log π 2 [i, j] (22) i=1 m log n, (23) where (21) follows from the fact that π 1 [i, j] log π 1 [i, j] is nonpositive for all i and j, and (23) follows from the entropy inequality, which shows that the random variable with uniform distribution has the maximum entropy. The next step is to determine the upper bound of gradient G. We assume that the demand in any coded group is upper bounded by D. For any codebook c j, we can find its recovery cost and its storage overhead immediately. We denote L max := max j l[j] and S max := max j s[j]. Therefore, we have G := DL max ρs max mns max M t (24) for Algorithm 1. Both (23) and (24) are constant with time T and we can get a regret bound of O( T ) of Algorithm 1 from Theorem 1. IV. PERFORMANCE EVALUATION We evaluated our online learning algorithm based on real workload traces collected from a production cluster in the Windows Azure Storage (WAS) system. Our trace dataset contains 252 equal-sized original data blocks in every second for a 22-hour period. We use two erasure codes, both are in the family of Local Reconstruction Codes [6], with different coding parameters. One of them is (12, 2, 2) LRC with k = 12 data blocks, 2 local parity blocks and 2 global parity blocks, achieving a per-block recovery cost of 6 [6] and a storage overhead of 4. Another code is (12, 6, 2) LRC with k = 12 data blocks, 6 local parity blocks and 2 global parity blocks, achieving a per-block recovery cost of 2 [6] and a larger storage overhead of 8. Fig. 1 shows the code structures of
5 Total Number of Requests Block Index (sorted) (a) Distribution of total requests per block Total Degraded Read Cost Online Best Fixed Policy Fast Code Compact Code Time Index (minute) (a) Total degraded read traffic in each minute Average Request Time Index (hour) (b) Average request per block Fig. 2. The statistics of the trace data collected from Windows Azure Storage (WAS). these two codes and an efficient scheme to change code rates between them. Fig. 2(a) shows the distribution of the total number of requests for each block in the trace data. We can see that the distribution of data access is highly skewed. The majority of data blocks are cold with only a few read accesses. However, there are a small fraction of data blocks that have high request demands. Fig. 2(b) shows the average request per block at different times. We can see that the trend of demand varies as the time evolves. A. Performance of the Exponentiated Gradient Algorithm We conduct our performance evaluation with each time slot representing 1-minute and we set the entire evaluation period to be T = 500 minutes. In each minute, we perform Algorithm 1, determine the policy matrix π t, and for each coded group G i, we pick a code c j according to the probability distribution π t [i, ]. We compare the performance of Algorithm 1 with the following approaches: Best fixed policy (offline): it learns the best fixed policy π by minimizing T f t(π) assuming all demands D 1,..., D T are known. Note that this policy is also stochastic, and in each minute, each coded group G i should choose one code according to the probability π [i, ]. A single fast code: all coded groups choose the fast code LRC(12, 6, 2). A single compact code: all coded groups choose the compact code LRC(12, 2, 2). Total Storage Overhead Online Best Fixed Policy Fast Code Compact Code Time Index (minute) (b) Total storage overhead in each minute Fig. 3. The performance of the Exponentiated Gradient Algorithm compared with other approaches. Fig. 3(a) and Fig. 3(b) show the performance comparison from two perspectives, namely, total degraded read traffic and total storage overhead. We assume that 5% of all requests may encounter the events of block unavailability at random and thus triggering degraded reads. During each degraded read, the requests are randomly directed to the combinations of blocks that can reconstruct the original block, contributing to the total degraded read traffic. Fig. 3(a) illustrates that after about 100 minutes, the online algorithm can perform close to the best fixed code selection policy, which shows the ability of our algorithm to closely track the demand. It is surprising to see that Algorithm 1 can even save more storage as compared to the best fixed policy. We also show the performances of two single coding schemes as comparisons, which are also illustrated in Fig. 3. The single compact code has the highest recovery cost, while the single fast code has the least recovery cost. However, this doesn t show the benefit of the single fast code, since such performance is at the expense of its storage overhead. In other words, neither of the two single codes do not achieve a proper tradeoff between recovery cost during degraded read and storage overhead. V. RELATIONSHIP TO PRIOR WORK A number of recent studies [6] [8] show that aside from node failures, the majority of degraded reads are due to the temporary unavailability of single original blocks. For example, Over 98% of all failure modes in Facebook s datawarehouse and other production HDFS clusters require recovery of a single temporary block failure [8], [15] instead of
6 node failures. And only less than 0.05% of all failures involve three or more blocks simultaneously. It is thus important to optimize the latency and cost associate with the reconstruction of individual data blocks, especially hot ones. Extensive recent studies have been conducted to reduce the repair cost (i.e., the number of reads required) for degraded reads when the original data is unavailable, through coding theory and system optimization. Local Reconstruction Code (LRC) [6] is proposed to reduce the IOs required for reconstruction reads over Reed-Solomon (RS) codes, while still allowing a significant reduction in storage overhead as compared to 3-way replication. Similar locally recoverable codes have been presented in [11], [12]. HitchHiker [8] is a another erasure-coded storage system that reduces both network traffic and disk I/O during reconstruction of unavailable data, riding on top of RS codes based on new encoding and decoding techniques. [16] presents an algorithm that finds the optimal number of codeword symbols needed for recovery with any XOR-based erasure code and produces recovery schedules to use a minimum amount of data. [5] proposes FastDR, a system that addresses node heterogeneity and exploits I/O parallelism, to enhance the performance of degraded reads. [7] presents HACFS, which uses two different erasure codes, i.e., a fast code is used to encode frequently accessed data for low recovery cost, and a compact code is used to encode the majority of data to maintain a low overall storage overhead. HACFS [7] is similar to our work in the sense that it tries to use two erasure codes to differentiate the coding schemes for hot and cold blocks. However, what is lacking in [7] is an intelligent scheme to determine code selection based on demand. In this paper, we propose a data-driven online learning framework which can perform code selection for different coded groups given any finite number of erasure codes (possibly of different types or constructions) in a storage system. We show that our algorithm is fast and achieves a regret bound of O( T ) as T grows. Optimized code selection based on the demand of each coded group can significantly reduce recovery traffic cost while maintaining a similar storage overhead. Such a benefit is especially salient for skewed demands, which have also been observed in other studies [7], [9], [10], [13]. VI. CONCLUDING REMARKS Erasure codes have been widely deployed in cloud storage systems. Although erasure codes can provide more reliability with less storage overhead, yet they incur higher reading traffic when reconstructing an unavailable block. Since degraded read may happen frequently in a coded storage system due to software glitches and server congestion, the higher reconstruction cost is the major reason that leads to long latency tails in erasure coded storage systems. In this paper, we study the problem of code rate adaption and codebook selection in storage systems that adopt multiple erasure codes to for service differentiation between hot and cold content. Since it is impossible to know the block demands before the code selection decisions are made, we propose an online learning algorithm to make dynamic decisions only based on the past observations. To avoid solving a hard integer program, we perform a stochastic relaxation for code selection and solve the online learning problem using an exponentiated gradient descent algorithm, leading to sparse solutions. We prove a regret bound O( T ) of our algorithm by showing that our algorithm is a special case of the FTRL online learning framework. Through trace-driven simulations based on real request traces from Windows Azure Storage, we show that our online learning algorithm performs close to the best fixed offline code selection policy, and can achieve fine-tuned tradeoff between degraded read cost and storage overhead. REFERENCES [1] D. Borthakur, Hdfs architecture guide, HADOOP APACHE PROJECT apache. org/common/docs/current/hdfs design. pdf, [2] S. Ghemawat, H. Gobioff, and S.-T. Leung, The google file system, in ACM SIGOPS operating systems review, vol. 37, no. 5. ACM, 2003, pp [3] B. Calder, J. Wang, A. Ogus, N. Nilakantan, A. Skjolsvold, S. McKelvie, Y. Xu, S. Srivastav, J. Wu, H. Simitci et al., Windows azure storage: a highly available cloud storage service with strong consistency, in Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles. ACM, 2011, pp [4] H. Weatherspoon and J. D. Kubiatowicz, Erasure coding vs. replication: A quantitative comparison, in Peer-to-Peer Systems. Springer, 2002, pp [5] O. Khan, R. C. Burns, J. S. Plank, W. Pierce, and C. Huang, Rethinking erasure codes for cloud file systems: minimizing i/o for recovery and degraded reads. in FAST, 2012, p. 20. [6] C. Huang, H. Simitci, Y. Xu, A. Ogus, B. Calder, P. Gopalan, J. Li, S. Yekhanin et al., Erasure coding in windows azure storage. in Usenix annual technical conference. Boston, MA, 2012, pp [7] M. Xia, M. Saxena, M. Blaum, and D. A. Pease, A tale of two erasure codes in hdfs, in To appear in Proceedings of 13th Usenix Conference on File and Storage Technologies, [8] K. Rashmi, N. B. Shah, D. Gu, H. Kuang, D. Borthakur, and K. Ramchandran, A hitchhiker s guide to fast and efficient data reconstruction in erasure-coded data centers, in Proceedings of the 2014 ACM conference on SIGCOMM. ACM, 2014, pp [9] Y. Chen, S. Alspaugh, and R. Katz, Interactive analytical processing in big data systems: A cross-industry study of mapreduce workloads, Proceedings of the VLDB Endowment, vol. 5, no. 12, pp , [10] K. Ren, Y. Kwon, M. Balazinska, and B. Howe, Hadoop s adolescence: an analysis of hadoop usage in scientific workloads, Proceedings of the VLDB Endowment, vol. 6, no. 10, pp , [11] M. Sathiamoorthy, M. Asteris, D. Papailiopoulos, A. G. Dimakis, R. Vadali, S. Chen, and D. Borthakur, Xoring elephants: Novel erasure codes for big data, in Proceedings of the VLDB Endowment, vol. 6, no. 5. VLDB Endowment, 2013, pp [12] I. Tamo and A. Barg, A family of optimal locally recoverable codes, Information Theory, IEEE Transactions on, vol. 60, no. 8, pp , [13] C. L. Abad, N. Roberts, Y. Lu, and R. H. Campbell, A storage-centric analysis of mapreduce workloads: File popularity, temporal locality and arrival patterns, in Workload Characterization (IISWC), 2012 IEEE International Symposium on. IEEE, 2012, pp [14] S. Shalev-Shwartz, Online learning and online convex optimization, Foundations and Trends in Machine Learning, vol. 4, no. 2, pp , [15] K. Rashmi, N. B. Shah, D. Gu, H. Kuang, D. Borthakur, and K. Ramchandran, A solution to the network challenges of data recovery in erasure-coded distributed storage systems: A study on the facebook warehouse cluster, Proc. USENIX HotStorage, [16] Y. Zhu, J. Lin, P. P. Lee, and Y. Xu, Boosting degraded reads in heterogeneous erasure-coded storage systems.
A Solution to the Network Challenges of Data Recovery in Erasure-coded Distributed Storage Systems: A Study on the Facebook Warehouse Cluster
A Solution to the Network Challenges of Data Recovery in Erasure-coded Distributed Storage Systems: A Study on the Facebook Warehouse Cluster K. V. Rashmi 1, Nihar B. Shah 1, Dikang Gu 2, Hairong Kuang
Reliability Comparison of Various Regenerating Codes for Cloud Services
Reliability Comparison of Various Regenerating Codes for Cloud Services Yonsei Univ. Seoul, KORE Jung-Hyun Kim, Jin Soo Park, Ki-Hyeon Park, Inseon Kim, Mi-Young Nam, and Hong-Yeop Song ICTC 13, Oct. 14-16,
Functional-Repair-by-Transfer Regenerating Codes
Functional-Repair-by-Transfer Regenerating Codes Kenneth W Shum and Yuchong Hu Abstract In a distributed storage system a data file is distributed to several storage nodes such that the original file can
CSE-E5430 Scalable Cloud Computing P Lecture 5
CSE-E5430 Scalable Cloud Computing P Lecture 5 Keijo Heljanko Department of Computer Science School of Science Aalto University [email protected] 12.10-2015 1/34 Fault Tolerance Strategies for Storage
Snapshots in Hadoop Distributed File System
Snapshots in Hadoop Distributed File System Sameer Agarwal UC Berkeley Dhruba Borthakur Facebook Inc. Ion Stoica UC Berkeley Abstract The ability to take snapshots is an essential functionality of any
Shareability and Locality Aware Scheduling Algorithm in Hadoop for Mobile Cloud Computing
Shareability and Locality Aware Scheduling Algorithm in Hadoop for Mobile Cloud Computing Hsin-Wen Wei 1,2, Che-Wei Hsu 2, Tin-Yu Wu 3, Wei-Tsong Lee 1 1 Department of Electrical Engineering, Tamkang University
Modern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh
Modern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh Peter Richtárik Week 3 Randomized Coordinate Descent With Arbitrary Sampling January 27, 2016 1 / 30 The Problem
ROUTING ALGORITHM BASED COST MINIMIZATION FOR BIG DATA PROCESSING
ROUTING ALGORITHM BASED COST MINIMIZATION FOR BIG DATA PROCESSING D.Vinotha,PG Scholar,Department of CSE,RVS Technical Campus,[email protected] Dr.Y.Baby Kalpana, Head of the Department, Department
Computing Load Aware and Long-View Load Balancing for Cluster Storage Systems
215 IEEE International Conference on Big Data (Big Data) Computing Load Aware and Long-View Load Balancing for Cluster Storage Systems Guoxin Liu and Haiying Shen and Haoyu Wang Department of Electrical
Efficient Data Replication Scheme based on Hadoop Distributed File System
, pp. 177-186 http://dx.doi.org/10.14257/ijseia.2015.9.12.16 Efficient Data Replication Scheme based on Hadoop Distributed File System Jungha Lee 1, Jaehwa Chung 2 and Daewon Lee 3* 1 Division of Supercomputing,
Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay
Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay Lecture - 17 Shannon-Fano-Elias Coding and Introduction to Arithmetic Coding
A Study on Workload Imbalance Issues in Data Intensive Distributed Computing
A Study on Workload Imbalance Issues in Data Intensive Distributed Computing Sven Groot 1, Kazuo Goda 1, and Masaru Kitsuregawa 1 University of Tokyo, 4-6-1 Komaba, Meguro-ku, Tokyo 153-8505, Japan Abstract.
A Cloud Data Center Optimization Approach Using Dynamic Data Interchanges
A Cloud Data Center Optimization Approach Using Dynamic Data Interchanges Efstratios Rappos Institute for Information and Communication Technologies, Haute Ecole d Ingénierie et de Geston du Canton de
Network Coding for Distributed Storage
Network Coding for Distributed Storage Alex Dimakis USC Overview Motivation Data centers Mobile distributed storage for D2D Specific storage problems Fundamental tradeoff between repair communication and
1 Introduction. 2 Prediction with Expert Advice. Online Learning 9.520 Lecture 09
1 Introduction Most of the course is concerned with the batch learning problem. In this lecture, however, we look at a different model, called online. Let us first compare and contrast the two. In batch
A Network Flow Approach in Cloud Computing
1 A Network Flow Approach in Cloud Computing Soheil Feizi, Amy Zhang, Muriel Médard RLE at MIT Abstract In this paper, by using network flow principles, we propose algorithms to address various challenges
Secret Sharing based on XOR for Efficient Data Recovery in Cloud
Secret Sharing based on XOR for Efficient Data Recovery in Cloud Computing Environment Su-Hyun Kim, Im-Yeong Lee, First Author Division of Computer Software Engineering, Soonchunhyang University, [email protected]
Lazy Means Smart: Reducing Repair Bandwidth Costs in Erasure-coded Distributed Storage
Lazy Means Smart: Reducing Repair Bandwidth Costs in Erasure-coded Distributed Storage Mark Silberstein, Lakshmi Ganesh 2, Yang Wang 3, Lorenzo Alvisi 3, Mike Dahlin 3,4 Technion, 2 Facebook, 3 The University
Index Terms : Load rebalance, distributed file systems, clouds, movement cost, load imbalance, chunk.
Load Rebalancing for Distributed File Systems in Clouds. Smita Salunkhe, S. S. Sannakki Department of Computer Science and Engineering KLS Gogte Institute of Technology, Belgaum, Karnataka, India Affiliated
IMPROVED FAIR SCHEDULING ALGORITHM FOR TASKTRACKER IN HADOOP MAP-REDUCE
IMPROVED FAIR SCHEDULING ALGORITHM FOR TASKTRACKER IN HADOOP MAP-REDUCE Mr. Santhosh S 1, Mr. Hemanth Kumar G 2 1 PG Scholor, 2 Asst. Professor, Dept. Of Computer Science & Engg, NMAMIT, (India) ABSTRACT
CURTAIL THE EXPENDITURE OF BIG DATA PROCESSING USING MIXED INTEGER NON-LINEAR PROGRAMMING
Journal homepage: http://www.journalijar.com INTERNATIONAL JOURNAL OF ADVANCED RESEARCH RESEARCH ARTICLE CURTAIL THE EXPENDITURE OF BIG DATA PROCESSING USING MIXED INTEGER NON-LINEAR PROGRAMMING R.Kohila
Load Balancing and Switch Scheduling
EE384Y Project Final Report Load Balancing and Switch Scheduling Xiangheng Liu Department of Electrical Engineering Stanford University, Stanford CA 94305 Email: [email protected] Abstract Load
XORing Elephants: Novel Erasure Codes for Big Data
XORing Elephants: Novel Erasure Codes for Big Data Maheswaran Sathiamoorthy University of Southern California msathiam@uscedu Megasthenis Asteris University of Southern California asteris@uscedu Alexandros
A Piggybacking Design Framework for Read-and Download-efficient Distributed Storage Codes
A Piggybacing Design Framewor for Read-and Download-efficient Distributed Storage Codes K V Rashmi, Nihar B Shah, Kannan Ramchandran, Fellow, IEEE Department of Electrical Engineering and Computer Sciences
Exploring the Efficiency of Big Data Processing with Hadoop MapReduce
Exploring the Efficiency of Big Data Processing with Hadoop MapReduce Brian Ye, Anders Ye School of Computer Science and Communication (CSC), Royal Institute of Technology KTH, Stockholm, Sweden Abstract.
FP-Hadoop: Efficient Execution of Parallel Jobs Over Skewed Data
FP-Hadoop: Efficient Execution of Parallel Jobs Over Skewed Data Miguel Liroz-Gistau, Reza Akbarinia, Patrick Valduriez To cite this version: Miguel Liroz-Gistau, Reza Akbarinia, Patrick Valduriez. FP-Hadoop:
R.K.Uskenbayeva 1, А.А. Kuandykov 2, Zh.B.Kalpeyeva 3, D.K.Kozhamzharova 4, N.K.Mukhazhanov 5
Distributed data processing in heterogeneous cloud environments R.K.Uskenbayeva 1, А.А. Kuandykov 2, Zh.B.Kalpeyeva 3, D.K.Kozhamzharova 4, N.K.Mukhazhanov 5 1 [email protected], 2 [email protected],
International Journal of Advance Research in Computer Science and Management Studies
Volume 2, Issue 8, August 2014 ISSN: 2321 7782 (Online) International Journal of Advance Research in Computer Science and Management Studies Research Article / Survey Paper / Case Study Available online
Analysis and Optimization of Massive Data Processing on High Performance Computing Architecture
Analysis and Optimization of Massive Data Processing on High Performance Computing Architecture He Huang, Shanshan Li, Xiaodong Yi, Feng Zhang, Xiangke Liao and Pan Dong School of Computer Science National
Energy aware RAID Configuration for Large Storage Systems
Energy aware RAID Configuration for Large Storage Systems Norifumi Nishikawa [email protected] Miyuki Nakano [email protected] Masaru Kitsuregawa [email protected] Abstract
Research on Job Scheduling Algorithm in Hadoop
Journal of Computational Information Systems 7: 6 () 5769-5775 Available at http://www.jofcis.com Research on Job Scheduling Algorithm in Hadoop Yang XIA, Lei WANG, Qiang ZHAO, Gongxuan ZHANG School of
9.2 Summation Notation
9. Summation Notation 66 9. Summation Notation In the previous section, we introduced sequences and now we shall present notation and theorems concerning the sum of terms of a sequence. We begin with a
Introduction to Online Learning Theory
Introduction to Online Learning Theory Wojciech Kot lowski Institute of Computing Science, Poznań University of Technology IDSS, 04.06.2013 1 / 53 Outline 1 Example: Online (Stochastic) Gradient Descent
Introduction to Hadoop
Introduction to Hadoop 1 What is Hadoop? the big data revolution extracting value from data cloud computing 2 Understanding MapReduce the word count problem more examples MCS 572 Lecture 24 Introduction
Energy Efficient MapReduce
Energy Efficient MapReduce Motivation: Energy consumption is an important aspect of datacenters efficiency, the total power consumption in the united states has doubled from 2000 to 2005, representing
Magnus: Peer to Peer Backup System
Magnus: Peer to Peer Backup System Naveen Gattu, Richard Huang, John Lynn, Huaxia Xia Department of Computer Science University of California, San Diego Abstract Magnus is a peer-to-peer backup system
Hosting Transaction Based Applications on Cloud
Proc. of Int. Conf. on Multimedia Processing, Communication& Info. Tech., MPCIT Hosting Transaction Based Applications on Cloud A.N.Diggikar 1, Dr. D.H.Rao 2 1 Jain College of Engineering, Belgaum, India
Coding for Big Data: Error-Correction for Distributed Data Storage
Coding for Big Data: Error-Correction for Distributed Data Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering Indian Institute of Science, Bangalore Bombay Information
Reducer Load Balancing and Lazy Initialization in Map Reduce Environment S.Mohanapriya, P.Natesan
Reducer Load Balancing and Lazy Initialization in Map Reduce Environment S.Mohanapriya, P.Natesan Abstract Big Data is revolutionizing 21st-century with increasingly huge amounts of data to store and be
How To Encrypt Data With A Power Of N On A K Disk
Towards High Security and Fault Tolerant Dispersed Storage System with Optimized Information Dispersal Algorithm I Hrishikesh Lahkar, II Manjunath C R I,II Jain University, School of Engineering and Technology,
Adaptive Online Gradient Descent
Adaptive Online Gradient Descent Peter L Bartlett Division of Computer Science Department of Statistics UC Berkeley Berkeley, CA 94709 bartlett@csberkeleyedu Elad Hazan IBM Almaden Research Center 650
An Industrial Perspective on the Hadoop Ecosystem. Eldar Khalilov Pavel Valov
An Industrial Perspective on the Hadoop Ecosystem Eldar Khalilov Pavel Valov agenda 03.12.2015 2 agenda Introduction 03.12.2015 2 agenda Introduction Research goals 03.12.2015 2 agenda Introduction Research
Chapter 7. Using Hadoop Cluster and MapReduce
Chapter 7 Using Hadoop Cluster and MapReduce Modeling and Prototyping of RMS for QoS Oriented Grid Page 152 7. Using Hadoop Cluster and MapReduce for Big Data Problems The size of the databases used in
How To Handle Big Data With A Data Scientist
III Big Data Technologies Today, new technologies make it possible to realize value from Big Data. Big data technologies can replace highly customized, expensive legacy systems with a standard solution
Numerical methods for American options
Lecture 9 Numerical methods for American options Lecture Notes by Andrzej Palczewski Computational Finance p. 1 American options The holder of an American option has the right to exercise it at any moment
Modeling and Performance Evaluation of Computer Systems Security Operation 1
Modeling and Performance Evaluation of Computer Systems Security Operation 1 D. Guster 2 St.Cloud State University 3 N.K. Krivulin 4 St.Petersburg State University 5 Abstract A model of computer system
CS2510 Computer Operating Systems
CS2510 Computer Operating Systems HADOOP Distributed File System Dr. Taieb Znati Computer Science Department University of Pittsburgh Outline HDF Design Issues HDFS Application Profile Block Abstraction
CS2510 Computer Operating Systems
CS2510 Computer Operating Systems HADOOP Distributed File System Dr. Taieb Znati Computer Science Department University of Pittsburgh Outline HDF Design Issues HDFS Application Profile Block Abstraction
An Adaptive Decoding Algorithm of LDPC Codes over the Binary Erasure Channel. Gou HOSOYA, Hideki YAGI, Toshiyasu MATSUSHIMA, and Shigeichi HIRASAWA
2007 Hawaii and SITA Joint Conference on Information Theory, HISC2007 Hawaii, USA, May 29 31, 2007 An Adaptive Decoding Algorithm of LDPC Codes over the Binary Erasure Channel Gou HOSOYA, Hideki YAGI,
International Journal of Engineering Research ISSN: 2348-4039 & Management Technology November-2015 Volume 2, Issue-6
International Journal of Engineering Research ISSN: 2348-4039 & Management Technology Email: [email protected] November-2015 Volume 2, Issue-6 www.ijermt.org Modeling Big Data Characteristics for Discovering
Open source software framework designed for storage and processing of large scale data on clusters of commodity hardware
Open source software framework designed for storage and processing of large scale data on clusters of commodity hardware Created by Doug Cutting and Mike Carafella in 2005. Cutting named the program after
17.3.1 Follow the Perturbed Leader
CS787: Advanced Algorithms Topic: Online Learning Presenters: David He, Chris Hopman 17.3.1 Follow the Perturbed Leader 17.3.1.1 Prediction Problem Recall the prediction problem that we discussed in class.
INTERNATIONAL JOURNAL OF COMPUTER ENGINEERING & TECHNOLOGY (IJCET) PERCEIVING AND RECOVERING DEGRADED DATA ON SECURE CLOUD
INTERNATIONAL JOURNAL OF COMPUTER ENGINEERING & TECHNOLOGY (IJCET) International Journal of Computer Engineering and Technology (IJCET), ISSN 0976- ISSN 0976 6367(Print) ISSN 0976 6375(Online) Volume 4,
Log Mining Based on Hadoop s Map and Reduce Technique
Log Mining Based on Hadoop s Map and Reduce Technique ABSTRACT: Anuja Pandit Department of Computer Science, [email protected] Amruta Deshpande Department of Computer Science, [email protected]
Provably Delay Efficient Data Retrieving in Storage Clouds
Provably Delay Efficient Data Retrieving in Storage Clouds Yin Sun, Zizhan Zheng, C. Emre Koksal, Kyu-Han Kim, and Ness B. Shroff Dept. of ECE, Dept. of CSE, The Ohio State University, Columbus, OH Hewlett
STAR : An Efficient Coding Scheme for Correcting Triple Storage Node Failures
STAR : An Efficient Coding Scheme for Correcting Triple Storage Node Failures Cheng Huang, Member, IEEE, and Lihao Xu, Senior Member, IEEE Abstract Proper data placement schemes based on erasure correcting
Distributed file system in cloud based on load rebalancing algorithm
Distributed file system in cloud based on load rebalancing algorithm B.Mamatha(M.Tech) Computer Science & Engineering [email protected] K Sandeep(M.Tech) Assistant Professor PRRM Engineering College
Quality Optimal Policy for H.264 Scalable Video Scheduling in Broadband Multimedia Wireless Networks
Quality Optimal Policy for H.264 Scalable Video Scheduling in Broadband Multimedia Wireless Networks Vamseedhar R. Reddyvari Electrical Engineering Indian Institute of Technology Kanpur Email: [email protected]
Large-Scale Data Sets Clustering Based on MapReduce and Hadoop
Journal of Computational Information Systems 7: 16 (2011) 5956-5963 Available at http://www.jofcis.com Large-Scale Data Sets Clustering Based on MapReduce and Hadoop Ping ZHOU, Jingsheng LEI, Wenjun YE
A STUDY ON HADOOP ARCHITECTURE FOR BIG DATA ANALYTICS
A STUDY ON HADOOP ARCHITECTURE FOR BIG DATA ANALYTICS Dr. Ananthi Sheshasayee 1, J V N Lakshmi 2 1 Head Department of Computer Science & Research, Quaid-E-Millath Govt College for Women, Chennai, (India)
COST MINIMIZATION OF RUNNING MAPREDUCE ACROSS GEOGRAPHICALLY DISTRIBUTED DATA CENTERS
COST MINIMIZATION OF RUNNING MAPREDUCE ACROSS GEOGRAPHICALLY DISTRIBUTED DATA CENTERS Ms. T. Cowsalya PG Scholar, SVS College of Engineering, Coimbatore, Tamilnadu, India Dr. S. Senthamarai Kannan Assistant
Enhancing Dataset Processing in Hadoop YARN Performance for Big Data Applications
Enhancing Dataset Processing in Hadoop YARN Performance for Big Data Applications Ahmed Abdulhakim Al-Absi, Dae-Ki Kang and Myong-Jong Kim Abstract In Hadoop MapReduce distributed file system, as the input
INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY
INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY A PATH FOR HORIZING YOUR INNOVATIVE WORK A REVIEW ON HIGH PERFORMANCE DATA STORAGE ARCHITECTURE OF BIGDATA USING HDFS MS.
Load Balancing in Fault Tolerant Video Server
Load Balancing in Fault Tolerant Video Server # D. N. Sujatha*, Girish K*, Rashmi B*, Venugopal K. R*, L. M. Patnaik** *Department of Computer Science and Engineering University Visvesvaraya College of
MANAGEMENT OF DATA REPLICATION FOR PC CLUSTER BASED CLOUD STORAGE SYSTEM
MANAGEMENT OF DATA REPLICATION FOR PC CLUSTER BASED CLOUD STORAGE SYSTEM Julia Myint 1 and Thinn Thu Naing 2 1 University of Computer Studies, Yangon, Myanmar [email protected] 2 University of Computer
Evaluating Task Scheduling in Hadoop-based Cloud Systems
2013 IEEE International Conference on Big Data Evaluating Task Scheduling in Hadoop-based Cloud Systems Shengyuan Liu, Jungang Xu College of Computer and Control Engineering University of Chinese Academy
Optimal Service Pricing for a Cloud Cache
Optimal Service Pricing for a Cloud Cache K.SRAVANTHI Department of Computer Science & Engineering (M.Tech.) Sindura College of Engineering and Technology Ramagundam,Telangana G.LAKSHMI Asst. Professor,
Algorithmic Mechanism Design for Load Balancing in Distributed Systems
In Proc. of the 4th IEEE International Conference on Cluster Computing (CLUSTER 2002), September 24 26, 2002, Chicago, Illinois, USA, IEEE Computer Society Press, pp. 445 450. Algorithmic Mechanism Design
Comparision of k-means and k-medoids Clustering Algorithms for Big Data Using MapReduce Techniques
Comparision of k-means and k-medoids Clustering Algorithms for Big Data Using MapReduce Techniques Subhashree K 1, Prakash P S 2 1 Student, Kongu Engineering College, Perundurai, Erode 2 Assistant Professor,
A Service Revenue-oriented Task Scheduling Model of Cloud Computing
Journal of Information & Computational Science 10:10 (2013) 3153 3161 July 1, 2013 Available at http://www.joics.com A Service Revenue-oriented Task Scheduling Model of Cloud Computing Jianguang Deng a,b,,
A Hybrid Load Balancing Policy underlying Cloud Computing Environment
A Hybrid Load Balancing Policy underlying Cloud Computing Environment S.C. WANG, S.C. TSENG, S.S. WANG*, K.Q. YAN* Chaoyang University of Technology 168, Jifeng E. Rd., Wufeng District, Taichung 41349
Distributed File System. MCSN N. Tonellotto Complements of Distributed Enabling Platforms
Distributed File System 1 How do we get data to the workers? NAS Compute Nodes SAN 2 Distributed File System Don t move data to workers move workers to the data! Store data on the local disks of nodes
A Framework for Performance Analysis and Tuning in Hadoop Based Clusters
A Framework for Performance Analysis and Tuning in Hadoop Based Clusters Garvit Bansal Anshul Gupta Utkarsh Pyne LNMIIT, Jaipur, India Email: [garvit.bansal anshul.gupta utkarsh.pyne] @lnmiit.ac.in Manish
Big Data Storage Architecture Design in Cloud Computing
Big Data Storage Architecture Design in Cloud Computing Xuebin Chen 1, Shi Wang 1( ), Yanyan Dong 1, and Xu Wang 2 1 College of Science, North China University of Science and Technology, Tangshan, Hebei,
BRAESS-LIKE PARADOXES FOR NON-COOPERATIVE DYNAMIC LOAD BALANCING IN DISTRIBUTED COMPUTER SYSTEMS
GESJ: Computer Science and Telecommunications 21 No.3(26) BRAESS-LIKE PARADOXES FOR NON-COOPERATIVE DYNAMIC LOAD BALANCING IN DISTRIBUTED COMPUTER SYSTEMS Said Fathy El-Zoghdy Department of Computer Science,
Detection of Distributed Denial of Service Attack with Hadoop on Live Network
Detection of Distributed Denial of Service Attack with Hadoop on Live Network Suchita Korad 1, Shubhada Kadam 2, Prajakta Deore 3, Madhuri Jadhav 4, Prof.Rahul Patil 5 Students, Dept. of Computer, PCCOE,
Distributing, Ensuring and Recovery of Data Stored in Cloud
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 2, February 2014,
Group Testing a tool of protecting Network Security
Group Testing a tool of protecting Network Security Hung-Lin Fu 傅 恆 霖 Department of Applied Mathematics, National Chiao Tung University, Hsin Chu, Taiwan Group testing (General Model) Consider a set N
GraySort on Apache Spark by Databricks
GraySort on Apache Spark by Databricks Reynold Xin, Parviz Deyhim, Ali Ghodsi, Xiangrui Meng, Matei Zaharia Databricks Inc. Apache Spark Sorting in Spark Overview Sorting Within a Partition Range Partitioner
Data Security Strategy Based on Artificial Immune Algorithm for Cloud Computing
Appl. Math. Inf. Sci. 7, No. 1L, 149-153 (2013) 149 Applied Mathematics & Information Sciences An International Journal Data Security Strategy Based on Artificial Immune Algorithm for Cloud Computing Chen
Big Data - Lecture 1 Optimization reminders
Big Data - Lecture 1 Optimization reminders S. Gadat Toulouse, Octobre 2014 Big Data - Lecture 1 Optimization reminders S. Gadat Toulouse, Octobre 2014 Schedule Introduction Major issues Examples Mathematics
