Enhancing MapReduce via Asynchronous Data Processing

Size: px
Start display at page:

Download "Enhancing MapReduce via Asynchronous Data Processing"

Transcription

1 Enhancing MapReduce via Asynchronous Data Processing Marwa Elteir, Heshan Lin and Wu-chun Feng Departent of Coputer Science Virginia Tech {aelteir, hlin2, Abstract The MapReduce prograing odel siplifies large-scale data processing on coodity clusters by having users specify a ap function that processes input key/value pairs to generate interediate key/value pairs, and a reduce function that erges and converts interediate key/value pairs into final results. Typical MapReduce ipleentations such as Hadoop enforce barrier synchronization between the ap and reduce phases, i.e., the reduce phase does not start until all ap tasks are finished. In turn, this synchronization requireent can cause inefficient utilization of coputing resources and can adversely ipact perforance. Thus, we present and evaluate two different approaches to cope with the synchronization drawback of existing MapReduce ipleentations. The first approach, hierarchical reduction, starts a reduce task as soon as a predefined nuber of ap tasks copletes; it then aggregates the results of different reduce tasks following a tree structure. The second approach, increental reduction, starts a predefined nuber of reduce tasks fro the beginning and has each reduce task increentally reduce records collected fro ap tasks. Together with our perforance odeling, we evaluate different reducing approaches with two real applications on a 32-node cluster. The experiental results have shown that increental reduction outperfors hierarchical reduction in general. Also, increental reduction can speed-up the original Hadoop ipleentation by up to 35.33% for the wordcount application and 57.98% for the grep application. In addition, increental reduction outperfors the original Hadoop in an eulated cloud environent with heterogeneous copute nodes. Index Ters Distributed Coputing, Cloud Coputing, MapReduce, Hadoop, Asynchronous processing. I. INTRODUCTION Today, any coercial and scientific applications require the processing of large aounts of data, and thus, deand copute resources far beyond what can be provided by a single coodity processor. To address this, various high-end coputing platfors including supercoputers, graphics processors, clusters of workstations, and cloud coputing environents have been architected to facilitate parallel processing at different scales. However, efficiently exploiting the hardware parallelis provided by these platfors requires parallel prograing skills that are difficult to aster by coon application developers. Even for well-trained experts, the engineering and debugging of parallel applications can be tie consuing. MapReduce [4] is an effort that seeks to deocratize parallel prograing. Originally proposed by Google, MapReduce has been used to process assive aounts of data in web search engines on a daily basis. With the aturity of Hadoop, an open-source MapReduce ipleentation, the MapReduce prograing odel has been widely adopted and found effective in any scientific and coercial application areas such as achine learning [2], bioinforatics [8], astrophysics [7] and cyber-security [5]. Because of its ease of use and built-in fault tolerance, MapReduce is also becoing one of the ost effective prograing odels in cloud coputing environents, e.g., Aazon EC2 [1]. A MapReduce job consists of two user-provided functions: ap and reduce. As shown in Figure 1, the run-tie syste creates concurrent instances of ap tasks that split and process the input data. The interediate results generated by ap tasks are then copied to the reduce tasks, which in turn reduce the interediate results and produce the final output. In a MapReduce syste, all the above data is stored as key/value pairs for efficient data indexing and partitioning. Input dataset Map phase r1 r2 Reduce phase Map task r Reduce task Distributed File Syste Fig. 1. MapReduce fraework Local File Syste Existing MapReduce ipleentations (e.g., Hadoop) enforce a barrier synchronization between the ap phase and the reduce phase, i.e., the reduce phase only starts when all ap tasks are copleted. There are two cases where this barrier synchronization can result in serious resource underutilization. First, on heterogeneous environents, the faster copute nodes will finish their assigned ap tasks earlier, but these nodes cannot proceed to the reduce processing until all the ap tasks are finished, thus wasting resources. The resource heterogeneity can originate fro different hardware

2 configurations or resource sharing through virtualization. 1 Second, even in hoogeneous environents, a copute node ay not be fully utilized by the ap processing because a ap task alternates between coputation and I/O. Also, there is additional scheduling overhead between the execution of different tasks. In this case, overlapping ap and reduce processing can considerably iprove job response tie, especially when the nuber of ap tasks is relatively large copared to the cluster size. For instance, the nuber of ap tasks of a MapReduce job is typically configured to be 1 ties the nuber of copute nodes at Google [4]. In this paper, we focus on addressing the above synchronization proble for a specific class of MapReduce jobs, called recursively reducible MapReduce jobs. For this type of MapReduce jobs, a portion of the ap results can be reduced independently, and the partial reduced results can be recursively aggregated to produce global reduce results. One exaple of such a job is word counting. More details about recursively reducible MapReduce jobs will be discussed in Section II-B. The unique characteristic of recursively reducible MapReduce jobs is that there is no inherent synchronization requireent between the ap phase and the reduce phase. Consequently, we propose and copare two asynchronous data-processing techniques to enhance resource utilization and perforance of MapReduce for recursively reducible jobs. The first approach, hierarchical reduction (HR), overlaps ap and reduce processing at the inter-task level. This approach starts a reduce task as soon as a certain nuber of ap tasks coplete and aggregates partial reduced results following a tree hierarchy. The second approach, increental reduction (), exploits the potential of overlapping data processing and counication within each reduce task. It starts a designated nuber of reduce tasks fro the beginning and increentally applies reduce function to the interediate results accuulated fro ap tasks. We evaluate the proposed approaches with analytical odels and experients on a 32-node cluster. The experiental results deonstrate that both approaches can effectively iprove the MapReduce execution tie with the increental reduction approach consistently outperforing hierarchical reduction. In particular, increental reduction can outperfor the original Hadoop ipleentation by 35.33% for the wordcount application and 57.98% for the grep application. The rest of the paper is organized as follows. Section II provides background inforation about Hadoop and recursively reducible MapReduce jobs. Section III presents the design of the proposed approaches, in particular, hierarchical reduction and increental reduction. Section IV evaluates the perforance of the proposed approaches using an analytical odel. The experiental results are discussed in Section V. Section VI presents the related work. Finally, Section VII concludes the paper. 1 Zahria et al. reported that there can be a 2.5-fold perforance difference aong various virtual achine instances on Aazon EC2 [13]. II. BACKGROUND Here we describe Hadoop, an open-source Java ipleentation of the MapReduce fraework, as well as the notion of recursively reducible jobs. A. Hadoop Hadoop can be logically segregated into two subsystes, i.e., a distributed file syste called H and a MapReduce run-tie syste. The MapReduce run-tie syste follows a aster-slave design. The aster node is responsible for anaging subitted jobs, i.e., assigning ap and reduce tasks of every job to the available workers. By default, each worker can run two ap tasks and two reduce tasks siultaneously. At the beginning of a job execution, the input data is split and assigned to individual ap tasks. When a worker finishes executing a ap task, it stores the ap results as interediate key/value pairs locally. The interediate results of each ap task will be partitioned and assigned to the reduce tasks according to their keys. A reduce task begins by retrieving its corresponding interediate results fro all ap outputs (called the shuffle phase). The reduce task then sorts the collected interediate results and applies the reduce function to the sorted results. To iprove perforance, Hadoop overlaps the retrieving and sorting of finished ap outputs with the execution of newly scheduled ap tasks. B. Recursively Reducible Jobs Word counting is a siple exaple of recursively reducible jobs. The occurrences of a word can be counted first on different splits of an input file, and those partial counts can then be aggregated to produce the nuber of word occurrences in the entire file. Other recursively reducible MapReduce applications include association rule ining, outlier detection, coutative and associative statistical functions, and so on. In contrast, the square of the su of values is an exaple of a reduce function that is not recursively reducible, because (a +b) 2 + (c + d) 2 does not equal (a +b+c+d) 2. However, there are soe atheatical approaches that can transfor such functions to benefit fro our solution. It is worth entioning that there is a cobiner function provided in typical MapReduce ipleentations including Hadoop. The cobiner function is used to reduce key/value pairs generated by a single ap task. The partially reduced results, instead of the raw ap output, are delivered to the reduce tasks for further reduction. Our proposed asynchronous data-processing techniques are applicable to all applications that can benefit fro the cobiner function. The fundaental difference between our techniques and the cobiner function is that our techniques optiize the reducing of key/value pairs fro ultiple ap tasks. III. ASYNCHRONOUS MAPREDUCE DATA PROCESSING In this section, we present the design details of our two proposed asynchronous data-processing techniques: hierarchical reduction and increental reduction.

3 A. Hierarchical Reduction (HR) 1) Design and Ipleentation: Hierarchical reduction seeks to overlap the ap and reduce processing by dynaically issuing reduce tasks to aggregate partially reduced results along a tree-like hierarchy. As shown in Figure 2, as soon as a certain nuber (i.e., defined by the aggregation level σ H ) of ap tasks are successfully copleted, a new reduce task is created and assigned to one of the available workers. This reduce task is responsible for reducing the output of the σ H ap tasks that are just finished. When all ap tasks are successfully copleted and assigned to reduce tasks, another stage of the reduce phase is started. In this stage, as soon as a certain σ H reduce tasks are successfully copleted, a new reduce task is created to reduce the output of the σ H reduce tasks. This process repeats until there is only one reaining reduce task, i.e., when all interediate results are reduced. B. Increental Reduction () 1) Design and Ipleentation: Increental reduction ais to start the reduce phase as early as possible within a reduce task. Specifically, the nuber of reduce tasks are defined at the beginning of the job siilar to the original MapReduce fraework. Within a reduce task, as soon as a certain aount of ap outputs are received, the reduction of these outputs starts and the results are stored locally. The sae process repeats until all ap outputs are retrieved. Map phase Reduce phase r1 r2 r3 r4 Write final output 7 8 Map phase Reduce phase r1 r2 r3 r4 r5 r7 Map task Reduce task Assigned ap output r5 Map task Reduce task Fig r5 r8 Write final output Hierarchical reduction with an aggregation level of two. Although conceptually the reduce tasks are organized as a balanced tree, in our ipleentation a reduce task at a given level does not have to wait for all of the tasks at the previous level to finish. In other words, as soon as a sufficient nuber of tasks (i.e., σ H ) fro the previous level becoes available, a reduce task fro the subsequent level can begin. Such a design can reduce the associated scheduling overhead of HR. 2) Discussion: One advantage of HR is that it can parallelize the reduction of a single reducing key across ultiple workers, whereas in the original MapReduce fraework, the reduction of a key is always handled by one worker. Therefore, this approach is suitable for applications with significant reduce coputation per key. However, HR incurs extra counication overhead in transferring the interediate key/value pairs to reduce tasks at different levels of the tree hierarchy, which can adversely ipact the perforance as the depth of the tree hierarchy increases. Other overheads include the scheduling cost of reduce tasks generated on the fly. The fault-tolerant design of the original Hadoop needs to be odified to accoodate HR. In particular, the JobTracker should keep track of all created reduce tasks, in addition to the tasks assigned to be reduced by these reduce tasks. Whenever a reduce task fails, another copy of this task should be created, and the appropriate tasks should be assigned again for reduction. Fig. 3. Reduce coputation Increental reduction with an reduce granularity of two. In Hadoop, a reduce task consists of three stages. The first stage, naed shuffling, copies the task s own portion of interediate results fro the output of all ap tasks. The second stage, naed sorting, sorts and erges the retrieved interediate results, according to their keys. Finally, the third stage applies the reduce function to the values associated with each key. To enhance the perforance of the reduce phase, the shuffling stage is overlapped with the sorting stage. More specifically, when the nuber of in-eory ap outputs reaches a certain threshold, apred.ine.erge.threshold, these outputs are erged and the results are stored on-disk. When the nuber of on-disk files reaches another threshold, io.sort.factor, another on-disk erge is perfored. After all ap outputs are retrieved, all on-disk and in-eory files are erged, and then the reduction stage begins. In our ipleentation, we ake use of io.sort.factor and apred.ine.erge.threshold. When the nuber of ineory outputs reaches the apred.ine.erge.threshold threshold, they are erged and the erging results are stored on the disk. When the nuber of on-disk outputs reaches the io.sort.factor threshold, the increental reduction of these outputs begins and the reducing results are stored instead of the erging results. When all ap outputs are retrieved, the in-eory ap outputs are reduced along with the stored reducing results. The final output data is written to the distributed file syste. The entire process is depicted in Figure 3. 2) Discussion: incurs less overheads than HR for two reasons. First, the interediate key/value pairs are transitted

4 once fro the ap to reduce tasks instead of several ties along the hierarchy. Second, all of the reduce tasks are created at the start of the job, and hence, the scheduling overhead is reduced. In addition to the counication cost, the nuber of writes to local and distributed file syste are the sae (assuing the sae nuber of reduce tasks) for both the original MapReduce and. Therefore, can outperfors the original MapReduce when there is sufficient overlap between the ap and reduce processing. The ain challenge of is to select the right threshold that triggers an increental reduce operation. Too low a threshold will result in unnecessarily frequent I/O operations, while too high a threshold will not be able to deliver noticeable perforance iproveents. Interestingly, a siilar decision, i.e., the erging threshold, has to be ade in the original Hadoop ipleentation as well. Currently we provide a runtie option for users to control the increental reduction threshold. In the future, we plan to investigate self-tuning of this threshold for long-running MapReduce jobs. It is worth noting that since the ap and reduce tasks in this approach are created in the sae anner as in Hadoop, the fault-tolerant schee of Hadoop works in as well. IV. ANALYTICAL MODELS Here we derive analytical odels to copare the perforance of the original MapReduce () ipleentation of Hadoop and the augented ipleentations with hierarchical reduction (HR) and increental reduction () enhanceents. Table I presents all the paraeters used in the odels. Without loss of generality, our odeling assues the nuber of reduce tasks r is saller than the nuber of execution slots 2n. In fact, the Hadoop docuentation recoends that 95% of the execution slots is a good nuber for the nuber of reduce tasks for typical applications. However, our analysis can be easily generalized to odel the cases where there are ore reduce tasks than the nuber of execution slots. Furtherore, the odel assues that the nuber of ap tasks is greater than the nuber of available execution slots in the cluster 2n (recall that there are two execution slots per node by default). When the nuber of ap tasks is less than the nuber of execution slots, all ap tasks are executed in parallel and copleted siultaneously, so there is no way to overlap ap and reduce phases. For siplicity, we consider two different cases. The first case assues that there is a high degree of overlap between the ap and reduce procedures, where the processing (including retrieving, erging, and sorting) of alost all interediate results in a reduce task can be overlapped with ap coputations. The second case, representing a low degree of overlap between the ap and reduce procedures, assues that the processing of only a portion of interediate results in a reduce task can be overlapped with ap coputations. A. The execution of the original Hadoop can be illustrated by the left part of Figure 4. When the overlapping degree is high, Paraeters n k r t t rk σ H C C C HR Meaning Nuber of ap tasks Nuber of nodes Total nuber of interediate key/ value pairs Nuber of reduce tasks of the fraework Average ap task execution tie Average execution tie of reducing values of a single key Aggregation level used in HR Counication cost per key/value pair Counication cost fro ap tasks to r reduce tasks in Counication cost fro the assigned σ H ap tasks to a reduce task in HR TABLE I PARAMETERS USED IN THE PERFORMANCE MODEL the erging phase of can be eliinated. So, the total tie becoes Map tie + Reduce tie, assuing the final stage erging is neglected. First stage of ap tasks Overlapped coputations Write to disk Decrease in total tie Map phase Merge phase Reduce phase Copy and erge Fig. 4. Copy, erge, and reduce Execution of and For low degree of overlap, if the reduce tasks occupy all nodes i.e., the nuber of reduce tasks (r) is larger than or equal to n, only a portion of the erging of the interediate results can be overlapped with the ap coputation, and the total tie can be expressed by Equation (1), where o represents the reduction in the erging tie. T = Map tie + (Merge tie o) + Reduce tie (1) However, when the reduce tasks do not occupy all nodes, ore erging can be overlapped with the ap coputations due to the load balancing effect i.e., the nodes with running reduce tasks process a saller nuber of ap tasks copared to the other nodes. This is because Hadoop uses greedy task scheduling; the nodes without running reduce tasks process ap tasks faster, and in turn, are assigned ore ap tasks. As a result, the Map tie is increased and the erging tie is decreased as shown in Equation (2), where l represents the load balancing effect, o represents the overlapping effect, and o > o. As r increases, l and o keeps decreasing, until reaching and o respectively when r = n (Equation (1)). T = (Map tie+l)+(merge tie o )+Reduce tie (2)

5 B. HR For hierarchical reduction, ap and reduce processing at different stages can be overlapped, as shown in Figure 5. To copare the perforance of HR with that of, we consider ore detailed odeling. For HR, when all ap tasks are finished, the reaining coputations are to reduce the un-reduced ap tasks in addition to cobining the results of this reducing stage with other partial reduced results. Specifically, the total execution tie of, and HR can be represented by the following equations, where C is the counication required to retrieve the reaining ap outputs since s counication is overlapped with the ap coputations, and s is the reaining nuber of reduce stages in the HR s hierarchy: T = Map tie + C + k r log( k r ) + t rk k r (3) T HR = Map tie+(c HR + σ Hk log(σ Hk )+t rk σ Hk ) s (4) Map phase t σ k σ k CHR + ( tr + log( )) Fig. 5. Reduce phase s Stage 1 Stage 2 Stage 3 Stage 4 Execution of HR fraework when = 8n Assuing every ap task produces a value for each given key, then C k 2n is r C, then C HR is σhk C, and C HR equals σhr 2n C, where C is counication cost per key/value pair. By substituting C HR in Equation (4) by the previous value, the equation becoes: T HR = Map tie+ ( σ Hr 2n C + σ H (klog(σ Hk ) + kt (5) rk)) s When the overlapping degree is high, s can be replaced by (log σh (2n) + 1) in Equation (5), which represents the reduction of the ap tasks of the final stage. Moreover, the erging tie can be eliinated fro Equation (3). So, for significantly large, the reducing part of Equation (5) is saller than that of Equation (3). If this ter occupies a significant portion of the total tie of, and the counication overhead is sall, then HR will behave better than as we will see in the experiental evaluation. However, when the overlapping degree is low, s can be very deep, and the perforance of HR can be worse than. C. The execution of relative to is represented by Figure 4. Particularly, when the overlapping degree is high, the final erging, and reducing phase of can be eliinated. So the total execution tie can be represented by Map tie. By coparing this to, we can conclude that behaves better than especially when the reducing tie of is significant. For low overlapping degree, if r is larger than n, then erging and reducing of only a portion of the interediate results can be overlapped with the ap coputations, and the total tie can be expressed by Equation (6). Where o and o r represents the reduction in the erging and reducing tie respectively. T = Map tie + (Merge tie o ) + (Reduce tie o r ) (6) To copare this with Equation (2), we consider the details of the overlapping coputations in and. The ain difference is that perfors reducing after erging and writes the results of reduce rather than the results of erge to disk. Assuing the reduce function changes the size of the input by a factor of x and the reduce function is linear, then the overlapped coputations of (O ) and (O ) can be represented by the following equations, where I is the size of interediate key/value pairs to be erged and reduced during the ap phase, IlogI is the average nuber of copare operations executed during the erge, p s is the processor speed, d s is the disk write speed. O = I logi p s O = I logi + I p s + I d s (7) + I x d s (8) Given the sae overlapping degree, if x < 1, which is valid for applications like wordcount, grep, linear regression etc., then is able to conduct ore erging in addition to reducing overlapped with ap coputations. So, the erging and reducing ters in equation 6 is less than the sae ters in equation 1. So, can behave better than given the reduce coputations is signifiant as illustrated by Figure 4. On the other side, if x 1 as in sort application, then the perforance of highly depends on the coplexity of the reduce function copared to the erging, in addition to the size of the interediate key/value pairs. By applying the previous analysis to the case where r < n, we can conclude that can behave better than in this case given also the reducing tie occupies a significant portion of the total execution tie. V. EXPERIMENTAL ANALYSIS In this section, we present a rigorous perforance evaluation of our proposed techniques. The details of the syste configurations are given along with the conducted experients.

6 A. Experiental Platfor We ran our experients on Syste X at Virginia Tech, coprising Apple Xserve G5 copute nodes with dual 2.3GHz PowerPC 97FX processors, 4GB RAM, and 8GB hard drives. The copute nodes are connected with a Gigabit Ethernet interconnection. Each node is running the GNU/Linux operating syste with kernel version The proposed approaches are developed based on Hadoop.17.. B. Overview We use two applications i.e., wordcount and grep in the experients. Wordcount is an application that parses a docuent or a nuber of docuents, and produces for every word the nuber of its occurrences. Grep accepts a docuent or a nuber of docuents and an expression as input, and it atches this expression along the whole docuents and produces the nuber of occurrences for every atch. We first ai at studying the scalability of the different reducing approaches in ters of the dataset size. Next, we seek to understand the reasons of perforance difference between different approaches by profiling the behaviors of wordcount and grep under various configurations. Finally, we evaluate various reducing approaches on eulated cloud environents. The ajor perforance etric in all experients is the total execution tie in seconds. Moreover, for a fair coparison, we follow the guidance given in the Hadoop docuentation regarding the nuber of ap and reduce slots per node as well as the thresholds that control the frequency of erging and sorting interediate results (i.e., apred.ine.erge.threshold and io.sort.factor as discussed in Section III-B). Furtherore, we enable the cobiner function in all experients In addition, the aggregation level of the hierarchical reduction approach is set to 4, which produces the best results on the 32-node cluster. Before executing an experient, we flush the Linux file syste cache by having each node read a duy file that is larger than the size of the eory. This avoids the I/O perforance inconsistency caused by the Linux file caching. C. Scalability with the Dataset Size In this experient, we ai at studying the scalability of the three reducing approaches with regard to the size of the input dataset. We run wordcount and grep using data sets of two different sizes, i.e., 16GB and 64GB. For wordcount, the nuber of reduce tasks was set to 4 and 8 (a broader range were used in Section V-D). For grep, we use a query that produces results of oderate size. The perforance of queries generating various sizes of output will be investigated in Section V-E. As shown in Figure 6, generally, as the size of the input dataset increases, the perforance iproveent of over increases. Specifically, for wordcount, as the size of the input increases to 64GB, outperfors by 34.5% and 9.48% instead of 21% and 5.97% for 16GB input using 4 and 8 reduce tasks, respectively. In addition, for grep, increasing the input size to 64GB iproves s perforance gain over ; is better than by 16.2% (for 64GB input) instead Noralized execution tie Fig GB 64GB 16GB 64GB 16GB 64GB 16GB 64GB aggregation level Wordcount Setting [a-c]+[a-z.]+' Grep Scalability with dataset size using wordcount and grep of 7.1% (for 16GB input). The scalability of attributes to two reasons that provide ore roo for overlapping ap processing and reduce coputations. First, as the dataset size increases, the ap phase has to perfor ore I/O. Second, for 64 GB, the nuber of ap tasks increases fro 256 to 124, thus increasing the scheduling overhead. Siilarly, the perforance iproveent of HR over increases by 9.96% for wordcount as the size of input dataset increases. However, the speedup of HR over decreases (by 4.13%) for grep when larger input is used. This is because the extra counication cost caused by the increase of input data can be copensated by the corresponding increase in the ap processing tie in wordcount but not grep. Since the noral Mapreduce jobs process large aounts of input data, all of the subsequent experients will use input datasets of 64 GB. D. Wordcount Perforance In a cluster of 32 nodes, we run wordcount on a dataset of size 64 GB. The nuber of ap tasks is set to 124, and the nuber of reduce task is varied fro 1 to 64. As shown in Figure 7, as the nuber of reduce tasks increases, s iproveent over decreases. Specifically, for one reduce task, behaves better than by 35.33%. When the nuber of reduce tasks is increased to 4, behaves better by 34.49%. As the nuber of reduce tasks increases, the processing tie of a reduce task decreases, thus providing little roo for overlapping the ap and reduce processing. Specifically, when the nuber of reduce tasks is 32, a reduce task only consue a ere 6.83% in the total execution tie as shown in table II. Total execution tie (seconds) Fig. 7. Nuber of reduce tasks Settings aggregation level Perforance of vs. using wordcount HR HR

7 Reduce Tasks Increental erges Reduce Tie Map Tie TABLE II AND EXECUTION PROFILE. Reduce Tasks S S S S TABLE III NUMBER OF MAP TASKS EXECUTED ON A CODE WITH REDUCE TASK(S) RUNNING. S MEANS A SPECULATIVE TASK. Furtherore, achieves its best perforance at 4 reduce tasks because this provides the best coproise between level of parallelis controlled by the nuber of reduce tasks and overlapping ap with reduce. Specifically, in this case, conducts 55 increental erges overlapped with the ap coputations copared to erges with 32 reduce tasks, as shown in Table II. As a result, the nodes executing a reduce task executes 26 ap tasks instead of 32 ap tasks in case of 32 reduce tasks, recall the load balancing effect discussed in section IV. Specifically, when only a sall portion of the nodes are executing reduce tasks, these nodes will have less resources to spend on the ap processing, and thus ore ap tasks are pushed off to the other nodes without reduce tasks. The best perforance is achieved at 32 and 4 reduce tasks for, and respectively. With the best perforance for both and, is better by 5.86%. On the other side, using an aggregation level of 4, HR behaves better than with 8 reduce tasks by 5.13%. We changed the aggregation level fro 2 to 8 as shown in Figure 7. The best perforance is achieved for the aggregation level 4, when a best balance is achieved between the overlap of ap and reduce processing and the reduce overhead. For exaple, with an aggregation level of 2, a reduce operation can be triggered sooner, but the reducing hierarchy is also deeper. Fig. 8. Overall CPU utilization Nuber of reduce tasks CPU utilization throughout the whole job using wordcount To better understand the benefits of the increental reduction approach, we easured the CPU utilization throughout the job execution, and the nuber of disk transfers per second during the ap phase for both and. As shown in Concurrent jobs Execution Tie Execution Tie (seconds) (seconds) TABLE IV AND PERFORMANCE WITH CONCURRENT JOBS. Figure 8 and 9, the CPU utilization of is greater than by 5% on the average. In addition, the average disk transfers of is less than by 2.95 transfers per second. This attributes to the saller aount of data written to disk by, because it reduces the interediate data before writing it back to disk. In doing so, also reduces the size of data read fro disk at the final erging and reducing stage. Transfers per seconds Nuber of reduce tasks Fig. 9. Nuber of disk transfers per second through the ap phase using wordcount. To conclude, for any nuber of reduce tasks, achieves either better or sae perforance as. And the best perforance for is achieved using only 4 reduce tasks, This eans that is ore efficient in utilizing the available resources. So, we expect to achieve better perforance when several jobs are running at the sae tie, or with larger aounts of reduce processing. Particularly, when running three concurrent jobs of wordcount, the best configuration of behaves better than the best configuration of by 8.1% instead of 5.86% as shown in table IV. E. Grep Perforance In a cluster of 32 nodes, we run grep on a dataset of 64 GB. The nuber of ap tasks is set to 124, and the nuber of reduce tasks is set to the default value i.e., one. Grep runs two consecutive jobs; one returns for each atch the nuber of its occurrence, and the other is a short job that inverts the output of the previous job so that the final output will be sorted based on occurrence of the atches instead of alphabetically. In this experient, we focus on the first longer job. We used five different queries each produces a different nuber of atches and hence different nubers of interediate and final key/value pairs. As shown in Figure 1, and HR deliver very siilar perforance. In addition, for the first query, all reducing approaches have the sae perforance. For subsequent queries, HR and outperfor, and the perforance iproveent of HR/ over increases along with the nuber of atches.

8 Total execution tie (seconds) Fig. 1. a+[a-z.]+' [a-b]+[a-z.]+' [a-c]+[a-z.]+' [a-d]+[a-z.]+' [a-i]+[a-z.]+' Query Perforance of,, and HR using grep. Query Reduce Tie Interediate Data Size (seconds) (records) a+[a-z.] ,285,68 [a-b]+[a-z.] ,897,216 [a-c]+[a-z.] ,196,736 [a-d]+[a-z.] ,39,36 [a-i]+[a-z.] ,921,472 TABLE V CHARACTERISTICS OF DIFFERENT QUERIES HR Total execution tie (seconds) Hoogenous 1 slow - fast 1 slow 32 slow reduce Setting Fig. 11. Wordcount perforance in hoogeneous and heterogeneous environents. In the second setting, 1 nodes are slowed down, and the reduce tasks are scheduled on the fast nodes. Currently, where reduce tasks will be run in HR is not controllable. ance because of virtualization, we slow down 32 nodes and repeat the experient. As we can see, still outperfors by 7.14%. HR F. Heterogeneous Environent Perforance Nowadays data centers are becoing increentally heterogeneous, due to the use of virtualization and/or achines fro different generations. In this experient, we ai at studying the robustness of, HR, and to the heterogeneity of the target cluster. In a cluster of 32 nodes, we anually slow down several nodes i.e., 1 nodes to iic a heterogeneous cluster We continuously run dd coand to convert and write a large file (e.g. 5.7 GB) to disk in order to slow down a given node. This approach was used in a related study by Zahria et al. [13]. We expect in these environents, the ap phase tie gets longer due to the effects of the slow nodes. So, if the reduce tasks are appropriately assigned to the fast nodes, then utilizing the extra ap tie in reduce coputations could iprove the perforance of the proposed approaches. Using wordcount, we run, and with the best configuration achieved in Section V-D i.e., 32 reduce tasks for and 4 reduce tasks for. As shown in Figure 11, when the reduce tasks are assigned to the fast nodes, becoes better than by 1.33% instead of 5.86%. However, when they are randoly assigned, becoes better than by only 2.32%. This is expected because the I/O and coputing resources available for reduce tasks in this case are liited, preventing fro taking advantage of overlap between the ap and reduce processing. We argue that if the heterogeneity originates fro different generations of hardware or fro virtualization, it is possible to identify the fast nodes and assign ore reduce tasks to these nodes. Note that HR s perforance drops significantly when running in the heterogeneous environent. This can be attributed to the large nuber of generated reduce tasks in HR. In addition, it is indeterinistic where these tasks will be run, so it is not possible to avoid the effect of the slow nodes. To siulate a cloud environent with slower I/O perfor- VI. RELATED WORK Several research efforts have been done to enhance the MapReduce fraework. Sawzall is a prograing language built on top of MapReduce [9]. It ais at autoatically analyzing huge distributed data files. The ain difference between Sawzall and the standalone Mapreduce fraework is that Sawzall distributes the reduction in a hierarchical topology-based anner. In [1], the authors iproved the perforance of Google s MapReduce fraework by pipelining disk and network I/O. Particularly, they aied at streaing interediate data as it is generated and uses local storage as a write-ahead log for network transfer. In [12], the authors believe that the original MapReduce fraework is liited in supporting applications like relational data processing. So they presented a odified version of the MapReduce naed MapReduceMerge, where the reduce workers produce a list of key/values pairs that are transitted to a set of erge workers for further processing. Moreover, Valvag et al. developed a high-level declarative prograing odel and its underlying runtie, Oivos, which ais at handling applications that require running several MapReduce jobs [11]. This fraework has two ain advantages copared with MapReduce. First, it reduces the overheads associated with such type of applications that includes onitoring the status and progress of each job, deterining when to re-execute a failed job or start the next one, and specifying a valid execution order for the MapReduce jobs. Second, it reoves the extra synchronization when these applications are executed using the traditional MapReduce fraework, i.e., every reduce task in one job should coplete before any of the ap tasks in the next job can start. Steve et al. realized that the loss of interediate ap outputs ay result in a significant perforance degradation [6]. And although using H (Hadoop Distributed File Syste) iproves the reliability, it results in considerably increase in the job copletion tie. As a result, they proposed soe

9 design ideas for a new interediate data storage syste. Zahria et al. [13] proposed a speculative task scheduling naed LATE (Longest Approxiate Tie to End) to cope with several liitations of the original Hadoop s scheduler in heterogeneous environents such as Aazon EC2[1]. Finally, Condie et al. [3] recently extended the MapReduce architecture to work efficiently for online jobs in addition to batch jobs. Instead of aterializing the interediate key/value pairs within every ap task, they proposed pipelining these data directly to the reduce tasks. They further extended this pipelined MapReduce to support interactive data analysis through online aggregation, and continuous query processing. To conclude, our proposed solutions are orthogonal to the above extensions to MapReduce fraework and can copleent their iproveents. VII. CONCLUSIONS AND FUTURE WORK In this paper, we designed and ipleented two approaches to reduce the overhead of the barrier synchronization between the ap and reduce phases of Hadoop, an open source ipleentation of MapReduce. In addition, we evaluated the perforance of these approaches against Hadoop using analytical odel and experients on a 32-node cluster. The first proposed approach is the hierarchical reduction, which overlaps ap and reduce processing at the inter-task level. It starts a reduce task as soon as a certain nuber of ap tasks coplete and aggregates partial reduced results following a tree hierarchy. This approach can be effective when there is enough overlap between ap and reduce processing. However, this approach has soe liitations due to the overheads of creating reduce tasks on the fly, in addition to the extra counication cost of transferring the interediate results along the tree hierarchy. To cope with these overheads, we proposed the increental reduction approach, where all reduce tasks are created at the start of the job, and every reduce task increentally reduces the received ap outputs. The experiental results have shown that this approach consistently outperfors the hierarchical reduction approach and the original Hadoop ipleentation. In the future, we plan to evaluate our techniques to a broader class of applications. We will also investigate in generalizing our techniques to support applications which are not recursively reducible in their original for. Finally, we will evaluate the perforance of the proposed techniques on real cloud environents. at both in-eory erge and on-disk erge. We also present an alternative design, i.e., hierarchical reduction. Finally, in addition to our proposed designs, we presented a rigorous perforance evaluation and analytical analysis of increental reduction () and hierarchical reduction (HR). ACKNOWLEDGEMENT This research is supported in part by NSF grant CNS REFERENCES [1] Aazon.co. Aazon elastic copute cloud. [2] S. Chen and S. Schlosser. Map-reduce eets wider varieties of applications eets wider varieties of applications. Technical Report P-TR-8-5, Intel Research, 28. [3] Tyson Condie, Neil Conway, Peter Alvaro, Joseph M. Hellerstein, Khaled Eleleegy, and Russell Sears. Mapreduce online. Technical Report UCB/EECS , University of California at Berkeley, 29. [4] Jeffrey Dean and Sanjay Gheawat. Mapreduce: Siplified data processing on large clusters. In 6th Syposiu on Operating Systes, Design & Ipleentation, OSDI, 24. [5] M. Grant, S. Sehrish, J. Bent, and J. Wang. Introducing ap-reduce to high end coputing. 3rd Petascale Data Storage Workshop, Nov 28. [6] S. Ko, I. Hoque, B. Cho, and I. Gupta. On Availability of Interediate Data in Cloud Coputations. In 12th Workshop on Hot Topics in Operating Systes (HotOS XII), 29. [7] G. Mackey, S. Sehrish, J. Bent, J. Lopez, S. Habib, and J. Wang. Introducing apreduce to high end coputing. In Petascale Data Storage Workshop Held in conjunction with SC8, 28. [8] A. Matsunaga, M. Tsugawa, and J. Fortes. Cloudblast: Cobining apreduce and virtualization on distributed resources for bioinforatics. Microsoft escience Workshop, 28. [9] Rob Pike, Sean Dorward, Robert Grieseer, and Sean Quinlan. Interpreting the data: Parallel analysis with sawzall. Sci. Progra., 13(4): , 25. [1] Alex Rasussen and Maxius Daehan Choi. Iproving perforance in apreduce through pipelining. 26. [11] S.V. Valvag and D. Johansen. Oivos: Siple and efficient distributed data processing. In High Perforance Coputing and Counications, 28. HPCC 8. 1th IEEE International Conference on, pages , Sept. 28. [12] Hung-chih Yang, Ali Dasdan, Ruey-Lung Hsiao, and D. Stott Parker. Map-reduce-erge: siplified relational data processing on large clusters. In SIGMOD 7: Proceedings of the 27 ACM SIGMOD international conference on Manageent of data, pages , New York, NY, USA, 27. ACM. [13] M. Zaharia, A. Konwinski, A. Joseph, R. Katz, and I. Stoica. Iproving apreduce perforance in heterogeneous environents. In OSDI: USENIX Syposiu on Operating Systes Design and Ipleentation, 28. DISCLAIMER This paper is based on version.17. of Hadoop. We copleted the ajor design and developent of our approaches in June 29. Recently, we found that in concurrence with our project developent, Hadoop extended the ipleentation of the original cobiner function, using a design siilar to the increental reduction () approach proposed in our paper. In the ost recent version of Hadoop, the cobiner function is triggered in a reduce task during the in-eory erge of ap outputs. In our design, the increental reduction occurs

Exploiting Hardware Heterogeneity within the Same Instance Type of Amazon EC2

Exploiting Hardware Heterogeneity within the Same Instance Type of Amazon EC2 Exploiting Hardware Heterogeneity within the Sae Instance Type of Aazon EC2 Zhonghong Ou, Hao Zhuang, Jukka K. Nurinen, Antti Ylä-Jääski, Pan Hui Aalto University, Finland; Deutsch Teleko Laboratories,

More information

An Innovate Dynamic Load Balancing Algorithm Based on Task

An Innovate Dynamic Load Balancing Algorithm Based on Task An Innovate Dynaic Load Balancing Algorith Based on Task Classification Hong-bin Wang,,a, Zhi-yi Fang, b, Guan-nan Qu,*,c, Xiao-dan Ren,d College of Coputer Science and Technology, Jilin University, Changchun

More information

The Research of Measuring Approach and Energy Efficiency for Hadoop Periodic Jobs

The Research of Measuring Approach and Energy Efficiency for Hadoop Periodic Jobs Send Orders for Reprints to reprints@benthascience.ae 206 The Open Fuels & Energy Science Journal, 2015, 8, 206-210 Open Access The Research of Measuring Approach and Energy Efficiency for Hadoop Periodic

More information

Modeling Parallel Applications Performance on Heterogeneous Systems

Modeling Parallel Applications Performance on Heterogeneous Systems Modeling Parallel Applications Perforance on Heterogeneous Systes Jaeela Al-Jaroodi, Nader Mohaed, Hong Jiang and David Swanson Departent of Coputer Science and Engineering University of Nebraska Lincoln

More information

Software Quality Characteristics Tested For Mobile Application Development

Software Quality Characteristics Tested For Mobile Application Development Thesis no: MGSE-2015-02 Software Quality Characteristics Tested For Mobile Application Developent Literature Review and Epirical Survey WALEED ANWAR Faculty of Coputing Blekinge Institute of Technology

More information

Dynamic Placement for Clustered Web Applications

Dynamic Placement for Clustered Web Applications Dynaic laceent for Clustered Web Applications A. Karve, T. Kibrel, G. acifici, M. Spreitzer, M. Steinder, M. Sviridenko, and A. Tantawi IBM T.J. Watson Research Center {karve,kibrel,giovanni,spreitz,steinder,sviri,tantawi}@us.ib.co

More information

An Integrated Approach for Monitoring Service Level Parameters of Software-Defined Networking

An Integrated Approach for Monitoring Service Level Parameters of Software-Defined Networking International Journal of Future Generation Counication and Networking Vol. 8, No. 6 (15), pp. 197-4 http://d.doi.org/1.1457/ijfgcn.15.8.6.19 An Integrated Approach for Monitoring Service Level Paraeters

More information

PERFORMANCE METRICS FOR THE IT SERVICES PORTFOLIO

PERFORMANCE METRICS FOR THE IT SERVICES PORTFOLIO Bulletin of the Transilvania University of Braşov Series I: Engineering Sciences Vol. 4 (53) No. - 0 PERFORMANCE METRICS FOR THE IT SERVICES PORTFOLIO V. CAZACU I. SZÉKELY F. SANDU 3 T. BĂLAN Abstract:

More information

Cooperative Caching for Adaptive Bit Rate Streaming in Content Delivery Networks

Cooperative Caching for Adaptive Bit Rate Streaming in Content Delivery Networks Cooperative Caching for Adaptive Bit Rate Streaing in Content Delivery Networs Phuong Luu Vo Departent of Coputer Science and Engineering, International University - VNUHCM, Vietna vtlphuong@hciu.edu.vn

More information

Real Time Target Tracking with Binary Sensor Networks and Parallel Computing

Real Time Target Tracking with Binary Sensor Networks and Parallel Computing Real Tie Target Tracking with Binary Sensor Networks and Parallel Coputing Hong Lin, John Rushing, Sara J. Graves, Steve Tanner, and Evans Criswell Abstract A parallel real tie data fusion and target tracking

More information

Online Bagging and Boosting

Online Bagging and Boosting Abstract Bagging and boosting are two of the ost well-known enseble learning ethods due to their theoretical perforance guarantees and strong experiental results. However, these algoriths have been used

More information

Applying Multiple Neural Networks on Large Scale Data

Applying Multiple Neural Networks on Large Scale Data 0 International Conference on Inforation and Electronics Engineering IPCSIT vol6 (0) (0) IACSIT Press, Singapore Applying Multiple Neural Networks on Large Scale Data Kritsanatt Boonkiatpong and Sukree

More information

Workflow Management in Cloud Computing

Workflow Management in Cloud Computing Workflow Manageent in Cloud Coputing Monika Bharti M.E. student Coputer Science and Engineering Departent Thapar University, Patiala Anju Bala Assistant Professor Coputer Science and Engineering Departent

More information

Design of Model Reference Self Tuning Mechanism for PID like Fuzzy Controller

Design of Model Reference Self Tuning Mechanism for PID like Fuzzy Controller Research Article International Journal of Current Engineering and Technology EISSN 77 46, PISSN 347 56 4 INPRESSCO, All Rights Reserved Available at http://inpressco.co/category/ijcet Design of Model Reference

More information

Searching strategy for multi-target discovery in wireless networks

Searching strategy for multi-target discovery in wireless networks Searching strategy for ulti-target discovery in wireless networks Zhao Cheng, Wendi B. Heinzelan Departent of Electrical and Coputer Engineering University of Rochester Rochester, NY 467 (585) 75-{878,

More information

Calculating the Return on Investment (ROI) for DMSMS Management. The Problem with Cost Avoidance

Calculating the Return on Investment (ROI) for DMSMS Management. The Problem with Cost Avoidance Calculating the Return on nvestent () for DMSMS Manageent Peter Sandborn CALCE, Departent of Mechanical Engineering (31) 45-3167 sandborn@calce.ud.edu www.ene.ud.edu/escml/obsolescence.ht October 28, 21

More information

A Scalable Application Placement Controller for Enterprise Data Centers

A Scalable Application Placement Controller for Enterprise Data Centers W WWW 7 / Track: Perforance and Scalability A Scalable Application Placeent Controller for Enterprise Data Centers Chunqiang Tang, Malgorzata Steinder, Michael Spreitzer, and Giovanni Pacifici IBM T.J.

More information

Use of extrapolation to forecast the working capital in the mechanical engineering companies

Use of extrapolation to forecast the working capital in the mechanical engineering companies ECONTECHMOD. AN INTERNATIONAL QUARTERLY JOURNAL 2014. Vol. 1. No. 1. 23 28 Use of extrapolation to forecast the working capital in the echanical engineering copanies A. Cherep, Y. Shvets Departent of finance

More information

Fuzzy Sets in HR Management

Fuzzy Sets in HR Management Acta Polytechnica Hungarica Vol. 8, No. 3, 2011 Fuzzy Sets in HR Manageent Blanka Zeková AXIOM SW, s.r.o., 760 01 Zlín, Czech Republic blanka.zekova@sezna.cz Jana Talašová Faculty of Science, Palacký Univerzity,

More information

An Improved Decision-making Model of Human Resource Outsourcing Based on Internet Collaboration

An Improved Decision-making Model of Human Resource Outsourcing Based on Internet Collaboration International Journal of Hybrid Inforation Technology, pp. 339-350 http://dx.doi.org/10.14257/hit.2016.9.4.28 An Iproved Decision-aking Model of Huan Resource Outsourcing Based on Internet Collaboration

More information

Research Article Performance Evaluation of Human Resource Outsourcing in Food Processing Enterprises

Research Article Performance Evaluation of Human Resource Outsourcing in Food Processing Enterprises Advance Journal of Food Science and Technology 9(2): 964-969, 205 ISSN: 2042-4868; e-issn: 2042-4876 205 Maxwell Scientific Publication Corp. Subitted: August 0, 205 Accepted: Septeber 3, 205 Published:

More information

Evaluating Inventory Management Performance: a Preliminary Desk-Simulation Study Based on IOC Model

Evaluating Inventory Management Performance: a Preliminary Desk-Simulation Study Based on IOC Model Evaluating Inventory Manageent Perforance: a Preliinary Desk-Siulation Study Based on IOC Model Flora Bernardel, Roberto Panizzolo, and Davide Martinazzo Abstract The focus of this study is on preliinary

More information

Managing Complex Network Operation with Predictive Analytics

Managing Complex Network Operation with Predictive Analytics Managing Coplex Network Operation with Predictive Analytics Zhenyu Huang, Pak Chung Wong, Patrick Mackey, Yousu Chen, Jian Ma, Kevin Schneider, and Frank L. Greitzer Pacific Northwest National Laboratory

More information

Red Hat Enterprise Linux: Creating a Scalable Open Source Storage Infrastructure

Red Hat Enterprise Linux: Creating a Scalable Open Source Storage Infrastructure Red Hat Enterprise Linux: Creating a Scalable Open Source Storage Infrastructure By Alan Radding and Nick Carr Abstract This paper discusses the issues related to storage design and anageent when an IT

More information

Energy Efficient VM Scheduling for Cloud Data Centers: Exact allocation and migration algorithms

Energy Efficient VM Scheduling for Cloud Data Centers: Exact allocation and migration algorithms Energy Efficient VM Scheduling for Cloud Data Centers: Exact allocation and igration algoriths Chaia Ghribi, Makhlouf Hadji and Djaal Zeghlache Institut Mines-Téléco, Téléco SudParis UMR CNRS 5157 9, Rue

More information

Analyzing Spatiotemporal Characteristics of Education Network Traffic with Flexible Multiscale Entropy

Analyzing Spatiotemporal Characteristics of Education Network Traffic with Flexible Multiscale Entropy Vol. 9, No. 5 (2016), pp.303-312 http://dx.doi.org/10.14257/ijgdc.2016.9.5.26 Analyzing Spatioteporal Characteristics of Education Network Traffic with Flexible Multiscale Entropy Chen Yang, Renjie Zhou

More information

ASIC Design Project Management Supported by Multi Agent Simulation

ASIC Design Project Management Supported by Multi Agent Simulation ASIC Design Project Manageent Supported by Multi Agent Siulation Jana Blaschke, Christian Sebeke, Wolfgang Rosenstiel Abstract The coplexity of Application Specific Integrated Circuits (ASICs) is continuously

More information

Presentation Safety Legislation and Standards

Presentation Safety Legislation and Standards levels in different discrete levels corresponding for each one to a probability of dangerous failure per hour: > > The table below gives the relationship between the perforance level (PL) and the Safety

More information

Analysis and Modeling of MapReduce s Performance on Hadoop YARN

Analysis and Modeling of MapReduce s Performance on Hadoop YARN Analysis and Modeling of MapReduce s Performance on Hadoop YARN Qiuyi Tang Dept. of Mathematics and Computer Science Denison University tang_j3@denison.edu Dr. Thomas C. Bressoud Dept. of Mathematics and

More information

Local Area Network Management

Local Area Network Management Technology Guidelines for School Coputer-based Technologies Local Area Network Manageent Local Area Network Manageent Introduction This docuent discusses the tasks associated with anageent of Local Area

More information

A Multi-Core Pipelined Architecture for Parallel Computing

A Multi-Core Pipelined Architecture for Parallel Computing Parallel & Cloud Coputing PCC Vol, Iss A Multi-Core Pipelined Architecture for Parallel Coputing Duoduo Liao *1, Sion Y Berkovich Coputing for Geospatial Research Institute Departent of Coputer Science,

More information

The Benefit of SMT in the Multi-Core Era: Flexibility towards Degrees of Thread-Level Parallelism

The Benefit of SMT in the Multi-Core Era: Flexibility towards Degrees of Thread-Level Parallelism The enefit of SMT in the Multi-Core Era: Flexibility towards Degrees of Thread-Level Parallelis Stijn Eyeran Lieven Eeckhout Ghent University, elgiu Stijn.Eyeran@elis.UGent.be, Lieven.Eeckhout@elis.UGent.be

More information

Construction Economics & Finance. Module 3 Lecture-1

Construction Economics & Finance. Module 3 Lecture-1 Depreciation:- Construction Econoics & Finance Module 3 Lecture- It represents the reduction in arket value of an asset due to age, wear and tear and obsolescence. The physical deterioration of the asset

More information

CLOSED-LOOP SUPPLY CHAIN NETWORK OPTIMIZATION FOR HONG KONG CARTRIDGE RECYCLING INDUSTRY

CLOSED-LOOP SUPPLY CHAIN NETWORK OPTIMIZATION FOR HONG KONG CARTRIDGE RECYCLING INDUSTRY CLOSED-LOOP SUPPLY CHAIN NETWORK OPTIMIZATION FOR HONG KONG CARTRIDGE RECYCLING INDUSTRY Y. T. Chen Departent of Industrial and Systes Engineering Hong Kong Polytechnic University, Hong Kong yongtong.chen@connect.polyu.hk

More information

A framework for performance monitoring, load balancing, adaptive timeouts and quality of service in digital libraries

A framework for performance monitoring, load balancing, adaptive timeouts and quality of service in digital libraries Int J Digit Libr (2000) 3: 9 35 INTERNATIONAL JOURNAL ON Digital Libraries Springer-Verlag 2000 A fraework for perforance onitoring, load balancing, adaptive tieouts and quality of service in digital libraries

More information

Introduction to the Microsoft Sync Framework. Michael Clark Development Manager Microsoft

Introduction to the Microsoft Sync Framework. Michael Clark Development Manager Microsoft Introduction to the Michael Clark Developent Manager Microsoft Agenda Why Is Sync both Interesting and Hard Sync Fraework Overview Using the Sync Fraework Future Directions Suary Why Is Sync Iportant Coputing

More information

An Approach to Combating Free-riding in Peer-to-Peer Networks

An Approach to Combating Free-riding in Peer-to-Peer Networks An Approach to Cobating Free-riding in Peer-to-Peer Networks Victor Ponce, Jie Wu, and Xiuqi Li Departent of Coputer Science and Engineering Florida Atlantic University Boca Raton, FL 33431 April 7, 2008

More information

REQUIREMENTS FOR A COMPUTER SCIENCE CURRICULUM EMPHASIZING INFORMATION TECHNOLOGY SUBJECT AREA: CURRICULUM ISSUES

REQUIREMENTS FOR A COMPUTER SCIENCE CURRICULUM EMPHASIZING INFORMATION TECHNOLOGY SUBJECT AREA: CURRICULUM ISSUES REQUIREMENTS FOR A COMPUTER SCIENCE CURRICULUM EMPHASIZING INFORMATION TECHNOLOGY SUBJECT AREA: CURRICULUM ISSUES Charles Reynolds Christopher Fox reynolds @cs.ju.edu fox@cs.ju.edu Departent of Coputer

More information

Machine Learning Applications in Grid Computing

Machine Learning Applications in Grid Computing Machine Learning Applications in Grid Coputing George Cybenko, Guofei Jiang and Daniel Bilar Thayer School of Engineering Dartouth College Hanover, NH 03755, USA gvc@dartouth.edu, guofei.jiang@dartouth.edu

More information

Generating Certification Authority Authenticated Public Keys in Ad Hoc Networks

Generating Certification Authority Authenticated Public Keys in Ad Hoc Networks SECURITY AND COMMUNICATION NETWORKS Published online in Wiley InterScience (www.interscience.wiley.co). Generating Certification Authority Authenticated Public Keys in Ad Hoc Networks G. Kounga 1, C. J.

More information

Extended-Horizon Analysis of Pressure Sensitivities for Leak Detection in Water Distribution Networks: Application to the Barcelona Network

Extended-Horizon Analysis of Pressure Sensitivities for Leak Detection in Water Distribution Networks: Application to the Barcelona Network 2013 European Control Conference (ECC) July 17-19, 2013, Zürich, Switzerland. Extended-Horizon Analysis of Pressure Sensitivities for Leak Detection in Water Distribution Networks: Application to the Barcelona

More information

Implementation of Active Queue Management in a Combined Input and Output Queued Switch

Implementation of Active Queue Management in a Combined Input and Output Queued Switch pleentation of Active Queue Manageent in a obined nput and Output Queued Switch Bartek Wydrowski and Moshe Zukeran AR Special Research entre for Ultra-Broadband nforation Networks, EEE Departent, The University

More information

Approximately-Perfect Hashing: Improving Network Throughput through Efficient Off-chip Routing Table Lookup

Approximately-Perfect Hashing: Improving Network Throughput through Efficient Off-chip Routing Table Lookup Approxiately-Perfect ing: Iproving Network Throughput through Efficient Off-chip Routing Table Lookup Zhuo Huang, Jih-Kwon Peir, Shigang Chen Departent of Coputer & Inforation Science & Engineering, University

More information

An Optimal Task Allocation Model for System Cost Analysis in Heterogeneous Distributed Computing Systems: A Heuristic Approach

An Optimal Task Allocation Model for System Cost Analysis in Heterogeneous Distributed Computing Systems: A Heuristic Approach An Optial Tas Allocation Model for Syste Cost Analysis in Heterogeneous Distributed Coputing Systes: A Heuristic Approach P. K. Yadav Central Building Research Institute, Rooree- 247667, Uttarahand (INDIA)

More information

Resource Allocation in Wireless Networks with Multiple Relays

Resource Allocation in Wireless Networks with Multiple Relays Resource Allocation in Wireless Networks with Multiple Relays Kağan Bakanoğlu, Stefano Toasin, Elza Erkip Departent of Electrical and Coputer Engineering, Polytechnic Institute of NYU, Brooklyn, NY, 0

More information

This paper studies a rental firm that offers reusable products to price- and quality-of-service sensitive

This paper studies a rental firm that offers reusable products to price- and quality-of-service sensitive MANUFACTURING & SERVICE OPERATIONS MANAGEMENT Vol., No. 3, Suer 28, pp. 429 447 issn 523-464 eissn 526-5498 8 3 429 infors doi.287/so.7.8 28 INFORMS INFORMS holds copyright to this article and distributed

More information

Energy Proportionality for Disk Storage Using Replication

Energy Proportionality for Disk Storage Using Replication Energy Proportionality for Disk Storage Using Replication Jinoh Ki and Doron Rote Lawrence Berkeley National Laboratory University of California, Berkeley, CA 94720 {jinohki,d rote}@lbl.gov Abstract Energy

More information

Improving MapReduce Performance in Heterogeneous Environments

Improving MapReduce Performance in Heterogeneous Environments UC Berkeley Improving MapReduce Performance in Heterogeneous Environments Matei Zaharia, Andy Konwinski, Anthony Joseph, Randy Katz, Ion Stoica University of California at Berkeley Motivation 1. MapReduce

More information

MapReduce Online. Tyson Condie, Neil Conway, Peter Alvaro, Joseph Hellerstein, Khaled Elmeleegy, Russell Sears. Neeraj Ganapathy

MapReduce Online. Tyson Condie, Neil Conway, Peter Alvaro, Joseph Hellerstein, Khaled Elmeleegy, Russell Sears. Neeraj Ganapathy MapReduce Online Tyson Condie, Neil Conway, Peter Alvaro, Joseph Hellerstein, Khaled Elmeleegy, Russell Sears Neeraj Ganapathy Outline Hadoop Architecture Pipelined MapReduce Online Aggregation Continuous

More information

Pure Bending Determination of Stress-Strain Curves for an Aluminum Alloy

Pure Bending Determination of Stress-Strain Curves for an Aluminum Alloy Proceedings of the World Congress on Engineering 0 Vol III WCE 0, July 6-8, 0, London, U.K. Pure Bending Deterination of Stress-Strain Curves for an Aluinu Alloy D. Torres-Franco, G. Urriolagoitia-Sosa,

More information

Media Adaptation Framework in Biofeedback System for Stroke Patient Rehabilitation

Media Adaptation Framework in Biofeedback System for Stroke Patient Rehabilitation Media Adaptation Fraework in Biofeedback Syste for Stroke Patient Rehabilitation Yinpeng Chen, Weiwei Xu, Hari Sundara, Thanassis Rikakis, Sheng-Min Liu Arts, Media and Engineering Progra Arizona State

More information

ON SELF-ROUTING IN CLOS CONNECTION NETWORKS. BARRY G. DOUGLASS Electrical Engineering Department Texas A&M University College Station, TX 77843-3128

ON SELF-ROUTING IN CLOS CONNECTION NETWORKS. BARRY G. DOUGLASS Electrical Engineering Department Texas A&M University College Station, TX 77843-3128 ON SELF-ROUTING IN CLOS CONNECTION NETWORKS BARRY G. DOUGLASS Electrical Engineering Departent Texas A&M University College Station, TX 778-8 A. YAVUZ ORUÇ Electrical Engineering Departent and Institute

More information

Equivalent Tapped Delay Line Channel Responses with Reduced Taps

Equivalent Tapped Delay Line Channel Responses with Reduced Taps Equivalent Tapped Delay Line Channel Responses with Reduced Taps Shweta Sagari, Wade Trappe, Larry Greenstein {shsagari, trappe, ljg}@winlab.rutgers.edu WINLAB, Rutgers University, North Brunswick, NJ

More information

Preference-based Search and Multi-criteria Optimization

Preference-based Search and Multi-criteria Optimization Fro: AAAI-02 Proceedings. Copyright 2002, AAAI (www.aaai.org). All rights reserved. Preference-based Search and Multi-criteria Optiization Ulrich Junker ILOG 1681, route des Dolines F-06560 Valbonne ujunker@ilog.fr

More information

An Application Research on the Workflow-based Large-scale Hospital Information System Integration

An Application Research on the Workflow-based Large-scale Hospital Information System Integration 106 JOURNAL OF COMPUTERS, VOL. 6, NO. 1, JANUARY 2011 An Application Research on the Workflow-based Large-scale Hospital Inforation Syste Integration Yang Guojun School of Coputer, Neijiang Noral University,

More information

Impact of Processing Costs on Service Chain Placement in Network Functions Virtualization

Impact of Processing Costs on Service Chain Placement in Network Functions Virtualization Ipact of Processing Costs on Service Chain Placeent in Network Functions Virtualization Marco Savi, Massio Tornatore, Giacoo Verticale Dipartiento di Elettronica, Inforazione e Bioingegneria, Politecnico

More information

Entity Search Engine: Towards Agile Best-Effort Information Integration over the Web

Entity Search Engine: Towards Agile Best-Effort Information Integration over the Web Entity Search Engine: Towards Agile Best-Effort Inforation Integration over the Web Tao Cheng, Kevin Chen-Chuan Chang University of Illinois at Urbana-Chapaign {tcheng3, kcchang}@cs.uiuc.edu. INTRODUCTION

More information

Efficient Key Management for Secure Group Communications with Bursty Behavior

Efficient Key Management for Secure Group Communications with Bursty Behavior Efficient Key Manageent for Secure Group Counications with Bursty Behavior Xukai Zou, Byrav Raaurthy Departent of Coputer Science and Engineering University of Nebraska-Lincoln Lincoln, NE68588, USA Eail:

More information

LEAN FOR FRONTLINE MANAGERS IN HEALTHCARE An action learning programme for frontline healthcare managers

LEAN FOR FRONTLINE MANAGERS IN HEALTHCARE An action learning programme for frontline healthcare managers Course Code: L024 LEAN FOR FRONTLINE MANAGERS IN HEALTHCARE An action learning prograe for frontline healthcare anagers 6 days Green Belt equivalent Are you ready to challenge the status quo and transfor

More information

High Performance Chinese/English Mixed OCR with Character Level Language Identification

High Performance Chinese/English Mixed OCR with Character Level Language Identification 2009 0th International Conference on Docuent Analysis and Recognition High Perforance Chinese/English Mixed OCR with Character Level Language Identification Kai Wang Institute of Machine Intelligence,

More information

Data Streaming Algorithms for Estimating Entropy of Network Traffic

Data Streaming Algorithms for Estimating Entropy of Network Traffic Data Streaing Algoriths for Estiating Entropy of Network Traffic Ashwin Lall University of Rochester Vyas Sekar Carnegie Mellon University Mitsunori Ogihara University of Rochester Jun (Ji) Xu Georgia

More information

Optimal Resource-Constraint Project Scheduling with Overlapping Modes

Optimal Resource-Constraint Project Scheduling with Overlapping Modes Optial Resource-Constraint Proect Scheduling with Overlapping Modes François Berthaut Lucas Grèze Robert Pellerin Nathalie Perrier Adnène Hai February 20 CIRRELT-20-09 Bureaux de Montréal : Bureaux de

More information

Keywords: Big Data, HDFS, Map Reduce, Hadoop

Keywords: Big Data, HDFS, Map Reduce, Hadoop Volume 5, Issue 7, July 2015 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Configuration Tuning

More information

Sensors as a Service Oriented Architecture: Middleware for Sensor Networks

Sensors as a Service Oriented Architecture: Middleware for Sensor Networks Sensors as a Service Oriented Architecture: Middleware for Sensor Networks John Ibbotson, Christopher Gibson, Joel Wright, Peter Waggett, IBM U.K Ltd, Petros Zerfos, IBM Research, Boleslaw K. Szyanski,

More information

arxiv:0805.1434v1 [math.pr] 9 May 2008

arxiv:0805.1434v1 [math.pr] 9 May 2008 Degree-distribution stability of scale-free networs Zhenting Hou, Xiangxing Kong, Dinghua Shi,2, and Guanrong Chen 3 School of Matheatics, Central South University, Changsha 40083, China 2 Departent of

More information

Storing and Accessing Live Mashup Content in the Cloud

Storing and Accessing Live Mashup Content in the Cloud Storing and Accessing Live Mashup Content in the Cloud Krzysztof Ostrowski Cornell University Ithaca, NY 14853, USA krzys@cs.cornell.edu Ken Biran Cornell University Ithaca, NY 14853, USA ken@cs.cornell.edu

More information

Reliability Constrained Packet-sizing for Linear Multi-hop Wireless Networks

Reliability Constrained Packet-sizing for Linear Multi-hop Wireless Networks Reliability Constrained acket-sizing for inear Multi-hop Wireless Networks Ning Wen, and Randall A. Berry Departent of Electrical Engineering and Coputer Science Northwestern University, Evanston, Illinois

More information

A Study on Workload Imbalance Issues in Data Intensive Distributed Computing

A Study on Workload Imbalance Issues in Data Intensive Distributed Computing A Study on Workload Imbalance Issues in Data Intensive Distributed Computing Sven Groot 1, Kazuo Goda 1, and Masaru Kitsuregawa 1 University of Tokyo, 4-6-1 Komaba, Meguro-ku, Tokyo 153-8505, Japan Abstract.

More information

Enhancing Dataset Processing in Hadoop YARN Performance for Big Data Applications

Enhancing Dataset Processing in Hadoop YARN Performance for Big Data Applications Enhancing Dataset Processing in Hadoop YARN Performance for Big Data Applications Ahmed Abdulhakim Al-Absi, Dae-Ki Kang and Myong-Jong Kim Abstract In Hadoop MapReduce distributed file system, as the input

More information

INTEGRATED ENVIRONMENT FOR STORING AND HANDLING INFORMATION IN TASKS OF INDUCTIVE MODELLING FOR BUSINESS INTELLIGENCE SYSTEMS

INTEGRATED ENVIRONMENT FOR STORING AND HANDLING INFORMATION IN TASKS OF INDUCTIVE MODELLING FOR BUSINESS INTELLIGENCE SYSTEMS Artificial Intelligence Methods and Techniques for Business and Engineering Applications 210 INTEGRATED ENVIRONMENT FOR STORING AND HANDLING INFORMATION IN TASKS OF INDUCTIVE MODELLING FOR BUSINESS INTELLIGENCE

More information

Evaluating the Effectiveness of Task Overlapping as a Risk Response Strategy in Engineering Projects

Evaluating the Effectiveness of Task Overlapping as a Risk Response Strategy in Engineering Projects Evaluating the Effectiveness of Task Overlapping as a Risk Response Strategy in Engineering Projects Lucas Grèze Robert Pellerin Nathalie Perrier Patrice Leclaire February 2011 CIRRELT-2011-11 Bureaux

More information

An improved TF-IDF approach for text classification *

An improved TF-IDF approach for text classification * Zhang et al. / J Zheiang Univ SCI 2005 6A(1:49-55 49 Journal of Zheiang University SCIECE ISS 1009-3095 http://www.zu.edu.cn/zus E-ail: zus@zu.edu.cn An iproved TF-IDF approach for text classification

More information

Lecture L9 - Linear Impulse and Momentum. Collisions

Lecture L9 - Linear Impulse and Momentum. Collisions J. Peraire, S. Widnall 16.07 Dynaics Fall 009 Version.0 Lecture L9 - Linear Ipulse and Moentu. Collisions In this lecture, we will consider the equations that result fro integrating Newton s second law,

More information

Survey on Scheduling Algorithm in MapReduce Framework

Survey on Scheduling Algorithm in MapReduce Framework Survey on Scheduling Algorithm in MapReduce Framework Pravin P. Nimbalkar 1, Devendra P.Gadekar 2 1,2 Department of Computer Engineering, JSPM s Imperial College of Engineering and Research, Pune, India

More information

Method of supply chain optimization in E-commerce

Method of supply chain optimization in E-commerce MPRA Munich Personal RePEc Archive Method of supply chain optiization in E-coerce Petr Suchánek and Robert Bucki Silesian University - School of Business Adinistration, The College of Inforatics and Manageent

More information

The individual neurons are complicated. They have a myriad of parts, subsystems and control mechanisms. They convey information via a host of

The individual neurons are complicated. They have a myriad of parts, subsystems and control mechanisms. They convey information via a host of CHAPTER 4 ARTIFICIAL NEURAL NETWORKS 4. INTRODUCTION Artificial Neural Networks (ANNs) are relatively crude electronic odels based on the neural structure of the brain. The brain learns fro experience.

More information

Markovian inventory policy with application to the paper industry

Markovian inventory policy with application to the paper industry Coputers and Cheical Engineering 26 (2002) 1399 1413 www.elsevier.co/locate/copcheeng Markovian inventory policy with application to the paper industry K. Karen Yin a, *, Hu Liu a,1, Neil E. Johnson b,2

More information

A Hybrid Scheduling Approach for Scalable Heterogeneous Hadoop Systems

A Hybrid Scheduling Approach for Scalable Heterogeneous Hadoop Systems A Hybrid Scheduling Approach for Scalable Heterogeneous Hadoop Systems Aysan Rasooli Department of Computing and Software McMaster University Hamilton, Canada Email: rasooa@mcmaster.ca Douglas G. Down

More information

The AGA Evaluating Model of Customer Loyalty Based on E-commerce Environment

The AGA Evaluating Model of Customer Loyalty Based on E-commerce Environment 6 JOURNAL OF SOFTWARE, VOL. 4, NO. 3, MAY 009 The AGA Evaluating Model of Custoer Loyalty Based on E-coerce Environent Shaoei Yang Econoics and Manageent Departent, North China Electric Power University,

More information

CPU Animation. Introduction. CPU skinning. CPUSkin Scalar:

CPU Animation. Introduction. CPU skinning. CPUSkin Scalar: CPU Aniation Introduction The iportance of real-tie character aniation has greatly increased in odern gaes. Aniating eshes ia 'skinning' can be perfored on both a general purpose CPU and a ore specialized

More information

Partitioned Elias-Fano Indexes

Partitioned Elias-Fano Indexes Partitioned Elias-ano Indexes Giuseppe Ottaviano ISTI-CNR, Pisa giuseppe.ottaviano@isti.cnr.it Rossano Venturini Dept. of Coputer Science, University of Pisa rossano@di.unipi.it ABSTRACT The Elias-ano

More information

IMPLEMENTING MAP-REDUCE TO ENHANCE THE COMPUTATIONAL ABILITIES OF CLOUD COMPUTING PLATFORM

IMPLEMENTING MAP-REDUCE TO ENHANCE THE COMPUTATIONAL ABILITIES OF CLOUD COMPUTING PLATFORM International Journal of Information Technology and Knowledge Management July-December 2012, Volume 5, No. 2, pp. 285-290 IMPLEMENTING MAP-REDUCE TO ENHANCE THE COMPUTATIONAL ABILITIES OF CLOUD COMPUTING

More information

ADJUSTING FOR QUALITY CHANGE

ADJUSTING FOR QUALITY CHANGE ADJUSTING FOR QUALITY CHANGE 7 Introduction 7.1 The easureent of changes in the level of consuer prices is coplicated by the appearance and disappearance of new and old goods and services, as well as changes

More information

CRM FACTORS ASSESSMENT USING ANALYTIC HIERARCHY PROCESS

CRM FACTORS ASSESSMENT USING ANALYTIC HIERARCHY PROCESS 641 CRM FACTORS ASSESSMENT USING ANALYTIC HIERARCHY PROCESS Marketa Zajarosova 1* *Ph.D. VSB - Technical University of Ostrava, THE CZECH REPUBLIC arketa.zajarosova@vsb.cz Abstract Custoer relationship

More information

RECURSIVE DYNAMIC PROGRAMMING: HEURISTIC RULES, BOUNDING AND STATE SPACE REDUCTION. Henrik Kure

RECURSIVE DYNAMIC PROGRAMMING: HEURISTIC RULES, BOUNDING AND STATE SPACE REDUCTION. Henrik Kure RECURSIVE DYNAMIC PROGRAMMING: HEURISTIC RULES, BOUNDING AND STATE SPACE REDUCTION Henrik Kure Dina, Danish Inforatics Network In the Agricultural Sciences Royal Veterinary and Agricultural University

More information

Introduction to Unit Conversion: the SI

Introduction to Unit Conversion: the SI The Matheatics 11 Copetency Test Introduction to Unit Conversion: the SI In this the next docuent in this series is presented illustrated an effective reliable approach to carryin out unit conversions

More information

Markov Models and Their Use for Calculations of Important Traffic Parameters of Contact Center

Markov Models and Their Use for Calculations of Important Traffic Parameters of Contact Center Markov Models and Their Use for Calculations of Iportant Traffic Paraeters of Contact Center ERIK CHROMY, JAN DIEZKA, MATEJ KAVACKY Institute of Telecounications Slovak University of Technology Bratislava

More information

Information Processing Letters

Information Processing Letters Inforation Processing Letters 111 2011) 178 183 Contents lists available at ScienceDirect Inforation Processing Letters www.elsevier.co/locate/ipl Offline file assignents for online load balancing Paul

More information

EFFICIENCY BY DESIGN STORIES OF BEST PRACTICE IN PUBLIC BODIES

EFFICIENCY BY DESIGN STORIES OF BEST PRACTICE IN PUBLIC BODIES EFFICIENCY BY DESIGN STORIES OF BEST PRACTICE IN PUBLIC BODIES Acknowledgeents We would like to extend a special thank you to ebers of the Public Chairs Foru (PCF) and the Association of Chief Executives

More information

Problem Set 2: Solutions ECON 301: Intermediate Microeconomics Prof. Marek Weretka. Problem 1 (Marginal Rate of Substitution)

Problem Set 2: Solutions ECON 301: Intermediate Microeconomics Prof. Marek Weretka. Problem 1 (Marginal Rate of Substitution) Proble Set 2: Solutions ECON 30: Interediate Microeconoics Prof. Marek Weretka Proble (Marginal Rate of Substitution) (a) For the third colun, recall that by definition MRS(x, x 2 ) = ( ) U x ( U ). x

More information

Botnets Detection Based on IRC-Community

Botnets Detection Based on IRC-Community Botnets Detection Based on IRC-Counity Wei Lu and Ali A. Ghorbani Network Security Laboratory, Faculty of Coputer Science University of New Brunswick, Fredericton, NB E3B 5A3, Canada {wlu, ghorbani}@unb.ca

More information

AutoHelp. An 'Intelligent' Case-Based Help Desk Providing. Web-Based Support for EOSDIS Customers. A Concept and Proof-of-Concept Implementation

AutoHelp. An 'Intelligent' Case-Based Help Desk Providing. Web-Based Support for EOSDIS Customers. A Concept and Proof-of-Concept Implementation //j yd xd/_ ' Year One Report ":,/_i',:?,2... i" _.,.j- _,._".;-/._. ","/ AutoHelp An 'Intelligent' Case-Based Help Desk Providing Web-Based Support for EOSDIS Custoers A Concept and Proof-of-Concept Ipleentation

More information

How To Balance Over Redundant Wireless Sensor Networks Based On Diffluent

How To Balance Over Redundant Wireless Sensor Networks Based On Diffluent Load balancing over redundant wireless sensor networks based on diffluent Abstract Xikui Gao Yan ai Yun Ju School of Control and Coputer Engineering North China Electric ower University 02206 China Received

More information

The Velocities of Gas Molecules

The Velocities of Gas Molecules he Velocities of Gas Molecules by Flick Colean Departent of Cheistry Wellesley College Wellesley MA 8 Copyright Flick Colean 996 All rights reserved You are welcoe to use this docuent in your own classes

More information

A Study on the Chain Restaurants Dynamic Negotiation Games of the Optimization of Joint Procurement of Food Materials

A Study on the Chain Restaurants Dynamic Negotiation Games of the Optimization of Joint Procurement of Food Materials International Journal of Coputer Science & Inforation Technology (IJCSIT) Vol 6, No 1, February 2014 A Study on the Chain estaurants Dynaic Negotiation aes of the Optiization of Joint Procureent of Food

More information

Considerations on Distributed Load Balancing for Fully Heterogeneous Machines: Two Particular Cases

Considerations on Distributed Load Balancing for Fully Heterogeneous Machines: Two Particular Cases Considerations on Distributed Load Balancing for Fully Heterogeneous Machines: Two Particular Cases Nathanaël Cheriere Departent of Coputer Science ENS Rennes Rennes, France nathanael.cheriere@ens-rennes.fr

More information

Performance Evaluation of Machine Learning Techniques using Software Cost Drivers

Performance Evaluation of Machine Learning Techniques using Software Cost Drivers Perforance Evaluation of Machine Learning Techniques using Software Cost Drivers Manas Gaur Departent of Coputer Engineering, Delhi Technological University Delhi, India ABSTRACT There is a treendous rise

More information

AN ALGORITHM FOR REDUCING THE DIMENSION AND SIZE OF A SAMPLE FOR DATA EXPLORATION PROCEDURES

AN ALGORITHM FOR REDUCING THE DIMENSION AND SIZE OF A SAMPLE FOR DATA EXPLORATION PROCEDURES Int. J. Appl. Math. Coput. Sci., 2014, Vol. 24, No. 1, 133 149 DOI: 10.2478/acs-2014-0011 AN ALGORITHM FOR REDUCING THE DIMENSION AND SIZE OF A SAMPLE FOR DATA EXPLORATION PROCEDURES PIOTR KULCZYCKI,,

More information

Protecting Small Keys in Authentication Protocols for Wireless Sensor Networks

Protecting Small Keys in Authentication Protocols for Wireless Sensor Networks Protecting Sall Keys in Authentication Protocols for Wireless Sensor Networks Kalvinder Singh Australia Developent Laboratory, IBM and School of Inforation and Counication Technology, Griffith University

More information

Study on the development of statistical data on the European security technological and industrial base

Study on the development of statistical data on the European security technological and industrial base Study on the developent of statistical data on the European security technological and industrial base Security Sector Survey Analysis: Poland Client: European Coission DG Migration and Hoe Affairs Brussels,

More information