A Data Placement Strategy in Scientific Cloud Workflows

Size: px
Start display at page:

Download "A Data Placement Strategy in Scientific Cloud Workflows"

Transcription

1 A Data Placement Strategy in Scientific Clou Workflows Dong Yuan, Yun Yang, Xiao Liu, Jinjun Chen Faculty of Information an Communication Technologies, Swinburne University of Technology Hawthorn, Melbourne, Australia 3 {yuan, yyang, xliu, jchen}@swin.eu.au ABSTRACT In scientific clou workflows, large amounts of application ata nee to be store in istribute ata centres. To effectively store these ata, a ata manager must intelligently select ata centres in which these ata will resie. This is, however, not the case for ata which must have a fixe location. When one task nees several atasets locate in ifferent ata centres, the movement of large volumes of ata becomes a challenge. In this paper, we propose a matrix base k-means clustering strategy for ata placement in scientific clou workflows. The strategy contains two algorithms that group the existing atasets in k ata centres uring the workflow buil-time stage, an ynamically clusters newly generate atasets to the most appropriate ata centres - base on epenencies - uring the runtime stage. Simulations show that our algorithm can effectively reuce ata movement uring workflow execution. Keywors-ata management; scientific workflow; clou computing;. INTRODUCTION Running scientific workflow applications usually nee not only high performance computing resources but also massive storage [8]. In many scientific research fiels, like astronomy [7], highenergy physics [35] an bio-informatics [39], scientists nee to analyse terabytes of ata either from existing ata resources or collecte from physical evices. During these processes, similar amounts of new ata might also be generate as intermeiate or final proucts [8]. Workflow technologies are facilitate to automate these scientific applications. Scientific workflows are typically very complex. They usually have a large number of tasks an nee a long time for execution. Nowaays, popular scientific workflows are eploye in gri systems [35] because they have high performance an massive storage. However, builing a gri system is extremely expensive an it is not available for scientists all over the worl to use. The emergence of clou computing technologies offers a new way to evelop scientific workflow systems. Since late 7 the concept of clou computing was propose [47] an it has been utilise in many areas with some success [8] [5] [] [38]. Clou computing is eeme as the next generation of IT platforms that can eliver computing as a kin of utility []. Foster et al. mae a comprehensive comparison of gri computing an clou computing [3]. Some features of clou computing also meet the requirements of scientific workflow systems. First, clou computing systems can provie high performance an massive storage require for scientific applications in the same way as gri systems, but with a lower infrastructure construction cost among many other features, because clou computing systems are compose of ata centres which can be clusters of commoity harware. Secon, clou computing systems offer a new paraigm that scientists from all over the worl can collaborate an conuct their research together. Clou computing systems are base on the Internet, an so are the scientific workflow systems eploye on the clou. Disperse computing facilities (like clusters) at ifferent institutions can be viewe as ata centres in the clou computing platform. Scientists can uploa their ata an launch their applications on scientific clou workflow systems from anywhere in the worl via the Internet. As all the ata are manage on the clou, it is easy to share ata among scientists. Research into oing science on the clou has alreay commence such as early experiences like Nimbus [3] an Cumulus [46] projects. The work by Deelman et al. [] shows that clou computing offers a cost-effective solution for ata-intensive applications, such as scientific workflows [9]. By taking avantage of clou computing, scientific workflow systems coul gain a wier utilisation; however they will also face some new challenges, where ata management is one of them. Scientific applications are ata intensive an usually nee collaborations of scientists from ifferent institutions [6], hence application ata in scientific workflows are usually istribute an very large. When one task nees to process ata from ifferent ata centres, moving ata becomes a challenge [8]. Some application ata are too large to be move efficiently, some may have fixe locations that are not feasible to be move an some may have to be locate at fixe ata centres for processing, but these are only one aspect of this challenge. For the application ata that are flexible to be move, we also cannot move them whenever an wherever we want, since in the clou computing platform, ata centres may

2 belong to ifferent clou service proviers that ata movement woul result in costs. Furthermore, the infrastructure of clou computing systems is hien from their users. They just offer the computation an storage resources require by users for their applications. The users o not know the exact physical locations where their ata are store. This kin of moel is very convenient for users, but remains a big challenge for ata management to scientific clou workflow systems. In this paper, we propose a matrix base k-means clustering strategy for ata placement in scientific clou workflow systems. Scientific workflows can be very complex, one task might require many atasets for execution; furthermore, one ataset might also be require by many tasks. If some atasets are always use together by many tasks, we say that these atasets are epenant on each other. In our strategy, we try to keep these atasets in one ata centre, so that when tasks were scheule to this ata centre, most, if not all, of the ata they nee are store locally. Our ata placement strategy has two algorithms, one for the buil-time stage an one for the runtime stage of scientific workflows. In the buil-time stage algorithm, we construct a epenency matrix for all the application ata, which represents the epenencies between all the atasets incluing the atasets that may have fixe locations. Then we use the BEA algorithm [37] to cluster the matrix an partition it that atasets in every partition are highly epenent upon each other. We istribute the partitions into k ata centres, where the partitions have fixe location atasets are also place in the appropriate ata centres. These k ata centres are initially as the partitions of the k-means algorithm at runtime stage. At runtime, our clustering algorithm eals with the newly generate ata that will be neee by other tasks. For every newly generate ataset, we calculate its epenencies with all k ata centres, an move the ata to the ata centre that has the highest epenency with it. By placing ata with their epenencies, our strategy attempts to minimise the total ata movement uring the execution of workflows. Furthermore, with the pre-allocate of ata to other ata centres, our strategy can prevent ata gathering to one ata centre an reuces the time spent waiting for ata by ensuring that relevant ata are store locally. The remainer of the paper is organise as follows. Section presents the relate work. Section 3 gives an example an analyses the research problems. Section 4 introuces the basic strategy of our algorithms. Section 5 presents the etaile steps of the algorithms in our ata placement strategy. Section 6 emonstrates the simulation results an the evaluation. Finally, Section 7 aresses our conclusions an future work.. RELATED WORK Data placement of scientific workflows is a very important an challenging issue. In traitional istribute computing systems, much work about ata placement has been conucte. In [49], Xie propose an energy-aware strategy for ata placement in RAID-structure storage systems. Stork [33] is a scheuler in the Gri that guarantees that ata placement activities can be queue, scheule, monitore an manage in a fault tolerant manner. In [5], Cope et al. propose a ata placement strategy for urgent computing environments to guarantee ata robustness. At the infrastructure level, NUCA [8] is a ata placement an replication strategy for istribute caches that can reuce ata access latency. However, none of them focuses on reucing ata movement between ata centres on the Internet. As clou computing has become more an more popular, new ata management systems have also appeare, such as Google File System [4] an Haoop [3]. They all have hien infrastructures that can store the application ata inepenent of users control. Google File System is esigne mainly for Web search applications, which are ifferent from workflow applications. Haoop is a more general istribute file system, which has been use by many companies, such as Amazon an Facebook. When you push a file to a Haoop File System, it will automatically split this file into chunks an ranomly istribute these chunks in a cluster. Furthermore, the Cumulus project [46] introuce a scientific clou architecture for a ata centre. An the Nimbus [3] toolkit can irectly turn a cluster into a clou an it has alreay been use to buil a clou for scientific applications. Within a small cluster, ata movement is not a big problem, because there are fast connections between noes, i.e. Ethernet. However, the scientific clou workflow system is esigne for scientists to collaborate, where large scale an istribute applications nee to be execute across several ata centres. The ata movement between ata centres may cost a lot of time, since ata centres are sprea aroun the Internet with limite banwith. In this work, we try to place the application ata base on their epenencies in orer to reuce the ata movement between ata centres. Data transfer is a big overhea for scientific workflows [4]. Though popular scientific workflow systems have their ata management strategies, they i not focus on reucing ata movement. For the buil-time stage, these systems mainly focus on ata moelling methos. For example, Kepler [35] has an actor-oriente ata moelling metho that works for large ata in a gri environment, Taverna [39] an ASKALON [48] have their own process efinition language to represent their ata flows. For the

3 runtime stage, most of the scientific workflow systems aopt some ata gri systems for their ata management. For examples, Kepler uses the SRB [7] system, while Pegasus [7] an Triana [4] aopt the RLS system [], Gribus [9] has a gri service broker [45] where all ata are eeme as important resources. Data gris primarily eal with proviing services an infrastructure for istribute ataintensive applications that nee to access, transfer, an moify massive atasets store in istribute storage resources [44]. However, these systems o not consier the epenencies between ata in scientific workflows either at buil-time or runtime an they also can not reuce ata movement. Some researches in gri computing have aresse the importance of ata epenency for the large-scale scientific applications, although they i not focus on workflow ata management. The Filecules project [] groups the files base on the epenencies. Using real workloa experiments ata, the authors emonstrate that filecules grouping is a reliable an useful abstraction for ata management in science Gri. BitDew [] is a istribute ata management system for esktop Gri. Different from ata centres in the clou that aim to provie services to users, esktop Gri aims to make use of the ile computing an storage resources in the esktop computers. In BitDew, the ata placement epenency is enote by a ata attribute calle affinity, which is pre-efine by users. However, in clou computing, all the applciation ata are hoste in the ata centres, where anyone can use the clou services an uploa their ata. Letting users efine the ata epenencies for the scientific clou workflows is clearly impractical. The closest workflow research to ours is the Pegasus workflow system which has propose some ata placement strategies [3] [4] base on the RLS system. The strategies are: first, pre-allocate the require ata to the computation resource where the task will execute; secon, ynamically elete the ata that will no longer be use by tasks. These strategies are only for the runtime stage of scientific workflows an can effectively reuce the overall execution time an the storage usage of the workflows. Furthermore, in [3], the authors propose a ata placement scheuler for istribute computing systems. It guarantees the reliable an efficient ata transfer with ifferent protocols. These works mainly focus on how to move the application ata, an they can not reuce the total ata movement of the whole system. However, our work aims to reuce ata movement. Our strategy is for both buil-time an runtime stages of scientific workflows an we esign specific algorithms to automatically place an move the application ata. In clou computing systems, the infrastructure is hien from users. Hence, for most of the application ata, the system will ecie where to store them. Depenencies exist among these ata. In this paper, we initially aapt the clustering algorithms for ata movement base on ata epenency. Clustering algorithms have been use in pattern recognition since 98s [3], which can classify patterns into groups without supervision. Toay they are wiely use to process ata streams [7]. In many scientific workflow applications, the intermeiate ata movement is in ata stream format an the newly generate ata must be move to the estination in real-time. We aapt the k-means clustering algorithm for ata placement. When new ata is generate by a task, we ynamically calculate the epenencies of the new ata with the K ata centres, an move the new ata to the centre with highest epenency. The simulation results of this paper show that with our ata placement strategy, the ata movement between ata centres is significantly reuce compare to ranom ata placement. 3. SCIENTIFIC CLOUD WORKFLOW DATA MANAGEMENT 3.. A Motivating Example Scientific applications often nee to process terabytes of ata. For example, the ATNF Parkes Swinburne Recorer (APSR) [] is a next-generation baseban ata recoring an processing system currently uner evelopment in collaboration by Swinburne University of Technology an ATNF. The ata from the APSR streams at a rate of one gigabyte per secon. The researchers at Parkes process the ata with a local cluster of servers an o their research. All the ata are store locally at Parkes an they are not available to other institutions. If researchers at other institutions nee the ata resources from the Parkes Raio Telescope, they have to contact the researchers at Parks an request for the ata. Researchers at Parkes will check the local repositories to see if the existing ata resources coul fulfill the requirements. In this situation communications often suffer from low efficiency because researchers are from ifferent projects an the requirements are usually complex. Sometimes researchers even have to go to Parkes an bring back the ata that they nee on har isks. Sharing ata resources in this manner is obviously inefficient an hence not esirable. With clou computing technologies, we can turn the Parkes cluster into a ata centre on a clou computing platform that can offer services to researchers all over the worl. The clou computing platform is built on the Internet, which is how the ata centres are connecte to each other. All the ata are manage by the clou ata management system. The researchers can access the existing ata resources, uploa application ata an launch their applications via the clou service. By oing this, the ATNF refers to the Australian Telescope National Facility.

4 resources at Parkes will be fully utilise, since ata can be sent to other ata centres for ifferent applications as neee. On the other han, researchers at Parkes will be able to o more scientific research by retrieving useful ata from other ata centres aroun the worl. All these ata sening an retrieving operations are hien from the researches. In another wor, via clou computing platform, researchers can utilise ata resources from other institutions without knowing where the ata are physically store. Hence, on a clou computing platform, ata centres shoul have the ability to host each other s ata. For example, if some particular ata at Parkes are frequently retrieve by another ata centre, the system will store these ata on that ata centre instea. Furthermore, if many applications at Parkes nee the same ata from another ata centre, the system will also move those ata to Parkes for storage. The Parkes Raio Telescope was setup in 96. For over 4 years, the Parkes cluster has accumulate a large amount of ata resources in ifferent formats an sizes. Normally, ata can be move to other ata centres, but if the size of the ata is very large, moving them via the Internet will be inefficient. To transport terabytes of ata, the most efficient way is for a elivery company to ship the har isks [5]. If an application nees the majority of its ata from Parkes, it is preferable that it is execute locally an retrieves ata from elsewhere. For example, some research projects may nee to process the raw ata recore from the telescope by APSR, in orer to get some specific results. 3.. Problem Analysis Scientific clou workflows run on the clou platform, which is compose of many istribute ata centres on the Internet (like Parkes cluster) an each connection between ata centres has limite banwith. Tasks sometimes nee to process more than one ataset that may be store in ifferent ata centres. Because of the banwith constraints, the movement of atasets between ata centres woul be the bottle-neck of the system. In [6], the authors propose a new protocol for ata transportation that coul provie gigabits of banwith. However, it has not been wiely supporte by the Internet. The popular clou systems, such as Amazon EC [], still have limite banwith [34]. It charges $. to $.5 per gigabyte to move ata in to an out of Amazon Web Services over the Internet. Another approach to eal with the bottle-neck of large ata transfer is to ivie the tasks, i.e. for the tasks that nee to process many istribute atasets, we split them to many smaller an parallel sub-tasks, an scheule them to ifferent atasets. Map-Reuce technology [6] is a typical an successful paraigm. It gains great success in the Google File System an Haoop, as well as in scientific applications [36]. However, Map-Reuce is more applicable to be use within one ata centre, since it nees huge interconnecte banwith, such as the shuffle step that occurs between the Map proceure an the Reuce proceure. Furthermore, in scientific applications, many tasks must use more than one atasets together an can not be further ivie, such as the All-Pairs problem [38]. Therefore, ata movement is inevitable. In light of this, we have to place the atasets that are neee by the same task in the same ata centre as much as possible, so as to minimise ata movement when the task is execute. The placement of atasets among ata centres is not trivial. Normally, a clou computing system nees to ecie in which ata centres the application ata are store. Most atasets are flexible about where they are store since they are inepenent of users. The clou computing system can automatically store the application ata base on some ata placement strategies. However, in scientific clou workflow systems, some ata are not such flexible. They have to be store in some particular ata centres ue to ifferent reasons. Some common scenarios are emonstrate below. First, some ata may nee to be processe by special equipment. In some scientific projects, many special types of equipment are utilise. Some ata can only be processe by particular equipment since they are in certain formats, e.g. the signal from Parkes Raio Telescope can only be processe by the equipment at Parkes, such as the ASPR. These ata have to be store where the require equipment is locate. Secon, some ata are naturally istribute an too large to be move efficiently. For example, the raw ata files recore by ASPR are usually terabytes or even petabytes in size. They are naturally store in Parkes, an impossible to move to other locations via the Internet. Another reason that some ata must be place at a particular ata centre is about the ownership. Data are consiere as an important an valuable resource in many scientific projects. The clou computing platform offers a new paraigm for cooperation that institutions can easily share their valuable ata resources by placing a charge on them. So the ata with limite access rights have to be store in particular ata centres. No matter what the reason that the ata must be store in a particular ata centre, we call these atasets as fixe location atasets in general. As such, we call the atasets that the system can flexibly ecie where to store flexible location atasets. The ata placement strategy not only has to place the

5 flexible location atasets, but also has to take into account the impact of the fixe location atasets. Some challenges exist in the ata placement strategy as iscusse below. First, in scientific workflows, both tasks an atasets coul be numerous an make up a complicate many-to-many relationship. One task might nee many atasets an one ataset might be neee by many tasks. Furthermore, new atasets will be generate uring the workflow execution. One ataset generate by a task might be use by several later tasks. So the ata placement strategy shoul be base on these ata epenencies. Secon, the scientific clou workflow system is a ynamic computing environment. Many workflow instances will run in the system simultaneously. Some instances might nee long time execution an some might be short. New workflow instances coul eploy to the system an complete instances coul be remove from the system anytime. So the relationships between atasets an tasks will change often an the placement of atasets has to be change accoringly. Thir, the ata management in scientific clou workflow systems is opaque to users, that means users o not know where an how the ata been store. In the clou environment, users only pay for the computation an storage resources that they nee an give the application ata to the system for processing. Because the clou systems are built on the service oriente architecture (SOA), the users just use the ynamic clou services an o not know the infrastructure of the system. Hence, the ata placement has to be automatic. 4. BASIC STRATEGIES FOR DATA PLACEMENT For scientific workflow ata management, there are two types of ata we have to eal with. First is the existing ata that exists before the workflow execution starts. This type of ata mainly inclues the resource ata from the existing file systems or atabases an the application ata from users as input for processing or analysis. Secon is the generate ata that are generate uring the workflow execution. This type of ata mainly inclues the newly generate meiate an result ata, as well as the streaming ata ynamically collecte from scientific evices uring the workflow execution. We propose this taxonomy because we will treat these two types of ata at the workflow buil-time an runtime respectively with ifferent algorithms. This taxonomy only inicates the generation time of the atasets. When the generate ata moves to a ata centre an is store, it becomes existing ata. The most important common feature is that both types of ata might be very large. They can not an shoul not be store an move wherever an whenever we want, since the clou system has the banwith constraints. The application ata of scientific workflow coul also have a variety of formats (e.g. XML ata, complex objects, raw ata files, tables in relational atabases). But in this paper, we o not consier the structure of the ata, since it is not the main focus of this paper an we will treat all ata in the same way. In scientific workflows, moving ata to one ata centre will cost more than scheuling tasks to that centre [3]. Hence, our basic strategy is to have a reasonable placement of ata in istribute ata centres first, so that when tasks are scheule to the appropriate ata centres, almost all the atasets they nee are in local storage. In this work we analyse the epenencies between atasets. Base on this epenency, we aapt the k-means clustering algorithm to cluster atasets to the proper ata centres. In scientific clou workflow systems, many workflow instances will run simultaneously, each of which have complex structures. Large numbers of tasks will access large numbers of atasets an prouce large output ata. In orer to execute a task, all require atasets must be locate on the same ata centre, an this may require some movement of atasets. Furthermore, if two atasets are always use together by many tasks, they shoul be store together in orer to reuce the frequency of ata movement. Here, we say that these two atasets have epenency. In other wors, two atasets are sai to be epenent on each other if they are both use by the same task. The more tasks there are that use the same atasets, the higher the epenency between those atasets. We enote the set of atasets as D an the set of tasks as T. To represent this epenency, we give every ataset a task set in aition to its size. So, every ataset is the set of tasks that i Dhas two attributes enote as <T i, s i >, where T i T will use ataset i, s i enotes the size of i. Furthermore, we use epenency ij to enote the epenency between atasets i an j. We say that the atasets i an j have epenency if there are tasks that will use i an j together an the quantity of this epenency is the number of tasks that use both i an j. All the enotations are liste at the en of the paper.

6 epenency ij = Count ( T T ) i j In this work, our k-means clustering ata placement strategy is base on this epenency that can cluster the atasets into ifferent ata centres. The strategy has two stages: buil-time an runtime. At the buil-time stage, the main goal of the algorithm is to set up k initial partitions for the k-means algorithm. We use a matrix base approach to cluster the existing atasets into k ata centres as the initial partitions. At the runtime stage, the main goal of the algorithm is to cluster the newly generate atasets to one of the k ata centres base on their epenencies, which will be calculate ynamically. We have to esign ifferent algorithms for buil-time an runtime stages to treat the existing ata an generate ata respectively, mainly because of the ynamic nature of the clou environment. Even though we know the size an relate tasks of the atasets that will be generate uring the workflow execution, it is not practical to calculate their epenencies an assign them a ata centre at buil-time stage. This is because the scientific workflows have a large number of tasks an nee a long time for execution. It is very har to preict when a certain ataset will be generate in a ynamic clou environment. If we assign the generate ata a ata centre at the buil-time stage, then when the ata are actually generate the ata centre might have not enough available storage to store them. Furthermore, it is impractical an inefficient to reserve the storage for the generate ata at the buil-time stage. This is because the ata might not be generate until the en of the scientific workflow an it woul be a waste of the reserve storage space uring this time. 5. MATRIX BASED K-MEANS CLUSTERING STRATEGY FOR DATA PLACEMENT Figure. Example of ata placement In this section we will intricately iscuss our ata placement strategy. In Fig., there is an example of a simple workflow instance, an it shows the two stages of our strategy. The ata flows in the workflow instance, for example, from ataset to tasks t an t mean that will be use by both t an t ; an ata flows from t to t an t 3 mean that the ataset generate by t will be use by both t an t 3. During the buil-time stage, we partition the existing atasets into several partitions, enote as p,p p n, base on their epenencies, an istribute these partitions into ifferent ata centres. During

7 the runtime stage, tasks may retrieve atasets from other ata centres as neee, an we also pre-allocate generate atasets to the appropriate ata centres. 5.. Buil-Time Stage Algorithm During the buil-time stage, we use a matrix moel to represent the existing ata. We pre-cluster the atasets by transforming the matrix, an then istributing the atasets to ifferent ata centres as the initial partitions for the k-means clustering algorithm, to be use uring the runtime stage. The builtime stage algorithm has two steps an the pseuocoe is shown in Fig. 4. Step : Setup an cluster the epenency matrix. First, we calculate the ata epenencies of all the atasets an buil up a epenency matrix DM (Line 3 in Fig. 4), where DM s element DM ij = epenency ij. epenency ij is the epenency value between atasets i an j, as we efine in the previous section. It can be calculate by counting the tasks in common between the task sets of i an j, which are enote as T i an T j. Specially, for the elements in the iagonal of DM, each value means the number of tasks that will use this ataset. In our algorithm, DM is an n n symmetrical matrix where n is the total number of existing atasets. If we take the simple workflow instance in Fig. as an example (with only 5 atasets, namely to 5, in the system initially), the epenency matrix DM is shown in Fig.. = Count( T i T j ) Figure. Buil up epenency matrix The epenency matrix (i.e. DM) is ynamically maintaine at the runtime. When new atasets are generate by tasks or ae to the system by users, we calculate their epenencies with all the existing atasets an a them to DM. Next, we use the BEA (Bon Energy Algorithm) to transform the epenency matrix DM (Line 4 in Fig. 4). BEA was propose in 97 [37] an has been wiely utilise in istribute atabase systems for the vertical partition of large tables [4]. It is a permutation algorithm that can group the similar items together in the matrix by permuting the rows an columns. In our work, it takes the epenency matrix (DM) as input, an generates a clustere epenency matrix (CM). In CM, the items with similar values are groupe together (i.e. large values with other large values, an small values with other small values). We efine a global measure (GM) of the epenency matrix: = n n i= j =, + ) GM DM + ij ( DM i j DM i, j The permutation is one in such a way as to maximise this measure. The etaile algorithm of permutation coul be foun in [4]. Fig. 3 shows the CM of the example DM after the BEA transformation Figure 3. BEA transformation of epenency matrix In this step, we o not consier the ifference between fixe location atasets an flexible location atasets. If there are some fixe location atasets in the system, they will be arbitrarily scattere in the columns an rows of the epenency matrix, since we built up the matrix by calculating epenencies between all the atasets. After the BEA transformation, all the atasets, incluing the fixe location atasets, are clustere by their epenencies

8 Buil-time Stage Algorithm Input: D: set of existing atasets,, n DC: set of ata centres c, c, c m Output: K: set of ata centres with initial atasets. K=Ø; FP=Ø; NFP=Ø; //Initialization. FP: set of partitions that have fixe location atasets //NFP: set of partitions that have not fixe location ataset. For (every c i in DC) i_cs i =cs i * λ ini ; //Calculate initial available storage of all ata centres 3. DM = epenency ij = Count (T i T j ) ; //Step : setup DM 4. CM = BEA (DM) ; //Step : BEA transformation 5. if (CM contains f) //Step starts. Check the existence of fixe location atasets 6. Partition&Classify (CM) //Sub-step : partition CM an classify the partitions in to FP an NFP 7. if (CM T contains f & the f belong to ifferent c) 8. Partition&Classify (CM T ) ; //Recursively partition an classify CM T 9. else if (CM T contains f). a CM T to FP ; //CM T has fixe location atasets, a to FP. else a CM T to NFP ; //CM T has not fixe location atasets, a to NFP. if (CM B contains f & the f belong to ifferent c) 3. Partition&Classify (CM B ) ; //Recursively partition an classify CM B 4. else if (CM B contains f) 5. a CM B to FP ; //CM B has fixe location atasets, a to FP 6. else a CM B to NFP ; //CM B has not fixe location atasets, a to NFP 7. for (every ata centre c i in DC) //Sub-step : istribute the partitions with fixe location atasets 8. if (c i has f) //Choose the ata centre c i that has fixe location atasets 9. for (every f j in FD i ) //Go through all the fixe location atasets belong to c i. fin CM j in FP ; //Pick out the partitions that contain these fixe location atasets from PF. a CM j to P i ; //Setup the partitions set P for c i. calculate ps i = cm ; //The total size of the partitions in P j P i s j 3. while (ps i > i_cs i ) //Further partition if the size of P is too large for c i 4. fin CM k in P i, where s k = maxcm P s ; i i i //Largest partition in P 5. remove CM k from P i ; 6. BinaryPartition (CM k ) ; //Partition CM k an upate the partitions sets 7. if (CM kt contains f) a CM kt to P i ; 8. else a CM kt to NFP ; 9. if (CM kb contains f) a CM kb to P i ; 3. else a CM kb to NFP ; 3. calculate ps i = cm j P i s j ; //New size of P after partition 3. istribute all CM j in P i to c i ; //Distribute atasets 33. upate c i to K ; 34. i_cs i = i_cs i ps i ; 35. else a CM to NFP ; //CM o not contain fixe location atasets 36. for (all the partitions CM i in NFP) //Sub-step 3: istribute the partitions without fixe location atasets 37. Partition&Distribute (CM i ) //Partition an istribute CM i 38. m if ( s T < max j= cs j ) //Size of CM it is small enough for some ata centres 39. fin c j from DC, //Fin the best ata centre m 4. where cs i = min j =( cs j > st ) ; 4. istribute CM it to c j ; //Distribute atasets 4. upate c j to K ; 43. i_cs j = i_cs j s it ; 44. else Partition&Distribute (CM it ) ; //Recursively partition an istribute CM it 45. m if ( s B < max j= cs j ) //Size of CM ib is small enough for some ata centres 46. fin c j from DC, //Fin the best ata centre m 47. where cs i = min j =( cs j > sb ); 48. istribute CM ib to c j ; //Distribute atasets 49. upate c j to K ; 5. i_cs j = i_cs j s ib ; 5. else Partition&Distribute (CM ib ) ; //Recursively partition an istribute CM ib 5. Return K ; Figure 4. Buil-time stage algorithm

9 Step : Partition an istribute atasets. In this step we will istribute the atasets to ata centres as the initial k partitions for the k-means clustering algorithm at the runtime stage. We enote the set of ata centres as DC. As shown in Fig., we partition the clustere epenency matrix an place the corresponing atasets to ifferent ata centres. However, each ataset i has a size s i an each ata centre c j also has a storage capacity enote as cs j. To fin the best partitioning of atasets matching the ata centres storage is an NP-har problem, since it coul be reuce to the Knapsack Packing Problem. Here, we evelop a recursive binary partitioning algorithm to fin the approximate best solution. First, we partition CM into two parts {, p } an { p+, p+ n }, which maximises the following measurement: p n ( ) p p n n i= j= CM ij i = p + j = p + CM ij i j = p CM ij PM = = + This measurement, PM, means that atasets in each partition have higher epenencies with each other an lower epenencies with the atasets in the other partitions. Base on this measure we can simply calculate all PMs for p=, n-, an choose p such that it has the maximum PM value as the partition point. After one partition, the CM forms two new clustere matrices, we enote the top one as CM T, which contains the epenencies of atasets D T = {, p } an the bottom one as CM B, which contains the epenencies of atasets D B = { p+, p+ n }. Every clustere matrix represents a partition of atasets an we enote the total size of the atasets it contains as s = i n = s. Hence the s for CM i T an CM B are p st = i = s an i sb s respectively. i = i n = p + Next, we istribute atasets to ata centres by recursively partitioning the clustere epenency matrix. For each of the ata centres, we introuce a percentage parameter λ ini to enote the initial usage of their storage capacity, which means that the initial size of atasets in ata centre c i coul not excee cs i * λ ini. The reason we can not fill the ata centre with their maximum storage is that in scientific workflows, the generate ata can also be very large. We have to reserve sufficient space in ata centres to store those ata uring the workflow execution. λ ini is an experience parameter. The value of λ ini shoul epen on what kins of applications are running on the system, because the generate ata of ifferent applications might have ifferent sizes. Furthermore, we also assume that the ata centres can host all the application ata in the system, i.e. n m i = s i < i= ( cs i λ ini ). To istribute the atasets, we have to examine whether there are fixe location atasets in the system (Line 5 in Fig. 4). If the system oes not have fixe location atasets (Line 35 in Fig. 4), we will recursively partition the sub-matrices CM T an CM B until the size of the sub-matrix can fit into one of the ata centres initial storage size limits (s <= cs i * λ ini ). Then we istribute the atasets in this submatrix into this ata centre, an a the reference of this ata centre (c i ) to K, where K is a set of ata centres. When the partitioning of CM finishes, all the initial atasets are move to proper ata centres. We take the ata centres in K as the initial partitions of the k-means clustering algorithm. If there are fixe location atasets in the system, the istribution process is more complicate. For a fixe location ataset f i, we enote it as <T i, s i, c>, where the aitional attribute c is the ata centre where this ataset has to be store. An we use FD to enote the set of the fixe location atasets a ata centre has. For a ata centre that oes not have fixe location atasets, FD is empty. The istribution is conucte as the three following sub-steps. Sub-step (Line 6-6 in Fig. 4), we classify fixe location atasets an flexible location atasets in ifferent partitions. We also nee to recursively partition the sub-matrices CM T an CM B. The stop conition is that the sub-matrix oes not have fixe location atasets or all the fixe location atasets it has belong to one ata centre. We a the partitions that o not have fixe location atasets to a set name NFP an the partitions have fixe location atasets to a set name FP. Sub-step (Line 7-34 in Fig. 4), we istribute the partitions with fixe location atasets in FP. We nee to check the ata centres information. For the ata centres that have fixe location atasets, we pick out the partitions that contain these fixe location atasets from FP, enote as P. Then, we calculate the total size of these partitions, enote as ps, where ps = CMi P s. If these partitions can fit into this i ata centre, we store them. If not, we recursively pick the largest partition from P, binary partition it an move the part that oes not have fixe location atasets to NFP, until these partitions can fit into the ata centre.

10 Sub-step 3 (Line 36-5 in Fig. 4), we istribute the partitions that only contain flexible location atasets in NFP. We start with the largest one an go through all the partitions in NFP by their size. For every partition, we istribute it to the ata centres by recursive binary partitioning. 5.. Runtime Stage Algorithm At the runtime stage, we use the k-means clustering algorithm to ynamically cluster the generate ata to one of the k ata centres base on their epenencies. An when new workflows are eploye to the system or some ata centres become overloae, we also have to ajust the ata placement among ata centres. The pseuocoe of the runtime stage algorithm is shown in Fig. 5. For the generate ata, some of them coul be valuable resources that can be utilise by other workflows, but most of them are temporal ata. They are generate by the preceing tasks in the workflows an will be use by the subsequent tasks. They o not nee permanent storage an will be elete after the workflows have finish execution. In many scientific applications, the temporal ata are in large volumes [9]. Some researches emonstrate that timely removal of these temporal ata can save a lot of runtime storage space [4]. In our work, we ynamically check an elete the obsolete temporal ata before every roun of task scheuling. The runtime stage algorithm contains the following two steps. Step : Data pre-allocation by the clustering algorithm. In this step, the first thing we have to o is task scheuling (Line -3 in Fig. 5). Scheuling is a very important issue in scientific workflow systems, especially for computation intensive an/or ata intensive applications. Much research has been one into scheuling workflows [43] [5]. However, task scheuling is not the main focus of this paper. Therefore, our scheuling strategy is quite straight forwar. We just follow the philosophy of moving ata to a ata centre will cost more than scheuling tasks to that centre, an scheule tasks base on the placement of atasets. We perioically monitor the state of all the workflow tasks an ynamically scheule the reay tasks to the ata centre which has the most atasets they require. Here, a task is reay if all the atasets it nees are existing ata (i.e. have been generate). When tasks have been execute, new atasets will be generate. The system will then ecie where to put these atasets: either store them locally or allocate them to other ata centres. In our work, the system will cluster the newly generate atasets to the ata centre that has the highest epenency with them (Line 4- in Fig. 5). We efine the epenency between ataset i an ata centre c j as c_ep ij, which is the sum of the epenencies of i with all the atasets in c j. Suppose u is a new generate ataset an T u is the set of tasks that will use u. First, we calculate the epenencies of u with all other atasets in the system an a the new row an column to DM for u, where DM ui = DM iu = epenencyui = Count{ Tu Ti } i =,,... n Then we calculate the epenencies of u with all the k ata centres, where c _ ep uj = epenency um, j =,,... k m c j With these epenencies, we will select the ata centre c h that has the highest epenency with u, where c _ ep uh = max = ( c _ epuj ) k j c h is the ata centre in which we will store the ataset u. An we will check the available storage of c h, before we move u to it. Here we will introuce a maximum storage usage parameter λ max for ata centres, which is a percentage threshol inicating whether a ata centre is overloae or not. λ max is also an experience parameter, just like the initial storage usage parameter λ ini. Hence, the storage that the runtime ata can use of a ata centre c i is cs i *(λ max -λ ini ). The value of λ max epens on the overall workloa of the system. If the system workloa is heavy, λ max has to be set to a larger value. Likewise, if the system workloa is light, λ max is set smaller to prevent too many atasets gathering in one ata centre. We will move the new generate ataset u to the selecte ata centre c h, if cs hλ + su < cshλ max is true, where s u is the size of u an λ is the current storage usage percentage of c h. Otherwise, we go to the next step to ajust the ata placement.

11 t i T { T T }, i =, n epenencyui = Count u i,... c k DC c _ ep = epenency um m c k c _ ep uh = max = ( c _ epuj ) cs hλ + su < cshλ max k j λ = max i j i c j DC j λ j / i c c j i j / i c c Figure 5. Runtime stage algorithm Step : Ajust ata placement among ata centres. During workflow execution, there are two situations that trigger the nee to ajust the ata placement among ata centres. The first is when the selecte estination ata centre c h for the new generate ataset oes not have enough available storage. This means that c h is overloae. Hence, we have to ajust the atasets placement to balance the overall workloa of the system. The secon is when new workflows are eploye to the system. Together with the new workflows, new atasets an tasks will be ae to the system. The epenencies of the original atasets will change, since the new tasks might use the existing ata in the system. In this situation, we will calculate the epenencies between the new atasets an the existing atasets, an a them to the epenency matrix DM. If there are any new tasks which use existing ata, they will be ae to the task set of the appropriate existing ataset. For every new ataset, we will fin an appropriate ata centre for it by following the proceure in step. If the selecte ata centre is overloae, we have to ajust the atasets placement to balance the overall workloa of the system. To ajust the ata placement, we nee to run some functions from the buil-time stage algorithms (Line 5-6 in Fig. 5). First, we o the BEA transformation to cluster the upate epenency matrix (DM) an get a new clustere epenency matrix (CM ). Next, we run the algorithm in step of the buil-time stage, but without actual ata istribution. We just calculate the new placement of atasets in the ata centres an save the references in a new set of ata centres, enote as K. Then we can o the ajustment by comparing the ol ata placement with the new one in K (Line 7-4 in Fig. 5). We start the ajustment from the ata centre that has the highest storage loa an go through all the ata centres by the storage usage in the ecreasing orer. For every ata centre, we compare the atasets it currently has with the new atasets in K. Then we sen the atasets that o not belong to this ata centre to the ones they now belong to an retrieve the atasets it shoul have from other ata centres.

12 Since λ max represents a percentage of a ata centre's total storage space, each ata centre will still have some storage available (% - λ max ) to facilitate ata movement uring this reistribution. In the case that λ max is set to %, aitional temporary storage space may nee to be acquire to serve as a buffer before the ajustment process can be complete. However, this situation rarely happens in the system, ue to the following reasons: ) in the ajustment process we always select the ata centre with the highest storage usage to ajust as the priority, an sen its atasets to other ata centres first; ) the total size of the atasets in the system is smaller than the total size of the available storage of all the ata centres ( n m i= s i < i= ( cs i λ ini )), because we have the assumption that the ata centres can host all the application ata in the system; an 3) for every ata centre we reserve some storage for the runtime generate atasets ( cs ( λmax λ ini )), this storage space is not always highly utilise, because we elete obsolete atasets ynamically. In our system, for every ata centre, we reserve runtime storage for generate atasets as 4% of the initial storage for existing atasets i.e. ( λ max λ ini ) λini = 4%. As aresse in section 6 later, we have run tens of thousans of workflow instances for simulation, an a situation where we lacke storage for ata reallocation i not occur. The ata placement strategy in this section states that when a task is scheule to one ata centre uring workflow execution, that ata centre will have most input atasets for that task. Then, only a small number of atasets have to be retrieve from remote ata centres. The simulations in the next section will show that our ata placement strategy can greatly reuce the total ata movement uring workflow execution. 6. SIMULATION 6.. Simulation Environment: SwinDeW-C SwinDeW-C (Swinburne Decentralise Workflow for Clou) [5] is evelope base on SwinDeW [5] an SwinDeW-G [5]. It is currently running at Swinburne University of Technology, which is compose of servers an high-en PCs. To simulate the clou computing environment, we set up VMware [4] software on the physical servers an create virtual clusters as ata centres. Fig. 6 shows our simulation environment. Figure 6. Simulation environment of SwinDeW-C Every ata centre create is compose of 8 virtual computing noes with storages, an we eploy an inepenent Haoop file system on each ata centre. SwinDeW-C runs on these virtual ata centres that can sen an retrieve ata to an from each other. Through a user interface at the applications layer, which is a Web base portal, we can eploy workflows an uploa application ata.

13 SwinDeW-C is esigne for large scale clou applications. It has a novel architecture for the clou computing environment. However, the presentation of the comprehensive system esign of SwinDeW-C is not the main focus of this paper. In Fig. 7, we only illustrate the key system components of SwinDeW- C that relate to the ata placement strategy. User Interface Moule: The clou computing platform is built on the Internet an a Web browser is normally the only software neee at the client sie. This interface is a Web portal by which users can visit the system an eploy their applications. The Uploaing Component is for users to uploa application ata an workflows, an the Monitoring Component is for users, as well as system aministrators to monitor workflow execution. Data Management Moule: The Data Placement Component is the core component of ata management in SwinDeW-C that facilitates the algorithms in our ata placement strategy. The Data Catalogue is use to store the information of applications which, in a service oriente clou platform, is a registry for the ata services. By using the catalogue, the system can locate the ata neee. Other components in this moule, such as Data Replication Component, Data Synchronisation Component, Meta-ata Repository an Provenance Data Collection are also essential for clou ata management. Since they are not irectly relate to the ata placement strategy, we o not give their etails here. Other Moules: The Flow Management Moule has a Process Repository that stores all the workflow instances running in the system. The Task Management Moule has a Scheuler that scheules reay tasks to ata centres uring the runtime stage of the workflows. Furthermore, the Resource Management Moule keeps the information of the ata centres usage, an can trigger the ajustment process in the ata placement strategy. For other components in these moules, as well as other moules in SwinDeW-C, we o not give the etails as the work presente here only focuses on the workflow ata management. Figure 7. Relate key system components of SwinDeW-C 6.. Simulation Strategies The algorithms in our ata placement strategy are for the buil-time an runtime stages respectively. To evaluate their performance, we run each workflow instance through 4 simulation strategies: Ranom: In this simulation, we ranomly place the existing ata uring the buil-time stage an store the generate ata in the local ata centre (i.e. where they were generate) at runtime. This simulation represents the traitional ata placement strategies in ol istribute computing systems (i.e. clusters an early gri systems). At that time, ata were usually store in the local noe naturally or in the noes that ha available storages. The temporal intermeiate ata, i.e. generate ata, were also naturally store where they were generate waiting for the tasks to retrieve them. Buil-time only: This simulation shows the performance of our buil-time algorithm. It is use to place the existing ata at buil-time. During the runtime stage we will store the generate ata in the local ata centre, as with the Ranom simulation. In a clou computing system, ata are more flexible

14 than they were in the past; this allows the system can ecie where to store them. Our buil-time algorithm places the application ata base on their epenencies. This simulation will show the ata movement reuction in the workflows execution by using this algorithm. Runtime only: This simulation shows the performance of the runtime algorithm by ranomly placing the existing ata at buil-time an by pre-allocating the generate ata with our runtime algorithm. This simulation represents the strategy that some popular gri scientific workflows use [3]. Their work shows that pre-allocating ata to the computing noe where the tasks will execute can reuce the total execution time of the workflow. However, this simulation will show that only pre-allocating ata at runtime stage can not reuce the ata movement in workflow execution. Buil & Run: This simulation shows the overall performance of our algorithms both at buil-time an runtime. Our algorithms are specifically esigne for scientific clou workflows. The strategy is base on ata epenency an can automatically place existing ata; an cluster generate ata to the appropriate ata centres. Comparisons with other strategies will be mae with ifferent aspects to show the performance of our algorithms. The traitional way to evaluate the performance of a workflow system is to recor an compare the execution time [3] [4]. However, in our work we will count the total ata movement instea. The execution time coul be influence by other factors besie ata management, such as banwith, scheuling strategy an I/O spee. Our ata placement strategy aims to reuce the ata movement between ata centres on the Internet. So we irectly take the number of atasets that are actually move uring the workflow execution as the measurement to evaluate the performance of the algorithms. In a clou computing environment with limite banwith base on the Internet, if the total ata movement has been reuce, the execution time will be reuce corresponingly. Furthermore, the cost of ata transfer will also ecrease. To make the evaluation as objective as possible, we generate test workflows ranomly to run on SwinDeW-C. This woul make the evaluation results inepenent of any specific applications. As we nee to run the buil-time an runtime algorithms separately, we set the number of existing atasets an generate atasets to be the same for every test workflow. That means that we have the same number of existing atasets an tasks for every test workflow, an we assume that each task will only generate one ataset. We can control the complexity of the test workflow by changing the number of atasets. Every ataset will be use by a ranom number of tasks, an tasks that use generate atasets must be execute after the task that generates their input. We can control the complexity of the relationships between the atasets an tasks by changing the range of this ranom number. Another factor that woul have impact on the algorithms is the number of fixe location atasets. We can ranomly choose some percentage of atasets from the existing ata an ranomly select some ata centres for them. We will run new simulations to show the impact on performance. Here we have only inclue graphs of the simulation results. The etaile configuration an result reports of the simulations, as well as the source coe can all be foun at Simulation Results Fig. 8 shows the ata movement when we run workflows with ifferent complexity on ifferent numbers of ata centres. We can see the increases in ata movement as the workflows become more complex an the number of ata centres increases. All the values in the figure are the average of running test workflows with the same parameters. In Fig. 8 (a), we ran the test workflows with ifferent complexity on 5 ata centres. We use 4 types of test workflows with ifferent numbers of atasets. In Fig. 8 (b), we fixe the test workflows atasets count to 5, an ran them on ifferent numbers of ata centres. Then we change % of the input atasets to fixe location atasets an ran the same simulation again. The results are shown in Fig. 9. From the results, we coul raw the conclusions that ) the buil-time algorithm can effectively reuce the total ata movement of the workflow execution; ) the runtime algorithm oes not reuce the total ata movement, an even causes more ata movement if the existing atasets are place ranomly an 3) with fixe location atasets ae to the system, our algorithms can still work very well with performance only egraing slightly. The runtime algorithm oes not ecrease the ata movement because it pre-allocates atasets before scheuling tasks base on their ata epenencies. If the existing atasets are ranomly place, the iffering epenencies of the ata centres are not obvious. The increase in ata movement is cause by pre-allocation of atasets to the wrong ata centres. However, if the existing atasets were clustere by the buil-time algorithm, the performance of the runtime algorithm woul be better.

15 Ranom BuiltimeOnly RuntimeOnly Buil&Run Data Sets Data Centres Figure 8. Data movements without runtime storage limit an without fixe location atasets Ranom BuiltimeOnly RuntimeOnly Buil&Run Data Sets Data Centres Figure 9. Data movements without runtime storage limit an with % of fixe location atasets However, in the simulation escribe above, we i not limit the amount of storage that the ata centres ha available uring runtime. The reason for this is that we wante to see how the tasks an atasets were istribute, which inicates the workloa balance among ata centres. During the execution of every test workflow instance, we recore the number of atasets that move to each ata centre, as well as the tasks that scheule to that ata centre. We also calculate the stanar eviation of the ata centres usage. Fig. shows the average stanar eviation of running test workflows on 5 ata centres each having 8 existing atasets an 8 tasks, both with an without fixe location atasets. From Fig. we can see relatively high eviations in the ata centres usage in the two simulations without the runtime algorithm. This means that tasks an atasets are allocate to one ata centre more frequently. This leas to a ata centre becoming a super noe that has a high workloa. By contrast, in the other two simulations that use the runtime algorithm to pre-allocate the generate ata to other ata centres, the eviation of ata centre usage is low. This emonstrates that the runtime algorithm can make a more balance istribution of the workloa among ata centres. In a clou computing environment, ata centres normally have limite storage, especially in some storage constraine systems. When one ata centre is overloae, we nee to reallocate the ata to other ata centres. The reallocation will not only cause extra ata movement, but will also elay the execution of the workflow. To count the reallocate atasets, we ran the same test workflows as in Fig. with a storage limit in every ata centre. We limite the runtime storage for generate atasets to 4% of the

16 initial storage for existing atasets i.e. ( λ λ ini ) λ % max ini = 4 movement incluing the ata reallocation.. In Fig. we show the average ata D atasets M o vement T asks Scheuling Stanar Deviation Stanar Deviation Ranom BuiltimeOnly RuntimeOnly Buil&Run Ranom BuiltimeOnly RuntimeOnly Buil&Run Figure. Stanar eviation of workloa among ata centres Data Retrieve Data Sent Data Reallocate Ranom BuiltimeOnly RuntimeOnly Buil&Run Ranom BuiltimeOnly RuntimeOnly Buil&Run (a) Without fixe location ata (b) With % fixe location ata Figure. Proportions of 3 types of ata movements From Fig., we can see that a lot of ata is reallocate in the simulations without the runtime algorithm. The least ata reallocation occurre when we only use the runtime algorithm. However, the least ata movement in total occurre when using the buil-time an runtime algorithms together. In Fig. (a), using both algorithms cause movements of atasets on average. Comparing this to the ranom simulation, atasets movements on average, our algorithms reuce the ata movement by 5.8%. On the other han, the buil-time algorithm an runtime algorithm cause movement of 7.6 an atasets on average. Compare to the ranom situation, they reuce the ata movements by 4.8% an 4.% respectively. In Fig. (b), with % fixe location atasets in the system, our algorithms (Buil&Run) can reuce the ata movement by 47.4% compare to the Ranom simulation. To better evaluate the performance of our algorithms, we give every ata centre a runtime storage limit an run the same simulation workflows as Fig. 8. We get the final results of ata movement which are shown in Fig.. From Fig. we can see that as the number of ata centres an atasets increases, the performance of the buil-time algorithm ecreases. This is because without the runtime algorithm the atasets an tasks are gathering on the one ata centre. This triggers the ajustment process more frequently, which costs extra ata movements. Furthermore, we ran the same simulation as Fig. uner the conition that the system has fixe location atasets. Fig. 3 shows the ata movements when we set the percentage of fixe location atasets to %. We can see our algorithms can still reuce the ata movements significantly. Furthermore, with higher percentages of fixe location atasets in the system, our algorithms still work, an we will emonstrate this in the next simulation.

17 Ranom BuiltimeOnly RuntimeOnly Buil&Run Data Sets (a) Figure. Data movements with runtime storage limit 5 5 Data Centres (b) Ranom BuiltimeOnly RuntimeOnly Buil&Run Data Sets Data Centres Figure 3. Data movements with runtime storage limit an with % fixe location atasets Fig. 3 has consistent results with Fig. 9, that the fixe location atasets have a negative impact on the algorithms performance. In the algorithms, we try to place the atasets on ata centres base on epenencies, however, the fixe location atasets have to be store in particular ata centres. This will ecrease performance, as fixe location atasets will prevent the algorithms from placing atasets with their epenencies. However, given the existence of fixe location atasets, our algorithms can still reuce ata movement by placing the flexible location atasets with epenencies. To emonstrate the impact of fixe location atasets on the algorithms, we conucte another batch of simulations. We ran test workflows on 5 ata centres each having 8 existing atasets an 8 tasks, but with ifferent percentages of fixe location atasets. As the number of fixe location atasets increases, we can see their impact on ata movement in Fig. 4. From Fig. 4 (a) we can see that as the percentage of fixe location atasets goes up, the ata movements of the Buil-time only an Buil & Run simulations go up accoringly; however the Ranom an Runtime only simulations keep steay. This means the fixe location atasets primarily have an impact on the buil-time algorithm. This is because all the fixe location atasets are existing ata, which are place by the buil-time stage algorithm. When the percentage reaches 6%, the ata movements of Buil & Run simulation even excees the Ranom simulation. This is because the preallocation of atasets in the runtime algorithm causes more ata movements, as the buil-time algorithm gets worse. In Fig. 4 (b) it may seem slightly confusing that the ata movements of all simulations go up an then rop, as the percentage of fixe location atasets goes up. This is because when we set the runtime storage limit, many ata movements are cause by ata reallocation. However, the fixe location atasets are not involve in the overloa ajustment process. Hence, the ata movement

18 ecreases. In this figure we can also see that the fixe location atasets may have a negative impact on the buil-time algorithm. Ranom BuiltimeOnly RuntimeOnly Buil&Run % % % 3% 4% 5% 6% 7% Percentage of Fixe Datasets 5 % % % 3% 4% 5% 6% 7% Percentage of Fixe Datasets Figure 4. Data movements with ifferent percentage of fixe location atasets 7. CONCLUSIONS AND FUTURE WORK In this paper, we examine the unique features of scientific clou workflows an propose a clustering ata placement strategy that can automatically allocate application ata among ata centres base on epenencies. Simulations in our clou workflow system SwinDeW-C inicate that our ata placement strategy can effectively reuce ata movement uring workflow execution. The buil-time algorithm reuces the amount of ata retrieve an the run time algorithm guarantees a balance istribution of ata an can reuce ata movement incurre by ata reallocation, even when fixe location ata exist in the system. In our current work, to guarantee the ata reliability, we use Haoop s replication mechanism within a ata centre, an among ata centres we i not use any replication strategies. The ata use in scientific workflow applications are usually very large an as such it is not efficient to replicate all the application ata in the system. However, replication of frequently use ata coul also reuce ata movement. In the future work, we will evelop some efficient replication strategies for the ata placement algorithm, which coul balance the ata movement an storage usage. Furthermore, in our current simulation we measure the reuction of atasets movements to evaluate our strategy. In the future, we will meter the execution time of the workflow as well, which can better emonstrate the effectiveness of our strategy. To be more comprehensive, we will also incorporate the size of atasets to calculate the ata epenency, an aapt some popular clou service proviers pricing moels to our simulation, which will show the cost effectiveness of our strategy. ACKNOWLEDGMENT The research work reporte in this paper is partly supporte by Australian Research Council uner Linkage Project LP We are grateful to Bryce Gibson an Michael Jensen for the accomplishment of the simulation work, as well as the carefull English proofreaing. REFERENCES [] "Amazon Elastic Computing Clou, accesse on 5 November 9. [] "ATNF Parkes Swinburne Recorer, accesse on 5 November 9. [3] "Haoop, accesse on 5 November 9. [4] "VMware, accesse on 5 November 9. [5] M. Armbrust, A. Fox, R. Griffith, A. D. Joseph, R. H. Katz, A. Konwinski, G. Lee, D. A. Patterson, A. Rabkin, I. Stoica, an M. Zaharia, "Above the Clous: A Berkeley View of Clou Computing," University of California at Berkeley, Technical Report UCB/EECS-9-8, accesse on 5 November 9.

19 [6] R. Barga an D. Gannon, "Scientific versus Business Workflows," in Workflows for e-science, pp. 9-6, 7. [7] C. Baru, R. Moore, A. Rajasekar, an M. Wan, "The SDSC Storage Resource Broker," in IBM Centre for Avance Stuies Conference, Toronto, Canaa pp. -, 998. [8] M. Brantner, D. Florescuy, D. Graf, D. Kossmann, an T. Kraska, "Builing a Database on S3," in SIGMOD, Vancouver, BC, Canaa, pp. 5-63, 8. [9] R. Buyya an S. Venugopal, "The Gribus Toolkit for Service Oriente Gri an Utility Computing: An Overview an Status Report," in IEEE International Workshop on Gri Economics an Business Moels, Seoul, pp. 9-66, 4. [] R. Buyya, C. S. Yeo, an S. Venugopal, "Market-Oriente Clou Computing: Vision, Hype, an Reality for Delivering IT Services as Computing Utilities," in th IEEE International Conference on High Performance Computing an Communications (HPCC-8), Los Alamitos, CA, USA, 8. [] R. Buyya, C. S. Yeo, S. Venugopal, J. Broberg, an I. Branic, "Clou computing an emerging IT platforms: Vision, hype, an reality for elivering computing as the 5th utility," Future Generation Computer Systems, vol. in press, pp. -8, 9. [] A. Chervenak, E. Deelman, I. Foster, L. Guy, W. Hoschek, A. Iamnitchi, C. Kesselman, P. Kunszt, M. Ripeanu, B. Schwartzkopf, H. Stockinger, K. Stockinger, an B. Tierney, "Giggle: A Framework for Constructing Scalable Replica Location Services," in ACM/IEEE conference on Supercomputing, Baltimore, Marylan, pp. -7,. [3] A. Chervenak, E. Deelman, M. Livny, M.-H. Su, R. Schuler, S. Bharathi, G. Mehta, an K. Vahi, "Data Placement for Scientific Applications in Distribute Environments," in 8th Gri Computing Conference, pp , 7. [4] D. Churches, G. Gombas, A. Harrison, J. Maassen, C. Robinson, M. Shiels, I. Taylor, an I. Wang, "Programming scientific an istribute workflow with Triana services," Concurrency an Computation: Practice an Experience, vol. 8, pp. -37, 6. [5] J. M. Cope, N. Trebon, H. M. Tufo, an P. Beckman, "Robust ata placement in urgent computing environments," in IEEE International Symposium on Parallel & Distribute Processing, IPDPS 9, pp. - 3, 9. [6] J. Dean an S. Ghemawat, "MapReuce: simplifie ata processing on large clusters," Commun. ACM, vol. 5, pp. 7-3, 8. [7] E. Deelman, J. Blythe, Y. Gil, C. Kesselman, G. Mehta, S. Patil, M.-H. Su, K. Vahi, an M. Livny, "Pegasus: Mapping Scientific Workflows onto the Gri," in European Across Gris Conference, pp. -, 4. [8] E. Deelman an A. Chervenak, "Data Management Challenges of Data-Intensive Scientific Workflows," in IEEE International Symposium on Cluster Computing an the Gri, pp , 8. [9] E. Deelman, D. Gannon, M. Shiels, an I. Taylor, "Workflows an e-science: An overview of workflow system features an capabilities," Future Generation Computer Systems, vol. In Press, Correcte Proof. [] E. Deelman, G. Singh, M. Livny, B. Berriman, an J. Goo, "The Cost of Doing Science on the Clou: the Montage example," in ACM/IEEE Conference on Supercomputing, Austin, Texas, pp. -, 8. [] S. Doraimani an A. Iamnitchi, "File grouping for scientific ata management: lessons from experimenting with real traces," in Proceeings of the 7th international symposium on High performance istribute computing Boston, MA, USA: ACM, 8, pp [] G. Feak, H. He, an F. Cappello, "BitDew: a programmable environment for large-scale ata management an istribution," in Proceeings of the 8 ACM/IEEE conference on Supercomputing, Austin, Texas, pp. -, 8. [3] I. Foster, Z. Yong, I. Raicu, an S. Lu, "Clou Computing an Gri Computing 36-Degree Compare," in Gri Computing Environments Workshop, GCE '8, pp. -, 8. [4] S. Ghemawat, H. Gobioff, an S.-T. Leung, "The Google file system," SIGOPS Oper. Syst. Rev., vol. 37, pp. 9-43, 3. [5] R. Grossman an Y. Gu, "Data Mining Using High Performance Data Clous: Experimental Stuies Using Sector an Sphere," in SIGKDD, pp. 9-97, 8. [6] R. Grossman, Y. Gu, M. Sabala, an W. Zhang, "Compute an storage clous using wie area high performance networks," Future Generation Computer Systems, pp , 8. [7] S. Guha, A. Meyerson, N. Mishra, R. Motwani, an L. O'Callaghan, "Clustering ata streams: Theory an practice," IEEE Transactions on Knowlege an Data Engineering, vol. 5, pp , 3. [8] N. Haravellas, M. Ferman, B. Falsafi, an A. Ailamaki, "Reactive NUCA: near-optimal block placement an replication in istribute caches," in Proceeings of the 36th annual International Symposium on Computer Architecture, ISCA '9, Austin, TX, USA, pp , 9. [9] C. Hoffa, G. Mehta, T. Freeman, E. Deelman, K. Keahey, B. Berriman, an J. Goo, "On the Use of Clou Computing for Scientific Workflows," in 4th IEEE International Conference on e-science, pp , 8. [3] A. K. Jain, M. N. Murty, an P. J. Flynn, "Data clustering: a review," ACM Comput. Surv., vol. 3, pp , 999. [3] K. Keahey, R. Figueireo, J. Fortes, T. Freeman, an M. Tsugawa, "Science Clous: Early Experiences in Clou Computing for Scientific Applications," in First Workshop on Clou Computing an its Applications (CCA'8), pp. -6, 8.

20 [3] T. Kosar an M. Livny, "A framework for reliable an efficient ata placement in istribute computing systems," Journal of Parallel an Distribute Computing, vol. 65, pp , 5. [33] T. Kosar an M. Livny, "Stork: making ata placement a first class citizen in the gri," in Proceeings of 4th International Conference on Distribute Computing Systems, ICDCS 4, pp , 4. [34] H. Liu an D. Orban, "GriBatch: Clou Computing for Large-Scale Data-Intensive Batch Applications," in Eighth IEEE International Symposium on Cluster Computing an the Gri, pp , 8. [35] B. Luascher, I. Altintas, C. Berkley, D. Higgins, E. Jaeger, M. Jones, an E. A. Lee, "Scientific workflow management an the Kepler system," Concurrency an Computation: Practice an Experience, pp , 5. [36] A. Matsunaga, M. Tsugawa, an J. Fortes, "ClouBLAST: Combining MapReuce an Virtualization on Distribute Resources for Bioinformatics Applications," in 4th IEEE International Conference on e-science, pp. -9, 8. [37] W. T. McCormick, P. J. Sehweitzer, an T. W. White, "Problem Decomposition an Data Reorganization by a Clustering Technique," Operations Research, vol., pp , 97. [38] C. Moretti, J. Bulosan, D. Thain, an P. J. Flynn, "All-Pairs: An Abstraction for Data-Intensive Clou Computing," in IEEE International Parallel & Distribute Processing Symposium, IPDPS'8, pp. -, 8. [39] T. Oinn, M. Ais, J. Ferris, D. Marvin, M. Senger, M. Greenwoo, T. Carver, K. Glover, M. R. Pocock, A. Wipat, an P. Li, "Taverna: A tool for the composition an enactment of bioinformatics workflows," Bioinformatics, vol., pp , 4. [4] M. T. Ozsu an P. Valuriez, Principles of istribute atabase systems: Prentice-Hall, Inc. Upper Sale River, NJ, USA, 99. [4] R. Proan an T. Fahringer, "Overhea Analysis of Scientific Workflows in Gri Environments," IEEE Transactions on Parallel an Distribute Systems, vol. 9, pp , 8. [4] G. Singh, K. Vahi, A. Ramakrishnan, G. Mehta, E. Deelman, H. Zhao, R. Sakellariou, K. Blackburn, D. Brown, S. Fairhurst, D. Meyers, G. B. Berriman, J. Goo, an D. S. Katz, "Optimizing Workflow Data Footprint," Scientific Programming, vol. 5, pp , 7. [43] S. Venugopal an R. Buyya, "An SCP-base heuristic approach for scheuling istribute ata-intensive applications on global gris," J. Parallel Distrib. Comput., vol. 68, pp , 8. [44] S. Venugopal, R. Buyya, an K. Ramamohanarao, "A Taxonomy of Data Gris for Distribute Data Sharing, Management, an Processing," ACM Comput. Surv., vol. 38, pp. -53, 6. [45] S. Venugopal, R. Buyya, an L. Winton, "A Gri Service Broker for Scheuling Distribute Data-Oriente Applications on Global Gris," in n Workshop on Mileware in Gri Computing, Toronto, Canaa, pp. 75-8, 4. [46] L. Wang, J. Tao, M. Kunze, A. C. Castellanos, D. Kramer, an W. Karl, "Scientific Clou Computing: Early Definition an Experience," in th IEEE International Conference on High Performance Computing an Communications, HPCC '8., pp , 8. [47] A. Weiss, "Computing in the Clou," ACM Networker, vol., pp. 8-5, 7. [48] M. Wieczorek, R. Proan, an T. Fahringer, "Scheuling of Scientific Workflows in the ASKALON Gri Environment," SIGMOD Recor, vol. 34, pp. 56-6, 5. [49] T. Xie, "SEA: A Striping-Base Energy-Aware Strategy for Data Placement in RAID-Structure Storage Systems," IEEE Transactions on Computers, vol. 57, pp , 8. [5] J. Yan, Y. Yang, an G. K. Raikunalia, "SwinDeW - A PP-Base Decentralize Workflow Management System," IEEE Transactions on Systems, Man an Cybernetics, Part A, vol. 36, pp , 6. [5] Y. Yang, K. Liu, J. Chen, J. Lignier, an H. Jin, "Peer-to-Peer Base Gri Workflow Runtime Environment of SwinDeW-G," in IEEE International Conference on e-science an Gri Computing, pp. 5-58, 7. [5] Y. Yang, K. Liu, J. Chen, X. Liu, D. Yuan, an H. Jin, "An Algorithm in SwinDeW-C for Scheuling Transaction-Intensive Cost-Constraine Clou Workflows," in 4th IEEE International Conference on e- Science, pp , 8.

A Data Dependency Based Strategy for Intermediate Data Storage in Scientific Cloud Workflow Systems *

A Data Dependency Based Strategy for Intermediate Data Storage in Scientific Cloud Workflow Systems * A Data Dependency Based Strategy for Intermediate Data Storage in Scientific Cloud Workflow Systems * Dong Yuan, Yun Yang, Xiao Liu, Gaofeng Zhang, Jinjun Chen Faculty of Information and Communication

More information

Game Theoretic Modeling of Cooperation among Service Providers in Mobile Cloud Computing Environments

Game Theoretic Modeling of Cooperation among Service Providers in Mobile Cloud Computing Environments 2012 IEEE Wireless Communications an Networking Conference: Services, Applications, an Business Game Theoretic Moeling of Cooperation among Service Proviers in Mobile Clou Computing Environments Dusit

More information

A data dependency based strategy for intermediate data storage in scientific cloud workflow systems

A data dependency based strategy for intermediate data storage in scientific cloud workflow systems CONCURRENCY AND COMPUTATION: PRACTICE AND EXPERIENCE Concurrency Computat.: Pract. Exper. (2010) Published online in Wiley Online Library (wileyonlinelibrary.com)..1636 A data dependency based strategy

More information

State of Louisiana Office of Information Technology. Change Management Plan

State of Louisiana Office of Information Technology. Change Management Plan State of Louisiana Office of Information Technology Change Management Plan Table of Contents Change Management Overview Change Management Plan Key Consierations Organizational Transition Stages Change

More information

Bellini: Ferrying Application Traffic Flows through Geo-distributed Datacenters in the Cloud

Bellini: Ferrying Application Traffic Flows through Geo-distributed Datacenters in the Cloud Bellini: Ferrying Application Traffic Flows through Geo-istribute Datacenters in the Clou Zimu Liu, Yuan Feng, an Baochun Li Department of Electrical an Computer Engineering, University of Toronto Department

More information

Improving Emulation Throughput for Multi-Project SoC Designs

Improving Emulation Throughput for Multi-Project SoC Designs Improving Emulation Throhput for Multi-Project SoC Designs By Frank Schirrmeister, Caence Design Systems As esign sizes grow, so, too, oes the verification effort. Inee, verification has become the biggest

More information

10.2 Systems of Linear Equations: Matrices

10.2 Systems of Linear Equations: Matrices SECTION 0.2 Systems of Linear Equations: Matrices 7 0.2 Systems of Linear Equations: Matrices OBJECTIVES Write the Augmente Matrix of a System of Linear Equations 2 Write the System from the Augmente Matrix

More information

HOST SELECTION METHODOLOGY IN CLOUD COMPUTING ENVIRONMENT

HOST SELECTION METHODOLOGY IN CLOUD COMPUTING ENVIRONMENT International Journal of Avance Research in Computer Engineering & Technology (IJARCET) HOST SELECTION METHODOLOGY IN CLOUD COMPUTING ENVIRONMENT Pawan Kumar, Pijush Kanti Dutta Pramanik Computer Science

More information

ThroughputScheduler: Learning to Schedule on Heterogeneous Hadoop Clusters

ThroughputScheduler: Learning to Schedule on Heterogeneous Hadoop Clusters ThroughputScheuler: Learning to Scheule on Heterogeneous Haoop Clusters Shehar Gupta, Christian Fritz, Bob Price, Roger Hoover, an Johan e Kleer Palo Alto Research Center, Palo Alto, CA, USA {sgupta, cfritz,

More information

Modelling and Resolving Software Dependencies

Modelling and Resolving Software Dependencies June 15, 2005 Abstract Many Linux istributions an other moern operating systems feature the explicit eclaration of (often complex) epenency relationships between the pieces of software

More information

Ch 10. Arithmetic Average Options and Asian Opitons

Ch 10. Arithmetic Average Options and Asian Opitons Ch 10. Arithmetic Average Options an Asian Opitons I. Asian Option an the Analytic Pricing Formula II. Binomial Tree Moel to Price Average Options III. Combination of Arithmetic Average an Reset Options

More information

Cost Efficient Datacenter Selection for Cloud Services

Cost Efficient Datacenter Selection for Cloud Services Cost Efficient Datacenter Selection for Clou Services Hong u, Baochun Li henryxu, bli@eecg.toronto.eu Department of Electrical an Computer Engineering University of Toronto Abstract Many clou services

More information

How To Connect Two Servers Together In A Data Center Network

How To Connect Two Servers Together In A Data Center Network DPillar: Scalable Dual-Port Server Interconnection for Data Center Networks Yong Liao ECE Department University of Massachusetts Amherst, MA 3, USA Dong Yin Automation Department Northwestern Polytech

More information

INFLUENCE OF GPS TECHNOLOGY ON COST CONTROL AND MAINTENANCE OF VEHICLES

INFLUENCE OF GPS TECHNOLOGY ON COST CONTROL AND MAINTENANCE OF VEHICLES 1 st Logistics International Conference Belgrae, Serbia 28-30 November 2013 INFLUENCE OF GPS TECHNOLOGY ON COST CONTROL AND MAINTENANCE OF VEHICLES Goran N. Raoičić * University of Niš, Faculty of Mechanical

More information

Firewall Design: Consistency, Completeness, and Compactness

Firewall Design: Consistency, Completeness, and Compactness C IS COS YS TE MS Firewall Design: Consistency, Completeness, an Compactness Mohame G. Goua an Xiang-Yang Alex Liu Department of Computer Sciences The University of Texas at Austin Austin, Texas 78712-1188,

More information

Unbalanced Power Flow Analysis in a Micro Grid

Unbalanced Power Flow Analysis in a Micro Grid International Journal of Emerging Technology an Avance Engineering Unbalance Power Flow Analysis in a Micro Gri Thai Hau Vo 1, Mingyu Liao 2, Tianhui Liu 3, Anushree 4, Jayashri Ravishankar 5, Toan Phung

More information

Data Center Power System Reliability Beyond the 9 s: A Practical Approach

Data Center Power System Reliability Beyond the 9 s: A Practical Approach Data Center Power System Reliability Beyon the 9 s: A Practical Approach Bill Brown, P.E., Square D Critical Power Competency Center. Abstract Reliability has always been the focus of mission-critical

More information

Risk Management for Derivatives

Risk Management for Derivatives Risk Management or Derivatives he Greeks are coming the Greeks are coming! Managing risk is important to a large number o iniviuals an institutions he most unamental aspect o business is a process where

More information

Towards a Framework for Enterprise Architecture Frameworks Comparison and Selection

Towards a Framework for Enterprise Architecture Frameworks Comparison and Selection Towars a Framework for Enterprise Frameworks Comparison an Selection Saber Aballah Faculty of Computers an Information, Cairo University Saber_aballah@hotmail.com Abstract A number of Enterprise Frameworks

More information

JON HOLTAN. if P&C Insurance Ltd., Oslo, Norway ABSTRACT

JON HOLTAN. if P&C Insurance Ltd., Oslo, Norway ABSTRACT OPTIMAL INSURANCE COVERAGE UNDER BONUS-MALUS CONTRACTS BY JON HOLTAN if P&C Insurance Lt., Oslo, Norway ABSTRACT The paper analyses the questions: Shoul or shoul not an iniviual buy insurance? An if so,

More information

Forecasting and Staffing Call Centers with Multiple Interdependent Uncertain Arrival Streams

Forecasting and Staffing Call Centers with Multiple Interdependent Uncertain Arrival Streams Forecasting an Staffing Call Centers with Multiple Interepenent Uncertain Arrival Streams Han Ye Department of Statistics an Operations Research, University of North Carolina, Chapel Hill, NC 27599, hanye@email.unc.eu

More information

GPRS performance estimation in GSM circuit switched services and GPRS shared resource systems *

GPRS performance estimation in GSM circuit switched services and GPRS shared resource systems * GPRS performance estimation in GSM circuit switche serices an GPRS share resource systems * Shaoji i an Sen-Gusta Häggman Helsinki Uniersity of Technology, Institute of Raio ommunications, ommunications

More information

zupdate: Updating Data Center Networks with Zero Loss

zupdate: Updating Data Center Networks with Zero Loss zupate: Upating Data Center Networks with Zero Loss Hongqiang Harry Liu Yale University hongqiang.liu@yale.eu Lihua Yuan Microsoft lyuan@microsoft.com Xin Wu Duke University xinwu@cs.uke.eu Roger Wattenhofer

More information

Enterprise Resource Planning

Enterprise Resource Planning Enterprise Resource Planning MPC 6 th Eition Chapter 1a McGraw-Hill/Irwin Copyright 2011 by The McGraw-Hill Companies, Inc. All rights reserve. Enterprise Resource Planning A comprehensive software approach

More information

Trace IP Packets by Flexible Deterministic Packet Marking (FDPM)

Trace IP Packets by Flexible Deterministic Packet Marking (FDPM) Trace P Packets by Flexible Deterministic Packet Marking (F) Yang Xiang an Wanlei Zhou School of nformation Technology Deakin University Melbourne, Australia {yxi, wanlei}@eakin.eu.au Abstract- Currently

More information

An intertemporal model of the real exchange rate, stock market, and international debt dynamics: policy simulations

An intertemporal model of the real exchange rate, stock market, and international debt dynamics: policy simulations This page may be remove to conceal the ientities of the authors An intertemporal moel of the real exchange rate, stock market, an international ebt ynamics: policy simulations Saziye Gazioglu an W. Davi

More information

How To Price Internet Access In A Broaban Service Charge On A Per Unit Basis

How To Price Internet Access In A Broaban Service Charge On A Per Unit Basis iqui Pricing for Digital Infrastructure Services Subhajyoti Banyopahyay * an sing Kenneth Cheng Department of Decision an Information Sciences Warrington College of Business Aministration University of

More information

Seeing the Unseen: Revealing Mobile Malware Hidden Communications via Energy Consumption and Artificial Intelligence

Seeing the Unseen: Revealing Mobile Malware Hidden Communications via Energy Consumption and Artificial Intelligence Seeing the Unseen: Revealing Mobile Malware Hien Communications via Energy Consumption an Artificial Intelligence Luca Caviglione, Mauro Gaggero, Jean-François Lalane, Wojciech Mazurczyk, Marcin Urbanski

More information

A New Evaluation Measure for Information Retrieval Systems

A New Evaluation Measure for Information Retrieval Systems A New Evaluation Measure for Information Retrieval Systems Martin Mehlitz martin.mehlitz@ai-labor.e Christian Bauckhage Deutsche Telekom Laboratories christian.bauckhage@telekom.e Jérôme Kunegis jerome.kunegis@ai-labor.e

More information

Product Differentiation for Software-as-a-Service Providers

Product Differentiation for Software-as-a-Service Providers University of Augsburg Prof. Dr. Hans Ulrich Buhl Research Center Finance & Information Management Department of Information Systems Engineering & Financial Management Discussion Paper WI-99 Prouct Differentiation

More information

Optimal Energy Commitments with Storage and Intermittent Supply

Optimal Energy Commitments with Storage and Intermittent Supply Submitte to Operations Research manuscript OPRE-2009-09-406 Optimal Energy Commitments with Storage an Intermittent Supply Jae Ho Kim Department of Electrical Engineering, Princeton University, Princeton,

More information

How To Predict A Call Capacity In A Voip System

How To Predict A Call Capacity In A Voip System Paper Preictive Moeling in a VoIP System Ana-Maria Simionovici a, Alexanru-Arian Tantar a, Pascal Bouvry a, an Loic Dielot b a Computer Science an Communications University of Luxembourg, Luxembourg b

More information

On Adaboost and Optimal Betting Strategies

On Adaboost and Optimal Betting Strategies On Aaboost an Optimal Betting Strategies Pasquale Malacaria 1 an Fabrizio Smerali 1 1 School of Electronic Engineering an Computer Science, Queen Mary University of Lonon, Lonon, UK Abstract We explore

More information

Improving Direct Marketing Profitability with Neural Networks

Improving Direct Marketing Profitability with Neural Networks Volume 9 o.5, September 011 Improving Direct Marketing Profitability with eural etworks Zaiyong Tang Salem State University Salem, MA 01970 ABSTRACT Data mining in irect marketing aims at ientifying the

More information

An introduction to the Red Cross Red Crescent s Learning platform and how to adopt it

An introduction to the Red Cross Red Crescent s Learning platform and how to adopt it An introuction to the Re Cross Re Crescent s Learning platform an how to aopt it www.ifrc.org Saving lives, changing mins. The International Feeration of Re Cross an Re Crescent Societies (IFRC) is the

More information

How To Segmentate An Insurance Customer In An Insurance Business

How To Segmentate An Insurance Customer In An Insurance Business International Journal of Database Theory an Application, pp.25-36 http://x.oi.org/10.14257/ijta.2014.7.1.03 A Case Stuy of Applying SOM in Market Segmentation of Automobile Insurance Customers Vahi Golmah

More information

Detecting Possibly Fraudulent or Error-Prone Survey Data Using Benford s Law

Detecting Possibly Fraudulent or Error-Prone Survey Data Using Benford s Law Detecting Possibly Frauulent or Error-Prone Survey Data Using Benfor s Law Davi Swanson, Moon Jung Cho, John Eltinge U.S. Bureau of Labor Statistics 2 Massachusetts Ave., NE, Room 3650, Washington, DC

More information

View Synthesis by Image Mapping and Interpolation

View Synthesis by Image Mapping and Interpolation View Synthesis by Image Mapping an Interpolation Farris J. Halim Jesse S. Jin, School of Computer Science & Engineering, University of New South Wales Syney, NSW 05, Australia Basser epartment of Computer

More information

Security Vulnerabilities and Solutions for Packet Sampling

Security Vulnerabilities and Solutions for Packet Sampling Security Vulnerabilities an Solutions for Packet Sampling Sharon Golberg an Jennifer Rexfor Princeton University, Princeton, NJ, USA 08544 {golbe, jrex}@princeton.eu Abstract Packet sampling supports a

More information

Safety Management System. Initial Revision Date: Version Revision No. 02 MANUAL LIFTING

Safety Management System. Initial Revision Date: Version Revision No. 02 MANUAL LIFTING Revision Preparation: Safety Mgr Authority: Presient Issuing Dept: Safety Page: Page 1 of 11 Purpose is committe to proviing a safe an healthy working environment for all employees. Musculoskeletal isorers

More information

Supporting Adaptive Workflows in Advanced Application Environments

Supporting Adaptive Workflows in Advanced Application Environments Supporting aptive Workflows in vance pplication Environments Manfre Reichert, lemens Hensinger, Peter Daam Department Databases an Information Systems University of Ulm, D-89069 Ulm, Germany Email: {reichert,

More information

Minimum-Energy Broadcast in All-Wireless Networks: NP-Completeness and Distribution Issues

Minimum-Energy Broadcast in All-Wireless Networks: NP-Completeness and Distribution Issues Minimum-Energy Broacast in All-Wireless Networks: NP-Completeness an Distribution Issues Mario Čagal LCA-EPFL CH-05 Lausanne Switzerlan mario.cagal@epfl.ch Jean-Pierre Hubaux LCA-EPFL CH-05 Lausanne Switzerlan

More information

Hull, Chapter 11 + Sections 17.1 and 17.2 Additional reference: John Cox and Mark Rubinstein, Options Markets, Chapter 5

Hull, Chapter 11 + Sections 17.1 and 17.2 Additional reference: John Cox and Mark Rubinstein, Options Markets, Chapter 5 Binomial Moel Hull, Chapter 11 + ections 17.1 an 17.2 Aitional reference: John Cox an Mark Rubinstein, Options Markets, Chapter 5 1. One-Perio Binomial Moel Creating synthetic options (replicating options)

More information

! # % & ( ) +,,),. / 0 1 2 % ( 345 6, & 7 8 4 8 & & &&3 6

! # % & ( ) +,,),. / 0 1 2 % ( 345 6, & 7 8 4 8 & & &&3 6 ! # % & ( ) +,,),. / 0 1 2 % ( 345 6, & 7 8 4 8 & & &&3 6 9 Quality signposting : the role of online information prescription in proviing patient information Liz Brewster & Barbara Sen Information School,

More information

Coalitional Game Theoretic Approach for Cooperative Transmission in Vehicular Networks

Coalitional Game Theoretic Approach for Cooperative Transmission in Vehicular Networks Coalitional Game Theoretic Approach for Cooperative Transmission in Vehicular Networks arxiv:.795v [cs.gt] 8 Feb Tian Zhang, Wei Chen, Zhu Han, an Zhigang Cao State Key Laboratory on Microwave an Digital

More information

Minimizing Makespan in Flow Shop Scheduling Using a Network Approach

Minimizing Makespan in Flow Shop Scheduling Using a Network Approach Minimizing Makespan in Flow Shop Scheuling Using a Network Approach Amin Sahraeian Department of Inustrial Engineering, Payame Noor University, Asaluyeh, Iran 1 Introuction Prouction systems can be ivie

More information

Consumer Referrals. Maria Arbatskaya and Hideo Konishi. October 28, 2014

Consumer Referrals. Maria Arbatskaya and Hideo Konishi. October 28, 2014 Consumer Referrals Maria Arbatskaya an Hieo Konishi October 28, 2014 Abstract In many inustries, rms rewar their customers for making referrals. We analyze the optimal policy mix of price, avertising intensity,

More information

Scalable live video streaming to cooperative clients using time shifting and video patching

Scalable live video streaming to cooperative clients using time shifting and video patching calable live vieo streaming to cooperative clients using time shifting an vieo patching Meng Guo, Mostafa H. Ammar {mguo, ammar}@cc.gatech.eu Networking an Telecommunication Group ollege of omputing, Georgia

More information

Bond Calculator. Spreads (G-spread, T-spread) References and Contact details

Bond Calculator. Spreads (G-spread, T-spread) References and Contact details Cbons.Ru Lt. irogovskaya nab., 21, St. etersburg hone: +7 (812) 336-97-21 http://www.cbons-group.com Bon Calculator Bon calculator is esigne to calculate analytical parameters use in assessment of bons.

More information

The one-year non-life insurance risk

The one-year non-life insurance risk The one-year non-life insurance risk Ohlsson, Esbjörn & Lauzeningks, Jan Abstract With few exceptions, the literature on non-life insurance reserve risk has been evote to the ultimo risk, the risk in the

More information

Stock Market Value Prediction Using Neural Networks

Stock Market Value Prediction Using Neural Networks Stock Market Value Preiction Using Neural Networks Mahi Pakaman Naeini IT & Computer Engineering Department Islamic Aza University Paran Branch e-mail: m.pakaman@ece.ut.ac.ir Hamireza Taremian Engineering

More information

RUNESTONE, an International Student Collaboration Project

RUNESTONE, an International Student Collaboration Project RUNESTONE, an International Stuent Collaboration Project Mats Daniels 1, Marian Petre 2, Vicki Almstrum 3, Lars Asplun 1, Christina Björkman 1, Carl Erickson 4, Bruce Klein 4, an Mary Last 4 1 Department

More information

Cross-Over Analysis Using T-Tests

Cross-Over Analysis Using T-Tests Chapter 35 Cross-Over Analysis Using -ests Introuction his proceure analyzes ata from a two-treatment, two-perio (x) cross-over esign. he response is assume to be a continuous ranom variable that follows

More information

Automatic Long-Term Loudness and Dynamics Matching

Automatic Long-Term Loudness and Dynamics Matching Automatic Long-Term Louness an Dynamics Matching Earl ickers Creative Avance Technology Center Scotts alley, CA, USA earlv@atc.creative.com ABSTRACT Traitional auio level control evices, such as automatic

More information

This post is not eligible for sponsorship and applicants must be eligible to work in the UK under present visa arrangements.

This post is not eligible for sponsorship and applicants must be eligible to work in the UK under present visa arrangements. WMG 7.60 per hour Ref: WMG005/15 Fixe Term Contract: 4 Weeks Full Time to be unertaken in summer 2015 (with the possibility of a further 4 weeks employment, applicants must therefore be available for the

More information

Option Pricing for Inventory Management and Control

Option Pricing for Inventory Management and Control Option Pricing for Inventory Management an Control Bryant Angelos, McKay Heasley, an Jeffrey Humpherys Abstract We explore the use of option contracts as a means of managing an controlling inventories

More information

Achieving quality audio testing for mobile phones

Achieving quality audio testing for mobile phones Test & Measurement Achieving quality auio testing for mobile phones The auio capabilities of a cellular hanset provie the funamental interface between the user an the raio transceiver. Just as RF testing

More information

FAST JOINING AND REPAIRING OF SANDWICH MATERIALS WITH DETACHABLE MECHANICAL CONNECTION TECHNOLOGY

FAST JOINING AND REPAIRING OF SANDWICH MATERIALS WITH DETACHABLE MECHANICAL CONNECTION TECHNOLOGY FAST JOINING AND REPAIRING OF SANDWICH MATERIALS WITH DETACHABLE MECHANICAL CONNECTION TECHNOLOGY Jörg Felhusen an Sivakumara K. Krishnamoorthy RWTH Aachen University, Chair an Insitute for Engineering

More information

BOSCH. CAN Specification. Version 2.0. 1991, Robert Bosch GmbH, Postfach 30 02 40, D-70442 Stuttgart

BOSCH. CAN Specification. Version 2.0. 1991, Robert Bosch GmbH, Postfach 30 02 40, D-70442 Stuttgart CAN Specification Version 2.0 1991, Robert Bosch GmbH, Postfach 30 02 40, D-70442 Stuttgart CAN Specification 2.0 page 1 Recital The acceptance an introuction of serial communication to more an more applications

More information

Professional Level Options Module, Paper P4(SGP)

Professional Level Options Module, Paper P4(SGP) Answers Professional Level Options Moule, Paper P4(SGP) Avance Financial Management (Singapore) December 2007 Answers Tutorial note: These moel answers are consierably longer an more etaile than woul be

More information

SwinDeW-C: A Peer-to-Peer Based Cloud Workflow System

SwinDeW-C: A Peer-to-Peer Based Cloud Workflow System SwinDeW-C: A Peer-to-Peer Based Cloud Workflow System Xiao Liu, Dong Yuan, Gaofeng Zhang, Jinjun Chen, Yun Yang Faculty of Information and Communication Technologies, Swinburne University of Technology,

More information

Unsteady Flow Visualization by Animating Evenly-Spaced Streamlines

Unsteady Flow Visualization by Animating Evenly-Spaced Streamlines EUROGRAPHICS 2000 / M. Gross an F.R.A. Hopgoo Volume 19, (2000), Number 3 (Guest Eitors) Unsteay Flow Visualization by Animating Evenly-Space Bruno Jobar an Wilfri Lefer Université u Littoral Côte Opale,

More information

Compare Authentication Algorithms for Mobile Systems in Order to Introduce the Successful Characteristics of these Algorithms against Attacks

Compare Authentication Algorithms for Mobile Systems in Order to Introduce the Successful Characteristics of these Algorithms against Attacks Compare Authentication Algorithms for Mobile Systems in Orer to Introuce the Successful Characteristics of these Algorithms against Attacks Shahriar Mohammai Assistant Professor of Inustrial Engineering

More information

Math 230.01, Fall 2012: HW 1 Solutions

Math 230.01, Fall 2012: HW 1 Solutions Math 3., Fall : HW Solutions Problem (p.9 #). Suppose a wor is picke at ranom from this sentence. Fin: a) the chance the wor has at least letters; SOLUTION: All wors are equally likely to be chosen. The

More information

Optimal Control Policy of a Production and Inventory System for multi-product in Segmented Market

Optimal Control Policy of a Production and Inventory System for multi-product in Segmented Market RATIO MATHEMATICA 25 (2013), 29 46 ISSN:1592-7415 Optimal Control Policy of a Prouction an Inventory System for multi-prouct in Segmente Market Kuleep Chauhary, Yogener Singh, P. C. Jha Department of Operational

More information

MODELLING OF TWO STRATEGIES IN INVENTORY CONTROL SYSTEM WITH RANDOM LEAD TIME AND DEMAND

MODELLING OF TWO STRATEGIES IN INVENTORY CONTROL SYSTEM WITH RANDOM LEAD TIME AND DEMAND art I. robobabilystic Moels Computer Moelling an New echnologies 27 Vol. No. 2-3 ransport an elecommunication Institute omonosova iga V-9 atvia MOEING OF WO AEGIE IN INVENOY CONO YEM WIH ANOM EA IME AN

More information

www.kickstartcommerce.com

www.kickstartcommerce.com Using Google Webmaster Tools to Verify an Submit a Website Sitemap Kickstart Commerce Generate greater customer an revenue growth. www.kickstartcommerce.com Open a web browser, an type in or copy an paste

More information

A Universal Sensor Control Architecture Considering Robot Dynamics

A Universal Sensor Control Architecture Considering Robot Dynamics International Conference on Multisensor Fusion an Integration for Intelligent Systems (MFI2001) Baen-Baen, Germany, August 2001 A Universal Sensor Control Architecture Consiering Robot Dynamics Frierich

More information

How To Understand The Structure Of A Can (Can)

How To Understand The Structure Of A Can (Can) Thi t t ith F M k 4 0 4 BOSCH CAN Specification Version 2.0 1991, Robert Bosch GmbH, Postfach 50, D-7000 Stuttgart 1 The ocument as a whole may be copie an istribute without restrictions. However, the

More information

Aon Retiree Health Exchange

Aon Retiree Health Exchange 2014 2015 Meicare Insurance Guie Aon Retiree Health Exchange Recommene by Why You Nee More Coverage I alreay have coverage. Aren t Meicare Parts A an B enough? For many people, Meicare alone oes not provie

More information

Sensor Network Localization from Local Connectivity : Performance Analysis for the MDS-MAP Algorithm

Sensor Network Localization from Local Connectivity : Performance Analysis for the MDS-MAP Algorithm Sensor Network Localization from Local Connectivity : Performance Analysis for the MDS-MAP Algorithm Sewoong Oh an Anrea Montanari Electrical Engineering an Statistics Department Stanfor University, Stanfor,

More information

Energy Cost Optimization for Geographically Distributed Heterogeneous Data Centers

Energy Cost Optimization for Geographically Distributed Heterogeneous Data Centers Energy Cost Optimization for Geographically Distribute Heterogeneous Data Centers Eric Jonari, Mark A. Oxley, Sueep Pasricha, Anthony A. Maciejewski, Howar Jay Siegel Abstract The proliferation of istribute

More information

Search Advertising Based Promotion Strategies for Online Retailers

Search Advertising Based Promotion Strategies for Online Retailers Search Avertising Base Promotion Strategies for Online Retailers Amit Mehra The Inian School of Business yeraba, Inia Amit Mehra@isb.eu ABSTRACT Web site aresses of small on line retailers are often unknown

More information

Low-Complexity and Distributed Energy Minimization in Multi-hop Wireless Networks

Low-Complexity and Distributed Energy Minimization in Multi-hop Wireless Networks Low-Complexity an Distribute Energy inimization in ulti-hop Wireless Networks Longbi Lin, Xiaojun Lin, an Ness B. Shroff Center for Wireless Systems an Applications (CWSA) School of Electrical an Computer

More information

Chapter 9 AIRPORT SYSTEM PLANNING

Chapter 9 AIRPORT SYSTEM PLANNING Chapter 9 AIRPORT SYSTEM PLANNING. Photo creit Dorn McGrath, Jr Contents Page The Planning Process................................................... 189 Airport Master Planning..............................................

More information

Calibration of the broad band UV Radiometer

Calibration of the broad band UV Radiometer Calibration of the broa ban UV Raiometer Marian Morys an Daniel Berger Solar Light Co., Philaelphia, PA 19126 ABSTRACT Mounting concern about the ozone layer epletion an the potential ultraviolet exposure

More information

MSc. Econ: MATHEMATICAL STATISTICS, 1995 MAXIMUM-LIKELIHOOD ESTIMATION

MSc. Econ: MATHEMATICAL STATISTICS, 1995 MAXIMUM-LIKELIHOOD ESTIMATION MAXIMUM-LIKELIHOOD ESTIMATION The General Theory of M-L Estimation In orer to erive an M-L estimator, we are boun to make an assumption about the functional form of the istribution which generates the

More information

SOFTWARE AND HARDWARE SOUND ANALYSIS TOOLS FOR FIELD WORK.

SOFTWARE AND HARDWARE SOUND ANALYSIS TOOLS FOR FIELD WORK. SOFTWARE AND HARDWARE SOUND ANALYSIS TOOLS FOR FIELD WORK. Pavan G. '' ^ Manghi M. \ Fossati C.'' ^ Centra Interisciplinare i Bioacustica e Ricerche Ambientali, Universita i Pavia, Via Taramelli 24, 27100

More information

A New Pricing Model for Competitive Telecommunications Services Using Congestion Discounts

A New Pricing Model for Competitive Telecommunications Services Using Congestion Discounts A New Pricing Moel for Competitive Telecommunications Services Using Congestion Discounts N. Keon an G. Ananalingam Department of Systems Engineering University of Pennsylvania Philaelphia, PA 19104-6315

More information

Digital barrier option contract with exponential random time

Digital barrier option contract with exponential random time IMA Journal of Applie Mathematics Avance Access publishe June 9, IMA Journal of Applie Mathematics ) Page of 9 oi:.93/imamat/hxs3 Digital barrier option contract with exponential ranom time Doobae Jun

More information

Owner s Manual. TP--WEM01 Performance Series AC/HP Wi-- Fi Thermostat Carrier Côr Thermostat TABLE OF CONTENTS

Owner s Manual. TP--WEM01 Performance Series AC/HP Wi-- Fi Thermostat Carrier Côr Thermostat TABLE OF CONTENTS TP--WEM01 Performance Series AC/HP Wi-- Fi Thermostat Carrier Côr Thermostat Fig. 1 - Carrier Côrt Thermostat TABLE OF CONTENTS Owner s Manual A14493 PAGE OVERVIEW... 2 Your Carrier Côrt Thermostat...

More information

Sage Match Terms and Conditions of Use (Last updated: 9 November 2015)

Sage Match Terms and Conditions of Use (Last updated: 9 November 2015) 1. Acknowlegement an Acceptance 1.1. This Agreement is between: (1) you, the person or organisation registere to use or using the Sage accountancy network service known as Sage Match ; an (2) us, as follows:

More information

Optimal Control Of Production Inventory Systems With Deteriorating Items And Dynamic Costs

Optimal Control Of Production Inventory Systems With Deteriorating Items And Dynamic Costs Applie Mathematics E-Notes, 8(2008), 194-202 c ISSN 1607-2510 Available free at mirror sites of http://www.math.nthu.eu.tw/ amen/ Optimal Control Of Prouction Inventory Systems With Deteriorating Items

More information

Different approaches for the equalization of automotive sound systems

Different approaches for the equalization of automotive sound systems Auio Engineering Society Convention Paper Presente at the 112th Convention 2002 May 10 13 Munich, Germany This convention paper has been reprouce from the author's avance manuscript, without eiting, corrections,

More information

A NATIONAL MEASUREMENT GOOD PRACTICE GUIDE. No.107. Guide to the calibration and testing of torque transducers

A NATIONAL MEASUREMENT GOOD PRACTICE GUIDE. No.107. Guide to the calibration and testing of torque transducers A NATIONAL MEASUREMENT GOOD PRACTICE GUIDE No.107 Guie to the calibration an testing of torque transucers Goo Practice Guie 107 Measurement Goo Practice Guie No.107 Guie to the calibration an testing of

More information

DIFFRACTION AND INTERFERENCE

DIFFRACTION AND INTERFERENCE DIFFRACTION AND INTERFERENCE In this experiment you will emonstrate the wave nature of light by investigating how it bens aroun eges an how it interferes constructively an estructively. You will observe

More information

SEC Issues Proposed Guidance to Fund Boards Relating to Best Execution and Soft Dollars

SEC Issues Proposed Guidance to Fund Boards Relating to Best Execution and Soft Dollars September 2008 / Issue 21 A legal upate from Dechert s Financial Services Group SEC Issues Propose Guiance to Fun Boars Relating to Best Execution an Soft Dollars The Securities an Exchange Commission

More information

hurni@ieee.org 1. INTRODUCTION ABSTRACT

hurni@ieee.org 1. INTRODUCTION ABSTRACT Deployment Issues of a VoIP Conferencing System in a Virtual Conferencing Environment R. Venkatesha Prasa i Richar Hurni ii H.S. Jamaagni iii H.N. Shankar iv i, iii {vprasa, hsjam}@cet.iisc.ernet.in i,

More information

Malawi Television White Spaces (TVWS) Pilot Network Performance Analysis

Malawi Television White Spaces (TVWS) Pilot Network Performance Analysis Journal of Wireless Networking an Communications 2014, 4(1): 26-32 DOI: 10.5923/j.jwnc.20140401.04 Malawi Television White Spaces (TVWS) Pilot Network Performance Analysis C. Mikeka 1,*, M. Thoi 1, J.

More information

Sustainability Through the Market: Making Markets Work for Everyone q

Sustainability Through the Market: Making Markets Work for Everyone q www.corporate-env-strategy.com Sustainability an the Market Sustainability Through the Market: Making Markets Work for Everyone q Peter White Sustainable evelopment is about ensuring a better quality of

More information

An Introduction to Event-triggered and Self-triggered Control

An Introduction to Event-triggered and Self-triggered Control An Introuction to Event-triggere an Self-triggere Control W.P.M.H. Heemels K.H. Johansson P. Tabuaa Abstract Recent evelopments in computer an communication technologies have le to a new type of large-scale

More information

USING SIMPLIFIED DISCRETE-EVENT SIMULATION MODELS FOR HEALTH CARE APPLICATIONS

USING SIMPLIFIED DISCRETE-EVENT SIMULATION MODELS FOR HEALTH CARE APPLICATIONS Proceeings of the 2011 Winter Simulation Conference S. Jain, R.R. Creasey, J. Himmelspach, K.P. White, an M. Fu, es. USING SIMPLIFIED DISCRETE-EVENT SIMULATION MODELS FOR HEALTH CARE APPLICATIONS Anthony

More information

Web Appendices to Selling to Overcon dent Consumers

Web Appendices to Selling to Overcon dent Consumers Web Appenices to Selling to Overcon ent Consumers Michael D. Grubb MIT Sloan School of Management Cambrige, MA 02142 mgrubbmit.eu www.mit.eu/~mgrubb May 2, 2008 B Option Pricing Intuition This appenix

More information

An Alternative Approach of Operating a Passive RFID Device Embedded on Metallic Implants

An Alternative Approach of Operating a Passive RFID Device Embedded on Metallic Implants An Alternative Approach of Operating a Passive RFID Device Embee on Metallic Implants Xiaoyu Liu, Ravi Yalamanchili, Ajay Ogirala an Marlin Mickle RFID Center of Excellence, Department of Electrical an

More information

Predicting Television Ratings and Its Application to Taiwan Cable TV Channels

Predicting Television Ratings and Its Application to Taiwan Cable TV Channels 2n International Symposium on Computer, Communication, Control an Automation (3CA 2013) Preicting Television Ratings an Its Application to Taiwan Cable TV Channels Hui-Ling Huang Department of Biological

More information

Reading: Ryden chs. 3 & 4, Shu chs. 15 & 16. For the enthusiasts, Shu chs. 13 & 14.

Reading: Ryden chs. 3 & 4, Shu chs. 15 & 16. For the enthusiasts, Shu chs. 13 & 14. 7 Shocks Reaing: Ryen chs 3 & 4, Shu chs 5 & 6 For the enthusiasts, Shu chs 3 & 4 A goo article for further reaing: Shull & Draine, The physics of interstellar shock waves, in Interstellar processes; Proceeings

More information

Lecture L25-3D Rigid Body Kinematics

Lecture L25-3D Rigid Body Kinematics J. Peraire, S. Winall 16.07 Dynamics Fall 2008 Version 2.0 Lecture L25-3D Rigi Boy Kinematics In this lecture, we consier the motion of a 3D rigi boy. We shall see that in the general three-imensional

More information

A Generalization of Sauer s Lemma to Classes of Large-Margin Functions

A Generalization of Sauer s Lemma to Classes of Large-Margin Functions A Generalization of Sauer s Lemma to Classes of Large-Margin Functions Joel Ratsaby University College Lonon Gower Street, Lonon WC1E 6BT, Unite Kingom J.Ratsaby@cs.ucl.ac.uk, WWW home page: http://www.cs.ucl.ac.uk/staff/j.ratsaby/

More information

Rural Development Tools: What Are They and Where Do You Use Them?

Rural Development Tools: What Are They and Where Do You Use Them? Faculty Paper Series Faculty Paper 00-09 June, 2000 Rural Development Tools: What Are They an Where Do You Use Them? By Dennis U. Fisher Professor an Extension Economist -fisher@tamu.eu Juith I. Stallmann

More information

MINIMIZING STORAGE COST IN CLOUD COMPUTING ENVIRONMENT

MINIMIZING STORAGE COST IN CLOUD COMPUTING ENVIRONMENT MINIMIZING STORAGE COST IN CLOUD COMPUTING ENVIRONMENT 1 SARIKA K B, 2 S SUBASREE 1 Department of Computer Science, Nehru College of Engineering and Research Centre, Thrissur, Kerala 2 Professor and Head,

More information