Using Decay Mechanism to Improve Regression Test Selection Techniques in Continuous Integration Development Environment
|
|
|
- Dustin Daniel
- 10 years ago
- Views:
Transcription
1 Using Decay Mechanism to Improve Regression Test Selection Techniques in Continuous Integration Development Environment Jingjing Liang Dept. of Computer Science & Engineering University of Nebraska-Lincoln Lincoln, NE ABSTRACT In continuous integration development environment, the changed code are committed and merged at a frequent interval. This approach benefits us a lot by reducing a lot of repetitive manual work and speeding up the overall development time. To make sure that those changes won t break the integration build, it is necessary to test the changed code prior to the submission to detect as many faults as possible. In this work, by focusing on presubmit testing phase, we present a new regression test selection (RTS) technique which adopts decay mechanism to select a subset of test suites for execution. Upon the assumption that the recent change of a data stream could provide more valuable information for the analysis of newly changed code, and the effect of old test suites on the current testing result is diminishing, decay based RTS relies more on the recent history of the test suites records. To evaluate the performance of the technique, we then conducted an empirical study on Google Shared Dataset of Test Suite Result (GSDTSR) to simulate the industrial process of testing, and the result shows, our new technique could detect a high percentage of failures by executing a very low percentage of test suites. General Terms Reliability, Experimentation Keywords Recent frequent failed tests, Regression test selection, Decay mechanism, Continuous Integration development environment, Google Shared Dataset 1. INTRODUCTION In continuous integration (CI) development environment, developers will commit and merge the code at a frequent interval. The merged code is then regression tested to ensure that their changes won t break the integration build and their codebase remains stable [3, 5]. This CI technique benefits us a lot by Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. reducing repetitive manual work and speeding up the overall development time. Even CI process has a lot of advantages, it also faces several challenges. For example, In order to achieve the benefits from CI, developers must commit code more frequently than it was before. The potential errors or collisions, which may result in the broken integration builds, must be detected as soon as possible. The broken builds must be fixed immediately. Notification systems are needed to improve coordination and reduce conflicts. Tests should be executed with automated build [5]. In addition to the previous challenges, cost effectiveness is another important aspect for development under CI environment. For example, in order to release of products rapidly, code is merged more frequently, which requires faster feedbacks from each builds. To solve this problem, many large organizations have two testing phases [5]. (1) Post-submit testing phase. After developers have submitted their code into the version control repository, the CI sever will build it and run the related script to test it. In this phase, developers will provide a change list along with the changed code. In the change list, they will indicate the modules that are directly relevant the building or testing, so that they could restrict the number of executed test suites. In this phase, we are focusing on detecting failures as soon as possible, so that the developers could fix those problems sooner. (2) Pre-submit testing phase. Even though post-submit testing phase has already restrict the number of test suites. The changed code may still need to deal with a large amount of dependent code and modules. So there may also exists a high possibility that the build will fail. To solve this problem, it is necessary for the developers to test their code prior submission. In this phase, we are focusing on detecting failures as many as possible, so as to reduce the possibility of failing the build after submission. In addition, as an important part of a build, regression testing also should be conducted cost effectively. Regression testing is the process of testing existing software applications to make sure that a change or an addition won t break any existing functionality. Under CI development environment, regression testing is quite necessary. As one of regression testing technique, Regression Test Selection (RTC) aims at selecting part of test suite for execution should be more cost effective. However, traditional RTC is not applicable: Traditional techniques analysis relies on code instrumentation and is applicable only to discrete
2 and complete set of test suites [5]. However, in CI, testing requests arrivals at frequent intervals, which will cause the analysis very expensive. And the frequent coming test suites make it hard to apply to the discrete and complete set of test suites. Therefore, in order to make testing cost effective in pre-submit testing phase, we provide a new approach to regression select a subset of test suites. In CI, when testing, large number of test suites will arrive very frequently. Consequently, the knowledge embedded in the data stream is more likely to be changed as time goes by and the more recent change of a data stream could provide more valuable information for the analysis of the incoming test suites selection. For example, for a certain test suite A, if it frequently fails in recent version of code, then it is more likely to indicate the potential problems in the corresponding changed code modules. However, for another test suite B, if it rarely fails recently, then it is more likely to indicate the good function of the corresponding code. For A and B, even A always pass in the previous version or B always fail in the previous version, we care more about their recent performance. And as time goes by, the effect of old fail/pass status of the test suites gradually decreases and the effect of recent fail/pass status of the test suites gradually increases. Upon the assumption that the recent change of a data stream could provide more valuable information for the analysis of newly changed code, and the effect of old test suites on the current testing result is diminishing, this work present a new regression test selection (RTS) technique which adopts decay mechanism to select a subset of test suites for execution. To evaluate the performance of the technique, we then conducted an empirical study on Google Shared Dataset of Test Suite Result (GSDTSR) to simulate the industrial process of testing, and the result shows, our new technique could detect a high percentage of failures by executing a very low percentage of test suites. Thus, our techniques contribute directly to the goal of the continuous integration process. The remainder of this paper is organized as follows. Section 2 provides backgrounds and related work. Section 3 presents out new decay based RTS techniques. Section 4 presents the design and results of our study. Section 5 discusses the findings and limits of this paper. Section 7 concludes 2. BACKGROUND AND RELATED WORK This section will provide some background for this paper. Section 1 will discuss the CI environment system. Section 2 will discuss Regression Test Selection and related work. Section 3 will discuss Regression Testing at Google and related work. Section 4 will discuss Decay Mechanism and related work. 2.1 Continuous Integration Development Environment Continuous Integration is a software development practice where the members of a team can integrate their work at a very frequent interval. Each integration is verified by an automated build (including test) to detect integration errors and give feedbacks as quickly as possible. It is found that this approach can lead to significantly reduced integration problems and help a team to develop cohesive software more rapidly. [1] 1. Each developer will commit code to the version control repository. At the same time, the CI server on the integration build machine is polling into this repository to check whether there occur any changes (e.g., every few minutes). 2. As a commit occurs, the CI sever will detect the changed code version. Then the CI sever will retrieve the copy of the changed code from the repository and execute the related build script to test and merge it to the code base. 3.As the build completes, the CI sever will generate a report about the result and inform the developer through the feedback mechanism. 4. The CI server will continue to poll for changes in the version control repository, and repeat the previous steps. Figure 1. Components of a Continuous Integration System CI has a lot of advantages: all of the processes are automated, which can significantly reduce the repetitive manual processes. And he fast feedback on an integration build produced by CI system can improve the quality of a project process and reduce the overall development time. As a result, many organizations are increasingly using Continuous Integration process to improve their development environment and reduce the overall development time so as to provide rapid release of new products. [2, 3, 4, 5] 2.2 Regression Test Selection Under CI development environment, code integration become more frequent that it was before, which make it more necessary for developers to test the functionality of the changed code and guarantee that all the old functions still work well. Regression testing is the process of testing existing software applications to make sure that a change or addition hasn t broken any existing functionality. For example, when a new version of product is going to be released, then the old test suites will still be run against the new version to ensure that all the old capabilities still work. Regression Testing can be classified by the following three techni ques: Figure 2. Regression Testing
3 1.Retest All. This is one of the methods for regression testing in which all the tests in the existing test bucket or suite should be reexecuted. This is very expensive, as it requires huge time and resources. 2. Regression Test Selection (RTC). Instead of re-executing the entire test suite, selecting part of test suite for execution should be more cost effective. 3. Test suites Prioritization (TCP). Prioritize the test suites depending on some criteria. In this paper, we are focusing on pre-submit testing phase to cost effectively detect as many failures as possible. So we choose Regression Test Selection technique, which select a subset of test suites related to the changed code and maximizes the failure detection. Even this technique may miss some test suites that could reveal potential faults, all the related test suites will still be executed in post-submit testing phase, and those faults will be detected then. [5]. There has been some recent work on techniques for testing programs on large farms of test servers or in the cloud [11, 12, 13]. However, those works did not specifically consider continuous integration process or regression testing. Saff and Ernst [7] focus on testing during development effort itself, but not after a set of changes has been completed are scheduled for submission to merge. Yoo et al. [35], also working with Google data sets, describe a search-based approach for using TCP techniques to improve the cost-effectiveness of the testing in pre-submit phase. But their study does not, however, consider the use of RTS techniques. In the work Elbaum et al [5], they worked on Google data sets and improved RTS and TCP techniques by using time window to adjust to the CI development environment. In this paper, they adopted two time windows: a failure window (Wf) and an execution window (We). When the test suite is recently failed (failed within failure window), then the test suites will be selected for execution. At the same, if the test suite hasn t been executed for a long time (outside the execution window), then the test suite will be selected for execution. However in this paper, we adopt decay mechanism to improve RTS technique. 2.3 Decay Mechanism In continuous integration development environment, when testing, a large number of test suites will be executed continuously at a rapid rate. Consequently, the knowledge embedded in the stream of data is likely to be changed as time goes by. Identifying the recent change of a data stream could provide more valuable information for the analysis of newly change code. Since the effect of old test suites (data transaction) on the result of in coming data stream is diminished. However, most of frequency counting algorithms over the data stream [7, 8], like Lossy Counting Algorithm [7], did not differentiate the information of recently generated transactions, which may be no longer useful or possibly invalid at present. [6] In terms of information differentiation, the SWF algorithm [9] uses a sliding window to find frequent itemsets in the fixed number of recent transactions. The sliding window is a sequence of partitions, and each partition maintains a number of transactions. It uses 2 candidate itemsets of all transactions and they are maintained separately, and when the window is advanced, then the oldest itemsets will be disregarded and a new itemset will be generated at the same time. This paper will adopt decay mechanism to improve the traditional regression test selection technique under CI development environment. And this technique will examine each test suites (transaction) one-by-one without any candidate generation. The effect of old test suites (transactions) on the current testing result is diminished by decaying the old occurrence count of each test suites as time goes by. 2.4 Google Shared Dataset of Test Suite Result (GSDTSR) To empirically conduct this work, we cannot access to the real industry development process, so we utilize the records in GSDTSR and simulate the process of a real continuous testing environment. Google Shared Dataset of Test Suite Result (GSDTSR) [15] provides the software testing and analysis community with a sample of 3.5 million of test suites execution results from a fast and large scale of continuous testing infrastructure. The information in this dataset is showing in the following Table 1. Field Test Suite Change Request Stage Status Launch Time Execution Time Size Shard Number Language Description Test suite name Rescaled change request number that lead to the execution of the test suite Testing stage including Pre and Post Test suite execution status: Failed and Passed The time when the test suite was executed Test suite execution time in millisecond The size of the test suite: Small, Medium and Large Shards are needed when test suites are parallelized for execution. Shard number is the number of shards used to execute the test suite. Language of the test suite. Google dataset contains two testing stages: pre-testing and posttesting. When a developer commits his changed module, the developer will test prior to submission. In this phase, the developer will provide a change list, which contains the directly relevant modules to the building and testing. Then the testing request is queued for executing all test suites relevant to the change list. After building, the developer will receive a report about the build and testing. If the pre-submit testing succeeds, the developer will submit the module for post-submit testing. In this phase, module dependency graphs are used to determine the test suites for execution. Modules that are globally relevant to the changed module are all included. All of the test suites relevant to these modules are queued for testing. 3. APPROACH Table 1. Fields in GSDTSR It is advantageous to apply continuous integration system for frequent system builds and regression testing technique to test new and changes version of codes to make sure that the integration can be operated early, errors can be detected sooner and feedbacks on potential problems can be achieved faster. Presubmit phase of testing helps to reduce the number of possible
4 problems accessing to the post-submit testing phase to fail the builds. However the process still faces a lot of other challenges. For example, as the project grows, testing still need to deal with a large amount of dependent code and modules, which relate to large number of test suites. In this paper, we are focusing on making pre-submit testing phase cost effective by applying decay mechanism. In this section, I will first introduce the decay model I use. Then I will provide the algorithm I implemented in the Google dataset and explain it in details. 3.1 Using Decay Mechanism to Find Frequent Failed Test Suites Under CI development environment, it is quite important to detect as many failed test suites as possible, so that developers could fix those problems prior submission to reduce the potential build break. In order to make pre-submit testing phase cost effective, we apply regression testing selection (RTC) technique to test them. However, the traditional RTC is not applicable: Google s codebase undergoes a large number of changes per minute [14]; Traditional techniques analysis relies on code instrumentation and is applicable only to discrete and complete set of test suites [5]. However, in CI, testing requests arrive at frequent intervals, which will cause the analysis very expensive. And the frequent coming test suites make it hard to apply to the discrete and complete set of test suites. We therefore provide a new approach to regression select a subset of test suites. In CI, when testing, large number of test suites will arrive at a very frequent interval. Consequently, the knowledge embedded in the data stream is more likely to be changed as time goes by and the more recent change of a data stream could provide more valuable information for the analysis of the incoming test suites selection. For example, for a certain test suite A, even though A rarely failed previously, if it fails frequently recently, it may indicate some problems in the recent changed code. On the other hand, even test suite B frequently failed previously, if it hasn t failed in the recent code changes, it may indicate the good function of related changed code. From the above example, we can see that as time goes by, the effect of old fail/pass status of the test suites gradually decreases and the effect of recent fail/pass status of the test suites gradually increases. To find the effect variation of the frequent test suites, we utilize the decay mechanism. (1) Let T = {t1, t2,, tn} be a set of test suites to be executed. Each of the test suites can be executed over and over again if it is selected. (2) Let T be the selected set of test suites such that T π {2n { }} where 2n is the power set of n (each of the test suites ti in T has two possibility: selected or not selected). (3) A transaction is a subset of T and each transaction has a unique transaction identifier T_id. A transaction generated at the kth turn is denoted by Tk. (4) When a new transaction Tk is generated, the current data stream Dk is composed of all transactions that have ever been generated so far i.e., Dk = <T1, T2,, Tk>. and the total number of transactions in Dk is denoted by Dk. (5) When a transaction Tk is generated, the current count Ck(Ti) of a subset Ti is the number of transactions that contain the subset Ti among k transactions. Decay rate: decay rate is the reducing rate of a weight for a fixed decay-unit. A decay-unit determines the chunk of information to be decayed together. [6] A decay rate is defined by 2 factors: (1) A decay-base b. b determines the amount of weight reduction per decay-unit and b>1. (2) A decay-base-life h. When the weight of the current information is set to 1, h is defined by the number of decay-units that makes the current weight be b-1. Predefined minimum support (transaction count) π . not all of test suites that come are significant for finding potential faults. A test suite that has much less support than a predefined minimum support is not necessarily monitored since it cannot detect failures in the near future. When the estimated support of a coming test suite is large enough, it is regarded as a significant test suite and it will be selected for execution. Otherwise, the test suite will be skipped. Then the decay-rate can be defined as follows: d = b-(1/h) (b>1, h 1, b-1 d 1) Then, given a decay rate d = b-(1/h), (b>1, h 1, b-1 d 1), the total number of transactions Dk in the current data stream Dk is found as follows: π·! = 1 ππ π = 1 π·!!! π + 1 ππ π 2 When the first transaction is generated, the number of transaction D1 is obviously 1 (since there exists no previous transaction whose weight should be decayed). As the next transaction comes, the total number of transaction D2 = D1 d + 1 (since the first transaction D1 should be decayed by the decay rate). As the new transaction is generated at kth (k 2) turn, the total number of transactions is as follows: D = π·!!! π + 1 = ( Dk-2 π + 1) π +1 = (( π·!!! π + 1) π + 1) π +1 =.. = ((((1) π + 1) ) π + 1) + 1 = dk-1 + dk d + 1 = (1 π k)/(1 π) Because of b-1 d 1, Dk converges to 1/(1-d) as k increases infinitely. Then, the count Ck(Ti) of the subset Ti(selected tests set) in the current data stream Dk is achieved as follows: Ck(Ti) = Ck-1(Ti) d + W(Ti), W(Ti) = 1 ππ π! π! ππ‘βπππ€ππ π 3.2 Algorithm of Decay Mechanism improved Regression Test Selection The algorithm utilizes the previous decay model to regression select the test suites with more potential influences on the current result.
5 Algorithm: Decay-SelectionPreTests //select test suites C 1 (T i ) = 1 T k = φ for all T i T do if C k (T i ) >= δ or T i is new then T k ß T k T i //execute it // update transaction count if T i is failed then C k (T i ) = C k-1 (T i ) d +1 else C k (T i ) = C k-1 (T i ) d + end if end if end for Return T This algorithm contains 2 main parts. The first part is to select the test suites for execution. The second part is to update the transaction count. When initialized, for each T i, count of transaction C 1 (T i ) = 1, k = 1 and T k = φ. Then as each test suite T i comes, we will check: (1) Whether the test suite is new. If yes, then we will execute it and add the test suite to T k and update the transaction count. (2) Whether the test suite s C k (T i ) is larger than an independent variable δ, which will determine the whether the test suite will be executed. If yes, then we will execute it and add the test suite to T k and update the transaction count. After the determination of the test suite execution, we will update the transaction count according to the result of the test execution. Since we conjecture that test suites with failed records is more likely to indicate the code churn, so if the test execution result is failed, then the current transaction count will increase by one plus the previous decayed transaction count. On the other hand, if the test execution result is passed, then the current transaction count will be equal to the decayed previous transaction. The formula is as follows: (1) If the test suite is failed, then T! T!, so we update it as C k (T i ) = C k-1 (T i ) d +1 (2) If the test suite is passed, then T! T!, so we update it as C k (T i ) = C k-1 (T i ) d + Take T i as an example. Suppose b=2, h=2, then d= b -(1/h) = 2 (-/2) and suppose δ=.7 and T i is not new. b h d Suppose C k-1 (T i ) = 1, since 1> δ, the T i should be executed, and if the execution result is failed, then the C k (T i ) will be updated to 1 d +1= However, if the execution result is passed, then C k (T i ) will be updated to 1 d += EMPIRICAL STUDY In order to evaluate the cost effectiveness of my new algorithm, I set several dependent and independent variables and simulate the progress of Google pre-submit testing phase to select and execute test suites. RQ1: How cost effective is the RTS based on Decay mechanism during pre-submit testing? And how dose the cost effectiveness vary with different settings of decay rate d and δ? RQ2: How dose the cost effectiveness performance of Decay based RTS compared with the previous time window based RTS? In this section I will first provide a general introduction to the object of my study, GSDTSR. In section 4.2, I will present the independent and dependent variables for this study. In section 4.3, I will discuss the operation of my study. And in the last section, I will consider the potential external validities and internal validities of this technique. 4.1 Object of Analysis For the object of this experiment, I will use Google Shared Dataset of Test Suite Result (GSDTSR), which contains a sample of 3.5 Million test suite execution results from a fast and large scale of continuous testing infrastructure [15]. 4.2 Variables and Measures Independent Variables In my empirical study process, the independent variables involve the techniques, decay rate and predefined minimum support. Technique: the decay mechanism based RTS presented in Section 3.2. Decay rate: as presented before, the decay rate d = b -(1/h) (b>1, h 1, b -1 d 1). I choose a fixed decay-base b = 2, and 3 decay-base-life h = {6, 1, 5} representing different numbers of decay-units that makes the current weight be b -1. Predefined minimum support: as presented before, the predefined minimum support is used to determine whether the test suite contains enough possibility to detect failures in the future. I choose 11 predefined minimum supports δ = {.5, 1, 2, 4, 8, 12, 16, 2, 24, 28, 32} representing different transaction count Dependent Variables As for dependent variables, I measure the percentage of test suites that are selected, the percentage of execution time required, and the percentage of failures detected by the technique. I do this for each combination of h and δ. 4.3 Study Operation In order to evaluate the result of my proposed technique, I implemented the algorithm described in Section 3.2. For the object, I use GSDTSR dataset to simulate a continuous testing environment. Decay-SelectionPreTests implementation utilizes the GSDTSR data, a decay rate d, and a predefined minimum support δ to select the test suites with higher possibility to detect failures. As for the report of the result, it contains three parts: the number of test suites selected, total execution time of the test suites selected and the number of failures selected. For each of the coming test suite, the program will use the proposed technique to check whether the certain test suite should be selected. If the test suite should be executed, then number of selected test suites and the total execution time of the selected test suites will be updated. If the result of the test suite is failed, the number of failure detected will be updated.
6 4.4 Thread to validity External validity. Even I have applied my implementation to the Google dataset that contains 3.5 million records of test results, the dataset still only represent a small section of one industrial setting. I have utilized several combinations of decay rates and predefined minimum supports to evaluate my result. However I haven t considered factors related to the availability of computing infrastructure, such as the different number of platforms available for use in testing Internal validity. Since I am using my own tool to do the experiment study, if there exists any faults in my tool, then the result won t be persuasive. In order to avoid this scenario, I carefully unit tested my tool on small portions of the dataset. In addition, for the 3.5 million test results, I did not consider the possible flaky test suites, which may have different testing results for different runs. 4.5 Results and Analysis In this section, I will analyze the results of my study according to the 3 research questions RQ1: How cost effective is the RTS based on Decay mechanism during pre-submit testing? Results of the implementation are shown in the following Figure 3, Figure 4, and Figure5. Each of those figures shows: for each δ (on the x-axis), the percentage of test suites selected, the execution time of those test suites and the percentage of failures detected. To be noted, the percentage of failures detected are corresponding to the total number of failures detected when all test suites are selected. Through those three figures, we can find that, the number of failing test suites increases as δ decreases, and the percentage of failing test suites can reach 81.39% with executing only 19.24% test suites. And as the h in decay rate increases, the number of failing test suites increases. Take δ = 32 as an example, when h=6, the percentage of failing test suites detected is 41.82%; When h = 1, the percentage of failing test suites detected is 45.7%; When h = 5, the percentage of failing test suites detected is 52.36%. From the three figures, we can see that, percentage of test suite selected to execute is very low: from.2% ~ 19.24% (except the one with 79.7% test suite selection, whose δ is.5 and h is 5). As expected, using smaller δ will lead to more aggressive test suites selection. For example, for all the 3 figures, the percentage of failure detections when δ =.5 are 35% more than when δ = 32. The reason is that as the predefined minimum support δ increases, each test suite T i need to have higher C k (T i ) to be selected. However, for each T i, its previous transaction count C k (T i ) is related to the number of previous failed test suites, which is fixed. So the higher δ will cause the less selection of test suites, leading to the less detection of failing test suites. As decay-base-life h increases, the number of previous test suites that affect the transaction count C k (T i ) will increase. And as the number of previous test suites that affects transaction count C k (T i ) increases, the number of failing test suites among them are also increasing, so that C k (T i ) will increase, leading to the higher percentage of failure detection Figure 3. Test Suite Selection: h= Figure 4. Test Suite Selection: h= Figure 5. Test Suite Selection: h=5 %TestSuite %Execution Time %FailDetect %TestSuite %Execution Time %FailDetect %TestSuite %Execution Time %FailDetect RQ1: How dose the cost effectiveness performance of Decay based RTS compared with the previous time window based RTS? Figure 6 and Figure 7 are the failure detection trends against test suites selection. Each of the figures shows, for each percentage of test suites selection (on the x-axis), the corresponding percentage of failure detection is on the y-axis.
7 Figure 6 are the result for window-based test suite selection technique proposed by S. Elbaum et al [5]. Both of the two figures show, as the percentage of test suites selection increases, the percentage of failure detection also increases. This because as the number of test suites selected increase, more percentage of failing test suites are detected. By comparing Figure 6 with Figure 7, we can find that all the percentages of failure detections are higher than 4%. However, the both We = 1 and We = 48 (We is execution window, representing hours) in Window Based RTS contains several failure detections that are lower than 4%. In addition, in Decay Based RTS, most of the points are selecting a very low percentage (less than 3%) of test suites, but get more than 4% of failure detections. However, in Figure 6, when the execution window is 1 hour (We=1), even though all of the failure detections are higher than 4%, the percentage of test suites selection is also very high, which is more than 2%. By this comparison, we can find that, Decay Based RTC is more cost effective than Window Based RTC DISCUSSION Figure 6. Window Based RTS Figure 7. Decay Based RTS We=1 We=24 We=48 h=6 h=1 h=5 RTS technique is not only influenced by decay rate as I proposed in this paper, it also should be influenced by many environmental factors, like the resources available for test execution (number of machines on which to run test suites), the rate at which change lists arrive, and the expense of executing test suites. In this section, I will discuss some additional issues that are not considered in this paper. Considering test suites coming rate. In this paper, I did not consider the coming rate of the test suites. For example, people are likely to submit more commits during weekdays but less during weekends. So the number of test suites executed on weekdays is more than it is on the weekends. For decay mechanism, as time goes by, the effect of the failure records on future testing results is decreasing, which means weekends will also affect the determination of test suite selection. For example, test suite t1 has a very high transaction count on Friday, and it should be selected for execution. However, as weekend goes by, the transaction count of t1 will be decreased on Monday even t1 is not executed or passed during weekend. Running on paralleled machines. In this paper, my program walk through each test suite to simulate the process of selection. However, I did not consider the number of machines available. I suppose all the test suites are coming and waiting in a queue. In real industry environment, we need to manage those continuous waiting test suites according to the resource availability to maximize the testing throughput while achieve maximum execution time. Some factors selection. The result relies on the choice of decay rate b and predefined minimum support δ. This paper chose several numbers for h and δ to run out the result. However, if the numbers vary, the result may also vary. In the future work, I will test more independent variables to analyze the relation between each factor for further analysis. 6. CONCLUSION As the continuous integrity system is increasingly adopted by large organizations for their fast release of new products, it is necessary to improve the overall build efficiency. As an important role of build part, regression testing also should be conducted cost effectively. This paper proposed a decay mechanism based regression testing selection technique for pre-submit testing phase. It adopt decay rate to calculate each test suites transaction count according to its failure records. If the transaction count is high enough, then the test suite will be selected for execution. By the empirical study operated on Google Shared Dataset of Test Suite Result (GSDTSR), the result shows that this technique could select a very low percentage of test suites to detect a quite high percentage of test suites. For the future work: (1) I will choose more independent variables to observe the trends for further improving the technique. (2) I will apply some queuing models (M/M/1, M/M/k, etc.) to compare the different combinations of queues and different number of server machines to support parallel execution scheduling. (3) I will improve the technique so that it can dynamic adjust the execution scheduling according to the rate of coming test suite. 7. ACKNOWLEDGMENTS My advisors, who help me understand more about the previous work, have supported this work. And when I was preparing for this implementation, they gave me a lot of good ideas. Thank them very much.
8 8. REFERENCES [1] P. M. Duvall, S. Matyas, and A. Glover. Continuous Integration: Improving Software Quality and Reducing Risk. Pearson Education, 27. [2] Atlassian. Atlassian software systems: Bamboo [3] Jenkins. Jenkins: An extendable open source continuous integration server. jenkins-ci.org, 214. [4] ThoughtWorks. Go: Continous delivery [5] S. Elbaum, G. Rothermel, and J. Penix. Techniques for improving regression testing in continuous integration development environments. In FSE, pages , 214. [6] J. H. Chang, and W. S. Lee. Finding Recent Frequent Itemsets Adaptively over Online Data Streams. In Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, pages ,. [7] G. S. Manku, and R. Motwani. Approximate frequency counts over data streams. In Proceedings of the 28th international conference on Very Large Data Bases. VLDB Endowment, 22. [8] M. Charikar, K. Chen, and M. Farach-Colton. Finding frequent items in data streams. In Automata, Languages and Programming. Springer Berlin Heidelberg, [9] C. H. Lee, C. R. Lin, and M. S. Chen. Sliding-window filtering: an efficient algorithm for incremental mining. Proceedings of the tenth international conference on Information and knowledge management. ACM, 21. [1] G. Rothermel, and M. J. Mary. A safe, efficient regression test selection technique. ACM Transactions on Software Engineering and Methodology (TOSEM) 6.2, [11] S. Bucur, V. Ureche, C. Zamfir, and G. Candea. Parallel symbolic execution for automated real-world software testing. In Proceedings of the Sixth Conference on Computer Systems, [12] Y. Kim, M. Kim, and G. Rothermel. A scalable distributed concolic testing approach: An empirical evaluation. In Proceedings of the International Conference on Software Testing, Apr [13] M. Staats, P. Loyola, and G. Rothermel. Oracle-centric test suites prioritization. In Proceedings of the International Symposium on Software Reliability Engineering, Nov [14] P. Gupta, M. Ivey, and J. Penix. Testing at the speed and scale of google. the-google-test-and-development_21.html, 214. [15] S. Elbaum, J. Penix, and A. McLaughlin. Google shared dataset of test suite results. -shared-dataset-of-test- suite-results/, 214. [16] L. Zhang, S.-S. Hou, C. Guo, T. Xie, and H. Mei. Timeaware test-case prioritization using integer linear programming. In Proceedings of the International Symposium on Software Testing and Analysis, [17] A. Orso, N. Shi, and M. J. Harrold. Scaling regression testing to large software systems. In Proceedings of the 12th ACM SIGSOFT International Symposium on Foundations of Software Engineering, 24. [18] Y. Shin, R. Nilsson, and M. Harman. Faster fault finding at Google using multi objective regression test optimization. 8th European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC/FSE 11), Szeged, Hungary. 211.
9
Techniques for Improving Regression Testing in Continuous Integration Development Environments
Techniques for Improving Regression Testing in Continuous Integration Development Environments Sebastian Elbaum, Gregg Rothermel, John Penix University of Nebraska - Lincoln Google, Inc. Lincoln, NE, USA
Regression Testing Based on Comparing Fault Detection by multi criteria before prioritization and after prioritization
Regression Testing Based on Comparing Fault Detection by multi criteria before prioritization and after prioritization KanwalpreetKaur #, Satwinder Singh * #Research Scholar, Dept of Computer Science and
Enhancing Dataset Processing in Hadoop YARN Performance for Big Data Applications
Enhancing Dataset Processing in Hadoop YARN Performance for Big Data Applications Ahmed Abdulhakim Al-Absi, Dae-Ki Kang and Myong-Jong Kim Abstract In Hadoop MapReduce distributed file system, as the input
Keywords: Regression testing, database applications, and impact analysis. Abstract. 1 Introduction
Regression Testing of Database Applications Bassel Daou, Ramzi A. Haraty, Nash at Mansour Lebanese American University P.O. Box 13-5053 Beirut, Lebanon Email: rharaty, [email protected] Keywords: Regression
Implementing Continuous Integration Testing Prepared by:
Implementing Continuous Integration Testing Prepared by: Mr Sandeep M Table of Contents 1. ABSTRACT... 2 2. INTRODUCTION TO CONTINUOUS INTEGRATION (CI)... 3 3. CI FOR AGILE METHODOLOGY... 4 4. WORK FLOW...
Shareability and Locality Aware Scheduling Algorithm in Hadoop for Mobile Cloud Computing
Shareability and Locality Aware Scheduling Algorithm in Hadoop for Mobile Cloud Computing Hsin-Wen Wei 1,2, Che-Wei Hsu 2, Tin-Yu Wu 3, Wei-Tsong Lee 1 1 Department of Electrical Engineering, Tamkang University
Web Application Regression Testing: A Session Based Test Case Prioritization Approach
Web Application Regression Testing: A Session Based Test Case Prioritization Approach Mojtaba Raeisi Nejad Dobuneh 1, Dayang Norhayati Abang Jawawi 2, Mohammad V. Malakooti 3 Faculty and Head of Department
Performance Evaluation of some Online Association Rule Mining Algorithms for sorted and unsorted Data sets
Performance Evaluation of some Online Association Rule Mining Algorithms for sorted and unsorted Data sets Pramod S. Reader, Information Technology, M.P.Christian College of Engineering, Bhilai,C.G. INDIA.
The Application Research of Ant Colony Algorithm in Search Engine Jian Lan Liu1, a, Li Zhu2,b
3rd International Conference on Materials Engineering, Manufacturing Technology and Control (ICMEMTC 2016) The Application Research of Ant Colony Algorithm in Search Engine Jian Lan Liu1, a, Li Zhu2,b
Log Mining Based on Hadoop s Map and Reduce Technique
Log Mining Based on Hadoop s Map and Reduce Technique ABSTRACT: Anuja Pandit Department of Computer Science, [email protected] Amruta Deshpande Department of Computer Science, [email protected]
COMP 598 Applied Machine Learning Lecture 21: Parallelization methods for large-scale machine learning! Big Data by the numbers
COMP 598 Applied Machine Learning Lecture 21: Parallelization methods for large-scale machine learning! Instructor: ([email protected]) TAs: Pierre-Luc Bacon ([email protected]) Ryan Lowe ([email protected])
CHAPTER 3 CALL CENTER QUEUING MODEL WITH LOGNORMAL SERVICE TIME DISTRIBUTION
31 CHAPTER 3 CALL CENTER QUEUING MODEL WITH LOGNORMAL SERVICE TIME DISTRIBUTION 3.1 INTRODUCTION In this chapter, construction of queuing model with non-exponential service time distribution, performance
Supporting Continuous Integration by Code-Churn Based Test Selection
Supporting Continuous Integration by Code-Churn Based Test Selection Eric Knauss, Miroslaw Staron, Wilhelm Meding, Ola SΓΆder, Agneta Nilsson, Magnus Castell University of Gothenburg [email protected]
Adaptive Tolerance Algorithm for Distributed Top-K Monitoring with Bandwidth Constraints
Adaptive Tolerance Algorithm for Distributed Top-K Monitoring with Bandwidth Constraints Michael Bauer, Srinivasan Ravichandran University of Wisconsin-Madison Department of Computer Sciences {bauer, srini}@cs.wisc.edu
International Journal of Engineering Research ISSN: 2348-4039 & Management Technology November-2015 Volume 2, Issue-6
International Journal of Engineering Research ISSN: 2348-4039 & Management Technology Email: [email protected] November-2015 Volume 2, Issue-6 www.ijermt.org Modeling Big Data Characteristics for Discovering
The Evolution of Mobile Apps: An Exploratory Study
The Evolution of Mobile Apps: An Exploratory Study Jack Zhang, Shikhar Sagar, and Emad Shihab Rochester Institute of Technology Department of Software Engineering Rochester, New York, USA, 14623 {jxz8072,
Continuous Integration with Jenkins. Coaching of Programming Teams (EDA270) J. Hembrink and P-G. Stenberg [dt08jh8 dt08ps5]@student.lth.
1 Continuous Integration with Jenkins Coaching of Programming Teams (EDA270) J. Hembrink and P-G. Stenberg [dt08jh8 dt08ps5]@student.lth.se Faculty of Engineering, Lund Univeristy (LTH) March 5, 2013 Abstract
A Logistic Regression Approach to Ad Click Prediction
A Logistic Regression Approach to Ad Click Prediction Gouthami Kondakindi [email protected] Satakshi Rana [email protected] Aswin Rajkumar [email protected] Sai Kaushik Ponnekanti [email protected] Vinit Parakh
Oracle Real Time Decisions
A Product Review James Taylor CEO CONTENTS Introducing Decision Management Systems Oracle Real Time Decisions Product Architecture Key Features Availability Conclusion Oracle Real Time Decisions (RTD)
Building A Smart Academic Advising System Using Association Rule Mining
Building A Smart Academic Advising System Using Association Rule Mining Raed Shatnawi +962795285056 [email protected] Qutaibah Althebyan +962796536277 [email protected] Baraq Ghalib & Mohammed
Static Data Mining Algorithm with Progressive Approach for Mining Knowledge
Global Journal of Business Management and Information Technology. Volume 1, Number 2 (2011), pp. 85-93 Research India Publications http://www.ripublication.com Static Data Mining Algorithm with Progressive
Survey on Scheduling Algorithm in MapReduce Framework
Survey on Scheduling Algorithm in MapReduce Framework Pravin P. Nimbalkar 1, Devendra P.Gadekar 2 1,2 Department of Computer Engineering, JSPM s Imperial College of Engineering and Research, Pune, India
How To Write A Summary Of A Review
PRODUCT REVIEW RANKING SUMMARIZATION N.P.Vadivukkarasi, Research Scholar, Department of Computer Science, Kongu Arts and Science College, Erode. Dr. B. Jayanthi M.C.A., M.Phil., Ph.D., Associate Professor,
Hadoop Scheduler w i t h Deadline Constraint
Hadoop Scheduler w i t h Deadline Constraint Geetha J 1, N UdayBhaskar 2, P ChennaReddy 3,Neha Sniha 4 1,4 Department of Computer Science and Engineering, M S Ramaiah Institute of Technology, Bangalore,
Deployment of express checkout lines at supermarkets
Deployment of express checkout lines at supermarkets Maarten Schimmel Research paper Business Analytics April, 213 Supervisor: RenΓ© Bekker Faculty of Sciences VU University Amsterdam De Boelelaan 181 181
A Time Efficient Algorithm for Web Log Analysis
A Time Efficient Algorithm for Web Log Analysis Santosh Shakya Anju Singh Divakar Singh Student [M.Tech.6 th sem (CSE)] Asst.Proff, Dept. of CSE BU HOD (CSE), BUIT, BUIT,BU Bhopal Barkatullah University,
Test Driven Development with Continuous Integration: A Literature Review
Test Driven Development with Continuous Integration: A Literature Review Sheikh Fahad Ahmad Deptt. of Computer Science & Engg. Mohd. Rizwan Beg Deptt. of Computer Science & Engg. Mohd. Haleem Deptt. of
MINIMIZING STORAGE COST IN CLOUD COMPUTING ENVIRONMENT
MINIMIZING STORAGE COST IN CLOUD COMPUTING ENVIRONMENT 1 SARIKA K B, 2 S SUBASREE 1 Department of Computer Science, Nehru College of Engineering and Research Centre, Thrissur, Kerala 2 Professor and Head,
Continuous Integration, Delivery and Deployment. Eero Laukkanen T-76.5613 - Software Testing and Quality Assurance P 20.11.2015
Continuous Integration, Delivery and Deployment Eero Laukkanen T-76.5613 - Software Testing and Quality Assurance P 20.11.2015 System Integration In engineering, system integration is defined as the process
Continuous Integration: Aspects in Automation and Configuration Management
Context Continuous Integration: Aspects in and Configuration Management Christian Rehn TU Kaiserslautern January 9, 2012 1 / 34 Overview Context 1 Context 2 3 4 2 / 34 Questions Context How to do integration
Continuous Integration (CI)
Introduction A long standing problem for software development teams has been to maintain the stability of an application while integrating the changes made by multiple developers. The later that integration
Continuous integration End of the big bang integration era
Continuous integration End of the big bang integration era Patrick Laurent Partner Technology & Enterprise Applications Deloitte Mario Deserranno Manager Technology & Enterprise Applications Deloitte The
Performance Workload Design
Performance Workload Design The goal of this paper is to show the basic principles involved in designing a workload for performance and scalability testing. We will understand how to achieve these principles
Optimization of Search Results with Duplicate Page Elimination using Usage Data A. K. Sharma 1, Neelam Duhan 2 1, 2
Optimization of Search Results with Duplicate Page Elimination using Usage Data A. K. Sharma 1, Neelam Duhan 2 1, 2 Department of Computer Engineering, YMCA University of Science & Technology, Faridabad,
Communication Protocol
Analysis of the NXT Bluetooth Communication Protocol By Sivan Toledo September 2006 The NXT supports Bluetooth communication between a program running on the NXT and a program running on some other Bluetooth
Software Testing. System, Acceptance and Regression Testing
Software Testing System, Acceptance and Regression Testing Objectives Distinguish system and acceptance testing o How and why they differ from each other and from unit and integration testing Understand
Keywords Cloud Environment, Cloud Testing, Software Testing
Volume 4, Issue 6, June 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Software Testing
DeuceScan: Deuce-Based Fast Handoff Scheme in IEEE 802.11 Wireless Networks
: Deuce-Based Fast Handoff Scheme in IEEE 82.11 Wireless Networks Yuh-Shyan Chen, Chung-Kai Chen, and Ming-Chin Chuang Department of Computer Science and Information Engineering National Chung Cheng University,
Continuous Testing with ElectricCommander. Electric Cloud, Inc. 2009
Continuous Testing with ElectricCommander Electric Cloud, Inc. 2009 Continuous Testing with ElectricCommander Drive quality earlier in the development process with continuous testing. Large development
Comparison of Request Admission Based Performance Isolation Approaches in Multi-tenant SaaS Applications
Comparison of Request Admission Based Performance Isolation Approaches in Multi-tenant SaaS Applications Rouven Kreb 1 and Manuel Loesch 2 1 SAP AG, Walldorf, Germany 2 FZI Research Center for Information
Continuous Integration Comes to China. www.electric-cloud.com
Continuous Integration Comes to China www.electric-cloud.com Agenda Time Topic Presenter 2:00 Introduction Tracy Shi Emdoor Technology 2:15 Continuous Integration Anders Wallgren, Electric Cloud 3:00 Practical
FREE computing using Amazon EC2
FREE computing using Amazon EC2 Seong-Hwan Jun 1 1 Department of Statistics Univ of British Columbia Nov 1st, 2012 / Student seminar Outline Basics of servers Amazon EC2 Setup R on an EC2 instance Stat
Real Time Network Server Monitoring using Smartphone with Dynamic Load Balancing
www.ijcsi.org 227 Real Time Network Server Monitoring using Smartphone with Dynamic Load Balancing Dhuha Basheer Abdullah 1, Zeena Abdulgafar Thanoon 2, 1 Computer Science Department, Mosul University,
A Visualization System and Monitoring Tool to Measure Concurrency in MPICH Programs
A Visualization System and Monitoring Tool to Measure Concurrency in MPICH Programs Michael Scherger Department of Computer Science Texas Christian University Email: [email protected] Zakir Hussain Syed
IBM Software Information Management. Scaling strategies for mission-critical discovery and navigation applications
IBM Software Information Management Scaling strategies for mission-critical discovery and navigation applications Scaling strategies for mission-critical discovery and navigation applications Contents
Software Engineering I: Software Technology WS 2008/09. Integration Testing and System Testing
Software Engineering I: Software Technology WS 2008/09 Integration Testing and System Testing Bernd Bruegge Applied Software Engineering Technische Universitaet Muenchen 1 Overview Integration testing
Load Balancing in Fault Tolerant Video Server
Load Balancing in Fault Tolerant Video Server # D. N. Sujatha*, Girish K*, Rashmi B*, Venugopal K. R*, L. M. Patnaik** *Department of Computer Science and Engineering University Visvesvaraya College of
Figure 1. The cloud scales: Amazon EC2 growth [2].
- Chung-Cheng Li and Kuochen Wang Department of Computer Science National Chiao Tung University Hsinchu, Taiwan 300 [email protected], [email protected] Abstract One of the most important issues
15-418 Final Project Report. Trading Platform Server
15-418 Final Project Report Yinghao Wang [email protected] May 8, 214 Trading Platform Server Executive Summary The final project will implement a trading platform server that provides back-end support
A FUZZY BASED APPROACH TO TEXT MINING AND DOCUMENT CLUSTERING
A FUZZY BASED APPROACH TO TEXT MINING AND DOCUMENT CLUSTERING Sumit Goswami 1 and Mayank Singh Shishodia 2 1 Indian Institute of Technology-Kharagpur, Kharagpur, India [email protected] 2 School of Computer
Automated performance testing using Maven & JMeter. George Barnett, Atlassian Software Systems @georgebarnett
Automated performance testing using Maven & JMeter George Barnett, Atlassian Software Systems @georgebarnett Create controllable JMeter tests Configure Maven to create a repeatable cycle Run this build
Ensembles and PMML in KNIME
Ensembles and PMML in KNIME Alexander Fillbrunn 1, Iris AdΓ€ 1, Thomas R. Gabriel 2 and Michael R. Berthold 1,2 1 Department of Computer and Information Science UniversitΓ€t Konstanz Konstanz, Germany [email protected]
So today we shall continue our discussion on the search engines and web crawlers. (Refer Slide Time: 01:02)
Internet Technology Prof. Indranil Sengupta Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Lecture No #39 Search Engines and Web Crawler :: Part 2 So today we
Distributed Framework for Data Mining As a Service on Private Cloud
RESEARCH ARTICLE OPEN ACCESS Distributed Framework for Data Mining As a Service on Private Cloud Shraddha Masih *, Sanjay Tanwani** *Research Scholar & Associate Professor, School of Computer Science &
Broadband Networks. Prof. Dr. Abhay Karandikar. Electrical Engineering Department. Indian Institute of Technology, Bombay. Lecture - 29.
Broadband Networks Prof. Dr. Abhay Karandikar Electrical Engineering Department Indian Institute of Technology, Bombay Lecture - 29 Voice over IP So, today we will discuss about voice over IP and internet
Optimizing unit test execution in large software programs using dependency analysis
Optimizing unit test execution in large software programs using dependency analysis Taesoo Kim, Ramesh Chandra, and Nickolai Zeldovich MIT CSAIL Abstract TAO is a system that optimizes the execution of
Novel Framework for Distributed Data Stream Mining in Big data Analytics Using Time Sensitive Sliding Window
ISSN(Print): 2377-0430 ISSN(Online): 2377-0449 JOURNAL OF COMPUTER SCIENCE AND SOFTWARE APPLICATION In Press Novel Framework for Distributed Data Stream Mining in Big data Analytics Using Time Sensitive
Network congestion, its control and avoidance
MUHAMMAD SALEH SHAH*, ASIM IMDAD WAGAN**, AND MUKHTIAR ALI UNAR*** RECEIVED ON 05.10.2013 ACCEPTED ON 09.01.2014 ABSTRACT Recent years have seen an increasing interest in the design of AQM (Active Queue
A Novel Method to Defense Against Web DDoS
A Novel Method to Defense Against Web DDoS 1 Yan Haitao, * 2 Wang Fengyu, 3 Cao ZhenZhong, 4 Lin Fengbo, 5 Chen Chuantong 1 First Author, 5 School of Computer Science and Technology, Shandong University,
Software Configuration Management Best Practices for Continuous Integration
Software Configuration Management Best Practices for Continuous Integration As Agile software development methodologies become more common and mature, proven best practices in all phases of the software
The Improved Job Scheduling Algorithm of Hadoop Platform
The Improved Job Scheduling Algorithm of Hadoop Platform Yingjie Guo a, Linzhi Wu b, Wei Yu c, Bin Wu d, Xiaotian Wang e a,b,c,d,e University of Chinese Academy of Sciences 100408, China b Email: [email protected]
Understanding Neo4j Scalability
Understanding Neo4j Scalability David Montag January 2013 Understanding Neo4j Scalability Scalability means different things to different people. Common traits associated include: 1. Redundancy in the
Continuous Integration
CODING & DEVELOPMENT BORIS GORDON FEBRUARY 7 2013 Continuous Integration Introduction About me boztek on d.o. (http://drupal.org/user/134410) @boztek [email protected] 2 Introduction About you
Biomarker Discovery and Data Visualization Tool for Ovarian Cancer Screening
, pp.169-178 http://dx.doi.org/10.14257/ijbsbt.2014.6.2.17 Biomarker Discovery and Data Visualization Tool for Ovarian Cancer Screening Ki-Seok Cheong 2,3, Hye-Jeong Song 1,3, Chan-Young Park 1,3, Jong-Dae
A QoS-Aware Web Service Selection Based on Clustering
International Journal of Scientific and Research Publications, Volume 4, Issue 2, February 2014 1 A QoS-Aware Web Service Selection Based on Clustering R.Karthiban PG scholar, Computer Science and Engineering,
From Traditional Functional Testing to Enabling Continuous Quality in Mobile App Development
From Traditional Functional Testing to Enabling Continuous Quality in Mobile App Development Introduction Today s developers are under constant pressure to launch killer apps and release enhancements as
Continuous Delivery. Anatomy of the Deployment Pipeline (Free Chapter) by Jez Humble and David Farley
Continuous Delivery Anatomy of the Deployment Pipeline (Free Chapter) by Jez Humble and David Farley Copyright 2011 ThoughtWorks Inc. All rights reserved www.thoughtworks-studios.com Introduction Continuous
Facilitating Consistency Check between Specification and Implementation with MapReduce Framework
Facilitating Consistency Check between Specification and Implementation with MapReduce Framework Shigeru KUSAKABE, Yoichi OMORI, and Keijiro ARAKI Grad. School of Information Science and Electrical Engineering,
Data Outsourcing based on Secure Association Rule Mining Processes
, pp. 41-48 http://dx.doi.org/10.14257/ijsia.2015.9.3.05 Data Outsourcing based on Secure Association Rule Mining Processes V. Sujatha 1, Debnath Bhattacharyya 2, P. Silpa Chaitanya 3 and Tai-hoon Kim
The Importance of Software License Server Monitoring
The Importance of Software License Server Monitoring NetworkComputer How Shorter Running Jobs Can Help In Optimizing Your Resource Utilization White Paper Introduction Semiconductor companies typically
Cloud and Big Data Summer School, Stockholm, Aug. 2015 Jeffrey D. Ullman
Cloud and Big Data Summer School, Stockholm, Aug. 2015 Jeffrey D. Ullman 2 In a DBMS, input is under the control of the programming staff. SQL INSERT commands or bulk loaders. Stream management is important
Classification On The Clouds Using MapReduce
Classification On The Clouds Using MapReduce SimΓ£o Martins Instituto Superior TΓ©cnico Lisbon, Portugal [email protected] ClΓ‘udia Antunes Instituto Superior TΓ©cnico Lisbon, Portugal [email protected]
Performance Modeling and Analysis of a Database Server with Write-Heavy Workload
Performance Modeling and Analysis of a Database Server with Write-Heavy Workload Manfred Dellkrantz, Maria Kihl 2, and Anders Robertsson Department of Automatic Control, Lund University 2 Department of
Hardware Configuration Guide
Hardware Configuration Guide Contents Contents... 1 Annotation... 1 Factors to consider... 2 Machine Count... 2 Data Size... 2 Data Size Total... 2 Daily Backup Data Size... 2 Unique Data Percentage...
IMPLEMENTATION OF RELIABLE CACHING STRATEGY IN CLOUD ENVIRONMENT
INTERNATIONAL JOURNAL OF ADVANCED RESEARCH IN ENGINEERING AND SCIENCE IMPLEMENTATION OF RELIABLE CACHING STRATEGY IN CLOUD ENVIRONMENT M.Swapna 1, K.Ashlesha 2 1 M.Tech Student, Dept of CSE, Lord s Institute
ADAPTIVE LOAD BALANCING ALGORITHM USING MODIFIED RESOURCE ALLOCATION STRATEGIES ON INFRASTRUCTURE AS A SERVICE CLOUD SYSTEMS
ADAPTIVE LOAD BALANCING ALGORITHM USING MODIFIED RESOURCE ALLOCATION STRATEGIES ON INFRASTRUCTURE AS A SERVICE CLOUD SYSTEMS Lavanya M., Sahana V., Swathi Rekha K. and Vaithiyanathan V. School of Computing,
Processing and data collection of program structures in open source repositories
1 Processing and data collection of program structures in open source repositories JEAN PETRIΔ, TIHANA GALINAC GRBAC AND MARIO DUBRAVAC, University of Rijeka Software structure analysis with help of network
Agile Software Factory: Bringing the reliability of a manufacturing line to software development
Agile Software Factory: Bringing the reliability of a manufacturing line to software development Today s businesses are complex organizations that must be agile across multiple channels in highly competitive
Architectural patterns for building real time applications with Apache HBase. Andrew Purtell Committer and PMC, Apache HBase
Architectural patterns for building real time applications with Apache HBase Andrew Purtell Committer and PMC, Apache HBase Who am I? Distributed systems engineer Principal Architect in the Big Data Platform
Analysis of Cloud Solutions for Asset Management
ICT Innovations 2010 Web Proceedings ISSN 1857-7288 345 Analysis of Cloud Solutions for Asset Management Goran Kolevski, Marjan Gusev Institute of Informatics, Faculty of Natural Sciences and Mathematics,
Feature Factory: A Crowd Sourced Approach to Variable Discovery From Linked Data
Feature Factory: A Crowd Sourced Approach to Variable Discovery From Linked Data Kiarash Adl Advisor: Kalyan Veeramachaneni, Any Scale Learning for All Computer Science and Artificial Intelligence Laboratory
Query Optimization Approach in SQL to prepare Data Sets for Data Mining Analysis
Query Optimization Approach in SQL to prepare Data Sets for Data Mining Analysis Rajesh Reddy Muley 1, Sravani Achanta 2, Prof.S.V.Achutha Rao 3 1 pursuing M.Tech(CSE), Vikas College of Engineering and
A Framework of User-Driven Data Analytics in the Cloud for Course Management
A Framework of User-Driven Data Analytics in the Cloud for Course Management Jie ZHANG 1, William Chandra TJHI 2, Bu Sung LEE 1, Kee Khoon LEE 2, Julita VASSILEVA 3 & Chee Kit LOOI 4 1 School of Computer
Dual Mechanism to Detect DDOS Attack Priyanka Dembla, Chander Diwaker 2 1 Research Scholar, 2 Assistant Professor
International Association of Scientific Innovation and Research (IASIR) (An Association Unifying the Sciences, Engineering, and Applied Research) International Journal of Engineering, Business and Enterprise
Finding Frequent Patterns Based On Quantitative Binary Attributes Using FP-Growth Algorithm
R. Sridevi et al Int. Journal of Engineering Research and Applications RESEARCH ARTICLE OPEN ACCESS Finding Frequent Patterns Based On Quantitative Binary Attributes Using FP-Growth Algorithm R. Sridevi,*
