Testing Automation for Distributed Applications By Isabel Drost-Fromm, Software Engineer, Elastic

Testing Automation for Distributed Applications By Isabel Drost-Fromm, Software Engineer, Elastic The challenge When building distributed, large-scale applications, quality assurance (QA) gets increasingly complicated. The term QA often brings about the image of testers manually checking a piece of software for correctness prior to release. The problem with the test-before-release approach is obvious: Each code modification done between releases can potentially lead to errors. The more time that passes before these errors are uncovered, the more likely changes will accumulate until their discovery, making it harder and more time consuming to identify which change caused the wrong behaviour (or slowed processing down). When releasing software that is supposed to be integrated into many different environments like Elasticsearch, tests need to cover as many configurations as possible. So how do we ensure quality checks are run often enough to speed up failure discovery while keeping the runtime of the whole test suite low enough for developers to be able to run them on their local development environment? Our solution Elastic employs an automated, multi-step QA process that helps uncover issues as soon as possible with as little manual work as possible. The testing deployment we use internally can serve as a good starting point for informing your own application-specific design. Developers connecting through the Java API can benefit tremendously by using our testing framework described below. The goal of our testing framework is to check Elasticsearch performance in scenarios that approximate how it will be used in production by downstream users. Automation infrastructure The performance of a search engine like Elasticsearch (with Apache Lucene as a foundation) is heavily influenced by memory availability and disk I/O performance. Due to the distributed nature of the system, network I/O is a third factor that must be taken into account. The goal of our testing framework is to check Elasticsearch performance in scenarios that approximate how it will be used in production by downstream users. Typical deployments of Elasticsearch run on either or in the cloud. From Elasticsearch version 1.2 onwards, a large number of different Java virtual machine (JVM) versions and settings within various operating systems can be tested. As this amounts to a large number of test runs we manage all our automated builds with Jenkins 1. In order to be able to quickly and repeatedly set up the machine taking care of this management, we employ Puppet 2 for configuration management. 1 http://jenkins-ci.org/ 2 http://puppetlabs.com/ 1

As our test suite grows we are able to add an increasing number of scenarios to our jobs. While initially only feature completeness was covered, we ve now added support for backwards compatibility testing. Table 1 below gives a brief overview of architectures and operating systems on which our tests run. Additionally, for each release we check a multitude of JVM versions and configurations (e.g., running with different garbage collection implementations). Tested Branch Tested OS Specific Tests Environment Random Testing Enabled Static Analysis Enabled 1.1 Ubuntu 3 Elasticsearch (ES) cloud instances unit test 1.2 instances 1.3 instances 1.4 1.x ES Cloud unit test and, REST compatibility Backwards Compatibility Coverage check and static analysis Master Windows unit test and, Coverage check and static analysis Table 1. A summary of implemented test architectures and operating systems Elasticsearch is being checked against continuously. 3 We keep up with the current stable (as in officially supported by the vendor ) version of each operating system. 2

Similar tests are run for each feature that takes longer than a few days to implement. For each commit, a series of tests are executed: Smoke testing checks whether the code compiles at all, and results in a version that one can connect to and install plugins in. During the second stage, Java-level integration and unit tests check individual Elasticsearch features. In addition to these commit-triggered builds, there is a set of continuously running builds on different hardware configurations, operating systems, JVM versions, and configurations. Plugins and specific clients tend to be tested on a commit-by-commit basis. One additional goal when testing clients is to check whether these clients can connect to Elasticsearch via the REST API. Levels of Elasticsearch testing Elasticsearch testing is done in multiple stages. For each user-facing REST API endpoint, there is an API specification that is also used to define REST-based tests. Figure 1. Elasticsearch Testing Schematic. Elasticsearch testing is performed in multiple stages (Java unit, Integration, REST, and backwards compatibility tests) across multiple levels (REST API, Java client API, and Elasticsearch core). 4 http://www.elastic.co/guide/en/elasticsearch/reference/current/testing-framework.html 3

To ensure backwards compatibility, we also run special Elasticsearch backwards compatibility tests. Those tests are implemented in much the same way integration testing is done except that queries are not sent to uniform clusters but to clusters with mixed version nodes. Of course, at the lowest level, traditional unit tests ensure that individual methods and algorithms are implemented correctly (Figure 1). Testing coverage is monitored regularly with line coverage values typically well over 70% on average and over 90% for crucial parts of the implementation, although these values tend to slightly underestimate testing coverage due to our test randomisation. When using the Elasticsearch Java Testing Framework, the exact same test randomizations from Apache Lucene s test suite are applied, enabling downstream users to enjoy the same benefits. A note on randomized testing With automated testing in place, most of the test definitions are written by developers themselves instead of dedicated test engineers. Experience shows that developers tend to be unintentionally conservative when defining test specifications, often missing many edge cases. In the case of Elasticsearch, Java-level testing has been alleviated to some extent by heavily relying on the randomized testing 5 framework as introduced by Apache Lucene 6 : Instead of defining tests with just a couple of input parameter settings and configurations, developers are expected to define the range of allowed parameters and settings. On each test run, a different valid combination is then chosen automatically. Given enough runs, this increases the number of code paths covered tremendously. In case of test failure, the user is supplied with the seed value of the test configuration, ensuring reproducibility by making it possible to re-run the test with the exact same configuration. When using the Elasticsearch Java Testing Framework, the exact same test randomizations from Apache Lucene s test suite are applied, enabling downstream users to enjoy the same benefits. However, we do not only employ randomized testing at the Java unit test level. In integration tests, configuration settings for Elasticsearch clusters to start are permuted as well, meaning each test is run, for example, against clusters with different numbers of data or master nodes. Additionally, the general runtime environment is subject to change as well. One popular example being Locale, which is chosen as default on each test run. As a result, the code becomes more resilient to configuration choices and runtime environment varieties. 5 http://labs.carrotsearch.com/randomizedtesting.html 6 http://lucene.apache.org 4

The same technique is applied to permute deployment parameters like JVM version and configuration options. As a result, we are able to give clear recommendations on which JVM version to use with a variety of configurations. For instance, in the past, the Apache Lucene committers (many of whom are also employed by Elastic) uncovered several bugs in the JVM and the Java compiler that could have led to index corruption unless fixed by Oracle 7. When setting up test servers, Elasticsearch users can apply these exact same randomizations to harden their own software. All scripts and development tools needed have been published online 8. 7 https://www.youtube.com/watch?v=pvrdlyqguxe 8 https://github.com/elastic/elasticsearch/blob/master/dev-tools/build_randomization.rb 5