Testing Automation for Distributed Applications By Isabel Drost-Fromm, Software Engineer, Elastic



Similar documents
2) Xen Hypervisor 3) UEC

Decomposition into Parts. Software Engineering, Lecture 4. Data and Function Cohesion. Allocation of Functions and Data. Component Interfaces

Apache Jakarta Tomcat

ADAM 5.5. System Requirements

Configuring Apache Derby for Performance and Durability Olav Sandstå

Waratek Cloud VM for Java. Technical Architecture Overview

ITG Software Engineering

DevOps with Containers. for Microservices

Continuous integration End of the big bang integration era

Architecting ColdFusion For Scalability And High Availability. Ryan Stewart Platform Evangelist

Tomcat Tuning. Mark Thomas April 2009

Delivering Quality in Software Performance and Scalability Testing

FleSSR Project: Installing Eucalyptus Open Source Cloud Solution at Oxford e- Research Centre

Monitis Project Proposals for AUA. September 2014, Yerevan, Armenia

XTM Web 2.0 Enterprise Architecture Hardware Implementation Guidelines. A.Zydroń 18 April Page 1 of 12

WINDOWS AZURE EXECUTION MODELS

Automated deployment of virtualization-based research models of distributed computer systems

Reference Architecture and Best Practices for Virtualizing Hadoop Workloads Justin Murray VMware

depl Documentation Release depl contributors

Build Automation for Mobile. or How to Deliver Quality Apps Continuously. Angelo Rüggeberg

CHAPTER 1 - JAVA EE OVERVIEW FOR ADMINISTRATORS

An Oracle White Paper November Oracle Real Application Clusters One Node: The Always On Single-Instance Database

The Definitive Guide To Docker Containers

Continuous Integration (CI) for Mobile Applications

Adding scalability to legacy PHP web applications. Overview. Mario Valdez-Ramirez

TESTING AND OPTIMIZING WEB APPLICATION S PERFORMANCE AQA CASE STUDY

The Definitive Guide to Cloud Acceleration

How To Set Up Wiremock In Anhtml.Com On A Testnet On A Linux Server On A Microsoft Powerbook 2.5 (Powerbook) On A Powerbook 1.5 On A Macbook 2 (Powerbooks)

Oracle Database Security and Audit

Automated testing for Mobility New age applications require New age Mobility solutions

BIRT Application and BIRT Report Deployment Functional Specification

White paper: Unlocking the potential of load testing to maximise ROI and reduce risk.

Using Cloud Services for Test Environments A case study of the use of Amazon EC2

Oracle Platform as a Service (PaaS) FAQ

The Virtualization Practice

Performance TesTing expertise in case studies a Q & ing T es T

How Liferay Is Improving Quality Using Hundreds of Jenkins Servers

The Defense RESTs: Automation and APIs for Improving Security

Oracle WebLogic Server 11g Administration

CloudCmp:Comparing Cloud Providers. Raja Abhinay Moparthi

Learning More About Load Testing

How AWS Pricing Works

Getting started with API testing

Effective Java Programming. efficient software development

SwiftStack Filesystem Gateway Architecture

WEBAPP PATTERN FOR APACHE TOMCAT - USER GUIDE

Put a Firewall in Your JVM Securing Java Applications!

The Evolution of Load Testing. Why Gomez 360 o Web Load Testing Is a

ORACLE DATA SHEET KEY FEATURES AND BENEFITS ORACLE WEBLOGIC SERVER STANDARD EDITION

Learning GlassFish for Tomcat Users

vrealize Hyperic Supported Configurations and System Requirements

Supported Platforms HPE Vertica Analytic Database. Software Version: 7.2.x

CSE 373: Data Structure & Algorithms Lecture 25: Programming Languages. Nicki Dell Spring 2014

SSO Plugin. Release notes. J System Solutions. Version 3.6

HP OO 10.X - SiteScope Monitoring Templates

Testing. Chapter. A Fresh Graduate s Guide to Software Development Tools and Technologies. CHAPTER AUTHORS Michael Atmadja Zhang Shuai Richard

automated acceptance testing of mobile apps

Big Data Analytics - Accelerated. stream-horizon.com

HDFS Cluster Installation Automation for TupleWare

Analyzing large flow data sets using. visualization tools. modern open-source data search and. FloCon Max Putas

Savanna Hadoop on. OpenStack. Savanna Technical Lead

MAGENTO HOSTING Progressive Server Performance Improvements

Symantec Control Compliance Suite Standards Manager

Storage XenMotion: Live Storage Migration with Citrix XenServer

Migration and Building of Data Centers in IBM SoftLayer with the RackWare Management Module

Comparing Microsoft SQL Server 2005 Replication and DataXtend Remote Edition for Mobile and Distributed Applications

Table of Contents Introduction and System Requirements 9 Installing VMware Server 35

The ROI of Test Automation

Configuring Nex-Gen Web Load Balancer

Log management with Logstash and Elasticsearch. Matteo Dessalvi

BASICS OF SCALING: LOAD BALANCERS

ORACLE INSTANCE ARCHITECTURE

Cloud Based Application Architectures using Smart Computing

FileNet System Manager Dashboard Help

Image Area. White Paper. Best Practices in Mobile Application Testing. - Mohan Kumar, Manish Chauhan.

Achieving business benefits through automated software testing. By Dr. Mike Bartley, Founder and CEO, TVS

SIGMOD RWE Review Towards Proximity Pattern Mining in Large Graphs

Best Practices for Using MySQL in the Cloud

Cloud on TEIN Part I: OpenStack Cloud Deployment. Vasinee Siripoonya Electronic Government Agency of Thailand Kasidit Chanchio Thammasat University

Oracle Exam 1z0-599 Oracle WebLogic Server 12c Essentials Version: 6.4 [ Total Questions: 91 ]

Energy Efficient MapReduce

Implement Hadoop jobs to extract business value from large and varied data sets

PASS4TEST 専 門 IT 認 証 試 験 問 題 集 提 供 者

Jonathan Worthington Scarborough Linux User Group

Deploying Business Virtual Appliances on Open Source Cloud Computing

America s Most Wanted a metric to detect persistently faulty machines in Hadoop

White Paper. Cloud Native Advantage: Multi-Tenant, Shared Container PaaS. Version 1.1 (June 19, 2012)

LOAD BALANCING IN CLOUD COMPUTING USING PARTITIONING METHOD

Transcription:

Testing Automation for Distributed Applications By Isabel Drost-Fromm, Software Engineer, Elastic The challenge When building distributed, large-scale applications, quality assurance (QA) gets increasingly complicated. The term QA often brings about the image of testers manually checking a piece of software for correctness prior to release. The problem with the test-before-release approach is obvious: Each code modification done between releases can potentially lead to errors. The more time that passes before these errors are uncovered, the more likely changes will accumulate until their discovery, making it harder and more time consuming to identify which change caused the wrong behaviour (or slowed processing down). When releasing software that is supposed to be integrated into many different environments like Elasticsearch, tests need to cover as many configurations as possible. So how do we ensure quality checks are run often enough to speed up failure discovery while keeping the runtime of the whole test suite low enough for developers to be able to run them on their local development environment? Our solution Elastic employs an automated, multi-step QA process that helps uncover issues as soon as possible with as little manual work as possible. The testing deployment we use internally can serve as a good starting point for informing your own application-specific design. Developers connecting through the Java API can benefit tremendously by using our testing framework described below. The goal of our testing framework is to check Elasticsearch performance in scenarios that approximate how it will be used in production by downstream users. Automation infrastructure The performance of a search engine like Elasticsearch (with Apache Lucene as a foundation) is heavily influenced by memory availability and disk I/O performance. Due to the distributed nature of the system, network I/O is a third factor that must be taken into account. The goal of our testing framework is to check Elasticsearch performance in scenarios that approximate how it will be used in production by downstream users. Typical deployments of Elasticsearch run on either or in the cloud. From Elasticsearch version 1.2 onwards, a large number of different Java virtual machine (JVM) versions and settings within various operating systems can be tested. As this amounts to a large number of test runs we manage all our automated builds with Jenkins 1. In order to be able to quickly and repeatedly set up the machine taking care of this management, we employ Puppet 2 for configuration management. 1 http://jenkins-ci.org/ 2 http://puppetlabs.com/ 1

As our test suite grows we are able to add an increasing number of scenarios to our jobs. While initially only feature completeness was covered, we ve now added support for backwards compatibility testing. Table 1 below gives a brief overview of architectures and operating systems on which our tests run. Additionally, for each release we check a multitude of JVM versions and configurations (e.g., running with different garbage collection implementations). Tested Branch Tested OS Specific Tests Environment Random Testing Enabled Static Analysis Enabled 1.1 Ubuntu 3 Elasticsearch (ES) cloud instances unit test 1.2 instances 1.3 instances 1.4 1.x ES Cloud unit test and, REST compatibility Backwards Compatibility Coverage check and static analysis Master Windows unit test and, Coverage check and static analysis Table 1. A summary of implemented test architectures and operating systems Elasticsearch is being checked against continuously. 3 We keep up with the current stable (as in officially supported by the vendor ) version of each operating system. 2

Similar tests are run for each feature that takes longer than a few days to implement. For each commit, a series of tests are executed: Smoke testing checks whether the code compiles at all, and results in a version that one can connect to and install plugins in. During the second stage, Java-level integration and unit tests check individual Elasticsearch features. In addition to these commit-triggered builds, there is a set of continuously running builds on different hardware configurations, operating systems, JVM versions, and configurations. Plugins and specific clients tend to be tested on a commit-by-commit basis. One additional goal when testing clients is to check whether these clients can connect to Elasticsearch via the REST API. Levels of Elasticsearch testing Elasticsearch testing is done in multiple stages. For each user-facing REST API endpoint, there is an API specification that is also used to define REST-based tests. Figure 1. Elasticsearch Testing Schematic. Elasticsearch testing is performed in multiple stages (Java unit, Integration, REST, and backwards compatibility tests) across multiple levels (REST API, Java client API, and Elasticsearch core). 4 http://www.elastic.co/guide/en/elasticsearch/reference/current/testing-framework.html 3

To ensure backwards compatibility, we also run special Elasticsearch backwards compatibility tests. Those tests are implemented in much the same way integration testing is done except that queries are not sent to uniform clusters but to clusters with mixed version nodes. Of course, at the lowest level, traditional unit tests ensure that individual methods and algorithms are implemented correctly (Figure 1). Testing coverage is monitored regularly with line coverage values typically well over 70% on average and over 90% for crucial parts of the implementation, although these values tend to slightly underestimate testing coverage due to our test randomisation. When using the Elasticsearch Java Testing Framework, the exact same test randomizations from Apache Lucene s test suite are applied, enabling downstream users to enjoy the same benefits. A note on randomized testing With automated testing in place, most of the test definitions are written by developers themselves instead of dedicated test engineers. Experience shows that developers tend to be unintentionally conservative when defining test specifications, often missing many edge cases. In the case of Elasticsearch, Java-level testing has been alleviated to some extent by heavily relying on the randomized testing 5 framework as introduced by Apache Lucene 6 : Instead of defining tests with just a couple of input parameter settings and configurations, developers are expected to define the range of allowed parameters and settings. On each test run, a different valid combination is then chosen automatically. Given enough runs, this increases the number of code paths covered tremendously. In case of test failure, the user is supplied with the seed value of the test configuration, ensuring reproducibility by making it possible to re-run the test with the exact same configuration. When using the Elasticsearch Java Testing Framework, the exact same test randomizations from Apache Lucene s test suite are applied, enabling downstream users to enjoy the same benefits. However, we do not only employ randomized testing at the Java unit test level. In integration tests, configuration settings for Elasticsearch clusters to start are permuted as well, meaning each test is run, for example, against clusters with different numbers of data or master nodes. Additionally, the general runtime environment is subject to change as well. One popular example being Locale, which is chosen as default on each test run. As a result, the code becomes more resilient to configuration choices and runtime environment varieties. 5 http://labs.carrotsearch.com/randomizedtesting.html 6 http://lucene.apache.org 4

The same technique is applied to permute deployment parameters like JVM version and configuration options. As a result, we are able to give clear recommendations on which JVM version to use with a variety of configurations. For instance, in the past, the Apache Lucene committers (many of whom are also employed by Elastic) uncovered several bugs in the JVM and the Java compiler that could have led to index corruption unless fixed by Oracle 7. When setting up test servers, Elasticsearch users can apply these exact same randomizations to harden their own software. All scripts and development tools needed have been published online 8. 7 https://www.youtube.com/watch?v=pvrdlyqguxe 8 https://github.com/elastic/elasticsearch/blob/master/dev-tools/build_randomization.rb 5