Continuous Integration in the Cloud with Hudson Kohsuke Kawaguchi Jesse Glick Sun Microsystems, Inc. Hudson committers
Rise of Continuous Integration Offload from people, push to computers $ computers us time 2
Spend more CPU power to help you even if it only helps a little Ever bloating IDEs Static code analysis tools More frequent build/test executions AKA Continuous Integration 3
Hudson http://hudson-ci.org/ Open-source CI server at java.net Emphasis on ease of installation and use Extensibility GUI for human users REST API for program users 130+ community-developed public plugins By 120+ contributors Estimated 20,000 installations 4
It basically does builds and tests Check out the source code Do builds and/or tests Java,.NET, shell script, Record results Subversion, Perforce, Git, Mercurial, CVS, Binary, test results, code coverage, static analysis Notify people E-mail, IM, RSS, tray apps, IDEs 5
Localized to 8 languages 6
Adoption in all kinds of businesses 7
Before we talk about clouds When I talk to people, they have computers Lots of them, lying around, under-utilized Just lacking software to use them effectively Let s use lots of computers effectively first Then we ll talk about EC2 8
Going distributed You need to use multiple computers because You need different environments You need isolation One computer can t keep up with all the loads 9
Distributed builds with Hudson Master Slaves Serves HTTP requests Stores all important info 170KB single jar Assumed to be unreliable Scale to at least 100 Link Single bi-di byte stream No other requirements 10
How master and slaves start talking Via sshd Master talks to sshd on a slave Send slave.jar and java -jar slave.jar SSH session becomes bi-di byte stream 11
How master and slaves start talking Via JNLP Java Web Start on slave initiates the session Hudson sends JNLP file and jar files A separate socket connection is made 12
How master and slaves start talking Once started, can be installed as Windows service 13
How master and slaves start talking Specifically for Windows Hudson speaks DCOM to talk to Windows Remotely install a service and starts it No manual intervention needed 14
Heterogeneous Cluster Challenge Your builds/tests need to run on specific environment Dependency to individual nodes hurts utilization jobs slaves Wombat Windows test Windows #1 GlassFish Windows test Windows #2 Hudson Windows test Solaris #1 Hudson Solaris test 15
Labels to rescue Label is a group of slaves Tie jobs to labels jobs slaves Wombat Windows test Windows #1 GlassFish Windows test Hudson Windows test Windows Solaris Windows #2 Solaris #1 Hudson Solaris test 16
Setting up slaves Keeping slaves look alike is a good thing General system administration tasks Network configuration Package installations for native tools Tools like Puppet or cfengine are supposed to help Install build tools in the cluster Particularly hard on heterogeneous environment Prepare tools on one file system rsync to everywhere This part of Hudson needs improvements 17
Forecasting failures Hudson monitors key health metrics of slaves Low disk space, insufficient swap Clock out of sync Extensible Slaves put offline automatically Catch problems before it breaks builds 18
Installing new slaves For first 20 or so slaves, we did it manually Insert CD, click, type, click, type, click, But that doesn t scale Then we automated 19
Automated System Installations Hudson + pxe-install plugin BOOTP proxy TFTP pxelinux Your corporate IT guy & his DHCP server Slaves Power on, hit F12 PC boots from network (PXE) Choose OS from menu Chain boot into OS Installs non-interactively 20
Automated System Installations Trivial with most Linux and Solaris Works with Windows, too Called Windows Deployment Service Needs a Windows 2003 server Vista (easy) or XP (hard) deployment Turns out quite useful outside Hudson, too No more broken CD drives No more CD-Rs 21
System Utilization Monitoring Showing about 25% utilization 22
When it s time to add more slaves There s almost always something in the queue 23
Hudson made this extensible Hudson detects excessive workload With exponential decay to filter out noise Hudson notifies plugins Plugins can provision more slaves 24
25
Amazon EC2: The Good Pay as you go (15 /h or so) Programmable API Instances launch pretty quickly (esp. Linux) EC2 instances are forgetful Good fit with Hudson Loads on Hudson tend to be spiky Tests are embarrassingly parallel At least in theory 26
Amazon EC2: The Bad Your data is still inside your firewall Takes time to check out code or to archive build artifacts Some data just can t be moved EC2 instances are forgetful Your build/test may depend on your environment Can your tests run in parallel? 27
Hudson EC2 plugin Built on top of typica* What does it do? Automatically provision slaves on EC2 on demand Pick the right AMI depending on demand Connect and install JDK on demand Shut down unused instances * http://code.google.com/p/typica/ 28
Hudson EC2 plugin usage Tell Hudson your AWS account information 29
Hudson EC2 plugin usage Tell Hudson what AMIs you want to start 30
Putting it all together # of executors capacity usage time 31
Hudson Appliance on EC2 Run the master in the cloud too, if you like Hudson on stock OpenSolaris AMI Data stored persistently in Elastic Block Storage Dynamically expandable thanks to ZFS Online, too Packaged as a wizard 32
33
34
Hudson Hadoop plugin Distributed file system Exactly two mouse clicks to install Turn every Hudson slave into a Hadoop node Automatic data replication (fail tolerant) Nice for storing old artifacts, logs, test records, Map/reduce framework Large scale test results analysis / datamining More interesting work to be done in the future 35
Selenium Grid Use Hudson slaves as Selenium RC nodes Hudson slaves Hudson master (selenium hub) 36
Why Selenium Grid & Hudson? Hudson wants a heterogeneous cluster Selenium wants that, too Centralized management Automatic selenium installation 37
Kohsuke Kawaguchi Jesse Glick kohsuke.kawaguchi@sun.com jesse.glick@sun.com http://hudson-ci.org/ 38