Wednesday 27 April 2016

Happy and Healthy Heterogeneous Build Slaves in Jenkins

After moving off the CloudBees platform, one thing that quickly became apparent was that an OpenShift Jenkins build slave simply runs out of resources when asked to build moderately-complex Scala software, on two fronts - the 500Mb RAM hard limit is quickly hit during SBT builds (particularly during tests) and the 1Gb of disk space is also very limiting once a few dependencies have been pulled into the Ivy cache.

So a second slave was brought on line - my old Dell Inspiron 9300 laptop from 2006 - which (after an upgrade to 2Gb of RAM for a handful of dollars online) has done a sterling job. Running Ubuntu 14.04 Desktop edition seems to not tax the Intel Pentium M too badly, and it seemed crazy to get rid of that amazing 17" 1920x1200 screen for a pittance on eBay. Now at this point I had two slaves on line, with highly different capabilities.
Horses for Courses
The OpenShift node (slave1) has low RAM, slow CPU, very limited persistent storage but exceptionally quick network access (being located in a datacenter somewhere on the US East Coast), while the laptop (slave2) has a reasonable amount of RAM, moderate CPU, tons of disk but relatively slow transfer rates to the outside world, via ADSL2 down here in Australia. How to deal with all these differences when running jobs that could be farmed out to either node?

The solution is of course the classic layer of indirection that allows the different boxes to be addressed consistently. Here is the configuration for my slave1 Redhat box on OpenShift:


Note the -mem argument in the SBT_COMMAND which sets the -Xmx and -Xms to this number and PermGen to 2* this number, keeping a lid on resource usage. And here's slave2, the Ubuntu laptop, with no such restriction needed:


And here's what a typical build job looks like:


Caring for Special-needs Nodes
Finally, my disk-challenged slave1 node gets a couple of Jenkins jobs to tend to it. The first periodically runs a git gc in each .git directory under the Jenkins workspace (as per a Stack Overflow answer) - it runs quota before-and-after to show how much (if anything) was cleared up:



The second job periodically removes the target directory wherever it is found - SBT builds leave a lot of stuff in here that can really add up. Here's what it looks like: