The Millhouse Group Blog

Monday, 4 May 2015

Strongly-Typed Time. Part 2: Design

Following on from my lightbulb moment, I tried to sketch out what I wanted from a strongly-typed system for representing timezoned instants in time.

The Look

Ever since Martin Odersky gave us Generics in Java 5, we've become comfortable with reading parameterized types like Set<String> ("Set of String") for container classes. Unsurprisingly, with the move to Odersky's Scala has come further use of parameterization; for example Try[Double] and Future[User]. Essentially, I wanted a type that looked like this. Hence:

  val pierreTime: TimeInZone[Paris]

  val johnTime: TimeInZone[Melbourne]

  val canonicalTime: TimeInZone[UTC]

Behaviour

I want wrong code to look wrong but more than that, I want the compiler to consider it wrong too:

  val pierreTime: TimeInZone[Paris]

  def doSomethingInMelbourne(mTime: TimeInZone[Melbourne]) ...

  
  // later...
  
  doSomethingInMelbourne(pierreTime)
                         ^
[error]  type mismatch;
[error]  found   : TimeInZone[Paris]
[error]  required: TimeInZone[Melbourne]

Functional Familiarity

I'm only a tiny way down the path to true functional-programming enlightenment. Hell, I've only just started looking at Scalaz, mainly thanks to eed3si9n's excellent tutorials.

But the following patterns seem pretty sensible to me:

`map` from one timezone to another

The instant-in-time is unchanged, but the type changes, and the local time changes too:

  val pierreTime: TimeInZone[Paris] 
  // Local: 2015-04-29T09:15:49.739+02:00
  // Millis (UTC): 1430291749739

  val johnTime: TimeInZone[Melbourne] = pierreTime.map[Melbourne]  
  // Local: 2015-04-29T17:15:49.739+10:00
  // Millis (UTC): 1430291749739

`transform` the time inside the container

I can make adjustments* to the DateTime contained within the TimeInZone[T] using any Joda-Time method that returns another DateTime:

 
  val pierreTime: TimeInZone[Paris]
  // Local: 2015-04-29T09:15:49.739+02:00 

  val pierreWakeUpTime: TimeInZone[Paris] = pierreTime.transform(_.withTime(7,0,0,0))
  // Local: 2015-04-29T07:00:00.000+02:00 

  val pierreLunchTime: TimeInZone[Paris] = pierreTime.transform(_.plusHours(4))
  // Local: 2015-04-29T13:15:49.739+02:00

(*) Everything is immutable (including within Joda-Time) so "adjustments" naturally result in a new object being returned. Do we have a word yet for "A modified copy of an immutable thing"?

Companion-object for construction

I should be able to get a TimeInZone[T] via its companion object for every conceivable scenario:

  // "Now" in whatever TZ my JVM is running in: 
  val myLocalTime: TimeInZone[TimeZone] = TimeInZone.now
  // -> TimeInZone[Melbourne] UTC: '2015-04-28T12:37:00.000Z'

  // "Now" in the given TZ: 
  val myParisTime: TimeInZone[TimeZone] = TimeInZone.now("Europe/Paris")
  // -> TimeInZone[Paris] UTC: '2015-04-28T12:37:23.000Z'

  // "Now" in UTC: 
  val myUTCTime: TimeInZone[UTC] = TimeInZone.nowUTC
  // -> TimeInZone[UTC] UTC: '2015-04-28T12:37:28.000Z'

  // If I pass millis, UTC is implied: 
  val myUTCTime: TimeInZone[UTC] = TimeInZone.fromUTCMillis(1430703466430)
  // -> TimeInZone[UTC] UTC: '2015-05-04T01:37:46.430Z

  // Reflective methods where the desired [TimeZone] affects the result:

  // Give me "now" on the West Coast:
  val paloAlto = TimeInZone[PST]
  // -> TimeInZone[PST] UTC: '2015-05-04T01:37:56.430Z

  // Give me "then" on the West Coast:
  val paloAltoLastWeek = TimeInZone[PST](new DateTime().minusDays(7))
  // -> TimeInZone[PST] UTC: '2015-04-28T01:37:59.430Z

Tuesday, 28 April 2015

Strongly-Typed Time. Part 1: Rationale

I've quite recently become involved in an after-hours project that has a strong temporal component to it. Basically every interaction with the system will need to be labelled with a time, and they will constantly need to be compared and converted. Add to this the fact that the first beta customers are located on opposite sides of the Pacific, and that events can occur in a further 3 European countries, and a way to safely and unambiguously represent the time of something happening in a time and a place seems paramount.

While Joda Time has undoubtedly made date/calendar/timezone manipulation a happier task for the JVM developer, I'm looking for something stronger. I can pass around a DateTime all day long (no pun intended) but until I inspect its TimeZone I can't be sure where it originated from, or if it is in fact the canonical UTC.

As a result, there is nothing at compile time to stop me doing something like:

  def displayAllEventsBefore(allEvents:Seq[Event], threshold:DateTime) = {
 
    // allEvents have been normalized to UTC. But there's no way of knowing this:
    allEvents.filter(_.isBefore(threshold)).display    
  } 

  // ... much later, miles away
  val myTime = new DateTime() // Happens to be in "Europe/Paris"

  displayAllEventsBefore(events, myTime)

Which will work just fine most of the time, except when there's been an event in the last hour, when we won't see it. Or is it the other way around? Tricky, isn't it?

There's nothing in the type system to prevent these kinds of runtime problems. It comes down to developer diligence in naming/commenting/testing all the things - literally, every thing that uses a representation of time - to ensure correctness.

But hang on, aren't compilers really, really good at ensuring correctness?

Friday, 12 December 2014

Walking away from CloudBees Part 5 - Publishing and Fine-Tuning

Publishing private artefacts to a private Nexus repository

As per my new world order diagram, I decided to use my third and final free OpenShift node as a Nexus box, and what a great move that turned out to be. Without a doubt the easiest setup of a Nexus box I've ever experienced:

Log in to OpenShift
Click theAdd Application... button
Scroll down to the Code Anything heading, and paste http://nexuscartridge-openshiftci.rhcloud.com/ into the URL textbox
Click Next, nominate the URL for the box, and wait a few minutes

Wow. More detail (if you need it) from OpenShift.

Publishing open-source artefacts to a public repository

As all of my open-source efforts are now written in Scala with SBT as the build tool, it was a simple matter to add the bintray-sbt plugin to each of them, allowing publication to BinTray, or more specifically, The Millhouse Group's little corner of it.

The only trick here was SSHing into the Jenkins Build slave (one time) and adding an ${OPENSHIFT_DATA_DIR}/.bintray/.credentials file so that an sbt publish would succeed.

Deployment of webapps to Heroku

As with most things open and/or free, someone has been here before - this blog post, together with the Heroku Jenkins Plugin README were a very good starting point for getting this all working.

In brief, the steps are:

Install the Heroku and Git Publisher Jenkins plugins
Grab your Heroku API key from your Account Settings page, and put it into Manage Jenkins -> Configure System -> Heroku -> API Key
Grab the details of the Heroku remote from your .git/config in your local repo, or from the "Git URL" in the Info on your app's Settings page on Heroku.
Set this up as an additional Git repo in your Jenkins build, and name it heroku. For safety, I like to name my other repo (i.e. the one holding the source that triggers builds) appropriately as well; it avoids confusion.
- Actual example:
- I name my source repo bitbucket
- Thus my Branch Specifier is bitbucket/master
Add a new Git Publisher Post-Build Action, that pushes to heroku/master when the build succeeds

Fine-tuning the OpenShift build setup

Having to do "Layer-8" timezone conversion when reading build logs is just annoying so put the slave node into your local time zone by navigating to (Manage Jenkins -> Manage Nodes -> Slave -> Configure icon -> Launch Method -> Advanced -> JVM Options) (phew!) and setting it to:

-Duser.home=${OPENSHIFT_DATA_DIR} -Duser.timezone="Australia/Melbourne" -XX:MaxPermSize=1M -Xmx2M -Xss128k

(You might need to consult the list of Java timezone ids)

The final pieces of the puzzle were the configuring the "final destinations" of my private artifacts (my gets sent to BinTray courtesy of the bintray-sbt plugin). Details follow.

After that, a little bit of futzing around to get auto-triggered builds working from both GitHub and BitBucket, and I had everything back to normal, or possibly, even better - I now have unlimited app slots on Heroku versus four on CloudBees - and I'm somewhat insulated from outages of a single provider. Happy!

Tuesday, 4 November 2014

Walking away from CloudBees Episode 4: A New Hope

With CloudBees leaving the free online Jenkins scene, I was unable to Google up any obvious successors. Everyone seems to want cash for builds-as-a-service. It was looking increasingly likely that I would have to press some of my own hardware into service as a Jenkins host. And then I had an idea. As it turns out, one of those cloudy providers that I had previously dismissed, OpenShift, actually is exactly what is needed here!

The OpenShift Free Tier gives you three Small "gears" (OpenShift-speak for "machine instance"), and there's even a "cartridge" (OpenShift-speak for "template") for a Jenkins master!

There are quite a few resources to help with setting up a Jenkins master on OpenShift, so I won't repeat them, but it was really very easy, and so far, I haven't had to tweak the configuration of that box/machine/gear/cartridge/whatever at all. Awesome stuff. The only trick was that setting up at least one build-slave is compulsory - the master won't build anything for you. Again, there are some good pages to help you with this, and it's nothing too different to setting up a build slave on your own physical hardware - sharing SSH keys etc.

The next bit was slightly trickier; installing SBT onto an OpenShift Jenkins build slave. This blog post gave me 95 percent of the solution, which I then tweaked to get SBT 0.13.6 from the official source. This also introduced me to the Git-driven configuration system of OpenShift, which is super-cool, and properly immutable unlike things like Puppet. The following goes in .openshift/action_hooks/start in the Git repository for your build slave, and once you git push, the box gets stopped, wiped, and restarted with the new start script. If you introduce an error in your push, it gets rejected. Bliss.

cd $OPENSHIFT_DATA_DIR
if [[ -d sbt ]]; then
  echo “SBT installed”
else
  SBT_VERSION=0.13.6
  SBT_URL="https://dl.bintray.com/sbt/native-packages/sbt/${SBT_VERSION}/sbt-${SBT_VERSION}.tgz"
  echo Fetching SBT ${SBT_VERSION} from $SBT_URL
  echo Installing SBT ${SBT_VERSION} to $OPENSHIFT_DATA_DIR
  curl -L $SBT_URL  -o sbt.tgz
  tar zxvf sbt.tgz sbt
  rm sbt.tgz
fi

The next hurdle was getting SBT to not die because it can't write into $HOME on an OpenShift node, which was fixed by setting -Duser.home=${OPENSHIFT_DATA_DIR} when invoking SBT. (OPENSHIFT_DATA_DIR is the de-facto writeable place for persistent storage in OpenShift - you'll see it mentioned a few more times in this post)

But an "OpenShift Small gear" build slave is slow and severely RAM-restricted - so much so that at first, I was getting heaps of these during my builds:

...
Compiling 11 Scala sources to /var/lib/openshift//app-root/data/workspace//target/scala-2.11/test-classes... 
FATAL: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
hudson.remoting.RequestAbortedException: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
 at hudson.remoting.RequestAbortedException.wrapForRethrow(RequestAbortedException.java:41)
 at hudson.remoting.RequestAbortedException.wrapForRethrow(RequestAbortedException.java:34)
 at hudson.remoting.Request.call(Request.java:174)
 at hudson.remoting.Channel.call(Channel.java:742)
 at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:168)
 at com.sun.proxy.$Proxy45.join(Unknown Source)
 at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:956)
 at hudson.tasks.CommandInterpreter.join(CommandInterpreter.java:137)
 at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:97)
 at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:66)
 at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
 at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:756)
 at hudson.model.Build$BuildExecution.build(Build.java:198)
 at hudson.model.Build$BuildExecution.doRun(Build.java:159)
 at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:529)
 at hudson.model.Run.execute(Run.java:1706)
 at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
 at hudson.model.ResourceController.execute(ResourceController.java:88)
 at hudson.model.Executor.run(Executor.java:232)
...

which is actually Jenkins losing contact with the build slave, because it has exceeded the 512Mb memory limit and been forcibly terminated. The fact that it did this while compiling Scala - specifically while compiling Specs2 tests - reminds me of an interesting investigation done about compile time that pointed out how Specs2's trait-heavy style blows compilation times (and I suspect, resources) out horrendously compared to other frameworks - but that is for another day!

If you are experiencing these errors on OpenShift, you can actually confirm that it is a "memory limit violation" by reading a special counter that increments when the violation occurs. Note this count never resets, even if the gear is restarted, so you just need to watch for changes.

A temporary fix for these issues seemed to be running sbt test rather than sbt clean test; obviously this was using just slightly less heap space and getting away with it, but I felt very nervous at the fragility of not just this "solution" but also of the resulting artifact - if I'm going to the trouble of using a CI tool to publish these things, it seems a bit stupid to not build off a clean foundation.

So after a lot of trawling around and trying things, I found a two-fold solution to keeping an OpenShift Jenkins build slave beneath the fatal 512Mb threshold.

Firstly, remember while a build slave is executing a job there are actually two Java processes running - the "slave communication channel" (for want of a better phrase) and the job itself. The JVM for the slave channel can safely be tuned to consume very few resources, leaving more for the "main job". So, in the Jenkins node configuration for the build slave, under the "Advanced..." button, set the "JVM Options" to:

-Duser.home=${OPENSHIFT_DATA_DIR} -XX:MaxPermSize=1M -Xmx2M -Xss128k

Secondly, set some more JVM options for SBT to use - for SBT > 0.12.0 this is most easily done by providing a -mem argument, which will force sensible values for -Xms, -Xmx and -XX:MaxPermSize. Also, because "total memory used by the JVM" can be fairly-well approximated with the equation:

Max memory = [-Xmx] + [-XX:MaxPermSize] + number_of_threads * [-Xss]

it becomes apparent that it is very important to clamp down the Stack Size (-Xss) as a Scala build/test cycle can spin up a lot of them. So each of my OpenShift Jenkins jobs now does this in an "Execute Shell":

export SBT_OPTS="-Duser.home=${OPENSHIFT_DATA_DIR} -Dbuild.version=$BUILD_NUMBER"
export JAVA_OPTS="-Xss128k"

# the -mem option will set -Xmx and -Xms to this number and PermGen to 2* this number
../../sbt/bin/sbt -mem 128 clean test

This combination seems to work quite nicely in the 512Mb OpenShift Small gear.

Saturday, 1 November 2014

Walking away from Run@Cloud Part 3: Pause and Reflect

As a happy free-tier CloudBees user, my "build ecosystem" looked like this:

As CloudBees seem to have gone "Enterprise" in the worst possible way (from my perspective) and don't have any free offerings any more, I was now looking for:

Git repository hosting (for private repos - my open-source stuff is on GitHub)
A private Nexus instance to hold closed-source library artifacts
A public Nexus instance to hold open-source artifacts for public consumption
A "cloud" Jenkins instance to build both public- and private-repo-code when it changes;
- pushing private webapps to Heroku
- publishing private libs to the private Nexus
- pushing open-source libs to the public Nexus

... and all for as close to $0 as possible. Whew!

I did a load of Googling, and the result of this is an ecosystem that is far more "diverse" (a charitable way to say "dog's breakfast") but still satisfies all of the above criteria, and it's all free. More detail in blog posts to come, but here's what I've come up with:

Tuesday, 28 October 2014

Walking away from Run@Cloud. Part 2: A Smooth Transition

So, having selected Heroku as my new runtime platform, how to move my stuff on there?

On the day of their announcement, Cloudbees provided an FAQ and a Migration Guide for their current customers.

In addition, Heroku most considerately have a CloudBees-to-Heroku migration guide (updated on the day of the CloudBees announcement, nice).

Setting up on Heroku proved delightfully simple, and with a git push heroku master from my machine, my first app was "migrated". Up and running, and actually (according to my simple metrics) responding more quickly than when it was hosted on CloudBees. Epic win, amirite?

Well, not entirely. The git push deploy method is all very well, but I dislike the implied trust it puts in the "pusher". How does anybody know what is in that push? Does it pass the tests? Does it even compile? When CloudBees was my end-to-end platform, I had the whole CI/CD chain thing happening so only verified, test-passing code actually made it through the gate. But Heroku doesn't offer such a thing - they just run what you push to them.

Well, if CloudBees wants to become the cloud Jenkins instance, and they continue to have a free offering, I will continue to use it. So let's get CloudBees building and testing my stuff, and then fire it over to Heroku to run it, all from a Jenkins instance on CloudBees.

Oh dear. CloudBees are no longer offering a free Jenkins service.

Back to the drawing-board!

Wednesday, 8 October 2014

Walking away from Run@Cloud. Part 1: Finding A Worthy Successor

In very disappointing news, last month CloudBees announced that they would be discontinuing their Run@Cloud service, a facility I have been happily using for a number of years.

I was using a number of CloudBees services, namely:

Dev@Cloud Repos - Git repositories
Dev@Cloud Builds - Jenkins in the cloud
Run@Cloud Apps - PaaS for hosted Java/Scala apps
Run@Cloud Database Service - for some MySQL instances
MongoHQ/Compose.io ecosystem service - MongoDB in the cloud

In short, a fair bit of stuff:

... the best bit being of course, that it was all free. So now I'm hunting for a new place for most, if not all, of this stuff to live, and run, for $0.00 or as close to it as conceivably possible. And before you mention it, I've done the build-it-yourself, host-it-yourself thing enough times to know that I do not ever want to do it again. It's most definitely not free for a start.

After a rather disappointing and fruitless tour around the block, it seemed there was only one solution that encompassed what I consider to be a true "run in the cloud" offering, for JVM-based apps, at zero cost. Heroku.