Thursday, 18 August 2016

Deep-diving into the Google pb= embedded map format

Building off the excellent work of Andrew Whitby, I wanted to go further in understanding this unusual format. Specifically because when trying to parse the lat-long of the marker out of it, the least couple of significant digits were always "off", and frustratingly, by a seemingly-random amount.

Let's take a look at a Google "Embed map" URL for a random lat-long. You can obtain one of these by clicking a random point on a Google Map, then clicking the lat-long hyperlink on the popup that appears at the bottom of the page. From there the map sidebar swings out; choose Share -> Embed map - that's your URL.
"https://www.google.com/maps/embed?pb=
!1m18!1m12!1m3!1d3152.8774048836685
!2d145.01352231578036
!3d-37.792912740624445!2m3!1f0!2f0!3f0!3m2!1i1024!2i768
!4f13.1!3m3!1m2!1s0x0%3A0x0!2zMzfCsDQ3JzM0LjUiUyAxNDXCsDAwJzU2LjYiRQ
!5e0!3m2!1sen!2sau!4v1471218824160"
Well, it's not pretty, but with the help of Andrew Whitby's cheat sheet and the comments from others, it turns out we can actually render it as a nested structure knowing that the format [id]m[n] means a structure (multi-field perhaps?) with n children in total - my IDE helped a lot here with indentation:
"https://www.google.com/maps/embed?pb=" +
    "!1m18" +
        "!1m12" +
            "!1m3" +
                "!1d3152.8774048836685" +
                "!2d145.01352231578036" +
                "!3d-37.792912740624445" +
            "!2m3" +
                "!1f0" +
                "!2f0" +
                "!3f0" +
            "!3m2" +
                "!1i1024" +
                "!2i768" +
            "!4f13.1" +
        "!3m3" +
            "!1m2" +
                "!1s0x0%3A0x0" +
                "!2zMzfCsDQ3JzM0LjUiUyAxNDXCsDAwJzU2LjYiRQ" +
        "!5e0" +
    "!3m2" +
        "!1sen" +
        "!2sau" +
    "!4v1471218824160"
It all (kinda) makes sense! You can see how a decoder could quite easily be able to count ! characters to decide that a bang-group (or could we call it an m-group?) has finished. I'm going to take a stab and say the e represents an enumerated type too - given that !5e0 is "roadmap" (default) mode and !5e1 forces "satellite" mode.

So this is all very well but it doesn't explain why URLs that I generate using the standard method don't actually put the lat-long I selected into the URL - yet they render perfectly! What do I mean? Well, the lat-long that I clicked on (i.e. the marker) for this example is actually:
               -37.792916,  145.015722
And yet in the URL it appears (kinda) as:
               -37.792912,  145.013522
Which is enough to be slightly, visibly, annoyingly, wrong if you're trying to use it as-is by parsing the URL. What I thought I needed to understand now was this section of the URL:

                "!1d3152.8774048836685" +
                "!2d145.01352231578036" +
                "!3d-37.792912740624445" +

Being the "scale" and centre points of the map. Then I realised - it's quite subtle, but for (one presumes) aesthetic appeal, Google doesn't put the map marker in the dead-centre of the map. So these co-ordinates are just the map centre. The marker itself is defined elsewhere. And there's only one place left. The mysterious z field:
   !2zMzfCsDQ3JzM0LjUiUyAxNDXCsDAwJzU2LjYiRQ
Sure enough, substituting the z-field from Mr. Whitby's example completely relocates the map to put the marker in the middle of Iowa. So now; how to decode this? Well, on a hunch I tried base64decode-ing it, and bingo:
% echo MzfCsDQ3JzM0LjUiUyAxNDXCsDAwJzU2LjYiRQ | base64 --decode
37°47'34.5"S 145°00'56.6"E
So there we have it. I can finally parse out the lat-long of the marker when given an embed URL. Hope it helps someone else out there...

Saturday, 30 July 2016

Vultr Jenkins Slave GO!

I was alerted to the existence of VULTR on Twitter - high-performance compute nodes at reasonable prices sounded like a winner for Jenkins build boxes. After the incredible flaming-hoop-jumping required to get OpenShift Jenkins slaves running (and able to complete builds without dying) it was a real pleasure to have the simplicity of root-access to a Debian (8.x/Jessie) box and far-higher limits on RAM.

I selected a "20Gb SSD / 1024Mb" instance located in "Silicon Valley" for my slave. Being on the opposite side of the US to my OpenShift boxes feels like a small, but important factor in preventing total catastrophe in the event of a datacenter outage.

Setup Steps

(All these steps should be performed as root):

User and access

Create a jenkins user:
addgroup jenkins
adduser jenkins --ingroup jenkins

Now grab the id_rsa.pub from your Jenkins master's .ssh directory and put it into /home/jenkins/.ssh/authorized_keys. In the Jenkins UI, set up a new set of credentials corresponding to this, using "use a file from the Jenkins master .ssh". (which by the way, on OpenShift will be located at /var/lib/openshift/{userid}/app-root/data/.ssh/jenkins_id_rsa).
I like to keep things organised, so I made a vultr.com "domain" container and then created the credentials inside.

Install Java

echo "deb http://ppa.launchpad.net/webupd8team/java/ubuntu xenial main" | tee /etc/apt/sources.list.d/webupd8team-java.list
echo "deb-src http://ppa.launchpad.net/webupd8team/java/ubuntu xenial main" | tee -a /etc/apt/sources.list.d/webupd8team-java.list
apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv-keys EEA14886
apt-get update
apt-get install oracle-java8-installer

Install SBT

apt-get install apt-transport-https
echo "deb https://dl.bintray.com/sbt/debian /" | tee -a /etc/apt/sources.list.d/sbt.list
apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 642AC823
apt-get update
apt-get install sbt

More useful bits

Git
apt-get install git
NodeJS
curl -sL https://deb.nodesource.com/setup_4.x | bash -
apt-get install nodejs

This machine is quite dramatically faster and has twice the RAM of my usual OpenShift nodes, making it extra-important to have those differences defined per-node instead of hard-coded into a job. One thing I was surprised to have to define was a memory limit for the JVM (via SBT's -mem argument) as I was getting "There is insufficient memory for the Java Runtime Environment to continue" errors when letting it choose its own upper limit. For posterity, here are the environment variables I have configured for my Vultr slave:

Wednesday, 29 June 2016

ReactJS - Early Thoughts

It is very-much a recruiter's term but I, like many who started out with "back-end" skills, am endeavouring to be a proper Full Stack Developer.

I have always been comfortable with HTML, and first applied a CSS rule (inline) waaay back in the Netscape 3.0 days:
  <a href="..." style="text-decoration: none;">Link</a>
I'm pretty sure there were HTML frames involved there too. Good times. Anyway, the Javascript world had always been a little too wild-and-crazy for me to get deeply into; browser support sucked, the "standard library" kinda sucked, and people had holy wars about how to structure even the simplest code.

Up to about 2015, I'd been happy enough with the basic tools - JQuery for DOM-smashing and AJAX, Underscore/Lodash for collection-manipulation, and bringing in Bootstrap's JS library for a little extra polish. The whole thing based on full HTML being rendered from a traditional multi-tiered server, ideally written in Scala. I got a lot done this way.

I had a couple of brushes with Angular (1.x) along the way and didn't really see the point; the Angular code was always still layered on top of "perfectly-good" HTML from the server. It was certainly better-structured than the usual JQuery mess but the hundreds of extra kilobytes to be downloaded just didn't seem to justify themselves.

Now this year, I've been working with a Real™ Front End project - that is, one that stands alone and consumes JSON from a back-end. This is using Webpack, Babel, ES6, ReactJS and Redux as its principal technologies. After 6 weeks of working on this, here are some of my first thoughts:
  • Good  ES6 goes a long way to making Javascript feel grown-up
  • Bad The whole Webpack-Babel-bundling thing feels really rough - so much configuration, so hard to follow
  • Good React-Hot-Reloading is super, super-nice. Automatic browser reloads that keep state are truly magic
  • Bad You can still completely, silently toast a React app with a misplaced comma ...
  • Good  ... but ESLint will probably tell you where you messed up
  • Bad It's still Javascript, I miss strong typing ...
  • Good  ... but React PropTypes can partly help to ensure arguments are there and (roughly) the right type
  • Good  Redux is a really neat way of compartmentalising state and state transitions - super-important in front-ends
  • Good  The React Way of flowing props down through components really helps with code structure


So yep, there were more Goods than Bads - I'm liking React, and I'm finally feeling like large apps can be built with JavaScript, get that complete separation between back- and front-ends, and be maintainable and practical for multiple people to work on concurrently. Stay tuned for more!

Thursday, 12 May 2016

Cloudy Continuous Integration Part 2 - Trigger-Happy

In Part 1 of this highly-sporadic series, I specified No Polling as a must-have for your build box. I have seen countless examples where otherwise-great toolchains are let down by dumb polling on behalf of the build server. Even worse is when a heap of jobs all go polling at the same time (e.g. a */5 * * * * cron expression or similar), resulting in terrible load spiking, and unfair (and possibly even wrong) build order.

Why do otherwise-excellent and smart engineers end up doing the kind of dumb polling in Jenkins that would keep them up at night if it was their code? Mainly because historically, it's been substantially harder to get properly-triggered job execution going in Jenkins. But things are getting better. The Jenkins GitHub Plugin does a terrific job of simplifying triggering, thanks to its convention-over-configuration approach - once you've nominated where your GitHub repo is, getting triggering is as simple as checking a box. Lovely.



Now, finally, it seems BitBucket have almost caught up in this regard. Naturally, as they offer (free) private repositories, there is a little bit more configuration required on the SCM side, but I can confirm that in May 2016, it works. There seem to have been a lot of changes going on under the hood at BitBucket, and the reliability of their triggering has suffered from week-to-week at times, but hopefully things will be solid now.

The 2016 Way to trigger Jenkins from BitBucket
  1. Firstly, there is now no need to configure a special user for triggering purposes
  2. Install the Jenkins BitBucket plugin. For your reference, I have 1.1.5
  3. In jobs that you want to be triggered, note there is a new "BitBucket" option under Build Triggers. You want this. If it was checked, uncheck the "polling" option and feel clean
  4. That's the Jenkins side done. Now flip to your BitBucket repo, and head to the Settings
  5. Under Integrations -> Webhooks add a new one, and fill it out something like this:
    Where jjj.rhcloud.com is your (in this case imaginary-OpenShift) Jenkins URL.
  6. Make sure you've included that trailing slash, and then you're done! Push some code to test
One of the nicest features of this Webhook-powered way of triggering is that you can actually view the details of each request and the response that came back from your Jenkins. This was completely opaque when using the Services integration that was until-recently the best option.
Hat-Tips to the following (but sadly outdated) bloggers

Wednesday, 27 April 2016

Happy and Healthy Heterogeneous Build Slaves in Jenkins

After moving off the CloudBees platform, one thing that quickly became apparent was that an OpenShift Jenkins build slave simply runs out of resources when asked to build moderately-complex Scala software, on two fronts - the 500Mb RAM hard limit is quickly hit during SBT builds (particularly during tests) and the 1Gb of disk space is also very limiting once a few dependencies have been pulled into the Ivy cache.

So a second slave was brought on line - my old Dell Inspiron 9300 laptop from 2006 - which (after an upgrade to 2Gb of RAM for a handful of dollars online) has done a sterling job. Running Ubuntu 14.04 Desktop edition seems to not tax the Intel Pentium M too badly, and it seemed crazy to get rid of that amazing 17" 1920x1200 screen for a pittance on eBay. Now at this point I had two slaves on line, with highly different capabilities.
Horses for Courses
The OpenShift node (slave1) has low RAM, slow CPU, very limited persistent storage but exceptionally quick network access (being located in a datacenter somewhere on the US East Coast), while the laptop (slave2) has a reasonable amount of RAM, moderate CPU, tons of disk but relatively slow transfer rates to the outside world, via ADSL2 down here in Australia. How to deal with all these differences when running jobs that could be farmed out to either node?

The solution is of course the classic layer of indirection that allows the different boxes to be addressed consistently. Here is the configuration for my slave1 Redhat box on OpenShift:


Note the -mem argument in the SBT_COMMAND which sets the -Xmx and -Xms to this number and PermGen to 2* this number, keeping a lid on resource usage. And here's slave2, the Ubuntu laptop, with no such restriction needed:


And here's what a typical build job looks like:


Caring for Special-needs Nodes
Finally, my disk-challenged slave1 node gets a couple of Jenkins jobs to tend to it. The first periodically runs a git gc in each .git directory under the Jenkins workspace (as per a Stack Overflow answer) - it runs quota before-and-after to show how much (if anything) was cleared up:



The second job periodically removes the target directory wherever it is found - SBT builds leave a lot of stuff in here that can really add up. Here's what it looks like:

Wednesday, 16 March 2016

Building reusable, type-safe Twirl components

I've been doing quite a lot of work on a Play Framework 2.4.x app recently, a hit upon a little problem that others have noted as well. I'm trying to make the view layer as nice a place to be as the "main" codebase - after all, it's all Scala - and so I'm extracting out anything re-usable into a components package.

Here's a simple example. I'm using Bootstrap (of course), and I'm using the table-striped class to add a little bit of interest to tabular data. The setup of an HTML table is quite verbose and definitely doesn't need to be repeated, so I started with the following basic structure:
@(items:Seq[_], headings:Seq[String] = Nil)
  <table class="table table-striped">
      @if(headings.nonEmpty) {
      <thead>
          <tr>
            @for(heading <- headings) {
                <th>@heading</th>
            }
          </tr>
      </thead>
      }
      <tbody>
        @for(item <- items) {
            <tr>
                ???
            </tr>
          }
        }
      </tbody>
  </table>
Which neatens up the call-site from 20-odd lines to one:
  @stripedtable(userList, Seq("Name", "Age")

Except. How do I render each row in the table body? That differs for every use case!
What I really wanted was to be able to map over each of the items, applying some client-provided function to render a load of <td>...</td> cells for each one. Basically, I wanted stripedtable to have this signature:
@(items:Seq[T], headings:Seq[String] = Nil)(fn: T => Html)
With the body simply being:
   @for(item <- items) {
      <tr>
        @fn(item)
      </tr>
   }
and client code looking like this:
  @stripedtable(userList, Seq("Name", "Age") { user:User =>
    <td>@user.name</td><td>@user.age</td>
  }
...aaaaand we have a big problem. At least at time of writing, Twirl templates cannot be given type arguments. So those [T]'s just won't work. Loosening off the types like this:
@(items:Seq[_], headings:Seq[String] = Nil)(fn: Any => Html)
will compile, but the call-site won't work because the compiler has no idea that the _ and the Any are referring to the same type. Workaround solutions? There are two, depending on how explosively you want type mismatches to fail:
Option 1: Supply a case as the row renderer
  @stripedtable(userList, Seq("Name", "Age") { case user:User =>
    <td>@user.name</td><td>@user.age</td>
  }
This works fine, as long as every item in userList is in fact a User - if not, you get a big fat MatchError.
Option 2: Supply a case as the row renderer, and accept a PartialFunction
The template signature becomes:
@(items:Seq[_],hdgs:Seq[String] = Nil)(f: PartialFunction[Any, Html])
and we tweak the body slightly:
   @for(item <- items) {
      @if(fn.isDefinedAt(item)) {
        <tr>
          @fn(item)
        </tr>
      }
   }
In this scenario, we've protected ourselves against type mismatches, and simply skip anything that's not what we expect. Either way, I can't currently conceive of a more succinct, reusable, and obvious way to drop a consistently-built, styled table into a page than this:
  @stripedtable(userList, Seq("Name", "Age") { case user:User =>
    <td>@user.name</td><td>@user.age</td>
  }

Friday, 4 March 2016

Unbreaking the Heroku Jenkins Plugin

TL;DR: If you need a Heroku Jenkins Plugin that doesn't barf when you Set Properties, here you go.

CI Indistinguishable From Magic

I'm extremely happy with my OpenShift-based Jenkins CI setup that deploys to Heroku. It really does do the business, and the price simply cannot be beaten.

Know Thy Release

Too many times, at too many workplaces, I have faced the problem of trying to determine Is this the latest code? from "the front end". Determined not to have this problem in my own apps, I've been employing a couple of tricks for a few years now that give excellent traceability.

Firstly, I use the nifty sbt-buildinfo plugin that allows build-time values to be injected into source code. A perfect match for Jenkins builds, it creates a Scala object that can then be accessed as if it contained hard-coded values. Here's what I put in my build.sbt:

buildInfoSettings

sourceGenerators in Compile <+= buildInfo

buildInfoKeys := Seq[BuildInfoKey](name, version, scalaVersion, sbtVersion)

// Injected via Jenkins - these props are set at build time: 
buildInfoKeys ++= Seq[BuildInfoKey](
  "extraInfo" -> scala.util.Properties.envOrElse("EXTRA_INFO", "N/A"),
  "builtBy"   -> scala.util.Properties.envOrElse("NODE_NAME", "N/A"),
  "builtAt"   -> new java.util.Date().toString)

buildInfoPackage := "com.themillhousegroup.myproject.utils"
The Jenkins Wiki has a really useful list of available properties which you can plunder to your heart's content. It's definitely well worth creating a health or build-info page that exposes these.

Adding Value with the Heroku Jenkins Plugin

Although Heroku works spectacularly well with a simple git push, the Heroku Jenkins Plugin adds a couple of extra tricks that are very worthwhile, such as being able to place your app into/out-of "maintenance mode" - but the most pertinent here is the Heroku: Set Configuration build step. Adding this step to your build allows you to set any number of environment variables in the Heroku App that you are about to push to. You can imagine how useful this is when combined with the sbt-buildinfo plugin described above!

Here's what it looks like for one of my projects, where the built Play project is pushed to a test environment on Heroku:

Notice how I set HEROKU_ENV, which I then use in my app to determine whether key features (for example, Google Analytics) are enabled or not.

Here are a couple of helper classes that I've used repeatedly (ooh! time for a new library!) in my Heroku projects for this purpose:

import scala.util.Properties

object EnvNames {
  val DEV   = "dev"
  val TEST  = "test"
  val PROD  = "prod"
  val STAGE = "stage"
}

object HerokuApp {
  lazy val herokuEnv = Properties.envOrElse("HEROKU_ENV", EnvNames.DEV)
  lazy val isProd = (EnvNames.PROD == herokuEnv)
  lazy val isStage = (EnvNames.STAGE == herokuEnv)
  lazy val isDev = (EnvNames.DEV == herokuEnv)
 
  def ifProd[T](prod:T):Option[T] = if (isProd) Some(prod) else None

  def ifProdElse[T](prod:T, nonProd:T):T = {
    if (isProd) prod else nonProd
  }
}

... And then it all went pear-shaped

I had quite a number of Play 2.x apps using this Jenkins+Heroku+BuildInfo arrangement to great success. But then at some point (around September 2015 as far as I can tell) the Heroku Jenkins Plugin started throwing an exception while trying to Set Configuration. For the benefit of any desperate Google-trawlers, it looks like this:

  at com.heroku.api.parser.Json.parse(Json.java:73)
  at com.heroku.api.request.releases.ListReleases.getResponse(ListReleases.java:63)
  at com.heroku.api.request.releases.ListReleases.getResponse(ListReleases.java:22)
  at com.heroku.api.connection.JerseyClientAsyncConnection$1.handleResponse(JerseyClientAsyncConnection.java:79)
  at com.heroku.api.connection.JerseyClientAsyncConnection$1.get(JerseyClientAsyncConnection.java:71)
  at com.heroku.api.connection.JerseyClientAsyncConnection.execute(JerseyClientAsyncConnection.java:87)
  at com.heroku.api.HerokuAPI.listReleases(HerokuAPI.java:296)
  at com.heroku.ConfigAdd.perform(ConfigAdd.java:55)
  at com.heroku.AbstractHerokuBuildStep.perform(AbstractHerokuBuildStep.java:114)
  at com.heroku.ConfigAdd.perform(ConfigAdd.java:22)
  at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
  at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:761)
  at hudson.model.Build$BuildExecution.build(Build.java:203)
  at hudson.model.Build$BuildExecution.doRun(Build.java:160)
  at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:536)
  at hudson.model.Run.execute(Run.java:1741)
  at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
  at hudson.model.ResourceController.execute(ResourceController.java:98)
  at hudson.model.Executor.run(Executor.java:374)
Caused by: com.heroku.api.exception.ParseException: Unable to parse data.
  at com.heroku.api.parser.JerseyClientJsonParser.parse(JerseyClientJsonParser.java:24)
  at com.heroku.api.parser.Json.parse(Json.java:70)
  ... 18 more 
Caused by: org.codehaus.jackson.map.JsonMappingException: Can not deserialize instance of java.lang.String out of START_OBJECT token
 at [Source: [B@176e40b; line: 1, column: 473] (through reference chain: com.heroku.api.Release["pstable"])
  at org.codehaus.jackson.map.JsonMappingException.from(JsonMappingException.java:160)
  at org.codehaus.jackson.map.deser.StdDeserializationContext.mappingException(StdDeserializationContext.java:198)
  at org.codehaus.jackson.map.deser.StdDeserializer$StringDeserializer.deserialize(StdDeserializer.java:656)
  at org.codehaus.jackson.map.deser.StdDeserializer$StringDeserializer.deserialize(StdDeserializer.java:625)
  at org.codehaus.jackson.map.deser.MapDeserializer._readAndBind(MapDeserializer.java:235)
  at org.codehaus.jackson.map.deser.MapDeserializer.deserialize(MapDeserializer.java:165)
  at org.codehaus.jackson.map.deser.MapDeserializer.deserialize(MapDeserializer.java:25)
  at org.codehaus.jackson.map.deser.SettableBeanProperty.deserialize(SettableBeanProperty.java:230)
  at org.codehaus.jackson.map.deser.SettableBeanProperty$MethodProperty.deserializeAndSet(SettableBeanProperty.java:334)
  at org.codehaus.jackson.map.deser.BeanDeserializer.deserializeFromObject(BeanDeserializer.java:495)
  at org.codehaus.jackson.map.deser.BeanDeserializer.deserialize(BeanDeserializer.java:351)
  at org.codehaus.jackson.map.deser.CollectionDeserializer.deserialize(CollectionDeserializer.java:116)
  at org.codehaus.jackson.map.deser.CollectionDeserializer.deserialize(CollectionDeserializer.java:93)
  at org.codehaus.jackson.map.deser.CollectionDeserializer.deserialize(CollectionDeserializer.java:25)
  at org.codehaus.jackson.map.ObjectMapper._readMapAndClose(ObjectMapper.java:2131)
  at org.codehaus.jackson.map.ObjectMapper.readValue(ObjectMapper.java:1481)
  at com.heroku.api.parser.JerseyClientJsonParser.parse(JerseyClientJsonParser.java:22)
  ... 19 more 
Build step 'Heroku: Set Configuration' marked build as failure
Effectively, it looks like Heroku has changed the structure of their pstable object, and that the baked-into-a-JAR definition of it (Map<String, String> in Java) will no longer work.

Open-Source to the rescue

Although the Java APIs for Heroku have been untouched since 2012, and indeed the Jenkins Plugin itself was announced deprecated (without a suggested replacement) only a week ago, fortunately the whole shebang is open-source on Github so I took it upon myself to download the code and fix this thing. A lot of swearing, further downloading of increasingly-obscure Heroku libraries and general hacking later, and not only is the bug fixed:
- Map<String, String> pstable;
+ Map<String, Object> pstable;
But there are new tests to prove it, and a new Heroku Jenkins Plugin available here now. Grab this binary, and go to Manage Jenkins -> Manage Plugins -> Advanced -> Upload Plugin and drop it in. Reboot Jenkins, and you're all set.