Friday, 4 May 2018

Raspberry Pi 3 Model B+

My Synology NAS is coming up to 10 years of age, and asking it to do all its usual functions, plus run a few solid Java apps: ... was all a bit much for its 700MHz ARM processor, and particularly its 256Mb of RAM. Jenkins was the final straw, so I was looking around for other low-power devices that could run these apps comfortably. One gigabyte of RAM being a definite requirement. My Googling came up with Raspberry Pi devices, which surprised me as I'd always considered them a little "weak" as general purpose servers, more for doing single duties or as clients.

But that was before I knew about the Raspberry Pi 3, Model B+. This little rocket boots up its Raspbian (tweaked Debian) OS in a few seconds, has 1Gb of RAM and a quad-core 1.4GHz ARM processor that does a great job with the Java workloads I'm throwing at it. And look at the thing - it's about the size of a pack of cards:
A quad-core server with 1Gb of RAM, sitting on 3TB of storage. LEGO piece for scale. I wonder what 1998-me would have made of that!

With wired and wireless Ethernet, scads of USB 3.0 ports and interesting GPIO pin possibilities, this thing is ideal for my home automation projects. And priced so affordably that (should it become necessary) running a fleet of these little guys is quite plausible. If like me, you had thought the Raspberry Pi was a bit of a toy, take another look!

Monday, 23 April 2018

Green Millhouse - Fixing the OpenHAB BroadLink Binding (part 2)

Development of the Broadlink OpenHAB binding has been going very well, but with just one scenario (and arguably the most-common one) not working reliably. With the OpenHAB server up-and-running and the Broadlink binding installed and configured for a particular disconnected device, turn the device on.

The copious logging I've added to the binding told me that this was resulting in an error with code -7 when the device was first detected, and I was a little stumped as to how to proceed. As many before me have done, I started entering random semi-related strings into Google to see what might turn up. And I hit gold. The author of the Python Broadlink library did some great work documenting the Broadlink protocol, and while perusing it a particular sentence jumped out at me:

You must obtain an authorisation key from the device before you can communicate.

Of course! When the binding first boots it calls authenticate for each device it knows about, and it does the same when a device setting is changed. But it does not when, after finding a newly-booted device, tries to get its status!

I added an authenticated boolean in the base Handler class, only setting it when authenticate() succeeds and clearing it whenever we lose contact with the device. Any attempt to communicate with the device first checks the boolean and does the authenticate() dance if necessary. And it works like a charm. We're very nearly there now - I just want to DRY up some of the payload-handling methods which seem to have been copy-pasted, add a 'presence' channel that mirrors the internal state of the binding, and remove the heavy dependence on IP addresses, and I think it's good to go.

Saturday, 31 March 2018

Green Millhouse - Fixing the OpenHAB BroadLink Binding (part 1)

You can follow along at Github, but my rebuilding of the Broadlink OpenHAB binding is nearing completion.

I've been building and testing locally with my A1 Air Quality Sensor, and since fixing some shared-state issues in the network layer, haven't yet experienced any of the reliability problems that plagued the original binding.

For reasons that aren't clear (because I'm working from a decompiled JAR file), the original binding was set up like this in the base Thing handler (which all Broadlink Things inherit from):
public class BroadlinkBaseThingHandler extends BaseThingHandler {
   private static DatagramSocket socket = null;
   static boolean commandRunning = false;
   ...
}

As soon as I saw those static members, alarms started ringing in my head - especially when combined with an inheritance model, you've got a definite "fragile base class" problem at compile-time, and untold misery at runtime when multiple subclasses start accessing the socket like it's their exclusive property!

An attempt to mitigate the race-conditions which must have abounded, the `commandRunning` boolean only complicated matters:
    public boolean sendDatagram(byte message[])
    {
        try
        {
            if(socket == null || socket.isClosed())
            {
                socket = new DatagramSocket();
                socket.setBroadcast(true);
            }
            InetAddress host = InetAddress.getByName(thingConfig.getIpAddress());
            int port = thingConfig.getPort();
            DatagramPacket sendPacket = new DatagramPacket(message, message.length, new InetSocketAddress(host, port));
            commandRunning = true;
            socket.send(sendPacket);
        }
        catch(IOException e)
        {
            logger.error("IO error for device '{}' during UDP command sending: {}", getThing().getUID(), e.getMessage());
            commandRunning = false;
            return false;
        }
        return true;
    }

    public byte[] receiveDatagram()
    {
        try {
            socket.setReuseAddress(true);
            socket.setSoTimeout(5000);
        } catch (SocketException se) {
            commandRunning = false;
            socket.close();
            return null;
        }

        if(!commandRunning) {
            logger.error("No command running - device '{}' should not be receiving at this time!", getThing().getUID());
            return null;
        }

        try
        {
            if(socket != null)
            {
                byte response[] = new byte[1024];
                DatagramPacket receivePacket = new DatagramPacket(response, response.length);
                socket.receive(receivePacket);
                response = receivePacket.getData();
                commandRunning = false;
                socket.close();
                return response;
            }
        }
        catch (SocketTimeoutException ste) {
            if(logger.isDebugEnabled()) {
                logger.debug("No further response received for device '{}'", getThing().getUID());
            }
        }

        catch(Exception e)
        {
            logger.error("IO Exception: '{}", e.getMessage());
        }

        commandRunning = false;
        return null;
    }

So we got a pseudo-semaphore that is trying to detect getting into a bad state (because of shared-state), but itself is shared-state, thereby experiencing the same unreliability.

Here's what the new code looks like:
public class BroadlinkBaseThingHandler extends BaseThingHandler {
    private DatagramSocket socket = null;
    ...

    public boolean sendDatagram(byte message[], String purpose) {
        try {
            logTrace("Sending " + purpose);
            if (socket == null || socket.isClosed()) {
                socket = new DatagramSocket();
                socket.setBroadcast(true);
                socket.setReuseAddress(true);
                socket.setSoTimeout(5000);
            }
            InetAddress host = InetAddress.getByName(thingConfig.getIpAddress());
            int port = thingConfig.getPort();
            DatagramPacket sendPacket = new DatagramPacket(message, message.length, new InetSocketAddress(host, port));
            socket.send(sendPacket);
        } catch (IOException e) {
            logger.error("IO error for device '{}' during UDP command sending: {}", getThing().getUID(), e.getMessage());
            return false;
        }
        logTrace("Sending " + purpose + " complete");
        return true;
    }

    public byte[] receiveDatagram(String purpose) {
        logTrace("Receiving " + purpose);

        try {
            if (socket == null) {
                logError("receiveDatagram " + purpose + " for socket was unexpectedly null");
            } else {
                byte response[] = new byte[1024];
                DatagramPacket receivePacket = new DatagramPacket(response, response.length);
                socket.receive(receivePacket);
                response = receivePacket.getData();
//                socket.close();
                logTrace("Receiving " + purpose + " complete (OK)");
                return response;
            }
        } catch (SocketTimeoutException ste) {
            logDebug("No further " + purpose + " response received for device");
        } catch (Exception e) {
            logger.error("While {} - IO Exception: '{}'", purpose, e.getMessage());
        }

        return null;
    }    


A lot less controversial I'd say. The key changes:
  • Each subclass instance (i.e. Thing) gets its own socket
  • No need to track commandRunning - an instance owns its socket outright
  • The socket gets configured just once, instead of being reconfigured between Tx- and Rx-time
  • Improved diagnostic logging that always outputs the ThingID, and the purpose of the call


The next phase is now stress-testing the binding with a second heterogeneous device (sadly I don't have another A1, that would be great for further tests), my RM3 Mini IR-blaster. I'll be trying adding and removing the devices at various times, seeing if I can trip the binding up. The final step will be making sure the Thing discovery process (which is the main reason to upgrade to OpenHAB 2, and is brilliant) is as good as it can be. After that, I'll be tidying up the code to meet the OpenHAB guidelines and hopefully getting this thing into the official release!

Sunday, 25 February 2018

Green Millhouse - Temp Monitoring 2 - Return of the BroadLink A1 Sensor!

So after giving up on the BroadLink A1 Air Quality Sensor a year ago, I'm delighted to report that is back in my good books after some extraordinary work from some OpenHAB contributors. Using some pretty amazing techniques they have been able to reverse-engineer the all-important crypto keys used by the Broadlink devices, thus "opening up" the protocol to API usage.

Here's the relevant OpenHAB forum post - it includes a link to a Beta-quality OpenHAB binding, which I duly installed to my Synology's OpenHAB 2 setup, and it showed itself to be pretty darn good. Both my A1 and my new BroadLink RM3 Mini (wifi-controlled remote blaster) were discovered immediately and worked great "out of the box".

However, I discovered that after an OpenHAB reboot (my Synology turns itself off each night and restarts each morning to save power) the BroadLink devices didn't come back online properly; it was also unreliable at polling multiple devices, and there were other niggly little issues identified by other forum members in the above thread. Worst of all, the original developer of the binding (one Cato Sognen) has since gone missing from the discussion, with no source code published anywhere!

Long story short, I've decided to take over the development of this binding - 90% of the work has been done, and thanks to the amazing JAD Decompiler, I was able to recover the vast majority of the source as if I'd written it myself. At the time of writing I am able to compile the binding and believe I have fixed the multiple-device problems (the code was using one shared static Socket instance and a shared mutable Boolean to try and control access to it...) and am looking at the bootup problems. And best of all, I'm doing the whole thing "in the open" over on Github - everyone is welcome to scrutinise and hopefully improve this binding, with a view to getting it included in the next official OpenHAB release.

Thursday, 25 January 2018

OpenShift - the 'f' is silent

So it's come to this.

After almost-exactly four years of free-tier OpenShift usage for Jenkins purposes, I have finally had to throw up my hands and declare it unworkable.

The first concern was earlier in 2017 when, with minimal notice, they announced the end-of-life of the OpenShift 2.0 platform which was serving me so well. Simultaneously they dropped the number of nodes available to free-tier customers from 3 to 1. A move I would have been fine with if there had been any way for me to pay them down here in Australia - a fact I lamented about almost 2 years ago.

Then, in the big "upgrade" to version 3, OpenShift disposed of what I considered to be their best feature - having the configuration of a node held under version control in Git; push a change, the node restarts with the new config. Awesome. Instead, version 3 handed us a complex new ecosystem of pods, containers, services, images, controllers, registries and applications, administered through a labyrinth of somewhat-complete and occasionally-buggy web pages. Truly a downgrade from my perspective.

The final straw was the extraordinarily fragile and flaky nature of the one-and-only node (or is it "pod"? Or "application"? I can't even tell any more) that I have running as a Jenkins master. Now this is hardly a taxing thing to run - I have a $5-per-month Vultr instance actually being a slave and doing real work - yet it seems to be unable to stay up reliably while doing such simple tasks as changing a job's configuration. It also makes "continuous integration" a bit of a joke if pushing to a repository doesn't actually end up running tests and building a new artefact because the node was unresponsive to the webhook from Github/Bitbucket. Sigh.

You can imagine how great it is to see this page when you've just hit "save" on the meticulously-detailed configuration for a brand new Jenkins job...

So, in what I hope is not a taste of things to come, I'm de-clouding my Jenkins instance and moving it back to the only "on-premises" bit of "server hardware" I still own - my Synology DS209 NAS. Stay tuned.

Friday, 29 December 2017

The best feeling in software development?

I've been spending a lot of time in Javascript-land these days and while it's pretty exciting, I sure do miss some of the luxuries that really emphasise the maturity and elegance of statically-typed Scala.
Here's an example. It's incredibly-simple, but one of the most satisfying things in software development in my opinion. First, a quick definition:
trait Insertable {
  val insertedAt: Long
} 

And now, the source of all the joy:
def findMostRecentlyInserted(xs:Set[Insertable]):Option[Insertable] = {

}

Yep, that's it. An empty function. It won't even compile.

But this moment, right here, with type signatures locked down and the cursor flashing in the right spot, is where your brain can finally shift up from boilerplate mode into full power. How am I going to get from that set of candidates to the right one? What will be the fastest way? Will it read well? How should I test this?

If I've got my "good boy" hat on I'll stop here and write a few tests, which will of course fail. But sometimes, I'll just let the Scala compiler guide me - a function this simple is a great example of where strong typing really helps prevent errors. Anyone else with me on this?

Thursday, 30 November 2017

bloop - Initial Thoughts

So the Scala Center just announced bloop - a tool completely focused on making the Scala edit/compile/test cycle as fast as possible.

This is awesome

SBT is a beast, but most of the time, its immense powers lie idle. I am totally happy to fire up one tool for checking/fetching dependencies, publishing to repositories and other such infrequent operations, and a different that is totally focused on coding.

bloop completely rings true with the UNIX philosophy (a tool should do one thing and do it well) which has time and again shown to be the best way to build systems; the key thing being composability of elements. I'm very excited about this new development which shows that Scala is truly a developer-focused language. Go Scala!