Thursday, 6 December 2018

Green Millhouse: OK Google, turn on the living room air conditioner

My Broadlink RM3 IR blaster has been working pretty well, so I thought I'd share how I've been using it with a couple of IR-controlled air conditioners in my house, to control them with the Google Home Assistant via OpenHAB.

The RM3 sits in a little niche that has line-of-sight to both these devices (a Daikin in the living room, and a Panasonic in the dining area). Using the RM-Bridge Android app, the fun2code website and the method I've documented over on the OpenHAB forums), I've learnt the ON and OFF codes for each device, and put them into transforms/broadlink.map:
PANASONIC_AIRCON_ON=D3ADB33F...

PANASONIC_AIRCON_OFF=13371337...

DAIKIN_AIRCON_ON=F00F00FAAFAA...

DAIKIN_AIRCON_OFF=F00D00FAAFAA...

The "basic" way of invoking the commands is by using a String Item for your IR-blaster device, and a Switch in your sitemap, like this:

items/remotecontrollable.items:
String RM3_MINI {channel="broadlink:rm3:34-ea-34-58-9d-5b:command"}

sitemaps/default.sitemap:
sitemap default label="My house" {
  Frame label="Aircon" {
    Switch
      item=RM3_MINI 
      label="Dining Area"
      mappings=[PANASONIC_AIRCON_ON="On", PANASONIC_AIRCON_OFF="Off"]
  }
  ...
}

Which gives you this:

... which is completely fine for basic UI control. But if you want the Google Home Assistant (aka "OK Google") to be able to operate it, it won't work. The reason for this is that the Switchable trait that you have to give the item can only take the simple values ON and OFF, not a string like PANASONIC_AIRCON_ON. So while it *might* work if you named your remote control commands ON and OFF, you're hosed if you want to add a second switchable device.
The best solution I could find was to set up a second Item, which is literally the most basic Switch you can have. You can also give it a label that makes it easier to remember and say when commanding your Google Home device. You then use a rule to issue the "command" to match the desired item's state. I'll demonstrate the difference by configuring the Living Room aircon in this Google-friendly way:

items/remotecontrollable.items:
String RM3_MINI {channel="broadlink:rm3:34-ea-34-58-9d-5b:command"}
Switch AC_LIVING_ONOFF "Living Room Air Conditioner" [ "Switchable" ] 

Notice the "label" in the Switch is the one that will be used for Google voice commands.

rules/remotecontrollable.rules:
rule "Translate Living Room ON/OFF to aircon state"
when
  Item AC_LIVING_ONOFF changed 
then 
  val isOn = (AC_LIVING_ONOFF.state.toString() == "ON")
  val daikinState = if(isOn) "DAIKIN_AIRCON_ON" else "DAIKIN_AIRCON_OFF"
  RM3_MINI.sendCommand(daikinState)
end

This rule keeps the controlling channel (RM3_MINI) in line with the human-input channel (the OpenHAB UI or the Google Home Assistant input). Finally the sitemap:

sitemaps/default.sitemap:
sitemap default label="My house" {
  Frame label="Aircon" {
    Switch item=AC_LIVING_ONOFF label="Living Room On/Off" 
  }
  ...
}

I quite like the fact that the gory detail of which command to send (the DAIKIN_AIRCON_ON stuff) is not exposed in the sitemap by doing it this way. You also get a nicer toggle switch as a result:



The final step is to ensure the AC_LIVING_ONOFF item is being exposed via the MyOpenHAB connector service, which in turn links to the Google Assistant integration. And now I can say "Hey Google, turn on the living room air conditioner" and within 5 seconds it's spinning up.

Wednesday, 7 November 2018

Green Millhouse - The Broadlink Binding part 3

Apologies to people who were interested in my Broadlink binding for OpenHAB 2.0 ... I migrated my code to the 2.4.0 family and things apparently stopped working. I was short on spare time so assumed something major had changed in the OpenHAB binding API and didn't have time to investigate. Turns out it was actually my own stupid fault, refactoring to clean up the code, I'd managed to cut out the "happy path" that actually updated OpenHAB with values from the device <facepalm />.

Anyway, back into it now with a couple of extra things rectified as well:
2.4.0-BETA-2.jar
This version also keeps track of the ScheduledFuture that is used to periodically poll the Broadlink device, and correctly calls cancel() on it if the Broadlink binding is getting disposed. This should put an end to those
...the handler was already disposed
errors flooding the logs.

2.4.0-BETA-3.jar
This version finally addresses the "reconnect" issues; i.e. when a device unexpectedly falls off the network, re-authentication with it would, more often than not, fail. The fix was to reset most of the state associated with the device once we lose contact with it. In particular, the packet counter, deviceId and deviceKey variables that we obtained from the previous authentication response.

As a result of this, my A1 Environmental Sensor seems to be working in all scenarios. Next up is long-term robustness testing while sorting out device discovery, making sure my RM3 Mini works, and checking the multiple heterogeneous devices will all play nicely together.

2.4.0-BETA-4.jar
This version reflects my testing with my RM3 Mini IR blaster. This device seems more sensitive to certain parameters in the authentication procedure - and it turned out that resetting the packet counter (as introduced in BETA-3) is not needed and can actually cause re-authentication to fail for this device. So there's actually less code in this release, but more functionality - which is always nice.
Device discovery is also working now too. From the OpenHAB Paper UI, go to
Configuration -> Things -> '+' -> Broadlink Binding and wait 10 seconds for the network broadcast to complete. If you have several Broadlink devices active on your network it may take a couple of scans to pick them all up.

2.4.0-BETA-5.jar
Small refactors for code cleanliness, e.g. grouping all networking functions together in NetworkUtils.java. Also added specific polling code for the SP3 smart switch device; prior to now this was using the same polling code as an SP2 device, but based on comments on this post from Jorg, it has its own function now, with extra diagnostic logging to hopefully pinpoint what is going on. I suspect that it doesn't reply in the same way as an SP2, so this is an "investigative" release; expect BETA-6 to follow with (hopefully) fixed support for the SP3. I may have to resort to buying one myself if we can't sort it out over the internet ;-)

2.4.0-BETA-6.jar
Fixed misidentification of an SP3 switch as an SP2 (appears to have been a typo in the original JAR). After investigating Jorg's incorrect polling issue, I took a look over at the python-broadlink module code for inspiration, and lo and behold, there was more going on there. This module seems to have had a lot of love (42 contributors!) and seems to have addressed all the quirks of these devices, so I had no hesitations in implementing the changes; namely that testing payload[4] == 1 is not sufficient. The Python code also checks against 3 and 0xFD. Looks like the LS bit of that byte is the one that matters, but if this works, that's fine!

2.4.0-BETA-7.jar
Added a ton more weird-and-wonderful variants of the RM2, courtesy of the aforementioned Python library. Broadlink seem to be constantly updating their device names (RM2, RM2 Pro Plus, RM2 Pro Plus R1, RM2 Pro Plus 2, etc etc), giving a new identification code to each one. I can't fathom a method to their coding scheme, so for the moment, we just have to play catch-up. In the case where we can't identify what seems to be a Broadlink device, I'm now logging (at ERROR level) the identification code we found, so that people out there can feature-request the support for their device.

2.4.0-BETA-8.jar
With thanks to excellent Github contributor FreddyFox, querying SP2/SP3 switch state should now be working. The old code attempted to decode the state of the switch from the encrypted payload, when of course it needs to be decrypted first. Thanks again FreddyFox!

2.4.0-BETA-9.jar
The main feature of this version is support for Dynamic IP Addresses.
There is a new switch available under the advanced Thing properties (open the SHOW MORE area in PaperUI) - it’s marked “Static IP” and it defaults to ON. If you are on a network where your Broadlink device might periodically be issued a different IP address, move it to the OFF position.
Now, if we lose communications with this Thing, rather than mark it OFFLINE, we’ll go into a mini-Discovery mode, and attempt to find the device on the network again, using its MAC address (which never changes). If we find the device, we update its current IP address in its Thing config, and everything continues without a hiccup. If we fail to find it, it goes OFFLINE as normal.
I should give credit to the LIFX binding which uses a very similar scheme - by chance I happened to come across it and realised it would be great for Broadlink devices too. As an extra bonus, it prompted me to redesign the entire discovery system and make it asynchronous; as a result it is now MUCH more reliable at finding devices (particularly if you have multiple Broadlink devices on your network) without having to scan repeatedly.

2.4.0-BETA-10.jar
This version fixes a few small issues:
  • RM3 devices can now have a polling frequency specified (this was an omission from the thing-types.xml configuration file)
  • Thing logging extracted to its own class and improved: Device status ONLINE/OFFLINE/undetermined shown in logs as ^/v/?
  • Each ThingHandler's network socket is now explicitly closed when we lose contact with the device. This seems to help subsequent reconnections.


2.4.0-BETA-11.jar
This version attempts to improve general reliability. The Broadlink network protocol always acknowledges every command sent to a device, so logically, the sendPacket() and receivePacket() functions have been coalesced to sendAndReceivePacket(). This in turn allowed for a simple retry mechanism to be implemented. If we time out waiting for a response from the device, we immediately retry, sending the packet again. Together with some improved logging, this should hopefully be enough to fix (or at least understand) devices prematurely being marked "offline" on unreliable networks.

Here's the latest JAR anyway, put it into your addons directory and let me know in the comments how it goes for you: https://dl.bintray.com/themillhousegroup/generic/org.openhab.binding.broadlink-2.4.0-BETA-11.jar

Friday, 24 August 2018

Building a retro game with React.js. Part 5 - Fill 'er up

By now I had a good portion of the "player's part" of the game complete. Movement around the game area, drawing lines, detecting intersection with existing lines and preventing illegal movement were all done, using the div-based "drawing" that I was familiar with. But now it was time to draw the filled colour block that results when a closed area has been completed. And I was looking at a wall of complexity that I didn't know how to scale.



The answer was to challenge my own comfort zone. Allow myself to quote, myself:
I could do that with an HTML canvas, but I'm (possibly/probably wrongly) not going to use a canvas. I'm drawing divs dammit!
After all, if ever there was a time and place to learn how to do something completely new, isn't it a personal "fun" project? After a quick scan around the canvas-in-React landscape I settled upon React-konva, which has turned out to be completely awesome for my needs, as evidenced by the amount of code changes actually needed to go from my div-based way of drawing a line to the Konva way:
The old way:

const Line = styled('div')`
  position: absolute;
  border: 1px solid ${FrenzyColors.CYAN};
  box-sizing: border-box;
  padding: 0;
`;

renderLine = (line, lineIndex) => {
  const [top, bottom] = minMax(line[2], line[4]);
  const [left, right] = minMax(line[1], line[3]);
  const height = (bottom - top) + 1;
  const width = (right - left) + 1;
  return <Line key={`line-${lineIndex}`}
                  style={{ top: `${top}px`, 
                           left: `${left}px`,
                           width: `${width}px`,
                           height: `${height}px` }} />;
}

The new way:

import { Line } from 'react-konva';

renderLine = (line, lineIndex) => {
  return (
    <Line
      key={`line-${lineIndex}`}
      points={line.slice(1)}
      stroke={FrenzyColors.CYAN}
      strokeWidth={2}
    />
  );
}

... and the jewel in the crown? Here's how simple it is to draw a closed polygon with the correct fill colour; by total fluke, my "model" for lines and polygons is virtually a one-to-one match with the Konva Line object, making this amazingly straightforward:
renderFilledArea = (filledPolygon) => {
  const { polygonLines, color } = filledPolygon;
  const flatPointsList = polygonLines.reduce((acc, l) => {
    const firstPair = l.slice(1, 3);
    return acc.concat(firstPair);
  }, []);

  return (
    <Line
      points={flatPointsList}
      stroke={FrenzyColors.CYAN}
      strokeWidth={2}
      closed
      fill={color}
    />
  );
}

Time Tracking
Probably at around 20 hours of work now.

Monday, 16 July 2018

Building a retro game with React.js. Part 4 - Drawing the line somewhere

In the previous instalment of this series, I was able to get the player's "sprite" (actually just a div with a border!) to move around the existing lines on the edge of the screen. The next logical step is to allow the player to draw their own lines, which, upon joining at both ends to existing lines, will become part of the "navigable" world the player can manoeuvre through.

Bugs Galore
It was at this point where I started being plagued by off-by-one errors; it seemed everywhere I turned I was encountering little one-pixel gaps when drawing lines, because:
  • My on-screen lines are actually 2px wide
  • My line-drawing function was doing an incorrect length calculation (had to do (right - left) + 1)
  • I was not updating my position at the right time, so was storing my "old" position as the current line's end point; and;
  • I was naively using setState and expecting the new this.state to be visible immediately

My solution to almost all of these problems (with the exception of the UI line-drawing function) was to write a heap of unit tests; these generally flushed things out pretty quickly.

Writing the line-drawing function was a weird experience. Virtually every software development "environment" I've ever used before, from BBC Basic on my Acorn Electron on, has had a function like drawLine(startX, startY, endX, endY);. And I could do that with an HTML canvas, but I'm (possibly/probably wrongly) not going to use a canvas. I'm drawing divs dammit! Here's what my function looks like:
renderLine = (line, lineIndex) => {
  const [top, bottom] = minMax(line[2], line[4]);
  const [left, right] = minMax(line[1], line[3]);
  const height = (bottom - top) + 1;
  const width = (right - left) + 1;
  return <Line key={`line-${lineIndex}`}
                  style={{ top: `${top}px`, 
                           left: `${left}px`,
                           width: `${width}px`,
                           height: `${height}px` }} />;
}
Where minMax is a simple function that returns [min, max] of its inputs, and Line is a React-Emotion styled div:
const Line = styled('div')`
  position: absolute;
  border: 1px solid ${FrenzyColors.CYAN};
  box-sizing: border-box;
  padding: 0;
`;
Notice that I resisted the temptation to pass the top, left etc into the Line wrapper. The reason for this is that doing so results in a whole new CSS class being created, and getting applied to the line, every time one of these computed values changes. This seems wasteful when most of the line's attributes remain the same! So I use an inline style to position the very-thin divs where I need it.
Time Tracking
Up to about 12 hours by my rough estimate.

Friday, 15 June 2018

Building a retro game with React.js. Part 3 - I Like To Move It

So with most of the graphical pieces in position, it's time to make things move around.

Again, starting with the easy stuff, I wanted the four directional keys to move the Player around. But in Frenzy, you can only move (as opposed to draw) along the boundaries of the game area and on lines you have already drawn. So if we look at my first iteration of the code in GameArea to handle a request to move the Player left, it's something like this:
 
update = () => {
  if (this.keyListener.isDown(this.keyListener.LEFT)) {
    this.moveLeft();
  }
};

moveLeft = () => {
  if (this.canMove(Direction.LEFT)) {
    this.setState({
       playerX : this.state.playerX -1
    });
  }
}
I ended up bundling quite a lot of smarts into the Direction enumeration in order to make the logic less "iffy" and more declarative. That one Direction.LEFT key encapsulates everything that is needed to check whether a) the player is on a line that has the correct orientation (horizontal) and b) there is room on that line to go further to the left.
A line looks like this:
[Orientation.HORIZONTAL, 0, 0, 478, 0], // startX, startY, endX, endY
and Direction looks like this:
export const Direction = {
  LEFT: {
    orientation: Orientation.HORIZONTAL,
    primaryCoord: (x, y) => y,
    lineToPrimaryCoord: (line) => line[2],
    secondaryCoord: (x, y) => x,
    testSecondary: (c, line) => c > Math.min(line[1], line[3])
  },
  ...
}

My test for whether I can move in a certain direction is:
static canPlayerMoveOnExistingLine = (playerX, playerY, direction, lines) => {
  const candidates = lines.filter(line => {
    return (line[0] === direction.orientation)
  });
    
  const pri = direction.primaryCoord(playerX, playerY);
  const primaryLines = candidates.filter(candidateLine => {
    return direction.lineToPrimaryCoord(candidateLine) === pri;
  });

  if (primaryLines.length > 0) {
    const sec = direction.secondaryCoord(playerX, playerY);
    const found = primaryLines.find(line => {
      return direction.testSecondary(sec, line);
    });

    return typeof found !== 'undefined';
  }
  return false;
} 
Declared static for ease of testing - easy and well worth doing for something like this where actually moving the player around is time-consuming and tedious. It's working well as it stands, although as we all know, naming things is hard. It's pretty easy to follow the process though. At this point I'm holding a lines array in this.state and doing filter and find operations on it as you can see above. We'll have to wait and see whether that will be fast enough. It may well be a good idea to keep a currentLine in state, as most of the time it will be unchanged from the last player movement. Next up, it's time to start drawing some new lines on the screen!

Kudos
I am starting to build up some tremendous respect for the original author of this game; although often dismissed as "very simple" there are some tricky little elements to coding this game and I'm only just scratching the surface. To achieve the necessary performance on an 8-bit, 1MHz processor with RAM measured in the handfuls of kilobytes is super impressive. Assembly language would have been necessary for speed, making the development and debugging a real pain. I haven't even started thinking about how to do the "fill" operation once a line has been drawn and it encloses some arbitrary space, but I suspect the original developer "sniffed" the graphics buffer to see what was at each pixel location - a "luxury" I don't think I'll have!
Time Tracking
Up to about 6 hours now.

Monday, 11 June 2018

Building a retro game with React.js. Part 2 - Screens

Continuing to nibble my way towards the hard parts, I've finished the basic designs of the welcome page, main game page and the paused-game page, including wiring up the key listeners so it all works as expected.

One thing I hadn't thought about was the behaviour of listening to the P key to pause and also resume. The first iteration of the code would flicker madly between the game page and the paused page whenever the P key was pressed - the game loop runs so fast that the app would transition multiple times between the two pages, because the isDown test would simply toggle the state of this.state.paused. I messed around for a while with storing the UNIX time when the key was first pressed, and comparing it to the current time, before realising I really just needed to know what "direction" was being requested, and then waiting for this "request" to finish before completing the transition.
...

debounce = (testCondition, key, newState) => {
  if (testCondition) {
    if (this.keyListener.isDown(key)) {
      // No. They are still holding it
      return true;
    }
    this.setState(newState);
   }
   return false;
};
 
handleKeysWhilePaused = () => {
  if (this.debounce(this.state.pauseRequested, P, {
    pauseRequested: false,
    paused: true
  })) {
    return;
  }

  if (this.debounce(this.state.unpauseRequested,P, {
    unpauseRequested: false,
    paused: false
  })) {
    return;
  }

  if (this.keyListener.isDown(Q)) {
    this.props.onGameExit();
  }

  if (this.keyListener.isDown(P)) {
    this.setState({unpauseRequested: true});
  }
}

handleGameplayKeys = () => {
  if (this.keyListener.isDown(P)) {
    this.setState({pauseRequested: true});
  }
}

update = () => {
  if (this.state.pauseRequested || this.state.paused) {
    this.handleKeysWhilePaused();
  } else {
    this.handleGameplayKeys();
  }
};
...
Effectively I've implemented an isUp(), which is surprisingly not in the react-game-kit library. Oh well, it's all good learning.
Time tracking
I'd estimate the total time spent on the game so far would be 3-4 hours.

Friday, 8 June 2018

A New Old Thing; Building a retro game with React.js. Part 1 - Background and Setup

I've blogged before about entering the fast-paced world of React.js, after a couple of years I'm still (on the whole) enjoying my day job working with it. Over this period React has done a pretty good job of delivering the "maintainable large JavaScript application" promise, but in the apps I built we've seen a few problems that stemmed from our developers' differing levels of experience with design patterns, immutability concepts, higher-order functions and higher-order components.

At the risk of being immodest, I'm comfortable with those concepts - Design Patterns from waaaay back and the functional paradigms from my five-year (and counting) love affair with Scala. What I wanted to explore was - what would happen if I built a React app by myself, endeavouring to write the cleanest, purest software based upon the best starting-point we currently have? How productive could I be? How long would it take to build a working full app? How would maintenance go? How quickly could I add a new feature?

As my day job is basically building CRUD apps, I wanted to do something a lot more fun for this side-project. And what could be more fun than a game? (Mental note: ask people working at Electronic Arts...) There's also a pleasing circularity in building a game and documenting how I did it - back in my earliest days of having a computer, aged about 7, I would buy magazines with program listings and laboriously enter them, line-by-line, while marvelling at how anyone could really have been good enough to know how to do this.

The Game
I'll admit, I've never built a non-trivial game before, but I think attempting an 8-bit home computer game I remember fondly from my childhood, on top of 2018's best front-end technologies, should be about the right level of difficulty.

The game I'll be replicating is called Frenzy; a Micro Power publication for the BBC B, Acorn Electron and Commodore 64. My machine was the Electron - basically a low-cost little brother to the mighty Beeb; highly limited in RAM and CPU, titles for this platform usually needed substantial trimming from their BBC B donor games, despite using the same BBC BASIC language.

Check out the links above for more details and screenshots, but the game is basically a simplified version of "Qix" or "Kix" where the object is to fill in areas of the screen without being hit by one or more moving enemies.

Just for the hell of it, I'm going to place this game front-and-centre on my homepage at http://www.themillhousegroup.com, which I just nuked for this purpose. The page is now a React app being served off a Play Scala backend as per my new-era architecture, and the key technologies I'm using so far are: I'm sure more will follow.

Initial Development
To develop the game, I decided to start from the start. The welcome page would need to be suitably old-skool but would force me to consider a few things:
  • What screen size should I be working to?
  • Can I get a suitably chunky, monospaced font?
  • Press Space to start sounds easy, but how do I make that actually work?
Decisions
The original Frenzy must have operated in the BBC's graphical MODE 1 because it used a whopping 4 colours and the pixels were square. So that means the native resolution was 320x256. While it would be tempting to stick to that screen size and thus have it fit on every smartphone screen, I've decided to double things and target a 640x512 effective canvas.
Some searching for 8-bit fonts led me to "Press Start 2P" which, while intended to honour Namco arcade machines, is near enough to the chunky fonts I remember fondly from my childhood home computer that I can't go past it:
As a tiny nod to the present, the "screen" is actually slightly transparent and has a drop shadow - showing how far we've come in 34 years!
The final piece of the welcome screen was achieved by mounting the FrenzyGame component in a React-Game-Kit Loop and using the KeyListener to subscribe to the keys I care about - a quick perusal of the demo game showed me how to use it:
class FrenzyGame extends Component {

  constructor(props) {
    super(props);
    this.keyListener = new KeyListener();

    this.state = {
      gameOver: true  
    };
  }

  componentDidMount() {
    this.loopSubscriptionId = this.context.loop.subscribe(this.update);
    this.keyListener.subscribe([
      this.keyListener.SPACE
    ]);
  }

  componentWillUnmount() {
    this.context.loop.unsubscribe(this.loopSubscriptionId);
    this.keyListener.unsubscribe();
  }

  update() {
    if (this.state.gameOver) {
      if (this.keyListener.isDown(this.keyListener.SPACE)) {
        this.setState({ gameOver: false });
      }
    }
  };

  ...

  render() {
    return this.state.gameOver 
      ? this.renderWelcomeScreen() 
      : this.renderGame();
  }
}

Friday, 4 May 2018

Raspberry Pi 3 Model B+

My Synology NAS is coming up to 10 years of age, and asking it to do all its usual functions, plus run a few solid Java apps: ... was all a bit much for its 700MHz ARM processor, and particularly its 256Mb of RAM. Jenkins was the final straw, so I was looking around for other low-power devices that could run these apps comfortably. One gigabyte of RAM being a definite requirement. My Googling came up with Raspberry Pi devices, which surprised me as I'd always considered them a little "weak" as general purpose servers, more for doing single duties or as clients.

But that was before I knew about the Raspberry Pi 3, Model B+. This little rocket boots up its Raspbian (tweaked Debian) OS in a few seconds, has 1Gb of RAM and a quad-core 1.4GHz ARM processor that does a great job with the Java workloads I'm throwing at it. And look at the thing - it's about the size of a pack of cards:
A quad-core server with 1Gb of RAM, sitting on 3TB of storage. LEGO piece for scale. I wonder what 1998-me would have made of that!

With wired and wireless Ethernet, scads of USB 3.0 ports and interesting GPIO pin possibilities, this thing is ideal for my home automation projects. And priced so affordably that (should it become necessary) running a fleet of these little guys is quite plausible. If like me, you had thought the Raspberry Pi was a bit of a toy, take another look!

Monday, 23 April 2018

Green Millhouse - Fixing the OpenHAB BroadLink Binding (part 2)

Development of the Broadlink OpenHAB binding has been going very well, but with just one scenario (and arguably the most-common one) not working reliably. With the OpenHAB server up-and-running and the Broadlink binding installed and configured for a particular disconnected device, turn the device on.

The copious logging I've added to the binding told me that this was resulting in an error with code -7 when the device was first detected, and I was a little stumped as to how to proceed. As many before me have done, I started entering random semi-related strings into Google to see what might turn up. And I hit gold. The author of the Python Broadlink library did some great work documenting the Broadlink protocol, and while perusing it a particular sentence jumped out at me:

You must obtain an authorisation key from the device before you can communicate.

Of course! When the binding first boots it calls authenticate for each device it knows about, and it does the same when a device setting is changed. But it does not when, after finding a newly-booted device, tries to get its status!

I added an authenticated boolean in the base Handler class, only setting it when authenticate() succeeds and clearing it whenever we lose contact with the device. Any attempt to communicate with the device first checks the boolean and does the authenticate() dance if necessary. And it works like a charm. We're very nearly there now - I just want to DRY up some of the payload-handling methods which seem to have been copy-pasted, add a 'presence' channel that mirrors the internal state of the binding, and remove the heavy dependence on IP addresses, and I think it's good to go.

Saturday, 31 March 2018

Green Millhouse - Fixing the OpenHAB BroadLink Binding (part 1)

You can follow along at Github, but my rebuilding of the Broadlink OpenHAB binding is nearing completion.

I've been building and testing locally with my A1 Air Quality Sensor, and since fixing some shared-state issues in the network layer, haven't yet experienced any of the reliability problems that plagued the original binding.

For reasons that aren't clear (because I'm working from a decompiled JAR file), the original binding was set up like this in the base Thing handler (which all Broadlink Things inherit from):
public class BroadlinkBaseThingHandler extends BaseThingHandler {
   private static DatagramSocket socket = null;
   static boolean commandRunning = false;
   ...
}

As soon as I saw those static members, alarms started ringing in my head - especially when combined with an inheritance model, you've got a definite "fragile base class" problem at compile-time, and untold misery at runtime when multiple subclasses start accessing the socket like it's their exclusive property!

An attempt to mitigate the race-conditions which must have abounded, the commandRunning boolean only complicated matters:
    public boolean sendDatagram(byte message[])
    {
        try
        {
            if(socket == null || socket.isClosed())
            {
                socket = new DatagramSocket();
                socket.setBroadcast(true);
            }
            InetAddress host = InetAddress.getByName(thingConfig.getIpAddress());
            int port = thingConfig.getPort();
            DatagramPacket sendPacket = new DatagramPacket(message, message.length, new InetSocketAddress(host, port));
            commandRunning = true;
            socket.send(sendPacket);
        }
        catch(IOException e)
        {
            logger.error("IO error for device '{}' during UDP command sending: {}", getThing().getUID(), e.getMessage());
            commandRunning = false;
            return false;
        }
        return true;
    }

    public byte[] receiveDatagram()
    {
        try {
            socket.setReuseAddress(true);
            socket.setSoTimeout(5000);
        } catch (SocketException se) {
            commandRunning = false;
            socket.close();
            return null;
        }

        if(!commandRunning) {
            logger.error("No command running - device '{}' should not be receiving at this time!", getThing().getUID());
            return null;
        }

        try
        {
            if(socket != null)
            {
                byte response[] = new byte[1024];
                DatagramPacket receivePacket = new DatagramPacket(response, response.length);
                socket.receive(receivePacket);
                response = receivePacket.getData();
                commandRunning = false;
                socket.close();
                return response;
            }
        }
        catch (SocketTimeoutException ste) {
            if(logger.isDebugEnabled()) {
                logger.debug("No further response received for device '{}'", getThing().getUID());
            }
        }

        catch(Exception e)
        {
            logger.error("IO Exception: '{}", e.getMessage());
        }

        commandRunning = false;
        return null;
    }

So we got a pseudo-semaphore that is trying to detect getting into a bad state (because of shared-state), but itself is shared-state, thereby experiencing the same unreliability.

Here's what the new code looks like:
public class BroadlinkBaseThingHandler extends BaseThingHandler {
    private DatagramSocket socket = null;
    ...

    public boolean sendDatagram(byte message[], String purpose) {
        try {
            logTrace("Sending " + purpose);
            if (socket == null || socket.isClosed()) {
                socket = new DatagramSocket();
                socket.setBroadcast(true);
                socket.setReuseAddress(true);
                socket.setSoTimeout(5000);
            }
            InetAddress host = InetAddress.getByName(thingConfig.getIpAddress());
            int port = thingConfig.getPort();
            DatagramPacket sendPacket = new DatagramPacket(message, message.length, new InetSocketAddress(host, port));
            socket.send(sendPacket);
        } catch (IOException e) {
            logger.error("IO error for device '{}' during UDP command sending: {}", getThing().getUID(), e.getMessage());
            return false;
        }
        logTrace("Sending " + purpose + " complete");
        return true;
    }

    public byte[] receiveDatagram(String purpose) {
        logTrace("Receiving " + purpose);

        try {
            if (socket == null) {
                logError("receiveDatagram " + purpose + " for socket was unexpectedly null");
            } else {
                byte response[] = new byte[1024];
                DatagramPacket receivePacket = new DatagramPacket(response, response.length);
                socket.receive(receivePacket);
                response = receivePacket.getData();
//                socket.close();
                logTrace("Receiving " + purpose + " complete (OK)");
                return response;
            }
        } catch (SocketTimeoutException ste) {
            logDebug("No further " + purpose + " response received for device");
        } catch (Exception e) {
            logger.error("While {} - IO Exception: '{}'", purpose, e.getMessage());
        }

        return null;
    }    


A lot less controversial I'd say. The key changes:
  • Each subclass instance (i.e. Thing) gets its own socket
  • No need to track commandRunning - an instance owns its socket outright
  • The socket gets configured just once, instead of being reconfigured between Tx- and Rx-time
  • Improved diagnostic logging that always outputs the ThingID, and the purpose of the call


The next phase is now stress-testing the binding with a second heterogeneous device (sadly I don't have another A1, that would be great for further tests), my RM3 Mini IR-blaster. I'll be trying adding and removing the devices at various times, seeing if I can trip the binding up. The final step will be making sure the Thing discovery process (which is the main reason to upgrade to OpenHAB 2, and is brilliant) is as good as it can be. After that, I'll be tidying up the code to meet the OpenHAB guidelines and hopefully getting this thing into the official release!

Sunday, 25 February 2018

Green Millhouse - Temp Monitoring 2 - Return of the BroadLink A1 Sensor!

So after giving up on the BroadLink A1 Air Quality Sensor a year ago, I'm delighted to report that is back in my good books after some extraordinary work from some OpenHAB contributors. Using some pretty amazing techniques they have been able to reverse-engineer the all-important crypto keys used by the Broadlink devices, thus "opening up" the protocol to API usage.

Here's the relevant OpenHAB forum post - it includes a link to a Beta-quality OpenHAB binding, which I duly installed to my Synology's OpenHAB 2 setup, and it showed itself to be pretty darn good. Both my A1 and my new BroadLink RM3 Mini (wifi-controlled remote blaster) were discovered immediately and worked great "out of the box".

However, I discovered that after an OpenHAB reboot (my Synology turns itself off each night and restarts each morning to save power) the BroadLink devices didn't come back online properly; it was also unreliable at polling multiple devices, and there were other niggly little issues identified by other forum members in the above thread. Worst of all, the original developer of the binding (one Cato Sognen) has since gone missing from the discussion, with no source code published anywhere!

Long story short, I've decided to take over the development of this binding - 90% of the work has been done, and thanks to the amazing JAD Decompiler, I was able to recover the vast majority of the source as if I'd written it myself. At the time of writing I am able to compile the binding and believe I have fixed the multiple-device problems (the code was using one shared static Socket instance and a shared mutable Boolean to try and control access to it...) and am looking at the bootup problems. And best of all, I'm doing the whole thing "in the open" over on Github - everyone is welcome to scrutinise and hopefully improve this binding, with a view to getting it included in the next official OpenHAB release.

Thursday, 25 January 2018

OpenShift - the 'f' is silent

So it's come to this.

After almost-exactly four years of free-tier OpenShift usage for Jenkins purposes, I have finally had to throw up my hands and declare it unworkable.

The first concern was earlier in 2017 when, with minimal notice, they announced the end-of-life of the OpenShift 2.0 platform which was serving me so well. Simultaneously they dropped the number of nodes available to free-tier customers from 3 to 1. A move I would have been fine with if there had been any way for me to pay them down here in Australia - a fact I lamented about almost 2 years ago.

Then, in the big "upgrade" to version 3, OpenShift disposed of what I considered to be their best feature - having the configuration of a node held under version control in Git; push a change, the node restarts with the new config. Awesome. Instead, version 3 handed us a complex new ecosystem of pods, containers, services, images, controllers, registries and applications, administered through a labyrinth of somewhat-complete and occasionally-buggy web pages. Truly a downgrade from my perspective.

The final straw was the extraordinarily fragile and flaky nature of the one-and-only node (or is it "pod"? Or "application"? I can't even tell any more) that I have running as a Jenkins master. Now this is hardly a taxing thing to run - I have a $5-per-month Vultr instance actually being a slave and doing real work - yet it seems to be unable to stay up reliably while doing such simple tasks as changing a job's configuration. It also makes "continuous integration" a bit of a joke if pushing to a repository doesn't actually end up running tests and building a new artefact because the node was unresponsive to the webhook from Github/Bitbucket. Sigh.

You can imagine how great it is to see this page when you've just hit "save" on the meticulously-detailed configuration for a brand new Jenkins job...

So, in what I hope is not a taste of things to come, I'm de-clouding my Jenkins instance and moving it back to the only "on-premises" bit of "server hardware" I still own - my Synology DS209 NAS. Stay tuned.