Sunday, 27 August 2023

Searching for the next SPA

I've been quite taken with a particular style of casual, "clever" game that rose to prominence during The COVID Years but still has a charm that keeps me visiting almost daily:

  • Wordle (the original, and the "rags to riches" ideal)
  • Quordle (a beautiful initial implementation, albeit reduced now)
  • Heardle (recently escaped from the clutches of Spotify)
and most-recently:

There are a heap of common factors amongst these games (and I'll optimistically include my own Cardle here too) that I think make feel so "nice":

  • Rejection of obvious monetization strategies
  • Feel resolutely mobile-first in UI/UX (large elements, zero scrolling!)
  • Delightful levels of polish (micro-interactions, animation etc)
  • Focused; not just a Single *Page* App, but almost a Single *Pane* App

Of course there's also the little matter of having a great idea with suitably nice mechanics and frequently creativity (Connections I think excels in having creative, challenging content by Wyna Liu that is pitched just *chef's kiss*) but I think there are still areas of the word-association, letter-oriented game landscape to be explored.

For this next one I also will be trying out Svelte after reading the superb "Things you forgot (or never knew) because of React" by Josh Collinsworth which very nicely articulated what feels a little "ick" about React development these days, and paints a very nice picture on what's on the other side of the fence. The scoped-styling and in-built animation abilities in particular seem like a perfect fit for this kind of app.

Now I just need an idea...

Sunday, 30 July 2023

Can you handle the truth?

JavaScript/ECMAScript/TypeScript are officially everywhere these days and with them comes the idiomatic use of truthiness checking.

At work, recently I had to fix a nasty bug where the truthiness of an optional value was used to determine what "mode" to be in, instead of a perfectly-good enumerated type located nearby. Let me extrapolate this into a worked example that might show how dangerous this is:

 
type VehicleParameters = {
   roadSpeed: number;
   engineRPM: number;
   ...
   cruiseControlOn: boolean;
   cruiseControlSpeed: number | undefined;
}   
and imagine, running a few times a second, we had a function:

function maintainCruiseSpeed(vp: VehicleParameters) {
   const { roadSpeed, cruiseControlSpeed } = vp;
   
   if (cruiseControlSpeed ?? cruiseControlSpeed < roadSpeed) {
   	  accelerate();
   }
}

Let's suppose the driver of this vehicle hits "SET" on their cruise control stalk to lock in their current speed of 100km/h as their desired automatically-maintained speed. The control module sets the cruiseControlOn boolean to true, and copies the current value of roadSpeed (being 100) into cruiseControlSpeed

Now imagine the driver disengages cruise control, and the boolean is correctly set to false, but the cruiseControlSpeed is retained, as it is very common for a cruise system to have a RESUME feature that goes back to the previously-stored speed.

And all of a sudden we have an Unintended Acceleration situation. Yikes.

As simple as can be, but no simpler

Don't get me wrong, I like terse code; one of the reasons I liked Scala so much was the succinctness after escaping from the famously long-winded Kingdom of Nouns. I also loathe redundant and/or underperforming fields, in particular Booleans that shadow another bit of state, e.g.:

  const [isLoggedIn] = useState(false);
  const [loggedInUser] = useState(undefined);

That kind of stuff drives me insane. What I definitely really like is when we can be Javascript-idiomatic AND use the power of TypeScript to prevent combinations of things that should not be. How?

Typescript Unions have entered the chat

Let's define some types that model the behaviour we want:

  • When cruise is turned on we need target speed, there's no resume speed
  • When cruise is turned off we zero the target speed, and the resume speed
  • When cruise is set to coast (or the brake pedal is pressed) we zero the target speed, but store a resume speed
  • When cruise is turned on we need a target speed to get back to, and there's no resume speed

type VehicleParameters = {
  roadSpeed: number;
  engineRPM: number;
  cruiseControlSettings: CruiseControlSettings;
} 

type CruiseControlSettings = 
       CruiseOnSettings | 
       CruiseOffSettings | 
       CruiseCoastSettings | 
       CruiseResumeSettings

type CruiseOnSettings = {
  mode: CruiseMode.CruiseOn
  targetSpeedKmh: number;
  resumeSpeedKmh: 0;
}
type CruiseOffSettings = {
  mode: CruiseMode.CruiseOff
  targetSpeedKmh: 0;
  resumeSpeedKmh: 0;
}
type CruiseCoastSettings = {
  mode: CruiseMode.CruiseCoast
  targetSpeedKmh: 0;
  resumeSpeedKmh: number;
}
type CruiseResumeSettings = {
  mode: CruiseMode.CruiseResume
  targetSpeedKmh: number;
  resumeSpeedKmh: 0;
}

Let's also write a new version of maintainCruiseSpeed, still in idiomatic ECMAScript (i.e. using truthiness):

function maintainCruiseSpeed(vp: VehicleParameters) {
  const { roadSpeed, cruiseControlSettings } = vp;
  
  if (cruiseControlSettings.targetSpeedKmh < roadSpeed) {
      accelerate();
  }
}

And finally, let's try and update the cruise settings to an illegal combination:

function illegallyUpdateCruiseSettings():CruiseControlSettings {
  return {
    mode: CruiseMode.CruiseOff,
    targetSpeedKmh: 120,
    resumeSpeedKmh: 99,
  }
}
... but notice now, you can't; you get a TypeScript error:
Type 
'{ mode: CruiseMode.CruiseOff; targetSpeedKmh: 120; 
resumeSpeedKmh: number; }'
is not assignable to type 'CruiseControlSettings'.
  Types of property 'targetSpeedKmh' are incompatible.
    Type '120' is not assignable to type '0'

I'm not suggesting that TypeScript types will unequivocally save your critical code from endangering human life, but a little thought expended on sensibly modelling conditions just might help.

Sunday, 25 June 2023

In praise of ETL, part three: Love-you-Load-time

Finishing off my three-part series as a newly-minted ETL fanboi, we get to the Load stage. One could be forgiven thinking there is not a great deal to get excited about at this point; throw data at the new system and it's Job Done. But as usual with ETL, there are hidden pleasures lurking that you might not have considered until you've got your feet wet.

Mechanical Sympathy

In the "customer migration" ETL project I worked on, the output of the Transform stage for a given customer was dumped into a JSON file in an AWS S3 bucket. At Load time, the content of a bucket was scooped out and fed into the target system. Something we noticed quite quickly as the migration project ramped up, was that the new system did not "like" being hit with too many "create new customer" API calls per second. It was pretty simple to implement a rate limit system in the Load stage (only!) to ensure we were being mechanically-sympathetic to the new system, while still being able to go as fast as possible in the other stages.

Optimal Throughput

Indeed, we had a similar rate-limit in our Extract for the benefit of our source system(s) - albeit at a higher rate as its API seemed to be able handle reading a fair bit faster than the new system's API could write. And there's another benefit - we weren't being throttled by the speed of the slower system; we could still extract as fast as the source would allow, transform and buffer into S3, then load at the optimal speed for the new system. You could get fancy and call it Elastic Scaling or somesuch, but really, if we'd used some monolithic process to try and do these customer migrations, we wouldn't have had the fine-grained control.

Idempotency is Imperative

One last tip; strive to ensure your Load stage does not alter the output of the Transform in any way, or you'll lose one of the key advantages of the whole ETL architecture. If you can't look at a transform file (e.g. a JSON blob in an S3 bucket in our case) and know that it's exactly what was sent to the target system, then your debugging just got a whole lot harder. Even something as innocent as populating a createdAt field with new Date() could well bite you (for example if the date has to be in a particular format). If you've got to do something like that, consider passing the date in, in the correct format, as an additional parameter to the Load stage, so there's at least some evidence of what the field was actually set to. There's really nothing worse than not being able to say with confidence what you actually sent to the target system.

We didn't do this, but if there was a "next time" I'd also store a copy of this payload in an S3 bucket as well, just for quick verification purposes.

Saturday, 6 May 2023

In praise of ETL, part two; Trouble-free Transforms

Continuing with my series about the unexpected pleasures of ETL, we get to the Transform stage.

I've already mentioned in the previous post the huge benefit of separating the extraction and transformation stages, giving an almost-limitless source of test fixtures for transform unit tests. On the subject of testing, it strikes me that there's a parallel in ETL with the classic "cost of failure" model that is typically used to justify adoption of unit tests to dinosaur-era organisations that don't use them; viz:

(Graph from DeepSource.com)

My contention is that failure in each E/T/L stage has a similar cost profile (of course YMMV, but it applied in our customer-migration scenario);

An error during Extract
  • (Assuming an absolute minimum of logic exists in the extraction code)
  • Most likely due to a source system being overloaded/overwhelmed by sheer number of extraction requests occurring at once
  • Throttle them, and retry the process
  • Easy and cheap to restart
  • Overall INEXPENSIVE
An error during Transform
  • Easy to reproduce via unit tests/fixtures
  • Rebuild, redeploy, re-run
  • Overall MEDIUM EXPENSE
An error during Load
  • Most likely in "somebody else's code"
  • Investigation/rectification may require cross-functional/cross-team/cross-company communications
  • Re-run for this scenario may be blocked if target system needs to be cleared down
  • Overall HIGH EXPENSE

Thus it behooves us (such a great vintage phrase that!) to get our transforms nice and tight; heavy unit-testing is an obvious solution here but also careful consideration of what the approach to transforming questionable data should be. In our case, our initial transform attempts took the pragmatic, Postel-esque "accept garbage, don't throw an error, return something sensible" approach. So upon encountering invalid data for example, we'd log a warning, and transform it to an undefined object or empty array as appropriate.

This turned out to be a problem, as we weren't getting enough feedback about the sheer amount of bad input data we were simply skimming over, resulting in gaps in the data being loaded into the new system.

So in the next phase of development, we became willingly, brutally "fragile", throwing an error as soon as we encountered input data that wasn't ideal. This would obviously result in a lot of failed ETL jobs, but it alerted us to the problems which we could then mitigate in the source system or with code fixes (and unit tests) as needed.

Interestingly, it turned out that in the "long tail" of the customer migration project, we had to return back (somewhat) to the "permissive mode" in order to get particularly difficult customer accounts to be migrated. The approach at that point was to migrate them with known holes in their data, and fix them in the TARGET system.

Here's my crude visualisation of it. I don't know if this mode of code evolution has a name but I found it interesting.

Sunday, 16 April 2023

Micro-Optimisation #393: More Log Macros!

I've posted some of my VSCode Log Macros previously, but wherever there is repetitive typing, there are further efficiencies to be gleaned!

Log, Label and Prettify a variable - [ Ctrl + Option + Command + J ]

You know what's better than having the contents of your console.log() autogenerated?

Having the whole thing inserted for you!

How do I add this?

On the Mac you can use ⌘-K-S to see the pretty shortcut list, then hit the "Open Keyboard Shortcuts (JSON)" icon in the top-right to get the text editor to show the contents of keybindings.json. And by the way, execute the command Developer: Toggle Keyboard Shortcuts Troubleshooting to get diagnostic output on what various special keystrokes map to in VSCode-speak (e.g. on a Mac, what Ctrl, Option and Command actually do)

keybindings.json
// Place your key bindings in this file to override the defaults
[
{
    "key": "ctrl+meta+alt+j", 
    "when": "editorTextFocus",
    "command": "runCommands",
    "args": {
      "commands": [
        {
          "command": "editor.action.copyLinesDownAction"
        },
        {
          "command": "editor.action.insertSnippet",
          "args": {
            "snippet": "\nconsole.log(`${TM_SELECTED_TEXT}: ${JSON.stringify(${TM_SELECTED_TEXT}$1, null, 2)}`);\n"
          }
        },
        {
          "command": "cursorUp"
        },
        {
          "command": "editor.action.deleteLines"
        },
        {
          "command": "cursorDown"
        },
        {
          "command": "editor.action.deleteLines"
        },
      ],
    }
  }
]

This one uses the new (for April 2023, VSCode v1.77.3) runCommands command, which, as you might infer, allows commands to be chained together in a keybinding. A really nice property of this is that you can Command-Z your way back out of the individual commands; very helpful for debugging the keybinding, but also potentially just nice-to-have.

The trick here is to retain the text selection so that ${TM_SELECTED_TEXT} can continue to contain the right thing, without clobbering whatever might be in the editor clipboard at this moment. We do this by copying the line down. This helpfully keeps the selection right on the variable where we want it. We then blast over the top of the selection with the logging line, but by sneakily inserting \n symbols at each end, we break up the old line into 3 lines, where the middle one is the only one we want to keep. So we delete the above and below.

Saturday, 25 March 2023

In praise of ETL, part one; E's are good

I've written previously about how at work I've been using an ETL (Extract, Transform, Load) process for customer migrations, but that was mostly in the context of a use case for AWS Step Functions.

Now I want to talk about ETL itself, and how good it's been as an approach. It's been around for a while so one would expect it to have merits, but I've found some aspects to be particularly neat and wanted to call them out specifically. So here we go.

An Extract is a perfect test fixture

I'd never realised this before, but the very act of storing the data you plan on Transforming, and Loading, is tremendously powerful. Firstly, it lets you see exactly what data your Transform was acting upon; secondly, it gives you replay-ability using that exact data (if that's what you want/need) and thirdly, you've got an instant source of test fixture data for checking how your transform code handles that one weird bug that you just came across in production.

My workflow for fixing transform-stage bugs literally became:

  • Locate JSON extract file for the process that failed
  • Save as local JSON file in test fixtures directory of the transform code
  • Write a test to attempt to transform this fixture (or sub-component of it)
  • Test should fail as the production code does
  • Fix transform code, test should now pass
  • Commit fixed code, new test(s) and fixture
  • Release to production
  • Re-run ETL process; bug is gone

Monday, 27 February 2023

Stepping up, and back, with the new Next.js "app" directory

I'm toying around with a new web-based side project and I thought it was time to give the latest Next.js version a spin. Although I've used Create-React-App (generally hosted on Netlify) more recently, I've dabbled with Next.js in one capacity or another since 2018, and this time some server-side requirements made it a better choice.

The killer feature of the 2023 "beta version" of Next.js (which I assume will eventually be named Next.js 14) is the app directory, which takes Next's already-excellent filesystem-based routing (i.e. if you create a file called bar.tsx in a directory called foo, you'll find it served up at /foo/bar without writing a line of code) and amps it up. A lot.

I won't try and reiterate their excellent documentation, but their nested layouts feature is looking like an absolute winner from where I'm sitting, and I'd like to explain why by taking you back in time. I've done this before when talking about React-related stuff when I joked that the HTML <img> tag was like a proto-React component. And I still stand by that comparison; I think this instant familiarity is one of the fundamental reasons why React has "won" the webapp developer mindshare battle.

Let me take you back to 1998. The Web is pretty raw, pretty wild, and mostly static pages. My Dad's website is absolutely no exception. I've meticulously hand-coded it in vi as a series of stand-alone HTML pages which get FTP'ed into position on his ISP's web server. Although I'm dimly aware of CSS, it's mainly still used for small hacks like removing the underlines from links (I still remember being shown the way to do this with an inline style tag and thinking it would never take off) - and I'm certainly not writing a separate .css file to be included by every HTML file. As a result, everything is styled "inline" so-to-speak, but not even in the CSS way; just mountains of widths and heights and font faces all over the place. It sucked, but HTML was getting better all the time so we just put up with it, used all that it offered, and automated what we could. Which was exactly what I did. If you dare to inspect the source of the above Wayback Machine page, you'll see that it uses HTML frames (ugh), which was a primitive way of maintaining a certain amount of UI consistency while navigating around the site.

The other thing I did to improve UI consistency, was a primitive form of templating. Probably more akin to concatenation, but I definitely had a header.htm which was crammed together with (for-example) order-body.htm to end up with order.htm using a DOS batch file that I ran to "pre-process" everything prior to doing an FTP upload - a monthly occurrence as my Dad liked to keep his "new arrivals" page genuinely fresh. Now header.htm definitely wasn't valid HTML as it would have had unclosed tags galore, but it was re-used for several pages that needed to look the same, and that made me feel efficient.

And this brings me to Next.js and the nesting layouts functionality I mentioned before. To achieve what took me a pile of HTML frames, some malformed HTML documents and a hacky batch file, all I have to do is add a layout.tsx and put all the pages that should use that UI alongside it. I can add a layout.tsx in any subdirectory and it will apply from there "down". Consistency via convention over configuration, while still nodding to the hierarchical filesystem structures we've been using since Before The Web. It's just really well thought-out, and a telling example of how much thought is going into Next.js right now. I am on board, and will be digging deeper this year for sure.

Sunday, 29 January 2023

Sneaking through the Analog Hole

I perhaps-foolishly recently agreed to perform a media-archiving task. A series of books-on-tape (yes, on physical audio cassettes), almost unplayable at this point in the century, needed to be moved onto a playable media. For this particular client, that meant onto Audio CDs (OK so we're moving forward, but not too far!). I myself didn't have a suitable playback device, but quickly located a bargain-priced solution, second-hand on eBay (of course) - an AWA E-F34U that appears to be exclusively distributed by the Big W retail chain here in Australia:

This device purports to be a one-USB-cable solution to digitising the contents of analogue cassettes. Unfortunately, the example I just purchased had extremely severe issues with its USB implementation. The audio coming straight off the USB cable would jump between perfectly fine for a few seconds, to glitchy, stuttering and repeating short sections, to half-speed slooooow with the attendant drop in pitch. Unusable.

I only hope that the problem is isolated to my unit (which was cheap and described as "sold untested" so I have no-one to blame but myself) - if not, someone's done a really bad job at their USB Audio implementation. Luckily, the USB Power works absolutely fine, so I had to resort to the old "Analog Hole" solution via my existing (rather nice) USB Audio Interface, a Native Instruments Komplete Audio 1 which I picked up after my previous interface, a TASCAM FireOne, finally kicked the bucket.

In the following picture, you can see my digitising solution. AWA tape transport (powered by USB) to 3.5mm headphone socket, through a 1/4" adaptor to a short guitar lead and into the Komplete Audio 1's Line In. From there, it goes in via the KA1's (fully-working!) USB connection to GarageBand on the Mac. A noise gate and a little compression are applied, and once each side of each tape has been captured, it gets exported directly to an MP3 file. I intend to present the client with not only the Audio CDs but also a data CD containing these MP3s so that future media formats can hopefully be more easily accommodated.

What if I didn't already have a USB audio interface? Would the client have given up, with their media stuck in the analog era, never to be heard again?

It amused me that analog technology was both the cause of this work - in that this medium and the ability to play it has gone from ubiquitous in the 1980s to virtually extinct - and its solution, using an analog interface to get around a deficient digital one.