Tuesday 12 April 2011

Sophistication via Behavioural Chaining

A Nice Pattern

A few years ago I had the pleasure of working with a truly excellent Java developer in the UK, Simon Morgan. I learnt a lot from looking at Simon's code, and he was a terrific guy to boot. One of the really eye-opening things he was doing was obtaining very sophisticated behaviour by stringing together simple evaluation modules.


This is just like how programmers have always solved problems - breaking them down into manageable chunks - but goes much further. We're not solving a problem here (well we are, but it sorta falls out the end rather than being explicit), rather, we are approximating the sophistication of an expert human being's behaviour when solving the problem. Wow.


Simon was using a hybrid of two patterns; Chain of Responsibility; and Strategy. The basic approach was to iterate over an injected list of Strategy implementations, where the Strategy interface would normally be as simple as:

Operand applyTo(Operand op);
BUT instead of returning a possibly-modified Operand, he defined a Scorer interface that looked like this:
float determineScore(Scenario scenario);

Individual Scorers can be as simple or as complicated as required. For Simon's particular case, each one tended to inspect the database, looking for a particular situation/combination, and arrive at a score based on how close that was to the "ideal". For this, it made sense to have an AbstractDatabaseAccessingScorer which every Scorer extended.


The float that each scorer returned multiplied a running total, that started at 1.0. At the end of a scoring run, a possible Scenario would have a score - somewhere from 0.0 to 1.0. Some aspect of the Scenario would then be tweaked, and the score calculated again. At the end of the evaluation run, the highest-scoring Scenario would be selected as the optimal course of action.


While this worked very well, Simon realised that in developing his Scorers, he'd unwittingly assigned some of them lower importance, by getting them to return scores only {0.0, 0.5} for example. He went on to refactor this out, and instead each Scorer was required to provide a {0.0, 1.0} score, and assigned a weight multiplier, so that some Scorers could be given greater power in influencing the choice of Scenario. This really boosted the power and subtlety of the system - to the extent that he started logging his scoring runs profusely in order to get some understanding of how his home-grown "neural net" was coming up with some results.


Often, the choice of winning scenario was a matter of choosing between final scores of 0.00000123 versus 0.00000122 - when dealing with such close decisions, it was worthwhile flagging the situation to allow a human to examine it and possibly tweak some weight modifiers to get the optimal outcome. In time, this would lead to even better approximation to an expert human's selection behaviour.


We never came up with a name for this pattern, but it has always stuck in my mind as a nice one (albeit with limited applications). Evaluator Chain seems to sum it up fairly well, and I'm working on a library that will give a convenient, templated API for domain-specific implementations, the first of which will be the selection of a winning sports team based on past performance data.


So if this is my last blog post, you'll know I've cracked it and made my fortune in sports betting ...

No comments:

Post a Comment

Comments welcome - spam is not. Spam will be detected, deleted and the source IP blocked.