Thursday, 20 March 2014

Future, Meet Past

Just because you're using a nice, shiny new language with all the latest bells-and-whistles doesn't make you immune from having to deal with problems that are almost as old as computers themselves. The case in point; watching a given directory, waiting for a new file to appear in it, and then doing something.

I was presented with solving this problem in Idiomatic Scala™ and was rather surprised to find very little built into the standard library to help.
Turning to Scala's grand-daddy was also rather astonishing - the "new, improved, async" NIO facilities are still, well, a bit clunky when you consider that there's not even an event-driven (i.e. register a callback) way to watch a directory.
So I set about implementing a directory watcher that works the way I'd like it to - namely, instantly returning a Future that will be completed only when the file I'm after has arrived (which of course I specify with a matching function). Here's my usage pattern:
val myDirWatcher = new DirectoryFileCreationWatcher(watchedDir)

myDirWatcher.awaitFile( _.endsWith("blah.txt") ).map { theNewDirectoryState =>
// Do something with the dir now that you know that *blah.txt is in it
}

And here's how I implemented it:
import java.nio.file._
import java.nio.file.StandardWatchEventKinds._
import scala.concurrent.ExecutionContext.Implicits.global
import scala.collection.JavaConversions._
 
class DirectoryFileCreationWatcher(directoryToWatch:Path) {

  val watcher = FileSystems.getDefault.newWatchService
 
  // Work properly on Mac:
  // http://stackoverflow.com/questions/9588737/is-java-7-watchservice-slow-for-anyone-else
  val kinds:Array[Kind[_]] = Seq(StandardWatchEventKinds.ENTRY_CREATE).toArray
  directoryToWatch.register(
    watcher, 
    kinds,     
    SensitivityWatchEventModifier.HIGH.asInstanceOf[WatchEvent.Modifier]
  )
 
  /**
   *
   * @param pathMatcher a function that returns true if this is the file we're looking for
   * @return a Future holding a Path that represents the directory in its "new" state
   */
  def awaitFile( pathMatcher: Path => Boolean):Future[Path] = Future[Path] {
    var foundMatch = false
    while (!foundMatch) {
      val watchKey = watcher.take // Blocks
      val events = watchKey.pollEvents
      foundMatch = events.exists { event =>
        val wep = event.asInstanceOf[WatchEvent[Path]]
        pathMatcher(wep.context)
      }
      watchKey.reset
    }
    directoryToWatch
  }
}

An annoying problem which I encountered during testing is that the JVM on MacOS does not implement the WatchService efficiently (i.e. by hooking into filesystem notifications), instead using the naïve polling approach. This will hopefully get rectified In Due Course™. As a result, I had to put quite lengthy sleeps into my test code (2000ms seemed to do it) after adding a file to a watched directory. That ugly bit of parameter-munging in the call to directoryToWatch.register() is configuring the "sensitive" version of the watcher, without which you'll need to wait 5+ seconds to be notified of changes. Ouch.

The really nice thing about using Futures in Scala is that once you've got one, you can just chain them up with map and friends, getting all of that asynchronous goodness with minimal boilerplate. Fun times.

Gist with unit tests is here.