Showing posts with label lambdas. Show all posts
Showing posts with label lambdas. Show all posts

Saturday, 24 February 2024

My Perfect AWS Console

Yeah that's literally it. I love AWS and use a decent portion of their offerings, but could really honestly get by with 2 of the OG AWS features, and one relative newcomer. ## AWS S3 The performance these days is absolutely top-notch (without even going down the [Directory Buckets](https://docs.aws.amazon.com/AmazonS3/latest/userguide/directory-buckets-overview.html) route). It's cheap enough that with a well-designed path structure, you can put just-about any workflow that can be represented with JSON into it. As in, you probably don't need [Step Functions](https://aws.amazon.com/step-functions/). ## AWS Lambda I can't remember the last time I've needed a server that hangs around all the time, whether for work or side-gigs. Lambdas just fit _so well_ with modern request-response patterns that it's difficult to justify anything else. Add some [Provisioned Concurrency](https://docs.aws.amazon.com/lambda/latest/operatorguide/provisioned-scaling.html) if you really need nice warm caches and connections, but you still get the super-fast deployment and observability of functions-in-the-cloud. And you're not limited to 30-second execution time any more either (it's currently [up to 15 minutes](https://blog.awsfundamentals.com/lambda-limitations)), so you can wait for those slow 3rd-party APIs. *Protip:* The Lambda Test Console allows you to store (and share!) test JSON payloads for each lambda. This can be a superb way to perform ad-hoc jobs, or re-process things that didn't quite work right the first time. Add a `dryRun?: boolean` option to the input shape and pass it though your lambda code to check things before opening the taps. ## AWS AppSync Sure the web console is a little clunky and bug-ridden (it won't reauthenticate its own IAM session so your queries will eventually just ... die) but if you've got a GraphQL interface deep inside some WAF-protected VPN, this is a great way to give it a poke.

Saturday, 26 November 2022

AWS Step Functions - a pretty-good v1.0

I've been using Amazon's Step Functions functionality a fair bit at work recently, as a way to orchestrate and visualise a migration process that involves some Extract-Transform-Load steps and various other bits, each one being an AWS Lambda.

On the whole, it's been pretty good - it's fun to watch the process chug along with the flowchart-like UI automagically updating (I can't show you any screenshots unfortunately, but it's neat). There have been a couple of reminders however that this is a version 1.0 product, namely:

Why can't I resume where I failed before?

With our ETL process, frequently we'll detect a source data problem in the Extract or Transform stages. It would be nice if after fixing the data in place, we could go back to the failed execution and just ask it to resume at the failed step, with all of the other "state" from the execution intact.

Similarly, if we find a bug in our Extract or Transform lambdas themselves, it's super-nice to be able to monkey-patch them right there and then (remembering of course to update the source code in Git as well) - but it's only half as nice as it could be. If we could fix the broken lambda code and then re-run the execution that uncovered the bug, the cycle time would be outstanding

Why can't you remember things for me?

Possibly-related to the first point, is the disappointing discovery that Step Functions have no "memory" or "context" if you prefer, where you can stash a variable for use later in the pipeline. That is you might expect to be able to declare 3 steps like this:

    Extract Lambda
      Inputs:
        accountId
      Outputs: 
        pathToExtractedDataBucket
        
    Transform Lambda
       Inputs:
         pathToExtractedDataBucket
       Outputs:
         pathToTransformedDataBucket
         
    Load Lambda
       Inputs:
         accountId
         pathToTransformedDataBucket
       Outputs:
         isSuccessful
  
But unfortunately that simply will not work (at time of writing, November 2022). The above pipeline will fail at runtime because accountId has not been passed through the Transform lambda in order for the Load lambda to receive it!

For me, this really makes a bit of a mockery of the reusability and composability of lambdas with step functions. To fix the situation above, we have to make the Extract Lambda emit the accountId and Transform Lambda aware of and pass through accountId even though it has no interest in, or need for it!; that is:

   Extract Lambda
      Inputs:
        accountId
      Outputs: 
        accountId
        pathToExtractedDataBucket
        
    Transform Lambda
       Inputs:
         accountId
         pathToExtractedDataBucket
       Outputs:
         accountId
         pathToTransformedDataBucket
         
    Load Lambda
       Inputs:
         accountId
         pathToTransformedDataBucket
       Outputs:
         isSuccessful
  
That's really not good in my opinion, and makes for a lot of unwanted cluttering-up of otherwise reusable lambdas, dealing with arguments that they don't care about, just because some other entity needs them. Fingers crossed this will be rectified soon, as I'm sure I'm not the first person to have been very aggravated by this design.