The most recent website release at my current site and only be described as une grande débacle.
The reason was an incompatibility between the newly-deployed webapp and an (in-house, but non-local) service it depended on. It wasn't until the webapp refused to start in production that the incompatibility was detected.
Now I'm a very big fan of service-oriented architectures, particularly if they are of a lightweight (i.e. no XML/Schemas/SOAP Envelopes/WSDLs) nature, but they also require a greater level of diligence than a traditional big-ball-of-code when it comes to environments.
The problem here was that each project made its way through development→test→staging→production while connecting to other projects doing exactly the same thing. So in staging, our webapp pointed to the staging version of that other service. And because our website released to prod on its own, the versions didn't line up and a frenzy of finger-pointing began.
So how do we simultaneously avoid future débacles and lower the collective blood-pressure on deployment-day? Well for the comprehensive answer, I can only point you towards Continuous Delivery, but in short:
- Make staging and production physically identical and interchangeable
- Run up your environment (your code and everyone else's services) in staging in a code-frozen environment as release day approaches
- Belt staging with regression tests, performance tests, whatever you've got
- On release day, "flip the switch" on your router, making staging the "live" environment
- Keep the old prod "warm", ready to flip it back if any problems are found
This strategy dissipates the stress and risk of releasing new software over the entire "we're in staging" period, rather than one horrible day.
Needless to say, I'll be recommending a Continuous-Delivery approach in our next retro!