Safe Software Deployments: Through the Looking Glass

We’ve covered a lot of ground in this Safe Software Deployment series, from the 180 Rule to Z Deployments to the Goldilocks Gauge. But there is an elephant in the room. Or should I say, a jabberwock.

In Lewis Carroll’s novel Through the Looking Glass, Alice discovers that the mirror above her mantle is not a mirror at all, but a doorway to another world in which things work very differently. When developers push software from staging to production, they often have a similar experience. Though they want to believe that staging and production are the same, they discover that staging is not a mirror at all, and production is another world, in which things work very differently. And out of that distortion come bugs and outages.

The bottom line is this: Staging ≠ Production. And it never will. There are simply too many variables between these two environments to ever achieve exact alignment. Those environments can be different in hardware: CPU cores, threads per core, cache size, microcode; bus architectures; memory size; firmware. Or different in software or configuration: OS versions; compilers; libraries; network traffic profiles. Or different in network topology, edge caching; DNS and directory services. And of course we know that no matter how diligent you test in staging, customer workloads exercise the software in different ways. But the major reason they are different is because both environments have had a different set of software deployed to them over time, with a different set of configuration parameters - and a different combination of patches, hacks, and rollbacks.

That staging is a mirror of production is one of the great delusions of software development. Developers often tell this lie to themselves, to overcome their fear of pushing to prod. Worse, developers often know the truth, but don’t really know how to explain this ambiguity to their management chain, leading to inevitable trust issues when deployments fail.

So what can we do about it?

First, accept reality. Modern, distributed software systems are nonlinear, and all test environments are simulacrums. It’s particularly important to help managers understand that even if it were possible to create exact duplicates of production – which it’s not – it would be practically and financially unjustifiable.

Second, approach testing like an actuary. The leap from staging to production is essentially an exercise in probability. You should know your architecture, operational characteristics, and costs well enough to prioritize tests and reduce risk. You may even want to create two or more test environments that are deliberately different to reduce the odds of failure. And you should continue to run tests after the release, so you can surface bugs before your customers do.

Third, if you can, treat both production and staging like cattle, not pets. If you have an enlightened software organization, one that believes in best practices, set up your systems so that you can blow away and recreate both your production and staging environments at regular intervals. This will reset many of the deviations and environment drift that build up over time.

Finally, you have to be able to do automated rollbacks (the 180 Rule) which work reliably (Z Deployments), and that are the right size for optimum efficiency and safety (the Goldilocks Gauge. ;-)

I saved this column for last because completely eliminating the differences between staging and production is not a solvable problem. And frankly, you shouldn’t even try. But you don’t want to get caught flat-footed either, so you need a system of best practices that greatly reduces risk and fosters confidence. In other words, a system of Safe Software Deployment that helps you overcome the fear of pushing to prod.

And most importantly, everyone in your organization needs to have a common understanding of the problem, from the top down. So feel free to print this post out, slide it under the door of your manager, and slink away like that Cheshire Cat.

Have another technique for managing the divergence between staging and production? Share it with me at @MarkLovesTech.

Safe Software Deployments: Through the Looking Glass

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112