Benjamin Cane
Portrait of Benjamin Cane
Benjamin Cane
December 19, 2025

Canary deployments are an operational superpower, but the complexity they bring isn’t for everyone. So why not just use Blue/Green deployments instead? 🦸‍♂️

Let’s break it down.

🎞️ A Quick Recap

Both Blue/Green and Canary start the same way:

Take two instances (or clusters) of a service & deploy the new code version to the idle one.

Where they differ is how traffic shifts.

🐤 Canary

Canary deployments gradually shift traffic from old to new.

Both versions serve live traffic during the transition.

🔵|🟢 Blue/Green

A Blue/Green traffic shift is an all-or-nothing shift.

Only one instance is serving traffic; there is no gradual ramp-up.

⚙️ Why Canary Is More Complex

Running two versions at the same time (with both taking traffic) introduces challenges:

  • Backward compatibility
  • Shared (or replicated) databases
  • Sticky sessions
  • Context-aware routing
  • Event ordering across versions
  • Consistency of state

Blue/Green avoids most of this. You still need a rollback plan, but you don’t have to worry about parallel operations.

So if Canary is so complicated… why use it?

🏅 Why Canary Is Worth It (Sometimes)

Canary shines when:

  • The system is highly critical
  • It must run 24/7 with no interruption
  • You cannot accept even a brief outage
  • You want to reduce the blast radius of regressions
  • You release often and need tight control/quick fallback

Canary lets you validate a new version with a small percentage of traffic before gradually increasing it further. If something breaks, roll traffic back instantly.

More importantly, when it breaks, only a portion of traffic is impacted.

For high-risk and mission-critical systems, the complexity is worth it.

🧠 Final Thoughts

Blue/Green is a great default deployment strategy, and in many cases, the optimal one.

A perfect example is file-based batch workloads. Batch systems usually have flexibility in timing. You can:

  • Pause traffic
  • Cut over to the new version
  • Resume processing
  • And if it fails… reprocess the files

Yes, easier said than done, but still far simpler than Canary.

Both approaches have their place. The key is matching the deployment strategy to the system’s criticality and level of acceptable risk.

Back to the feed

Next Post

  • December 26, 2025 One of the toughest engineering skills to develop is accepting a decision you disagree with. 😖

Previous Posts

  • December 12, 2025 Everyone has bias, yes, even you. 🫵
  • December 5, 2025 Do you use Architecture Decision Records? I’m a big fan, and I think they’re a best practice every engineering org should adopt.
  • November 28, 2025 Does resource usage within your application or database suddenly spike periodically? Does it cause system slowdown?

Made with Eleventy and a dash of #Bengineering energy.