Discovering Deployment Strategies

External reference: https://medium.com/buildpiper/canary-vs-blue-green-deployment-which-one-should-you-choose-a7d86d2929f0

Before getting head first into a shinny geeky deployment method, one is advised to beware the cognitive load of people that will work with it¹. Also, it is good practice to accompany those with some monitoring to ensure the deployment did not induce regressions².

Deployment may be run either manually, or automatically. In the later case, we will call it continuous deployment³.

When deploying resources, you can either deploy everything at once, or do it incrementally.

The known strategies to deploy everything at once are:

recreate: you shut the old resources down and start the new ones⁴^,⁵^,⁶,
big bang: you start the new resources and shut the old resources down⁷^,⁸^,⁹,
blue/green: you start everything in a (green) environment beside the production (blue), test this new deployment and if reach your expectation switch all the traffic to the new one¹⁰^,¹¹^,¹²^,¹³^,¹⁴. This implies some data structure compatibility between the old and the new version¹⁵.

Incremental deployments supposedly allow a more Zen approach, where you can more easily observe the deployment process and rollback if need be¹⁶, while having no down time¹⁷. On the other hand, that requires a way to deal with the continuity of the user experience¹⁸.

It is unclear to me whether rolling deployment is the same as incremental deployment. In the articles I read, I could see rolling deployment as described to be about increasing gradually¹⁹ the number of users²⁰, servers²¹, nodes²¹ or applications²².

In the case of rolling the servers, people call it ramped²³^,²⁴. Actually, I also often see rolling as describing ramped deployments²⁵^,²⁶, while canary is used to describe rolling at the application level²⁷^,²⁸.

There are two kinds of deployments that are not to upgrade the stack per se: shadow deployment and A/B testing. The former uses a “shadow” environment that will get a copy of user requests²⁹^,³⁰ and whose responses will never be sent back to the user³¹. It helps monitoring how it behaves before deciding to validate a release or not³¹^,³². The later temporary redirect the traffic of a subset of users to an alternative version of a feature to find out how they behave³³^,³⁴. Those two kinds of deployment are about measuring stuff before actually upgrading anything.

When you want a finer grain about the dynamic of the upgrade, you can deploy the new version aside the current one³⁵ and incrementally redirect users to it³⁶^,³⁷. That way, you can monitor how things happen before adding more users. This most likely use similar tooling than A/B testing, but the aim is here to upgrade the stack, not just monitor user behaviors³⁸. This is particularly useful if you feel like something may go wrong and don’t want to impact everyone at once³⁹^,⁴⁰^,⁴¹. Therefore, you may want to spot a particular subset of users that will expect cutting edge stuffs that might become unstable from time to time⁴²^,⁴³. If on the other end, you feel confident with the upgrade, a blue/green deployment is likely more suited⁴⁴. Also, to provide a consistent user experience, you need to deal with sticky sessions here.

Now, if you want even finer grain on what the end users will be able to see, consider adopting trunk based development, in which features are hidden behind feature flags. You can test the latest version (with all flags turn on) and the production version (with only the production flags turned on) and deploy often⁴⁵. Because you deploy the artifacts in disabled state, there is less fear of stuff going wrong and incrementally deploying becomes less needed. The incremental upgrade can be done at the level you want⁴⁶. You can even imagine a user sending a particular header to ask for the new version of a feature⁴⁷. The aim is still to enable all the flags, but this can be done much smoother⁴⁸. You can implement the blue/green, A/B testing and canary deployments using features flags⁴⁹, more easily than trying to do it at the network and session layers⁵⁰. On the other hand, it requires a big amount of work beforehand to test the code appropriately and to have a team doing trunk based development.

Launch darkly made a nice summary of this:

Strategy Description Benefits Drawbacks When to use Downtime mitigation

Big-bang deployment Simultaneous deployment of all changes, impacting all users at once Simple and fast implementation High risk, no easy rollback, downtime for entire application For small, non-critical applications Limited; consider feature flags for quick disabling of features

Rolling deployment Gradual rollout to smaller user groups, reducing risk and allowing rollback Lower risk, easier rollback, less downtime More complex, higher user impact than canary/blue-green When release frequency is a high priority Gradual rollout limits impact; easier rollback

Recreate deployment Recreation of the entire environment with new changes Simple, predictable Downtime, limited testing, scalability limitations When only a single app version can run at a time Limited; consider scheduling during off-peak hours

Canary deployment Subset of users receive changes for early issue detection Early issue detection, real user feedback, easy rollback Slower release, increased infrastructure costs When real user feedback is a priority Limited impact; quick rollback possible

Blue/green deployment Two identical production environments for seamless switchovers Easy rollbacks, improved incident response, zero downtime Extra infrastructure cost, not suitable for user-dependent rollouts When release frequency is a priority with low risk tolerance Instant rollback; zero downtime during switch

Shadow deployment Deploys changes in parallel, unseen by users Low-risk testing with real-world data Complex setup, monitoring overhead, not suitable for all changes When load testing is a priority and downtime tolerance is low No impact on production; thorough pre-deployment testing

— https://launchdarkly.com/blog/deployment-strategies/ ([2025-04-16 Wed])

Strategy	Description	Benefits	Drawbacks	When to use	Downtime mitigation
Big-bang deployment	Simultaneous deployment of all changes, impacting all users at once	Simple and fast implementation	High risk, no easy rollback, downtime for entire application	For small, non-critical applications	Limited; consider feature flags for quick disabling of features
Rolling deployment	Gradual rollout to smaller user groups, reducing risk and allowing rollback	Lower risk, easier rollback, less downtime	More complex, higher user impact than canary/blue-green	When release frequency is a high priority	Gradual rollout limits impact; easier rollback
Recreate deployment	Recreation of the entire environment with new changes	Simple, predictable	Downtime, limited testing, scalability limitations	When only a single app version can run at a time	Limited; consider scheduling during off-peak hours
Canary deployment	Subset of users receive changes for early issue detection	Early issue detection, real user feedback, easy rollback	Slower release, increased infrastructure costs	When real user feedback is a priority	Limited impact; quick rollback possible
Blue/green deployment	Two identical production environments for seamless switchovers	Easy rollbacks, improved incident response, zero downtime	Extra infrastructure cost, not suitable for user-dependent rollouts	When release frequency is a priority with low risk tolerance	Instant rollback; zero downtime during switch
Shadow deployment	Deploys changes in parallel, unseen by users	Low-risk testing with real-world data	Complex setup, monitoring overhead, not suitable for all changes	When load testing is a priority and downtime tolerance is low	No impact on production; thorough pre-deployment testing

no silver bullet

There appears to be no ideal deployment method. If you have a bunch of users using the application frequently and you can “test” the user experience on a few users, a canary might be a good candidate.

If, on the other hand, the application is very seldom used, a canary might be too long (because of the long feedback loop) and a simple recreate might be enough.

Doing canary also comes with the rabbit hole of sticky sessions.

Also, you have to possibly replicate user behavior from the old to the new infrastructure so that when a user is transitioned to the new environment, everything that was done in the previous one is kept.

blue/green lacks the possibility to incrementally upgrade and the comfort of a feedback loop that comes with it, but it does not comes with the sticky session hell. Like canary, it needs a way to “copy” the user data from blue to green dynamically so that the transition goes well.

recreate implies user downtime, but it does not have the issue of playing with several deployments.

If you want more power over the upgrade, trunk based development might help, but it require much effort (technical, but mostly psychological) to be put in place.

In conclusion, no deployment is to be advised for all situations, your mileage may vary⁵¹. They all have a kind of Zen attitude on some dimension and are hard to deal with on other dimensions.

Prior to deciding what deployment method is well suited for your use case, you definitely need:

have to a good monitoring to guess the user behavior,
a good idea of the code change to find out how hellish will be keeping two environments side by side,

Notes linking here

gitops

Permalink

A complex strategy might increase your risk if it overtaxes your team’s capabilities

— https://launchdarkly.com/blog/deployment-strategies/ ([2025-04-16 Wed])

↩︎
No matter what deployment strategy you’re using, it’s important to keep tabs on what’s happening in your environments

— https://devopsbootcamp.org/8-deployment-strategies-explained-and-compared/ ([2025-04-16 Wed])

↩︎
Continuous Deployment (CD) is an approach where every change that passes automated testing is automatically deployed to production. It’s all about speed and automation without human intervention

— https://www.nearshore-it.eu/articles/deployment-strategies-101-types-pros-cons-devops-more/ ([2025-04-16 Wed])

Continuous Deployment (CD) is a software deployment strategy that allows you to release new versions of your application to production at any time, without human intervention

— https://devopsbootcamp.org/8-deployment-strategies-explained-and-compared/ ([2025-04-16 Wed])

↩︎
In this deployment strategy, you shut down the old version of the application completely, deploy the new version, and then turn the whole system back on. This means there will be a downtime while the old software is shut down and the new one is booted up.

— https://devopsbootcamp.org/8-deployment-strategies-explained-and-compared/ ([2025-04-16 Wed])

↩︎
Unlike the big-bang deployment, which overwrites the older version on a running environment, recreate deployments involve terminating the previous version’s deployment and recreating the entire environment with the newly deployed application

— https://launchdarkly.com/blog/deployment-strategies/ ([2025-04-16 Wed])

↩︎
recreate deployment, the old version is completely shut down, the new version is deployed, and then the server is turned back on for users. Like big-bang, this is quite a simple approach but inevitably results in downtime.

— https://www.nearshore-it.eu/articles/deployment-strategies-101-types-pros-cons-devops-more/ ([2025-04-16 Wed])

↩︎
big bang deployment is a type of software deployment in which all of the changes are deployed to the production environment all at once.

[…]

contrast to a phased or incremental deployment, where the changes are deployed in stages or in small batches

— https://devopsbootcamp.org/8-deployment-strategies-explained-and-compared/ ([2025-04-16 Wed])

↩︎
As the name suggests, this deployment strategy involves deploying all changes at once, impacting all users simultaneously.

— https://launchdarkly.com/blog/deployment-strategies/ ([2025-04-16 Wed])

↩︎
big-bang deployment is perhaps the simplest strategy, but it’s also the riskiest. In this approach, the software is delivered all at once, replacing the old version in an instantaneous transition

— https://www.nearshore-it.eu/articles/deployment-strategies-101-types-pros-cons-devops-more/ ([2025-04-16 Wed])

↩︎
With this deployment strategy, we’ll have both the old and new versions of the software running side by side. You might also know it as the red/black deployment strategy. Just to clarify, the stable, or older version of the application is always referred to as blue (or red), while the newer version is green (or black).

[…]

Once the new version has been thoroughly tested and meets all the requirements, the load balancer will automatically switch traffic over to it

— https://devopsbootcamp.org/8-deployment-strategies-explained-and-compared/ ([2025-04-16 Wed])

↩︎
A blue/green deployment is one in which the existing application (blue environment) and the new deployment (green) are independently running in two identical production environments. Once testing is complete on the green environment, live traffic is redirected to it making it live for all users

— https://launchdarkly.com/blog/deployment-strategies/ ([2025-04-16 Wed])

↩︎
Once the ‘Green’ environment is verified and ready, a router or load balancer is used to switch traffic from ‘Blue’ to ‘Green.’ The ‘Blue’ environment now becomes the staging environment for the next release

— https://dev.to/yogini16/understanding-different-types-of-deployments-38c0 ([2025-04-16 Wed])

↩︎
environments are called Blue and Green environments. At any time, only one of these is ‘live’ and serving users, with the other being updated to reflect new changes. When you want to update your application, you do so in the environment that isn’t live – e.g., the blue environment is live, the green environment is updated, then users switch over to the green environment.

— https://www.nearshore-it.eu/articles/deployment-strategies-101-types-pros-cons-devops-more/ ([2025-04-16 Wed])

↩︎
running two identical environments, one serving as the active production environment (blue) and the other as a new release candidate (green). The new release candidate is thoroughly tested before being switched with the production environment, allowing for a smooth transition without any downtime or errors

— https://dev.to/pavanbelagatti/kubernetes-deployments-rolling-vs-canary-vs-blue-green-4k9p

↩︎
For this to work, both environments must have the same database schema to share data seamlessly, or a system to synchronize different schema versions

— https://devopsbootcamp.org/8-deployment-strategies-explained-and-compared/ ([2025-04-16 Wed])

↩︎
rolls out changes incrementally, reducing the risk of downtime and allowing for easy rollbacks in case of errors

— https://dev.to/pavanbelagatti/kubernetes-deployments-rolling-vs-canary-vs-blue-green-4k9p

↩︎
allows for the new version to be deployed while the old version is still running, ensuring that there is no interruption to service

— https://dev.to/pavanbelagatti/kubernetes-deployments-rolling-vs-canary-vs-blue-green-4k9p

↩︎
incremental deployments are difficult to manage and requires routing a percentage of traffic to the updated application and ensuring that this change is sticky (i.e., the user continues to see the same version rather than switch between versions on subsequent requests).

— https://launchdarkly.com/blog/deployment-strategies/ ([2025-04-16 Wed])

↩︎
rolling deployment is a strategy for updating and deploying new versions of software in a controlled and gradual manner

— https://dev.to/pavanbelagatti/kubernetes-deployments-rolling-vs-canary-vs-blue-green-4k9p

↩︎
Rolling deployments (or rolling updates) adopt a gradual rollout approach, where changes are exposed to an increasing percentage of users incrementally until fully released

— https://launchdarkly.com/blog/deployment-strategies/ ([2025-04-16 Wed])

↩︎
In a Rolling Deployment, the new version is deployed incrementally to subsets of servers or nodes, usually behind a load balancer. Each subset is taken offline, updated, and brought back into the rotation.

— https://dev.to/yogini16/understanding-different-types-of-deployments-38c0 ([2025-04-16 Wed])

↩︎ ↩︎
creating a new replica set with the updated version of the software while gradually scaling down the old replica set

— https://dev.to/pavanbelagatti/kubernetes-deployments-rolling-vs-canary-vs-blue-green-4k9p

↩︎
ramped deployment, we start by updating a small percentage of servers at a time and gradually increase that percentage over time

— https://devopsbootcamp.org/8-deployment-strategies-explained-and-compared/ ([2025-04-16 Wed])

↩︎
ramped option gradually increases the percentage of each server running the new version, whereas the rolling approach takes turns to update each server in one go.

— https://www.nearshore-it.eu/articles/deployment-strategies-101-types-pros-cons-devops-more/ ([2025-04-16 Wed])

↩︎
some servers run the new version of the application, while others continue to host the older one. As traffic comes in to the application, some users interact with the new code, while others land on the known-good production version.

— https://www.techtarget.com/searchitoperations/answer/When-to-use-canary-vs-blue-green-vs-rolling-deployment ([2025-04-17 Thu])

↩︎
In a rolling deployment, infrastructure instances of the previous version are slowly replaced with nodes running the new version iteratively

— https://www.koyeb.com/blog/blue-green-rolling-and-canary-continuous-deployments-explained#blue-green-deployment- ([2025-04-17 Thu])

↩︎
small percentage of traffic is then routed to the new replica set, while the majority of the traffic continues to be served by the original replica set.

— https://dev.to/pavanbelagatti/kubernetes-deployments-rolling-vs-canary-vs-blue-green-4k9p

↩︎
In a canary deployment, the new version is released to a small number of users

— https://www.koyeb.com/blog/blue-green-rolling-and-canary-continuous-deployments-explained#rolling-deployment- ([2025-04-17 Thu])

↩︎
In this deployment strategy, we’ll deploy the new version alongside the old one, but users won’t have access to the new version right away. It’s like the new version is hiding in the shadows. We’ll send a copy or “fork” of the requests the old version receives to the shadow version to see how it will handle them when it goes live.

— https://devopsbootcamp.org/8-deployment-strategies-explained-and-compared/ ([2025-04-16 Wed])

↩︎
Shadow Deployment, a copy of the production traffic is mirrored to a shadow environment where a new version or feature is tested. The results of the testing are observed without affecting the live environment

— https://dev.to/yogini16/understanding-different-types-of-deployments-38c0 ([2025-04-16 Wed])

↩︎
shadow deployments differ is that they can simulate real traffic patterns under real load by replicating actual requests from production traffic and sending them to the shadow environment in real-time. The output of these requests are discarded and never shown to the user (thus the name “shadow”), but the metrics they provide can help uncover any issues that might occur in a real-world production environment.

— https://launchdarkly.com/blog/deployment-strategies/ ([2025-04-16 Wed])

↩︎ ↩︎
shadow deployment, also known as dark launching, the new version runs alongside the current version but doesn’t receive any real user traffic. Instead, it receives a copy of the production traffic, allowing DevOps engineers to test how it performs under real conditions without affecting users

— https://www.nearshore-it.eu/articles/deployment-strategies-101-types-pros-cons-devops-more/ ([2025-04-16 Wed])

↩︎
A/B testing deployment is a way for developers to test out new versions of their software. They do this by deploying the new version alongside the older version, but only making the new version available to a select group of users.

— https://devopsbootcamp.org/8-deployment-strategies-explained-and-compared/ ([2025-04-16 Wed])

↩︎
A/B testing deployment involves running two versions of the application side by side and directing a portion of traffic to a select group of users. This allows you to compare the performance or user acceptance of different versions side-by-side

— https://www.nearshore-it.eu/articles/deployment-strategies-101-types-pros-cons-devops-more/ ([2025-04-16 Wed])

↩︎
done by creating a new replica set with the updated version of the software while keeping the original replica set running

— https://dev.to/pavanbelagatti/kubernetes-deployments-rolling-vs-canary-vs-blue-green-4k9p

↩︎
canary deployment, we’ll set up the new version and slowly shift production traffic from the older version to the new one. For instance, during the deployment process, the older version might still handle 75% of all traffic while the newer version handles the remaining 25%.

— https://devopsbootcamp.org/8-deployment-strategies-explained-and-compared/ ([2025-04-16 Wed])

↩︎
canary deployment is a technique for rolling out new features or changes to a small subset of users or servers before releasing the update to the entire system

— https://dev.to/pavanbelagatti/kubernetes-deployments-rolling-vs-canary-vs-blue-green-4k9p

↩︎
Canary deployments should not be confused with A/B testing, which is for testing user experience and engagement across potential new versions

— https://www.koyeb.com/blog/blue-green-rolling-and-canary-continuous-deployments-explained#rolling-deployment- ([2025-04-17 Thu])

↩︎
fraction of the users (the “canaries”) is selected for the new version rollout. Performance and stability are monitored in real-time. If the canaries don’t experience issues, more users are switched over to the new version. This process continues incrementally until the new version is fully deployed or any issues are identified, in which case the deployment is halted.

— https://dev.to/yogini16/understanding-different-types-of-deployments-38c0 ([2025-04-16 Wed])

↩︎
Canary deployment allows you to dip your toe in the water before jumping in. A small subset of users or servers receives the new updates first (the canary release), allowing you to test the functionality and assess how it may impact the user experience before increasing traffic to the updated server (and thus the number of users) over time

— https://www.nearshore-it.eu/articles/deployment-strategies-101-types-pros-cons-devops-more/ ([2025-04-16 Wed])

↩︎
If issues are detected during the canary deployment, it can be quickly rolled back to the original replica set

— https://dev.to/pavanbelagatti/kubernetes-deployments-rolling-vs-canary-vs-blue-green-4k9p

↩︎
“canary” users are typically chosen randomly, but they can also be selected based on their current usage patterns or other characteristics that may make them more likely to respond positively to the change.

— https://devopsbootcamp.org/8-deployment-strategies-explained-and-compared/ ([2025-04-16 Wed])

↩︎
canary deployments are designed to target a small subset of users to allow for the early detection of issues without impacting the entire user base.

[…]

Besides early detection of bugs, this process usually targets a specific subset of “power users” who can provide user feedback on changes.

— https://launchdarkly.com/blog/deployment-strategies/ ([2025-04-16 Wed])

↩︎
While Blue/green deployments are used to eliminate downtime, Canary deployments are used to test a new feature in a production environment with minimal risk.

— https://medium.com/buildpiper/canary-vs-blue-green-deployment-which-one-should-you-choose-a7d86d2929f0

↩︎
Strategies like trunk-based development, whereby code is constantly merged into the main branch, combined with continuous integration and continuous delivery (CI/CD) mean that new code is constantly being pushed to a production environment. However, it is only when the relevant feature flag is enabled, that the new features are actually released.

— https://launchdarkly.com/blog/deployment-strategies/ ([2025-04-16 Wed])

↩︎
means that release doesn’t encompass the entire application code, but can be thought of on a feature-by-feature basis

— https://launchdarkly.com/blog/deployment-strategies/ ([2025-04-16 Wed])

↩︎
Features are developed and included in the codebase but hidden behind a toggle. The toggle can be controlled externally, enabling or disabling features for different users or groups. Features can be gradually rolled out by toggling them on for specific users or groups, monitoring performance and user feedback.

— https://dev.to/yogini16/understanding-different-types-of-deployments-38c0 ([2025-04-16 Wed])

↩︎
original concept of DevOps, release came before the deployment process within the software development and release cycle. A release consisted of freezing code, after it had undergone testing and validation in the staging environment, and marking it ready for deployment to production. The application deployment was the process of moving this code to a production environment, thereby making it live.

The introduction of feature flags changed this. In a modern DevOps process, deployment now comes before release. This is because feature flags decouple the deploy from the release. Even though code is deployed to production, the code may not be running. Users will see the old version until the new version of the application is enabled. Strategies like trunk-based development, whereby code is constantly merged into the main branch, combined with continuous integration and continuous delivery (CI/CD) mean that new code is constantly being pushed to a production environment. However, it is only when the relevant feature flag is enabled, that the new features are actually released.

This also means that release doesn’t encompass the entire application code, but can be thought of on a feature-by-feature basis.

— https://launchdarkly.com/blog/deployment-strategies/ ([2025-04-16 Wed])

↩︎
using feature flags it is possible to replicate a blue/green deployment strategy without requiring duplicate environments. The green deployment code would be deployed to the standard production environment but hidden behind a feature flag. The new functionality would only be visible to the internal testing team via targeting and, once the release has been validated, the flag would be turned on for all users.

— https://launchdarkly.com/blog/deployment-strategies/ ([2025-04-16 Wed])

↩︎
Feature flags alter the equation when it comes to rolling or incremental releases because they become relatively trivial to manage

— https://launchdarkly.com/blog/deployment-strategies/ ([2025-04-16 Wed])

↩︎
Here are some scenarios where using feature flags make more sense than blue/green deployments. Like everything else in software, it’s about trade-offs,

— https://www.getunleash.io/blog/blue-green-deployments ([2025-04-16 Wed])

↩︎