Custom Canary Checks with ECS CodeDeploy Blue/Green Canary Deployment

0

Hey everyone,

I have an ECS Fargate cluster up and running with automated CI/CD via CodePipeline & CodeDeploy. The infrastructure is managed with terraform and works as desired. My deployment strategy is blue/green, and I went for the CodeDeployDefault.ECSCanary10percent5Minutes deployment configuration (as described here: https://docs.aws.amazon.com/AmazonECS/latest/userguide/deployment-type-bluegreen.html). The gradual traffic shift between the blue and green task sets works fine, however I haven't found any way to influence the canary checks (like, abort the deployment if something's wrong) or introduce custom canary checks during the deployment process. Also, I found very little information on how Canary determines the health of a deployment. I'd like to perform a number of checks during deployment, and abort if anything's wrong.

My questions are:

  • How is a Canary release deemed healthy/unhealthy by default (assumption: The alb target group health check?)
  • What's a possible best practice to observe deployment health during canary releases and perform automated rollback? Is this even possible currently?

Upon googling some more it seems that adding CloudWatch Alarms and an AlarmConfiguration to the Deployment Group could do the trick (as documented here: https://docs.aws.amazon.com/codedeploy/latest/userguide/deployment-groups-configure-advanced-options.html). Is that the idiomatic way to achieve this? If so, how do I ensure that the CloudWatch alarm check is only performed against the service version that is currently being deployed?

Thanks for any pointers everyone!

Maik

maik
asked 2 years ago896 views
1 Answer
1
Accepted Answer

Another option is to add an AfterAllowTraffic lifecycle hook to your application's AppSpec file. This hook will be called once traffic begins to flow in to your canary tasks. You can use whatever custom logic you like in the Lambda function. To complete the lifecycle hook, your Lambda function (or its delegate) must invoke CodeDeploy's PutLifecycleEventHookExecutionStatus API action with success or failure.

You can find a tutorial on this here and a lifecycle hook reference here.

AWS
EXPERT
answered 2 years ago
  • Thanks for your thoughts Michael, I will have a look at this! Just so that I'll be able to give your idea a little more context: What would you consider the "aws idiomatic" way, and what approach would you recommend in general? Would you say that plugging a lambda into the AfterAllowTraffic hook or use an Alarm Configuration in the Deployment Group is the better way to handle this? Thanks.

  • The answer, as with most things, is "it depends." It's going to come down to a judgment call you must make that is informed by the needs of your business and the specific application. If your deployment health can be expressed exclusively in terms of CloudWatch metrics, then using a metric based alarm could work. Otherwise, if you think it would be more useful to express your health evaluation logic in code, then a Lambda function will work better.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions