How to handle intermittent resource creation failures?

0

I have a service that is used to deploy a new EC2 instance behind an ELB. The code works fine most of the time but every once in a while I get an error "Target groups 'arn:aws:elasticloadbalancing:ca-central-arn:...' not found (Service: AmazonElasticLoadBalancing; Status Code: 400; Error Code: TargetGroupNotFound" when trying to register targets in the target group. Here is a code snippet:

AmazonElasticLoadBalancing client = AmazonElasticLoadBalancingClient.builder()
        ....
        .build();
...
CreateTargetGroupRequest createTargetGroupRequest  = new CreateTargetGroupRequest();
...
CreateTargetGroupResult targetGroupResult = client.createTargetGroup(createTargetGroupRequest);
TargetGroup targetGroup = targetGroupResult.getTargetGroups().stream().findFirst().orElse(null);
assert targetGroup != null;
RegisterTargetsRequest registerTargetsRequest = new RegisterTargetsRequest();
registerTargetsRequest.setTargetGroupArn(targetGroup.getTargetGroupArn());
...
client.registerTargets(registerTargetsRequest);

When I get the error and go to check the target groups in the AWS Console, I can see it is there but without any registered targets. Is this some obscure timing issue? Should I put in a delay between the target group creation and registering the targets? Would it be a good idea to try the operation again if it throws the TargetGroupNotFound exception?

Thanks for any suggestions.

1 Answer
1

While I haven't seen this particular problem there is a high chance that this is a timing issue. It would be very easy to write code which creates something (target group in this case) and then tries to use that something (here, register targets with the target group) before the service control plane has had time to react.

Rather than just add a delay what I would do is query the service to see if the target group has been successfully created. If it hasn't, then wait (say, for a second) and try again. You might even perform an exponential backoff in that loop and have some condition for permanent failure (just in case).

There's an excellent article about this in the Amazon Builder's Library.

profile pictureAWS
EXPERT
answered a year ago
  • Thanks for this. I will give it a try.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions