I am trying to setup the AWS node termination handler, and am running into an issue where the EventBridge rule is invoked, but no messages are showing up in the sqs queue. I have tested and the termination handler is able to communicate with the SQS queue. I have also tested spinning instances up and down, and see the rule invocations for the EventBridge rules. However, there are no messages appearing in the queue...
NOTE: I tried adding a photo here from cloudwatch showing rule invocations but no messages appearing in the queue, it seems like pictures are not supported here yet...
Below are my configs for this:
SQS policy:
resource "aws_sqs_queue_policy" "termination_handler_queue_policy" {
queue_url = module.termination_handler_queue.sqs_queue_id
policy = jsonencode({
"Version" : "2012-10-17",
"Id" : "sqspolicy",
"Statement" : [
{
"Sid" : "TermEventsToHandlerQueue",
"Effect" : "Allow",
"Principal" : {
"Service" : ["events.amazonaws.com", "sqs.amazonaws.com"]
},
"Action" : "sqs:*",
"Resource" : "${module.termination_handler_queue.sqs_queue_name}",
"Condition" : {
"ArnEquals" : {
"aws:SourceArn" : ["arn:aws:events:us-east-2:${local.account_id}:rule/node-termination-asg-lifecycle-rule",
"arn:aws:events:us-east-2:${local.account_id}:rule/node-termination-ec2-status-rule",
"arn:aws:events:us-east-2:${local.account_id}:rule/node-termination-ec2-spot-interruption-rule",
"arn:aws:events:us-east-2:${local.account_id}:rule/node-termination-ec2-rebalance-rule"
]
}
}
}
]
})
}
EventBridge Config:
module "termination_handler_eventbridge" {
source = "terraform-aws-modules/eventbridge/aws"
version = "~> 1.14.0"
create_bus = false
rules = {
node-termination-asg-lifecycle = {
description = "Capture eks asg lifecycle events."
event_pattern = jsonencode({
"source" : ["aws.autoscaling"],
"detail-type" : ["EC2 Instance Launch Successful", "EC2 Instance Terminate Successful", "EC2 Instance Launch Unsuccessful", "EC2 Instance Terminate Unsuccessful", "EC2 Instance-launch Lifecycle Action", "EC2 Instance-terminate Lifecycle Action"],
"detail" : {
"AutoScalingGroupName" : ["eks-Group_A", "eks-Group_B"]
}
})
enabled = true
}
node-termination-ec2-status = {
description = "Capture ec2 status events"
event_pattern = jsonencode({
"source" : ["aws.ec2"],
"detail-type" : ["EC2 Instance State-change Notification"]
})
enabled = true
}
node-termination-ec2-spot-interruption = {
description = "Capture spot interruption events"
event_pattern = jsonencode({
"source" : ["aws.ec2"],
"detail-type" : ["EC2 Spot Instance Interruption Warning"]
})
enabled = true
}
node-termination-ec2-rebalance = {
description = "Capture ec2 rebalance events"
event_pattern = jsonencode({
"source" : ["aws.ec2"],
"detail-type" : ["EC2 Instance Rebalance Recommendation"]
})
enabled = true
}
}
targets = {
node-termination-asg-lifecycle = [
{
name = "termination_handler-sqs-life"
arn = module.termination_handler_queue.sqs_queue_arn
},
]
node-termination-ec2-status = [
{
name = "termination_handler-sqs-status"
arn = module.termination_handler_queue.sqs_queue_arn
},
]
node-termination-ec2-spot-interruption = [
{
name = "termination_handler-sqs-int"
arn = module.termination_handler_queue.sqs_queue_arn
},
]
node-termination-ec2-rebalance = [
{
name = "termination_handler-sqs-rebalance"
arn = module.termination_handler_queue.sqs_queue_arn
},
]
}
tags = {
Name = "node-termination-handler-bus"
Service = "aws-node-termination-handler"
}
}
Is there somewhere specific you can point me towards? The only SQS cloud trail entries are for the creation and SetAttributes on the queue, and events.amazonaws.com just shows listTargetsByRule and DescribeRule. I don't see anything useful in CloudTrail.
I'm sorry I forgot, that CloudTrail won't log access deny actions and does not have integration like with S3. What I have in mind is that your event may do not have sufficient permissions to push events into SQS. Have you tried giving a wider SQS resource policy (for example without conditions, I assume that you're working on a test environment) and checking if it resolves our issue? I found also one thing in your SQS policy:
"Resource": "${module.termination_handler_queue.sqs_queue_name}"
here you're referring to the name of the SQS, but as this is a policy, you should pass ARN here.Regards