Running SSM Commands after ec2 instance is created (Terraform)

0

I have created a Terraform module which creates a ec2 instance the module also creates aws_ssm_association resources which essentially runs a command.

My custom command runs just fine and is working perfectly this basically sets up docker swarm and deploys an application.

The problem I have is I need to install the AWS Cloudwatch Agent so I can send logs and metrics to cloudwatch.

I used the AWS-ConfigureAWSPackage document to install AmazonCloudWatchAgent this only works when I run it from the AWS Web Console when I run the same command with Terraform it fails.

I then used my instance's user_data script to install the AmazonCloudWatchAgent.

Then I tried to setup the cloudwatch agent using the AmazonCloudWatch-ManageAgent document. Again this works only from the AWS Web console and not from Terraform.

I get the following error: CloudWatch Agent not installed. Please install it using the AWS-ConfigureAWSPackage SSM Document.

When I log into the ec2 instance after creation the agent is installed and when I run the document from AWS Web console it works.

Is there some race condition here? Is Terraform trying to run these commands too fast after the ec2 instance was created?

I have no idea how to handle this, I would really prefer all config on the ec2 instance to be done with Systems Manger and not rely on using the user_data script.

Any ideas would be greatly appreciated.

3 Risposte
0
Risposta accettata

Okay, So I found a solution to my problem. I used the terraform time_sleep resource, this allowed me to create delay of around a minute which gave the instance a chance to pass all systems checks then all the aws_ssm_association resources created successfully.

Creating the time_sleep resource:

# Sleep
# Wait 1 minute
resource "time_sleep" "wait_60_seconds" {
  depends_on = [aws_instance.ec2-instance]

  create_duration = "60s"
}

You can then depend on the time_sleep resource this will then create the aws_ssm_association resource after 1 minute.

# SSM Run command
# Configure Cloudwatch Agent
resource "aws_ssm_association" "cloudwatch-config" {

  name = "AmazonCloudWatch-ManageAgent"
  
  targets {
    key    = "InstanceIds"
    values = [aws_instance.ec2-instance.id]  # Use the correct instance ID from aws_instance
  }

  parameters = {
    action                      = "configure"
    mode                        = "ec2"
    optionalConfigurationSource = "ssm"
    optionalConfigurationLocation = "CWA_config"
    optionalRestart               = "yes"
  }

  depends_on = [ aws_ssm_association.cloudwatch-agent, time_sleep.wait_60_seconds ]

}
con risposta un mese fa
0

It appears that you're encountering a timing issue when trying to install and configure the CloudWatch agent using Terraform and AWS Systems Manager (SSM). This is likely due to a race condition between the EC2 instance creation, the SSM agent initialization, and the execution of your SSM commands.

Here are some suggestions to address this issue:

  1. Ensure proper IAM permissions: Make sure the IAM role associated with your EC2 instance has the necessary permissions, including the AmazonSSMManagedInstanceCore and CloudWatchAgentServerPolicy. This allows the instance to interact with Systems Manager and CloudWatch.

  2. Use a null_resource with a local-exec provisioner: Instead of relying solely on the aws_ssm_association resource, you can use a null_resource with a local-exec provisioner in your Terraform code. This allows you to add a delay before executing the SSM commands, giving the instance and SSM agent time to initialize fully.

  3. Implement a retry mechanism: You can create a script that attempts to run the SSM command multiple times with a delay between attempts. This can help overcome temporary unavailability of the SSM agent.

  4. Use SSM State Manager: Instead of running the commands immediately after instance creation, you can use SSM State Manager to schedule the execution of your SSM documents. This gives the instance more time to fully initialize before attempting to run the commands.

  5. Verify SSM agent status: Before running your SSM commands, you can add a step to check the status of the SSM agent on the instance. This ensures that the agent is ready to receive commands.

  6. Split the process: Consider separating the CloudWatch agent installation and configuration into two steps. First, ensure the agent is installed (either through user_data or an SSM command), and then use a separate SSM command to configure it.

  7. Use the latest SSM agent: Ensure that your EC2 instance is using the latest version of the SSM agent, as newer versions may have improved handling of race conditions.

By implementing one or more of these strategies, you should be able to reliably install and configure the CloudWatch agent using Terraform and Systems Manager, without relying solely on user_data scripts.
Sources
Install and configure the CloudWatch agent using the Amazon EC2 console to add additional metrics - Amazon Elastic Compute Cloud
Download, configure, and run the CloudWatch agent using SSM - Amazon CloudWatch
Resolve why SSM Agent doesn't start on EC2 Windows instance | AWS re:Post
Install and configure the unified CloudWatch Agent with SSM | AWS re:Post

profile picture
con risposta 2 mesi fa
0

My approach is to install the cloudwatch agent during the intial boot via userdata. Would this work for you?

https://docs.aws.amazon.com/prescriptive-guidance/latest/implementing-logging-monitoring-cloudwatch/deploy-cloudwatch-agent-user-data-script.html

profile picture
ESPERTO
con risposta 2 mesi fa

Accesso non effettuato. Accedi per postare una risposta.

Una buona risposta soddisfa chiaramente la domanda, fornisce un feedback costruttivo e incoraggia la crescita professionale del richiedente.

Linee guida per rispondere alle domande