I have a few EC2 instances running Ubuntu 20.04 where I want to collect system metrics and application logs, so I installed the Cloudwatch agent on them. I set up the agent and already tried with different users, the generated config file was as below:
{
"agent": {
"metrics_collection_interval": 60,
"run_as_user": "cwagent"
},
"logs": {
"logs_collected": {
"files": {
"collect_list": [
{
"file_path": "/home/deploy/.pm2/logs/api-out.log",
"log_group_name": "staging-logs",
"log_stream_name": "PM2 log",
"retention_in_days": 90
},
{
"file_path": "/home/deploy/.pm2/logs/api-error.log",
"log_group_name": "staging-logs",
"log_stream_name": "PM2 error logs",
"retention_in_days": 90
}
]
}
}
},
"metrics": {
"aggregation_dimensions": [
[
"InstanceId"
]
],
"metrics_collected": {
"disk": {
"measurement": [
"used_percent"
],
"metrics_collection_interval": 60,
"resources": [
"*"
]
},
"mem": {
"measurement": [
"mem_used_percent"
],
"metrics_collection_interval": 60
}
}
}
When I try to create a dashboard or alert based on this data, under "CWAgent > InstanceId" only the metrics for one instance are listed. If I try "CWAgent > host" all instances are listed but only the mem_used_percent option (no disk_used_percent). Also no logs are sent to the Cloudwatch logs.
Here is the Cloudwatch agent log:
2023/12/12 16:39:18 Reading json config file path: /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.json ...
2023/12/12 16:39:18 Reading json config file path: /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.d/default ...
2023/12/12 16:39:18 I! Valid Json input schema.
2023/12/12 16:39:18 I! Detected runAsUser: cwagent
2023/12/12 16:39:18 I! Changing ownership of [/opt/aws/amazon-cloudwatch-agent/logs /opt/aws/amazon-cloudwatch-agent/etc /opt/aws/amazon-cloudwatch-agent/var] to 996:997
2023/12/12 16:39:18 I! Set HOME: /home/cwagent
2023-12-12T16:39:18Z I! Starting AmazonCloudWatchAgent CWAgent/1.300031.0b313 (go1.21.3; linux; amd64)
2023-12-12T16:39:18Z I! AWS SDK log level not set
2023-12-12T16:39:18Z I! creating new logs agent
2023-12-12T16:39:18Z I! [logagent] starting
2023-12-12T16:39:18.752Z info service/telemetry.go:76 Skipping telemetry setup. {"address": "", "level": "None"}
2023-12-12T16:39:18.779Z info service/service.go:138 Starting CWAgent... {"Version": "1.300031.0b313", "NumCPU": 1}
2023-12-12T16:39:18.779Z info extensions/extensions.go:31 Starting extensions...
2023-12-12T16:39:18.779Z info extensions/extensions.go:34 Extension is starting... {"kind": "extension", "name": "agenthealth/metrics"}
2023-12-12T16:39:18.779Z info extensions/extensions.go:38 Extension started. {"kind": "extension", "name": "agenthealth/metrics"}
2023-12-12T16:39:18Z I! cloudwatch: get unique roll up list []
2023-12-12T16:39:18.786Z info ec2tagger/ec2tagger.go:435 ec2tagger: Check EC2 Metadata. {"kind": "processor", "name": "ec2tagger", "pipeline": "metrics/host"}
2023-12-12T16:39:18Z I! cloudwatch: publish with ForceFlushInterval: 1m0s, Publish Jitter: 35.787405733s
2023-12-12T16:39:18.794Z info ec2tagger/ec2tagger.go:334 ec2tagger: EC2 tagger has started initialization. {"kind": "processor", "name": "ec2tagger", "pipeline": "metrics/host"}
2023-12-12T16:39:18.794Z info service/service.go:161 Everything is ready. Begin running and processing data.
2023-12-12T16:39:19.126Z info ec2tagger/ec2tagger.go:500 ec2tagger: Initial retrieval of tags succeeded {"kind": "processor", "name": "ec2tagger", "pipeline": "metrics/host"}
2023-12-12T16:39:19.126Z info ec2tagger/ec2tagger.go:411 ec2tagger: EC2 tagger has started, finished initial retrieval of tags and Volumes {"kind": "processor", "name": "ec2tagger", "pipeline": "metrics/host"}
There is no error so I would expect the metrics and logs should be collected. If anyone can provide any help I would be very grateful.
Thanks for you answer. I checked the role used by my instances and it has the default CloudWatchAgentServerPolicy attached. Bellow is the full policy: