Hi,
I am trying to do something basic: pipe logs from a process running on an EC2 instance to Cloudwatch. I followed the Cloudwatch tutorials for installing/configuring/starting an agent on the EC2 instance, setting up an IAM role for the instance, etc, but:
- No log group with the log group name specified in the agent config appears in the Cloudwatch console (I have checked that the IAM role includes permission to create log groups and streams)
- After manually creating the log group (and stream), no logs appear in the console
I have checked that I am looking at the correct region in my console.
The Cloudwatch agent produces some logs which do contain some errors where the agent does not have local permissions to determine disk usage, but I don't see how that can be related to the missing logs. I have also verified that the agent has access to the log file that it is supposed to be tailing. Finally, the EC2 instances have outbound traffic permissions, since they need to talk to the wider internet.
Does anyone have any ideas to troubleshoot?
Here are the logs produced by the agent:
2024/07/23 08:00:00 Reading json config file path: /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.json ...
2024/07/23 08:00:00 Reading json config file path: /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.d/file_amazon-cloudwatch-agent.json ...
2024/07/23 08:00:01 I! Valid Json input schema.
2024/07/23 08:00:01 I! Detected runAsUser: cwagent
2024/07/23 08:00:01 I! Changing ownership of [/opt/aws/amazon-cloudwatch-agent/logs /opt/aws/amazon-cloudwatch-agent/etc /opt/aws/amazon-cloudwatch-agent/var] to 997:997
2024/07/23 08:00:01 I! Set HOME: /home/cwagent
2024-07-23T08:00:01Z I! Starting AmazonCloudWatchAgent CWAgent/1.300042.0b733 (go1.22.5; linux; amd64) with log file /opt/aws/amazon-cloudwatch-agent/logs/amazon-cloudwatch-agent.log with log target lumberjack
2024-07-23T08:00:01Z I! AWS SDK log level not set
2024-07-23T08:00:01Z I! creating new logs agent
2024-07-23T08:00:01Z I! [logagent] starting
2024-07-23T08:00:01Z I! [logagent] found plugin cloudwatchlogs is a log backend
2024-07-23T08:00:01Z I! [logagent] found plugin logfile is a log collection
2024-07-23T08:00:01Z I! [logagent] start logs plugin file paths [/var/lib/docker/containers/d93b1346b7e8553117508f4e1f80590edb6040c93f81dd58dec5b2930e672f1f/d93b1346b7e8553117508f4e1f80590edb6040c93f81dd58dec5b2930e672f1f-json.log]
2024-07-23T08:00:01Z I! [inputs.logfile] turned on logs plugin
2024-07-23T08:00:01Z I! {"caller":"service@v0.98.0/telemetry.go:47","msg":"Skipping telemetry setup.","address":"","level":"None"}
2024-07-23T08:00:01Z I! {"caller":"awsxrayreceiver@v0.98.0/receiver.go:45","msg":"Going to listen on endpoint for X-Ray segments","kind":"receiver","name":"awsxray","data_type":"traces","udp":"127.0.0.1:2000"}
2024-07-23T08:00:01Z I! {"caller":"udppoller/poller.go:95","msg":"Listening on endpoint for X-Ray segments","kind":"receiver","name":"awsxray","data_type":"traces","udp":"127.0.0.1:2000"}
2024-07-23T08:00:01Z I! {"caller":"awsxrayreceiver@v0.98.0/receiver.go:56","msg":"Listening on endpoint for X-Ray segments","kind":"receiver","name":"awsxray","data_type":"traces","udp":"127.0.0.1:2000"}
2024-07-23T08:00:01Z I! {"caller":"service@v0.98.0/service.go:143","msg":"Starting CWAgent...","Version":"1.300042.0b733","NumCPU":8}
2024-07-23T08:00:01Z I! {"caller":"extensions/extensions.go:34","msg":"Starting extensions..."}
2024-07-23T08:00:01Z I! {"caller":"extensions/extensions.go:37","msg":"Extension is starting...","kind":"extension","name":"agenthealth/traces"}
2024-07-23T08:00:01Z I! {"caller":"extensions/extensions.go:52","msg":"Extension started.","kind":"extension","name":"agenthealth/traces"}
2024-07-23T08:00:01Z I! {"caller":"extensions/extensions.go:37","msg":"Extension is starting...","kind":"extension","name":"agenthealth/metrics"}
2024-07-23T08:00:01Z I! {"caller":"extensions/extensions.go:52","msg":"Extension started.","kind":"extension","name":"agenthealth/metrics"}
2024-07-23T08:00:01Z I! cloudwatch: get unique roll up list [[InstanceId]]
2024-07-23T08:00:01Z I! {"caller":"ec2tagger/ec2tagger.go:415","msg":"ec2tagger: Check EC2 Metadata.","kind":"processor","name":"ec2tagger","pipeline":"metrics/host"}
2024-07-23T08:00:01Z I! cloudwatch: publish with ForceFlushInterval: 1m0s, Publish Jitter: 42.599937511s
2024-07-23T08:00:01Z I! {"caller":"ec2tagger/ec2tagger.go:333","msg":"ec2tagger: EC2 tagger has started initialization.","kind":"processor","name":"ec2tagger","pipeline":"metrics/host"}
2024-07-23T08:00:01Z I! Started the statsd service on :8125
2024-07-23T08:00:01Z I! {"caller":"awsxrayreceiver@v0.98.0/receiver.go:90","msg":"X-Ray TCP proxy server started","kind":"receiver","name":"awsxray","data_type":"traces"}
2024-07-23T08:00:01Z I! {"caller":"service@v0.98.0/service.go:169","msg":"Everything is ready. Begin running and processing data."}
2024-07-23T08:00:01Z W! {"caller":"localhostgate/featuregate.go:63","msg":"The default endpoints for all servers in components will change to use localhost instead of 0.0.0.0 in a future version. Use the feature gate to preview the new default.","feature gate ID":"component.UseLocalHostAsDefaultHost"}
2024-07-23T08:00:01Z I! Statsd listener listening on: [::]:8125
2024-07-23T08:00:01Z I! {"caller":"ec2tagger/ec2tagger.go:480","msg":"ec2tagger: Initial retrieval of tags succeeded","kind":"processor","name":"ec2tagger","pipeline":"metrics/host"}
2024-07-23T08:00:01Z I! {"caller":"ec2tagger/ec2tagger.go:391","msg":"ec2tagger: EC2 tagger has started, finished initial retrieval of tags and Volumes","kind":"processor","name":"ec2tagger","pipeline":"metrics/host"}
2024-07-23T08:00:02Z E! [inputs.disk] [SystemPS] => error getting disk usage ("/var/lib/docker/overlay2/bdd6eaaf2eefb158c14e053fd49308a08009c83e7fd6dc622f4cef80702be676/merged"): permission denied
2024-07-23T08:00:02Z E! [inputs.disk] [SystemPS] => error getting disk usage ("/run/docker/netns/daa3427e69e9"): permission denied
Here is my config:
{
"agent": {
"metrics_collection_interval": 60,
"run_as_user": "cwagent"
},
"logs": {
"logs_collected": {
"files": {
"collect_list": [
{
"file_path": "/var/lib/docker/containers/d93b1346b7e8553117508f4e1f80590edb6040c93f81dd58dec5b2930e672f1f/d93b1346b7e8553117508f4e1f80590edb6040c93f81dd58dec5b2930e672f1f-json.log",
"log_group_class": "STANDARD",
"log_group_name": "reconstruction-worker-test-log",
"log_stream_name": "{instance_id}",
"retention_in_days": -1
}
]
}
}
},
"metrics": {
"aggregation_dimensions": [
[
"InstanceId"
]
],
"append_dimensions": {
"AutoScalingGroupName": "${aws:AutoScalingGroupName}",
"ImageId": "${aws:ImageId}",
"InstanceId": "${aws:InstanceId}",
"InstanceType": "${aws:InstanceType}"
},
"metrics_collected": {
"disk": {
"measurement": [
"used_percent"
],
"metrics_collection_interval": 60,
"resources": [
"*"
]
},
"mem": {
"measurement": [
"mem_used_percent"
],
"metrics_collection_interval": 60
},
"statsd": {
"metrics_aggregation_interval": 60,
"metrics_collection_interval": 10,
"service_address": ":8125"
}
}
},
"traces": {
"buffer_size_mb": 3,
"concurrency": 8,
"insecure": false,
"traces_collected": {
"xray": {
"bind_address": "127.0.0.1:2000",
"tcp_proxy": {
"bind_address": "127.0.0.1:2000"
}
}
}
}
}