為什麼統一的 CloudWatch 代理程式不會將我的指標或日誌事件推送到 CloudWatch?
我已在我的 Amazon Elastic Compute Cloud (Amazon EC2) 執行個體上設定統一的 CloudWatch 代理程式,以將指標和日誌發佈到 Amazon CloudWatch。但我在 CloudWatch 主控台中看不到我的指標或日誌。為什麼統一的 CloudWatch 代理程式不會將我的指標和日誌推送到 CloudWatch?
簡短描述
統一的 CloudWatch 代理程式可能無法將您的指標或日誌推送到 CloudWatch,其原因有很多。例如,可能存在許可或連線錯誤,這會阻止代理程式發佈您的指標。當您檢閱統一的 CloudWatch 代理程式日誌時,您可能會看到如下錯誤:
- 代理程式日誌錯誤:未連線到端點
- 代理程式日誌錯誤:許可不足
解決方案
**注意:**如果您在執行 AWS Command Line Interface (AWS CLI) 命令時收到錯誤,請確保您使用的是最新的 AWS CLI 版本。
檢閱統一的 CloudWatch 代理程式日誌
透過代理程式日誌檔案,幫助您解決使用統一的 CloudWatch 代理程式套件時遇到的問題。您可能遇到以下常見問題之一:
- 您遇到與所需 AWS 服務端點或 VPC 端點的連線問題。
- 您沒有正確的許可,無法對 CloudWatch 進行支援 API 呼叫。
- 本機檔案系統中不存在日誌檔案。
您可能會在日誌中看到以下錯誤之一:
代理程式日誌錯誤:未連線到端點
2021-08-30T04:07:46Z E! cloudwatch: code: RequestError, message: send request failed, original error: Post "https://monitoring.us-east-1.amazonaws.com/": dial tcp 172.31.11.121:443: i/o timeout 2021-08-30T04:07:46Z W! 210 retries, going to sleep 1m0s before retrying. 2021-08-30T04:07:46Z E! cloudwatch: code: RequestError, message: send request failed, original error: Post "https://monitoring.us-east-1.amazonaws.com/": dial tcp 172.31.11.121:443: i/o timeout 2021-08-30T04:07:46Z W! 211 retries, going to sleep 1m0s before retrying.
代理程式日誌錯誤:許可不足
2021-08-30T02:15:45Z E! cloudwatch: code: AccessDenied, message: User: arn:aws:sts::123456789012:assumed-role/cwagent/i-0744de7c842d2c2ba is not authorized to perform: cloudwatch:PutMetricData, original error: 2021-08-30T02:15:45Z W! 1 retries, going to sleep 400ms before retrying. 2021-08-30T02:15:46Z E! WriteToCloudWatch failure, err: AccessDenied: User: arn:aws:sts::123456789012:assumed-role/cwagent/i-0744de7c842d2c2ba is not authorized to perform: cloudwatch:PutMetricData status code: 403, request id: f1171fd0-05b6-4f7d-bac2-629c8594c46e
確認與 CloudWatch 端點的連線
當到 CloudWatch 的流量不應經過公有網際網路時,您可以改用 VPC 端點。如果您使用的是 VPC 端點,請檢查以下內容:
- 如果您使用的是私有名稱伺服器,請確認 DNS 解析提供了準確的回應。
- 確認 CloudWatch 端點解析為私有 IP 地址。
- 確認與 VPC 端點關聯的安全群組允許來自主機的入站流量。
1. 檢查與指標端點的連線:
$ telnet monitoring.us-east-1.amazonaws.com 443 Trying 52.46.138.115... Connected to monitoring.amazonaws.com. Escape character is '^]'. ^] telnet> quit Connection closed.
2. 檢查與日誌端點的連線:
$ telnet logs.us-east-1.amazonaws.com 443 Trying 3.236.94.218... Connected to logs.us-east-1.amazonaws.com. Escape character is '^]'. ^] telnet> quit Connection closed
3. 檢查 VPC 端點是否解析為私有 IP 地址:
$ dig monitoring.us-east-1.amazonaws.com +short 172.31.11.121 172.31.0.13
檢閱統一的 CloudWatch 代理程式組態
代理程式組態檔案詳細説明發佈到 CloudWatch 的指標和日誌。檢閱代理程式組態檔案,確認包含要發佈的日誌和指標。
確認主機具有發佈指標和日誌的許可
AWS 受管政策 CloudWatchAgentServerPolicy 和 CloudWatchAgentAdminPolicy 可幫助您部署統一的 CloudWatch 代理程式並檢查您是否擁有正確的許可。使用這些政策作為參考,確保您的主機擁有正確的許可。
這些範例中的 AWS CLI 輸出顯示許可不足。
此代理程式啟動命令輸出顯示沒有任何 IAM 角色附加到 EC2 執行個體:
$ /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a fetch-config -m ec2 -c ssm:CWT-Web-Server -s ****** processing amazon-cloudwatch-agent ****** /opt/aws/amazon-cloudwatch-agent/bin/config-downloader --output-dir /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.d --download-source ssm:CWT-Web-Server --mode ec2 --config /opt/aws/amazon-cloudwatch-agent/etc/common-config.toml --multi-config default Region: us-east-1 credsConfig: map[] Error in retrieving parameter store content: NoCredentialProviders: no valid providers in chain. Deprecated. For verbose messaging see aws.Config.CredentialsChainVerboseErrors Fail to fetch/remove json config: NoCredentialProviders: no valid providers in chain. Deprecated. For verbose messaging see aws.Config.CredentialsChainVerboseErrors Fail to fetch the config!
此代理程式啟動命令輸出顯示不正確的 IAM 角色附加到 EC2 執行個體:
$ /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a fetch-config -m ec2 -c ssm:CWT-Web-Server -s ****** processing amazon-cloudwatch-agent ****** /opt/aws/amazon-cloudwatch-agent/bin/config-downloader --output-dir /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.d --download-source ssm:CWT-Web-Server --mode ec2 --config /opt/aws/amazon-cloudwatch-agent/etc/common-config.toml --multi-config default Region: us-east-1 credsConfig: map[] Error in retrieving parameter store content: AccessDeniedException: User: arn:aws:sts::123456789012:assumed-role/cwagent/i-0744de7c842d2c2ba is not authorized to perform: ssm:GetParameter on resource: arn:aws:ssm:us-east-1:123456789012:parameter/CWT-Web-Server status code: 400, request id: b85b0a7a-0fb1-47b4-924f-be8cf43a3b4d Fail to fetch/remove json config: AccessDeniedException: User: arn:aws:sts::123456789012:assumed-role/cwagent/i-0744de7c842d2c2ba is not authorized to perform: ssm:GetParameter on resource: arn:aws:ssm:us-east-1:123456789012:parameter/CWT-Web-Server status code: 400, request id: b85b0a7a-0fb1-47b4-924f-be8cf43a3b4d Fail to fetch the config!
在某些情況下,IAM 使用者可能位於命令列。獲取使用者/角色命令會返回與執行個體關聯的 IAM 使用者或角色:
$ aws sts get-caller-identity { "UserId": "AROA123456789012ABCDE:i-0744de7c842d2c2ba", "Account": "123456789012", "Arn": "arn:aws:sts::123456789012:assumed-role/CloudWatchAgentServerRole/i-0744de7c842d2c2ba" }
確認代理程式是否正確啟動
代理程式設計為使用 AWS CLI 啟動,並將組態檔案作為引數傳遞。使用這些有效的啟動命令。
Linux 命令:
- `$ sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a fetch-config -m ec2 -s -c file:configuration-file-path` - `$ sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a fetch-config -m ec2 -s -c ssm:configuration-parameter-store-name`
Windows 命令:
- `& "C:\Program Files\Amazon\AmazonCloudWatchAgent\amazon-cloudwatch-agent-ctl.ps1" -a fetch-config -m ec2 -s -c file:"C:\Program Files\Amazon\AmazonCloudWatchAgent\config.json"` - `& "C:\Program Files\Amazon\AmazonCloudWatchAgent\amazon-cloudwatch-agent-ctl.ps1" -a fetch-config -m ec2 -s -c ssm:configuration-parameter-store-name`
重要提示:不要從 Windows 控制面板啟動代理程式。
確認代理程式正在執行中
若要發佈指標和日誌,代理程式必須正在執行中。執行此命令,確認代理程式處於活動狀態。
$ sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -m ec2 -a status { "status": "running", "starttime": "2021-08-30T02:13:44+00:00", "configstatus": "configured", "cwoc_status": "stopped", "cwoc_starttime": "", "cwoc_configstatus": "not configured", "version": "1.247349.0b251399" }
更新代理程式組態後重新啟動代理程式
代理程式不會自動註冊對組態檔案的變更。如果更新代理程式組態以包含新的或不同的指標和日誌,請使用此命令重新啟動代理程式:
$ sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -m ec2 -a stop ****** processing cwagent-otel-collector ****** cwagent-otel-collector has already been stopped ****** processing amazon-cloudwatch-agent ****** Redirecting to /bin/systemctl stop amazon-cloudwatch-agent.service $ sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a fetch-config -m ec2 -s -c file:config.json ****** processing amazon-cloudwatch-agent ****** /opt/aws/amazon-cloudwatch-agent/bin/config-downloader --output-dir /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.d --download-source file:config.json --mode ec2 --config /opt/aws/amazon-cloudwatch-agent/etc/common-config.toml --multi-config default Successfully fetched the config and saved in /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.d/file_config.json.tmp Start configuration validation... /opt/aws/amazon-cloudwatch-agent/bin/config-translator --input /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.json --input-dir /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.d --output /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.toml --mode ec2 --config /opt/aws/amazon-cloudwatch-agent/etc/common-config.toml --multi-config default 2021/08/31 02:45:37 Reading json config file path: /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.d/file_config.json.tmp ... Valid Json input schema. I! Detecting run_as_user... Configuration validation first phase succeeded /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent -schematest -config /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.toml Configuration validation second phase succeeded Configuration validation succeeded amazon-cloudwatch-agent has already been stopped Redirecting to /bin/systemctl restart amazon-cloudwatch-agent.service $ sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -m ec2 -a status { "status": "running", "starttime": "2021-08-31T02:45:37+0000", "configstatus": "configured", "cwoc_status": "stopped", "cwoc_starttime": "", "cwoc_configstatus": "not configured", "version": "1.247349.0b251399" }
相關資訊
相關內容
- 已提問 1 年前lg...
- AWS 官方已更新 2 年前
- AWS 官方已更新 2 年前
- AWS 官方已更新 6 個月前