By using AWS re:Post, you agree to the Terms of Use
/AWS Systems Manager/

Questions tagged with AWS Systems Manager

Sort by most recent
  • 1
  • 90 / page

Browse through the questions and answers listed below or filter and sort to narrow down your results.

It there a way to add Exponential Backoff to AWS:ExecuteAutomation in an SSM Automation Document?

Put simply, I have written an SSM Automation. It creates a snapshot of each of the attached volumes of a targeted instance, by getting a list of Volume Ids from a DescribeInstance call. It then utilizes Rate Execution of a second runbook, in an AWS:executeAutomation action, which creates each snapshot, fanning out on those volume ids. As you can see, I have already limited MaxConcurrency of this step to 1. ``` { "name":"SnapshotAllVolumes", "action":"aws:executeAutomation", "maxAttempts": 3, "onFailure":"Abort", "inputs":{ "DocumentName": "MyCustomCreateSnapshotRunbook", "Targets": [ { "Key": "ParameterValues", "Values": [ "{{ DescribeInstance.CurrentVolumes }}" ] } ], "TargetParameterName": "VolumeId", "RuntimeParameters": { "InstanceId": "{{ InstanceId }}", "InstanceName": "{{ GetInstanceName.Name }}" }, "MaxConcurrency": "1" } ``` Where I have gotten into trouble is I am attempting to execute this runbook via Maintenance Window on ~60 instances, each averaging about 3 EBS volumes. I keep crashing into a rate limit on the above step. > Step fails when it is Executing. Fail to start automation, errorMessage: Rate exceeded. Please refer to Automation Service Troubleshooting Guide for more diagnosis details. Unfortunately, it doesn't tell me exactly which rate limit I'm hitting, but I think I can assume it is one of two: either the limit on api calls, or the limit on simultaneous SSM rate executions. Because the latter is much more restrictive(25 is the max), I think it's my most likely suspect. I've been dialing down the concurrency limit on my Maintenance Window. If my assumption about rate limit is correct, I need to stay under 25 concurrent Rate Executions. With the max concurrency of 1 on the child runbook, no individual execution of the parent runbook should lead to more than 2 simultaneous rate executions(the parent, and x*(number of volumes) consecutive executions). This means my concurrency rate for my maintenance window needs to keep below 25/2, so I've reckoned my max safe concurrency is a mere 12. Talking all this over with support, the solution recommended was to implement some sort of exponential backoff on the retries here. That way the calls that are rate limited on the first attempt are retried at different times, et al. This could be done through code if resorted to invoking a lambda that executed the child Automation, instead of executing directly in the Parent Runbook. I'd much rather not need to introduce a dependency on a Lambda here. I was wondering if anyone has a way to implement incremental backoff purely in SSM automation? Perhaps it isn't even possible?
1
answers
0
votes
4
views
AWS-User-7841548
asked 7 days ago

Unable to create new OpsItems from EventBridge when using Input Transformer for deduplication and adding category and severity values

Apologize to all for the duplicate post. I created my login under the wrong account when I initially posted this question. I’m able to generate a new OpsItem for any EC2, SecurityGroup, or VPC configuration change using an EventBridge rule with the following event pattern. { "source": "aws.config", "detail-type": "Config Configuration Item Change", "detail": { "messageType": "ConfigurationItemChangeNotification", "configurationItem": { "resourceType": "AWS::EC2::Instance", "AWS::EC2::SecurityGroup", "AWS::EC2::VPC" } } } The rule and target work great when using Matched event for the Input but I noticed that launching one EC2 using the AWS wizard creates at least three OpsItems, one for each resourceType. Therefore I’d like to implement a deduplication string to cut down on the number of OpsItems generated to one if possible and I’d also like to attach a category and severity to the new OpsItem. I’m trying to use an Input Transformer as recommended by the AWS documentation but even the most simplest of Input Transformers when applied prevent any new OpsItems from being generated. When I've tested, I've also ensured that all previous OpsItems were resolved. Can anyone tell me what might be blocking the creation of any new OpsItems when using this Input Transformer configuration? Here’s what I have configured now. Input path { "awsAccountId": "$.detail.configurationItem.awsAccountId", "awsRegion": "$.detail.configurationItem.awsRegion", "configurationItemCaptureTime": "$.detail.configurationItem.configurationItemCaptureTime", "detail-type": "$.detail-type", "messageType": "$.detail.messageType", "notificationCreationTime": "$.detail.notificationCreationTime", "region": "$.region", "resourceId": "$.detail.configurationItem.resourceId", "resourceType": "$.detail.configurationItem.resourceType", "resources": "$.resources", "source": "$.source", "time": "$.time" } Input template { "awsAccountId": "<awsAccountId>", "awsRegion": "<awsRegion>", "configurationItemCaptureTime": "<configurationItemCaptureTime>", "resourceId": "<resourceId>", "resourceType": "<resourceType>", "title": "Template under ConfigDrift-EC2-Dedup4", "description": "Configuration Drift Detected.", "category": "Security", "severity": "3", "origination": "EventBridge Rule - ConfigDrift-EC2-Dedup", "detail-type": "<detail-type>", "source": "<source>", "time": "<time>", "region": "<region>", "resources": "<resources>", "messageType": "<messageType>", "notificationCreationTime": "<notificationCreationTime>", "operationalData": { "/aws/dedup": { "type": "SearchableString", "value": "{\"dedupString\":\"ConfigurationItemChangeNotification\"}" } } } Output when using the AWS supplied Sample event called “Config Configuration Item Change” { "awsAccountId": "123456789012", "awsRegion": "us-east-1", "configurationItemCaptureTime": "2022-03-16T01:10:50.837Z", "resourceId": "fs-01f0d526165b57f95", "resourceType": "AWS::EFS::FileSystem", "title": "Template under ConfigDrift-EC2-Dedup4", "description": "Configuration Drift Detected.", "category": "Security", "severity": "3", "origination": "EventBridge Rule - ConfigDrift-EC2-Dedup", "detail-type": "Config Configuration Item Change", "source": "aws.config", "time": "2022-03-16T01:10:51Z", "region": "us-east-1", "resources": "arn:aws:elasticfilesystem:us-east-1:123456789012:file-system/fs-01f0d526165b57f95", "messageType": "ConfigurationItemChangeNotification", "notificationCreationTime": "2022-03-16T01:10:51.976Z", "operationalData": { "/aws/dedup": { "type": "SearchableString", "value": "{"dedupString":"ConfigurationItemChangeNotification"}" } } }
1
answers
0
votes
2
views
AWS-User-1369b
asked 9 days ago

Unable to create new OpsItems from EventBridge when using Input Transformer for deduplication and adding category and severity values

I’m able to generate a new OpsItem for any EC2, SecurityGroup, or VPC configuration change using an EventBridge rule with the following event pattern. { "source": ["aws.config"], "detail-type": ["Config Configuration Item Change"], "detail": { "messageType": ["ConfigurationItemChangeNotification"], "configurationItem": { "resourceType": ["AWS::EC2::Instance", "AWS::EC2::SecurityGroup", "AWS::EC2::VPC"] } } } The rule and target work great when using Matched event for the Input but I noticed that launching one EC2 using the AWS wizard creates at least three OpsItems, one for each resourceType. Therefore I’d like to implement a deduplication string to cut down on the number of OpsItems generated to one if possible and I’d also like to attach a category and severity to the new OpsItem. I’m trying to use an Input Transformer as recommended by the AWS documentation but even the most simplest of Input Transformers when applied prevent any new OpsItems from being generated. When I've tested, I've also ensured that all previous OpsItems were resolved. Can anyone tell me what might be blocking the creation of any new OpsItems when using this Input Transformer configuration? Here’s what I have configured now. Input path { "awsAccountId": "$.detail.configurationItem.awsAccountId", "awsRegion": "$.detail.configurationItem.awsRegion", "configurationItemCaptureTime": "$.detail.configurationItem.configurationItemCaptureTime", "detail-type": "$.detail-type", "messageType": "$.detail.messageType", "notificationCreationTime": "$.detail.notificationCreationTime", "region": "$.region", "resourceId": "$.detail.configurationItem.resourceId", "resourceType": "$.detail.configurationItem.resourceType", "resources": "$.resources", "source": "$.source", "time": "$.time" } Input template { "awsAccountId": "<awsAccountId>", "awsRegion": "<awsRegion>", "configurationItemCaptureTime": "<configurationItemCaptureTime>", "resourceId": "<resourceId>", "resourceType": "<resourceType>", "title": "Template under ConfigDrift-EC2-Dedup4", "description": "Configuration Drift Detected.", "category": "Security", "severity": "3", "origination": "EventBridge Rule - ConfigDrift-EC2-Dedup", "detail-type": "<detail-type>", "source": "<source>", "time": "<time>", "region": "<region>", "resources": "<resources>", "messageType": "<messageType>", "notificationCreationTime": "<notificationCreationTime>", "operationalData": { "/aws/dedup": { "type": "SearchableString", "value": "{\"dedupString\":\"ConfigurationItemChangeNotification\"}" } } } Output when using the AWS supplied Sample event called “Config Configuration Item Change” { "awsAccountId": "123456789012", "awsRegion": "us-east-1", "configurationItemCaptureTime": "2022-03-16T01:10:50.837Z", "resourceId": "fs-01f0d526165b57f95", "resourceType": "AWS::EFS::FileSystem", "title": "Template under ConfigDrift-EC2-Dedup4", "description": "Configuration Drift Detected.", "category": "Security", "severity": "3", "origination": "EventBridge Rule - ConfigDrift-EC2-Dedup", "detail-type": "Config Configuration Item Change", "source": "aws.config", "time": "2022-03-16T01:10:51Z", "region": "us-east-1", "resources": "arn:aws:elasticfilesystem:us-east-1:123456789012:file-system/fs-01f0d526165b57f95", "messageType": "ConfigurationItemChangeNotification", "notificationCreationTime": "2022-03-16T01:10:51.976Z", "operationalData": { "/aws/dedup": { "type": "SearchableString", "value": "{"dedupString":"ConfigurationItemChangeNotification"}" } } }
0
answers
0
votes
1
views
AWS-User-1369
asked 9 days ago

AWS:InstanceInformation folder created in s3 by Resource Data Sync cannot be queried by Athena because it has an invalid schema with duplicate columns.

[After resolving my first issue](https://repost.aws/questions/QUXOInFRr1QrKfR0Bh9wVglA/aws-glue-not-properly-crawling-s-3-bucket-populated-by-resource-data-sync-specifically-aws-instance-information-is-not-made-into-a-table) with getting a resource data sync set up, I've now run into another issue with the same folder. When a resource data sync is created, it creates a folder structure with 13 folders following a folder structure like: `s3://resource-data-sync-bucket/AWS:*/accountid=*/regions=*/resourcetype=*/instance.json}` When running the glue crawler over this, a schema is created where partitions are made for each subpath with an `=` in it. This works fine for most of the data, except for the path starting with `AWS:InstanceInformation`. The instance information json files ALSO contain a "resourcetype" field as can be seen here. ``` {"PlatformName":"Microsoft Windows Server 2019 Datacenter","PlatformVersion":"10.0.17763","AgentType":"amazon-ssm-agent","AgentVersion":"3.1.1260.0","InstanceId":"i","InstanceStatus":"Active","ComputerName":"computer.name","IpAddress":"10.0.0.0","ResourceType":"EC2Instance","PlatformType":"Windows","resourceId":"i-0a6dfb4f042d465b2","captureTime":"2022-04-22T19:27:27Z","schemaVersion":"1.0"} ``` As a result, there are now two "resourcetype" columns in the "aws_instanceinformation" table schema. Attempts to query that table result in the error `HIVE_INVALID_METADATA: Hive metadata for table is invalid: Table descriptor contains duplicate columns` I've worked around this issue by removing the offending field and setting the crawler to ignore schema updates, but this doesn't seem like a great long term solution since any changes made by AWS to the schema will be ignored. Is this a known issue with using this solution? Are there any plans to change how the AWS:InstanceInformation documents are so duplicate columns aren't created.
0
answers
0
votes
0
views
AWS-User-7460517
asked 20 days ago

AWS Glue not properly crawling s3 bucket populated by "Resource Data Sync" -- specifically, "AWS: InstanceInformation" is not made into a table

I set up an s3 bucket that collects inventory data from multiple AWS accounts using the Systems Manager "Resource Data Sync". I was able to set up the Data Syncs to feed into the single bucket without issue and the Glue crawler was created automatically. Now that I'm trying to query the data in Athena, I noticed there is an issue with how the Crawler is parsing the data in the bucket. The folder "AWS:InstanceInformation" is not being turned into a table. Instead, it is turning all of the "region=us-east-1/" and "test.json" sub-items into tables which are, obviously, not queryable. To illustrate further, each of the following paths is being turned into it's own table. * s3://resource-data-sync-bucket/AWS:InstanceInformation/accountid=12345679012/region=us-east-1 * s3://resource-data-sync-bucket/AWS:InstanceInformation/accountid=12345679012/test.json * s3://resource-data-sync-bucket/AWS:InstanceInformation/accountid=23456790123/region=us-east-1 * s3://resource-data-sync-bucket/AWS:InstanceInformation/accountid=23456790123/test.json * s3://resource-data-sync-bucket/AWS:InstanceInformation/accountid=34567901234/region=us-east-1 * s3://resource-data-sync-bucket/AWS:InstanceInformation/accountid=34567901234/test.json This is ONLY happening with the "AWS:InstanceInformation" folder. All of the other folders (e.g. "AWS:DetailedInstanceInformation") are being properly turned into tables. Since all of this data was populated automatically, I'm assuming that we are dealing with a bug? Is there anything I can do to fix this?
1
answers
0
votes
1
views
AWS-User-7460517
asked 21 days ago

AWS Workspaces Latency

Hello AWS Experts! First and foremost, if you have stumbled upon my inquiry and are ready to aid me in resolving the issue, I appreciate you. My company has established an AWS Workspaces environment in the Virginia region. The Workspaces service is located in two private subnets within our Virginia VPC. When a U.S.-based user connects to the Workspaces environment, they do not deal with latency issues. Additionally, from Workspaces, the user will connect to a private EC2 instance using AWS System Manager (SSM) port forwarding. The user deals with slight latency when port forwarding/RDP into the instance. If there is a way to increase network speeds for SSM port forwarding, please inform me. Furthermore, this is where your phenomenal expertise is most needed; our users in Bulgaria experience high latency when attempting to connect to their Workspaces and even higher when trying to use SSM to connect to their EC2 instance. When a Bulgarian user uses Workspaces, their round trip time averages around 150ms. I had them test their local network bandwidth, which most averaged around 84 to 120Mbps download speeds and 140 to 160Mbps upload speeds. Additionally, they checked their "connection status..." from the following website: https://clients.amazonworkspaces.com/Health The site stated that Frankfurt provided the best connection; however, we do not want to create an entirely new VPC in the Frankfurt region for Workspaces. And have our customers question why a European IP is trying to connect to their assets. A VPN may be a possibility, but we're hoping for a better solution. I have tried the following, which non proved to improve network speeds for U.S. or Bulgarian members. - Established an Internal Load Balancer - Established VPC Endpoints - Established Global Accelerator So, how could one increase network speeds to Workspaces and SSM port forwarding? Thank you!
1
answers
0
votes
4
views
Major
asked a month ago

Receiving Validation Error on Payload input (Lambda) when using via SSM Automation Document

Hello, I've cloned the AWS-UpdateWindowsAMI document and added two steps for invoking Lamda function to update the parameter store with the latest AMI ID and creating EC2 tags. I'm getting the error below when I run into the aws:invokelambdafunction block. Everything else works. > Step fails when it is validating and resolving the step inputs. Failed to resolve input: createImage.ImageId to type String. createImage is not defined in Automation Document MainSteps.. Please refer to Automation Service Troubleshooting Guide for more diagnosis details. VerificationErrorMessage Failed to resolve input: createImage.ImageId to type String. createImage is not defined in Automation Document MainSteps. I've copied the LamdaFunction code straight of [this document](https://docs.aws.amazon.com/systems-manager/latest/userguide/automation-walk-patch-windows-ami-simplify.html#automation-pet1). This is the updatessmparam code in the document. I've only replaced the parameterName in the code. { "name": "CreateImage", "action": "aws:createImage", "maxAttempts": 3, "onFailure": "Abort", "inputs": { "InstanceId": "{{ LaunchInstance.InstanceIds }}", "ImageName": "{{ TargetAmiName }}", "NoReboot": true, "ImageDescription": "{{ TargetImageDescription }}" } }, { "name": "TerminateInstance", "action": "aws:changeInstanceState", "maxAttempts": 3, "onFailure": "Abort", "inputs": { "InstanceIds": [ "{{ LaunchInstance.InstanceIds }}" ], "DesiredState": "terminated" } }, { "name": "updateSsmParam", "action": "aws:invokeLambdaFunction", "timeoutSeconds": 1200, "maxAttempts": 1, "onFailure": "Continue", "inputs": { "FunctionName": "Automation-UpdateSsmParam", "Payload": "{\"parameterName\":\"/ami/latest/us-east-1/win2k19sf-cb\", \"parameterValue\":\"{{createImage.ImageId}}\"}" } }, Any help would be appreciated. Thanks. Justin
3
answers
0
votes
1
views
justingregory83
asked 2 months ago

Can't register on-prem Amazon Linux 2 with Systems Manager

I'm trying to execute this command on an on-premises Amazon Linux 2 instance: ``` amazon-ssm-agent -register -code "<removed_code>" -id "<removed_id>" -region "us-east-1" ``` It fails with : ``` Error occurred fetching the seelog config file path: open /etc/amazon/ssm/seelog.xml: no such file or directory Initializing new seelog logger New Seelog Logger Creation Complete 2022-03-22 23:22:51 ERROR Registration failed due to error registering the instance with AWS SSM. InvalidParameter: 1 validation error(s) found. - minimum field size of 36, RegisterManagedInstanceInput.ActivationId. ``` /var/log/amazon/ssm/errors.log contains: ``` 2022-03-22 23:24:07 ERROR [newAgentIdentityInner @ identity_selector.go.96] Agent failed to assume any identity 2022-03-22 23:24:07 ERROR [NewAgentIdentity @ identity_selector.go.109] failed to find identity, retrying: failed to find agent identity 2022-03-22 23:24:13 ERROR [newAgentIdentityInner @ identity_selector.go.96] Agent failed to assume any identity 2022-03-22 23:24:13 ERROR [NewAgentIdentity @ identity_selector.go.109] failed to find identity, retrying: failed to find agent identity 2022-03-22 23:24:19 ERROR [newAgentIdentityInner @ identity_selector.go.96] Agent failed to assume any identity 2022-03-22 23:24:19 ERROR [Init @ bootstrap.go.75] failed to get identity: failed to find agent identity 2022-03-22 23:24:19 ERROR [run @ agent.go.130] error occurred when starting amazon-ssm-agent: failed to get identity: failed to find agent identity 2022-03-22 23:25:04 ERROR [processRegistration @ agent_parser.go.177] Registration failed due to error registering the instance with AWS SSM. InvalidParameter: 1 validation error(s) found. - minimum field size of 36, RegisterManagedInstanceInput.ActivationId. ```
2
answers
0
votes
2
views
Jeff-Reed-at-AST
asked 2 months ago

Cannot see custom systems manager inventory in AWS Console

Hi, I created a basic custom inventory for Disk usage by creating a Systems manager association. I am running a powershell script to get the disk information but unable to see the inventory in the console. Script runs fine and was written to the intended location in the target machine however the inventory doesn't show up in the console. Here's my script. function Get-EbsVolumes {[CmdletBinding()] $Map = @{"0" = '/dev/sda1'}; for($x = 1; $x -le 26; $x++) {$Map.add($x.ToString(), [String]::Format("xvd{0}",[char](97 + $x)))}; for($x = 78; $x -le 102; $x++) {$Map.add($x.ToString(), [String]::Format("xvdc{0}",[char](19 + $x)))} return Get-WmiObject -Class Win32_DiskDrive | % { $Drive = $_; $SN = $Drive.SerialNumber -replace '^(vol)([^ _]+)(?: *_.+)?$', '$1-$2'; Get-WmiObject -Class Win32_DiskDriveToDiskPartition | Where-Object {$_.Antecedent -eq $Drive.Path.Path} | %{ $D2P = $_; $Partition = Get-WmiObject -Class Win32_DiskPartition | Where-Object {$_.Path.Path -eq $D2P.Dependent}; $Disk = Get-WmiObject -Class Win32_LogicalDisk`ToPartition | Where-Object {$_.Antecedent -in $D2P.Dependent} | %{ $L2P = $_; Get-WmiObject -Class Win32_LogicalDisk | Where-Object {$_.Path.Path -in $L2P.Dependent} | %{ $L2V = $_; Get-WmiObject -Class Win32_Volume | Where-Object {$_.SerialNumber -eq ([Convert]::ToInt64($L2V.VolumeSerialNumber, 16))}; } } New-Object PSObject -Property @{ Device = $Map[$Drive.SCSITargetId.ToString()]; Disk = [Int]::Parse($Partition.Name.Split(",")[0].Replace("Disk #","")); Boot = $Partition.BootPartition; Partition = [Int]::Parse($Partition.Name.Split(",")[1].Replace(" Partition #","")); DriveLetter = If($Disk -eq $NULL) {"NA"} else {$Disk.DriveLetter}; VolumeDeviceID = If($Disk -eq $NULL) {"NA"} else {$Disk.DeviceID}; DriveType = If($Disk -eq $NULL) {"NA"} else {$Disk.DriveType}; IsSystemVolume = If($Disk -eq $NULL) {"NA"} else {$Disk.SystemVolume}; IsBootVolume = If($Disk -eq $NULL) {"NA"} else {$Disk.BootVolume.ToString()}; DriveLabel = If($Disk -eq $NULL) {"NA"} else {$Disk.Label}; BlockSize = $Disk.BlockSize; Capacity = ([math]::Round($Disk.Capacity /1GB,0)).ToString(); DiskType = $Partition.Type; FreeSpace = [math]::Round($Disk.FreeSpace /1GB,0); SerialNumber = $SN } } } | Where-Object { $_.DriveType -eq 3 -and $_.IsSystemVolume -eq $false} } $data = Get-EbsVolumes | Select-Object @{n="Device";e={$_.Device}},@{n="Disk";e={$_.Disk.ToString()}},@{n="Boot";e={$_.Boot.ToString()}},@{n="DriveLetter";e={$_.DriveLetter.ToString()}},@{n="VolumeDeviceID";e={$_.VolumeDeviceID}},@{n="Partition";e={$_.Partition}},@{n="DriveType";e={$_.DriveType.ToString()}},@{n="IsSystemVolume";e={$_.IsSystemVolume.ToString()}},@{n="IsBootVolume";e={$_.IsBootVolume.ToString()}},@{n="DriveLabel";e={$_.DriveLabel.ToString()}},@{n="BlockSize";e={$_.BlockSize.ToString()}},@{n="Capacity";e={$_.Capacity.ToString()}},@{n="DiskType";e={$_.DiskType.ToString()}},@{n="FreeSpace";e={$_.FreeSpace.ToString()}},@{n="SerialNumber";e={$_.SerialNumber.ToString()}} | ConvertTo-Json -Compress $content = "{`"SchemaVersion`" : `"1.0`", `"TypeName`": `"Custom:EbsVolumesInventory`", `"Content`": $data}" $instanceId = Invoke-RestMethod -uri http://169.254.169.254/latest/meta-data/instance-id $filepath = "C:\ProgramData\Amazon\SSM\InstanceData\" + $instanceId + "\inventory\custom\EbsVolumesInventory.json" if (-NOT (Test-Path $filepath)) { New-Item $filepath -ItemType file} Set-Content -Path $filepath -Value $content
1
answers
0
votes
4
views
AWS-User-5588065
asked 3 months ago

Access Control in Secrets Manager for Federated Users

My scenario: I have my users in Azure AD. This is connected to AWS Single Account SSO into an AWS Account using IAM SAML IDP (PS: we are not using AWS SSO Service). We are using AWS Secrets Manager and want to store per user secret using a secret name path (eg /usersecrets/<azure_ad_username>/<secret_name> When the users login using Azure AD auth, they automatically assume the IAM Role attached. I would like to do the following: Requirement1: 1. Allow users to list secrets, create secrets and get secret value for any secret which has a name /usersecrets/<azure_ad_username>/* (here the azure_ad_username is what AWS session sees when the assume role to login) 2. Deny access to any secret unless the request is coming from Federated user (i.e local IAM users in AWS account should not be able to see any secret in path /usersecrets/<azure_ad_username>/* Requirement2: In addition to the federates Azure AD users, I also want to allow a EC2 Instance Role to be able to Get/List/Describe any secret. This EC2 role is in same AWS account where secrets are and is attached to all Windows Servers. This IAM role is to allow SSM Run commands to execute on these Windows machines and fetch the secrets values (eg, to get the secret of a user and create a local windows user with same name and password as it is in secret manager using powershell. Questions: Can you help with some sample IAM Policy for the role or the secret manager resource policy I can use to meet both the requirements?
1
answers
0
votes
4
views
Alexa
asked 4 months ago

SSM Agent update failure

Hi all. We're just starting with SSM and hoping to use this quite extensively moving forward. I have a small handful of devices online using it but I just noticed that one of my devices has gone offline. Luckily it is not in the field yet so I was able to access it locally. What appears to have happened was SSM Agent was attempting to update but failed and never came back. I'm not so concerned that it didn't update and more concerned that it didn't start again. Going through the ssm agent logs I came across this line ``` "standardError": "E: Could not get lock /var/lib/dpkg/lock-frontend - open (11: Resource temporarily unavailable)\nE: Unable to acquire the dpkg frontend lock (/var/lib/dpkg/lock-frontend), is another process using it?\nWARNING: Could not install the python3-apt, this may cause the patching operation to fail.\nfailed to run commands: exit status 1 ``` The SSM update log itself ends like this ``` 2022-01-20 21:47:09 DEBUG UpdateInstanceInformation Response{ } 2022-01-20 21:47:09 INFO initiating cleanup of other versions in amazon-ssm-agent and amazon-ssm-agent-updater folder 2022-01-20 21:47:09 INFO removing artifacts in the folder: /var/lib/amazon/ssm/update/amazon-ssm-agent 2022-01-20 21:47:09 INFO removed files and folders: 3.1.821.0 2022-01-20 21:47:09 INFO removing artifacts in the folder: /var/lib/amazon/ssm/update/amazon-ssm-agent-updater 2022-01-20 21:47:09 INFO removed files and folders: 3.1.715.0 2022-01-20 21:47:09 INFO initiating cleanup of files in update download folder 2022-01-20 21:47:09 INFO Successfully downloaded manifest Successfully downloaded updater version 3.1.821.0 Updating amazon-ssm-agent from 3.1.715.0 to 3.1.821.0 Successfully downloaded https://s3.us-east-2.amazonaws.com/amazon-ssm-us-east-2/amazon-ssm-agent/3.1.715.0/amazon-ssm-agent-ubuntu-amd64.tar.gz Successfully downloaded https://s3.us-east-2.amazonaws.com/amazon-ssm-us-east-2/amazon-ssm-agent/3.1.821.0/amazon-ssm-agent-ubuntu-amd64.tar.gz Initiating amazon-ssm-agent update to 3.1.821.0 failed to install amazon-ssm-agent 3.1.821.0, ErrorMessage=The execution of command returned Exit Status: 125 exit status 125 Initiating rollback amazon-ssm-agent to 3.1.715.0 failed to uninstall amazon-ssm-agent 3.1.821.0, ErrorMessage=The execution of command returned Exit Status: 2 exit status 2 Failed to update amazon-ssm-agent to 3.1.821.0 ``` Then the error log gives me this ``` 2022-01-20 16:29:54 ERROR [Submit @ processor.go.140] [ssm-agent-worker] [MessagingDeliveryService] [Association] [associationId=5752f0d0-1f57-492e-83f7-740484b81d73] Document Submission failed: Job with id 5752f0d0-1f57-492e-83f7-740484b81d73 already exists 2022-01-20 21:47:02 ERROR [AppendError @ context.go.129] failed to install amazon-ssm-agent 3.1.821.0, ErrorMessage=The execution of command returned Exit Status: 125 exit status 125 ``` Could anyone help me out here? I cannot have these fail like this when they go out into the wild. The OS is Ubuntu Server 18 on this box, 20 on others we have.
0
answers
0
votes
5
views
AWS-User-7038179
asked 4 months ago

Can Systems Manager Target Only Instances That Are Running

I am using an Automation Runbook to install and configure CloudWatch Agent on managed instances. When I execute my Runbook on a tag-based Resource group, it attempts to run on instances that do not have "running" status. When it ran on stopped instances, it would eventually time out. How can I only target running instances in my Resource Group for execution? Things I have tried or looked into: * Adding a step to the start of my Runbook to assertAwsResourceProperty, DescribeInstanceStatus, and "running". This prevents the timeout, but returns nothing when run against a stopped instance, therefore Aborting and being marked as a failure. This is undesirable to me, because I don't see the Runbook as having failed, rather skipped execution for a legitimate reason. Furthermore, if run on a large batch of machines, this leads to even one stopped instance ending in the entire parent execution being marked as a failed. * Filtering my Resource Group to only contain running instances. I saw no way of doing this. * Adding a Property to my Runbook so that it can only be run on a running instance. Similar to how I use the TargetType property to limit it to /AWS::EC2::Instance. This made the most sense to me, because as the developer I know my Runbook can only be successful if the instance is running. I wanted something akin to being able to set my TargetType as service-provider::service-name::data-type-name::status. I haven't found any way of doing this. * Applying a filter when picking my Targets for execution in the Systems Manager console. This is another place where it made sense that I might find it, but I didn't . If, when I've chosen the Resource Group, the Interactive Instance picker were to update to display only the instances from that group, that would be unwieldy but at least allow me to manually deselect instances that are not running. None of this appears to be the case. Is there a simple way to accomplish this that I have overlooked?
2
answers
0
votes
14
views
AWS-User-7841548
asked 4 months ago

SSM agent won't get new tokens after network failure resolved

I have multiple machines running hybrid SSM Agent. Those machines in one network suffered a multi-day network outage. When the network issue was restored SSM Agent wouldn't 'reconnect'. I cannot start sessions to access these machines. Here are the relevant log lines from `/var/log/amazon/ssm/amazon-ssm-agent.log`: ``` 2021-12-23 13:42:22 INFO [ssm-agent-worker] [MessagingDeliveryService] increasing error count by 1 2021-12-23 13:42:24 ERROR [ssm-agent-worker] [MessagingDeliveryService] error when calling AWS APIs. error details - GetMessages Error: shared credentials are already expired, they were queried at 2021-12-21T11:30:10-06:00 and expired at 2021-12-21T18:30:10Z 2021-12-23 13:42:24 INFO [ssm-agent-worker] [MessagingDeliveryService] increasing error count by 1 2021-12-23 13:42:26 ERROR [ssm-agent-worker] [MessagingDeliveryService] error when calling AWS APIs. error details - GetMessages Error: shared credentials are already expired, they were queried at 2021-12-21T11:30:10-06:00 and expired at 2021-12-21T18:30:10Z 2021-12-23 13:42:26 INFO [ssm-agent-worker] [MessagingDeliveryService] increasing error count by 1 2021-12-23 13:42:29 ERROR [ssm-agent-worker] [MessagingDeliveryService] error when calling AWS APIs. error details - GetMessages Error: shared credentials are already expired, they were queried at 2021-12-21T11:30:10-06:00 and expired at 2021-12-21T18:30:10Z 2021-12-23 13:42:29 INFO [ssm-agent-worker] [MessagingDeliveryService] increasing error count by 1 2021-12-23 13:42:31 ERROR [ssm-agent-worker] [MessagingDeliveryService] error when calling AWS APIs. error details - GetMessages Error: shared credentials are already expired, they were queried at 2021-12-21T11:30:10-06:00 and expired at 2021-12-21T18:30:10Z 2021-12-23 13:42:31 INFO [ssm-agent-worker] [MessagingDeliveryService] increasing error count by 1 2021-12-23 13:42:33 ERROR [ssm-agent-worker] [MessagingDeliveryService] MessagingDeliveryService stopped temporarily due to internal failure. We will retry automatically after 15 minutes ``` That seems to repeat round and around. The credentials are now a couple of days old as can be seen by the timestamps. I am assuming the "internal failure" is trying to refresh the tokens. I restarted the agent on one machine (through `systemctl restart`) and it came back fine. So it's definitely some state in the running agents that is the problem. I have left the others in their failed state in case someone responds with something for me to test this further.
0
answers
1
votes
5
views
DamonMaria
asked 5 months ago

SMS Patching Fails for ALL Windows Server 2019 EC2 Instances

I just starting using SMS to manage Windows 2019 Server EC2 instance patching (security updates). I noticed that by default, AWS prevents Windows OS to automatically run Windows Update. I followed the instructions for SMS Quick Setup and the Patching of my servers are failing with the following error message: (I have been searching ALL day for a resolution to this. Modifying registry settings, running DSIM commands, etc. Nothing helps. Seems like some type of certificate issue but I can't resolve it). Has anyone else had issues with getting SMS to patch AWS Windows Server 2019 EC2 instances? **Invoke-PatchBaselineOperation : Exception Details: An error occurred when attempting to search Windows Update. Exception Level 1: Error Message: A certificate chain processed, but terminated in a root certificate which is not trusted by the trust provider. (Exception from HRESULT: 0x800B0109)** Stack Trace: at WUApiLib.IUpdateSearcher.Search(String criteria) at Amazon.Patch.Baseline.Operations.PatchNow.Implementations.WindowsUpdateAgent.SearchForUpdates(String searchCriteria) At C:\ProgramData\Amazon\SSM\InstanceData\i-03638bdca902ef8fd\document\orchestration\86ed2eda-065a-49d3-b084-69bfc89c14 3d\PatchWindows\_script.ps1:233 char:13 + $response = Invoke-PatchBaselineOperation -Operation Scan -SnapshotId ... + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + CategoryInfo : OperationStopped: (Amazon.Patch.Ba...UpdateOperation:FindWindowsUpdateOperation) [Invoke -PatchBaselineOperation], Exception + FullyQualifiedErrorId : Exception Level 1: Error Message: Exception Details: An error occurred when attempting to search Windows Update. Exception Level 1: Error Message: A certificate chain processed, but terminated in a root certificate which is not trusted by the t rust provider. (Exception from HRESULT: 0x800B0109) Stack Trace: at WUApiLib.IUpdateSearcher.Search(String criteria) at Amazon.Patch.Baseline.Operations.PatchNow.Implementations.WindowsUpdateAgent.SearchForUpdates(String searc hCriteria) Stack Trace: at Amazon.Patch.Baseline.Operations.PatchNow.Implementations.WindowsUpdateAgent.SearchForUpdates( String searchCriteria) at Amazon.Patch.Baseline.Operations.PatchNow.Implementations.WindowsUpdateOperation.SearchAndProcessResult(Lis t`1 kbGuids) at Amazon.Patch.Baseline.Operations.PatchNow.Implementations.WindowsUpdateOperation.SearchByGuidsPaginated(Lis t`1 kbGuids, Int32 maxPageSize) at Amazon.Patch.Baseline.Operations.PatchNow.Implementations.WindowsUpdateOperation.FilterWindowsUpdateSearch( List`1 filteringMethods) at Amazon.Patch.Baseline.Operations.PatchNow.Implementations.FindWindowsUpdateOperation.DoWindowsUpdateOperati on() at Amazon.Patch.Baseline.Operations.PatchNow.Implementations.WindowsUpdateOperation.DoBeginProcessing() ,Amazon.Patch.Baseline.Operations.PowerShellCmdlets.InvokePatchBaselineOperation failed to run commands: exit status 4294967295
3
answers
0
votes
7
views
KevinM_BMW
asked 5 months ago

AWS Backup VSS snapshot fails

I am backing up about 45 Windows Server EC2 instances with AWS Backup. One of the AWS Backup jobs, for about 35 of those instances does a VSS snapshot as part of the backup. I get a lot of VSS failure messages. Some of them are VSS timeouts, which I understand is a Windows issue that occurs because of an unconfigurable 10 second max time for the snapshot to complete. Some are related to the AWS VSS provider. In AWS Backup the error is "Windows VSS Backup Job Error encountered, trying for regular backup". The job then completes, but without a VSS snapshot. In SSM, the Run Command error for this task is: Encountered unexpected error. Please see error details below Message : The process cannot access the file 'C:\ProgramFiles\Amazon\AwsVssComponents\vsserr.log' because it is being used by another process. I tried to rename this file (just as a test, to see if it was in use) and says it is in use by the ec2-vss-agent.exe. So I stopped the EC2 VSS Windows service but that did not stop the ec2-vss-agent.exe process and the error remained. I did an 'end task' on the ec2-vss-agent.exe process and I then manually ran the VSS Run Command from SSM. It re-started the process, and it ran for awhile before timing out, which is the other (unrelated?) issue we see too. I can not find anything online about this issue or error and I'm at a loss as far as where to look from here. I need VSS snapshots of these servers. If anyone has any ideas about how to troubleshoot this or what else to look for, please let me know!
1
answers
0
votes
5
views
jhallock
asked 5 months ago

how to connect to private RDS from localhost

I have a private VPC with private subnets a private jumpbox in 1 private subnet and my private RDS aurora MySql serverless instance in another private subnet. I did those commands on my local laptop to try to connect to my RDS via port forwarding: ``` aws ssm start-session --target i-0d5470040e7541ab9 --document-name AWS-StartPortForwardingSession --parameters "portNumber"=["5901"],"localPortNumber"=["9000"] --profile myProfile aws ssm start-session --target i-0d5470040e7541ab9 --document-name AWS-StartPortForwardingSession --parameters "portNumber"=["22"],"localPortNumber"=["9999"] --profile myProfile aws ssm start-session --target i-0d5470040e7541ab9 --document-name AWS-StartPortForwardingSession --parameters "portNumber"=["3306"],"localPortNumber"=["3306"] --profile myProfile ``` The connection to the server hangs. I had this error on my local laptop: ``` Starting session with SessionId: myuser-09e5cd0206cc89542 Port 3306 opened for sessionId myuser-09e5cd0206cc89542. Waiting for connections... Connection accepted for session [myuser-09e5cd0206cc89542] Connection to destination port failed, check SSM Agent logs. ``` and those errors in `/var/log/amazon/ssm/errors.log`: ``` 2021-11-29 00:50:35 ERROR [handleServerConnections @ port_mux.go.278] [ssm-session-worker] [myuser-017cfa9edxxxx] [DataBackend] [pluginName=Port] Unable to dial connection to server: dial tcp :3306: connect: connection refused 2021-11-29 14:13:07 ERROR [transferDataToMgs @ port_mux.go.230] [ssm-session-worker] [myuser-09e5cdxxxxxx] [DataBackend] [pluginName=Port] Unable to read from connection: read unix @->/var/lib/amazon/ssm/session/3366606757_mux.sock: use of closed network connection ``` and I try to connect to RDS like this : [![enter image description here][1]][1] I even tried to put the RDS Endpoint using ssh Tunnel, but it doesn't work: [![enter image description here][2]][2] Are there any additional steps to do on the remote server ec2-instance? It seems the connection is accepted but the connection to the destination port doesn't work. or is there any best other way to connect to private rds in private vpc when de don't have site-to site-vpn or Direct connect ? [1]: https://i.stack.imgur.com/RwiZ8.png [2]: https://i.stack.imgur.com/53GIh.png
6
answers
0
votes
31
views
AWS-User-1737129
asked 5 months ago

Error User cannot Terminate their own SSM Session when trying to use SCP.

We use AWS SSO to provide permissions for Session Manager access to systems. When trying to use Session Manager in conjunction with SCP one of our users is getting the following error: _$ scp -r -i ~/.ssh/example-key-singapore system1/startsystem.sh legerity@i-06a0c25qb665a08eb.ap-southeast-1:_ _An error occurred (AccessDeniedException) when calling the TerminateSession operation: User: arn:aws:sts::001292317441:a_ _ssumed-role/AWSReservedSSO_Example_739d002d2774bna6/john.doe@companyname.com is not authorized t_ _o perform: ssm:TerminateSession on resource: arn:aws:ssm:ap-southeast-1:001292317441:session/john.doe@companyname.com-08fce585f53bab614 because no identity-based policy allows the ssm:TerminateSession action_ _kex_exchange_identification: Connection closed by remote host_ _Connection closed by UNKNOWN port 65535_ _lost connection_ The session that it says can't be terminated is actually one that is already terminated so I can't figure out how it is erroring or why. I cannot replicate this error when giving myself the same permissions. This same user can access the same system via SSM (SSH equivalent) fine. The permissions assigned to this user are: { "Effect": "Allow", "Action": \[ "ssm:StartSession" ], "Resource": \[ "arn:aws:ec2:**:001292317441:instance/\**", "arn:aws:ec2:**:053586226857:instance/\**", "arn:aws:ssm:**:**:document/AWS-StartSSHSession" ] }, { "Effect": "Allow", "Action": \[ "ssm:TerminateSession", "ssm:ResumeSession" ], "Resource": \[ "arn:aws:ssm:\**:\**:session/${aws:username}-*" ] } This same command using the same permissions works fine for me. The command should work according the config in .ssh which is: host i-**.** mi-**.** ProxyCommand bash -c "aws ssm start-session --target $(echo %h|cut -d'.' -f1) --region $(echo %h|/usr/bin/cut -d'.' -f2) --document-name AWS-StartSSHSession --parameters 'portNumber=%p'" --profile $(echo %h|cut -d '.' -f3) Does anyone have any idea what might be happening? Edited by: jonzen on Oct 29, 2021 3:38 AM Edited by: jonzen on Oct 29, 2021 3:39 AM
2
answers
0
votes
0
views
jonzen
asked 7 months ago
1
answers
0
votes
0
views
NorbertS
asked a year ago

(Fargate) ExecuteCommandAgent transitions from RUNNING to STOPPED

Hi, I recently followed all the guidance to enable ECS ExecuteCommand access for my containers (excellent feature). I followed https://aws.amazon.com/blogs/containers/new-using-amazon-ecs-exec-access-your-containers-fargate-ec2/, deployed the new infrastructure, and I was able to connect to my ECS Fargate task. Success! Very exciting. I go back today to try to troubleshoot a problem in our beta environment, and I keep getting... "An error occurred (InvalidParameterException) when calling the ExecuteCommand operation: The execute command failed because execute command was not enabled when the task was run or the execute command agent isn’t running. Wait and try again or run a new task with execute command enabled and try again." My IAM permissions haven't changed and the cluster/service/task configuration hasn't changed. I double-checked by re-running Terraform to ensure the plans were the same (they are - no diff between what's in the account and what's in the templates). I spun up a new task and described the state of the task: $ aws --region us-west-2 --profile <redacted> ecs describe-tasks --cluster <redacted> --tasks 1fe5bd64db4d428794fa0b956c1efda6 | jq '.tasks\[0] | {"managedAgents": .containers\[0].managedAgents, "enableExecuteCommand": .enableExecuteCommand}' { "managedAgents": \[ { "lastStartedAt": "2021-04-07T09:10:01.885000-04:00", "name": "ExecuteCommandAgent", "lastStatus": "RUNNING" } ], "enableExecuteCommand": true } I then tried to connect: aws --region us-west-2 --profile <redacted> ecs execute-command --cluster <redacted> \ --task 1fe5bd64db4d428794fa0b956c1efda6 \ --container <redacted> \ --interactive \ --command "/bin/bash" Got the error I pasted above. Then I checked the status again. $ aws --region us-west-2 --profile <redacted> ecs describe-tasks --cluster <redacted> --tasks 1fe5bd64db4d428794fa0b956c1efda6 | jq '.tasks\[0] | {"managedAgents": .containers\[0].managedAgents, "enableExecuteCommand": .enableExecuteCommand}' { "managedAgents": \[ { "name": "ExecuteCommandAgent", "lastStatus": "STOPPED" } ], "enableExecuteCommand": true } There's ample RAM available. I'm not sure why it isn't working after nothing has changed. I don't have any additional tools to troubleshoot this. Worse, it's also happening in our production environment too. Any suggestions?
8
answers
0
votes
0
views
nscott
asked a year ago

Windows Upgrade using AWSEC2-CloneInstanceAndUpgradeWindows automation fail

I am attempting to upgrade a Windows Server 2012R2 EC2 Instance to Windows Server 2019 using the AWS Systems Manager automation: AWSEC2-CloneInstanceAndUpgradeWindows I have followed the instructions outlined here to perform the upgrade: https://docs.aws.amazon.com/AWSEC2/latest/WindowsGuide/automated-upgrades.html I have updated the SSM agent on the server that I am upgrading to the newest version. I have also updated the EC2Config service to the latest version and rebooted the server. I have created a new IAM role to use for the upgrade and I have attached the AmazonSSMManagedInstanceCore policy to the role. I have set this role as the IAM role on the instance that I am attempting to upgrade. I am running the upgrade with a subnet that is in the same default VPC as the instance that I am trying to upgrade. The subnet is a different previously unused default subnet within that default VPC as the instructions suggest. The subnet does have Auto-assign public IPv4 address set to yes. The automation fails on step 7: Automation step7: runUpgradeFrom2012R2Or2016 Failed to run automation with executionId: theId Failed : {Status=\[Failed], Output=\[No output available yet because the step is not successfully executed, No output available yet because the step is not successfully executed], ExecutionId=theId} I am having trouble narrowing down the problem or finding a cause for the failure based on the error above. Any pointers or help will be greatly appreciated! Thank you, Brian
1
answers
0
votes
0
views
briangAWS
asked a year ago
  • 1
  • 90 / page