By using AWS re:Post, you agree to the Terms of Use
/Amazon EC2/

Questions tagged with Amazon EC2

Sort by most recent
  • 1
  • 90 / page

Browse through the questions and answers listed below or filter and sort to narrow down your results.

Unsupported Action in Policy for S3 Glacier/Veeam

Hello, New person using AWS S3 glacier and I ran across an issue. I am working with Veeam to add an S3 Glacier to my backup. I have the bucket created. I need to add the following to my bucket policy: ``` { "Version": "2012-10-17", "Statement": [ { "Sid": "VisualEditor0", "Effect": "Allow", "Action": [ "s3:DeleteObject", "s3:PutObject", "s3:GetObject", "s3:RestoreObject", "s3:ListBucket", "s3:AbortMultipartUpload", "s3:GetBucketVersioning", "s3:ListAllMyBuckets", "s3:GetBucketLocation", "s3:GetBucketObjectLockConfiguration", "ec2:DescribeInstances", "ec2:CreateKeyPair", "ec2:DescribeKeyPairs", "ec2:RunInstances", "ec2:DeleteKeyPair", "ec2:DescribeVpcAttribute", "ec2:CreateTags", "ec2:DescribeSubnets", "ec2:TerminateInstances", "ec2:DescribeSecurityGroups", "ec2:DescribeImages", "ec2:DescribeVpcs", "ec2:CreateVpc", "ec2:CreateSubnet", "ec2:DescribeAvailabilityZones", "ec2:CreateRoute", "ec2:CreateInternetGateway", "ec2:AttachInternetGateway", "ec2:ModifyVpcAttribute", "ec2:CreateSecurityGroup", "ec2:DeleteSecurityGroup", "ec2:AuthorizeSecurityGroupIngress", "ec2:AuthorizeSecurityGroupEgress", "ec2:DescribeRouteTables", "ec2:DescribeInstanceTypes" ], "Resource": "*" } ] } ``` Once I put this in, the first error I get is "Missing Principal". So I added "Principal": {}, under SID. But I have no idea what to put in the brackets. I changed it to "*" and that seemed to fix it. Not sure if this the right thing to do? The next error I get is for all the EC2's and s3:ListAllMyBuckets give me an error of "Unsupported Action in Policy". This is where I get lost. Not sure what else to do. Do I need to open my bucket to public? Is this a permissions issue? Do I have to recreate the bucket and disable object-lock? Please help.
2
answers
0
votes
5
views
amatuerAWSguy
asked 2 days ago

Elastic Beanstalk´s ASG cannot create EC2 instances

**The problem** I´m using Elastic Beanstalk to deploy applications alongside GitHub Actions, when the action gets activated, Beanstalk creates an ASG where the desired capacity creates at least 1 instance with the containerized application. For some reason, the ASG provided by Beanstalk started to set as unhealthy the instances almost in an immediate way and terminates them. This process repeats 5 or 6 times and then returns an error state to the beanstalk application. The ASG remains in *Provisioning state* and when I looked at the ASG activity history log I got the following: | Status | Description | Cause | | --- | --- | --- | | Cancelled | Launching a new EC2 instance. Status Reason: Instance Became unhealthy while waiting for instance to be in inService state. Termination Reason: Client.InternalError: Client error on launch | At 2022-01-13 an instance was started in response to a difference between desired and actual capacity, increasing the capacity from 0 to 1 | And the EB environment events throw 4 errors: 1. Creating Auto Scaling group named: awseb-[...]-AWSEBAutoScalingGroup-1V0R8Z9EJ8G8J failed. Reason: Group did not stabilize. {current/minSize/maxSize} group size = {0/1/1}. 2. Service:AmazonCloudFormation, Message:Stack named 'awseb-e-rvjtnttttf-immutable-stack' aborted operation. Current state: 'CREATE_FAILED' Reason: The following resource(s) failed to create: [AWSEBAutoScalingGroup]. 3. Failed to deploy application. 4. Cannot complete command execution on the following instances as they are no longer running: [i-03449eff8756123c2]. **The following steps have been taken at the moment** 1. Review of IAM role permissions to allow creating EC2 instances. 2. Review of SG and secure connection between the load balancer and target group **Identified activities before the problem** 1. Enable ELB health check with 300 grace period for two of our ASG **Personal point of view** The problem seems to be not directly with Beanstalk but between TG and the instances, maybe a VPC endpoint is needed to return health status from EC2 to TG? `Client -> LB -> TG -[HERE]> EC2`
2
answers
0
votes
4
views
Osain Abitia
asked 3 days ago

EC2 unresponsive ,status check failed , not reaching via ping/ssh

Ec2 is not unresponsive after the force shut down and the normal start of ec2 instances. When starting, the Status check failing and the below logs are seen via Get system log option on aws console: Now as there is no way to access to ec2, how can I resolve it? ``` ---- [ 3.766101] ACPI: PCI Interrupt Link [LNKD] enabled at IRQ 11 [ 4.210518] input: AT Translated Set 2 keyboard as /devices/platform/i8042/serio0/input/input0 [ 4.862401] nvme0n1: p1 p2 [ 4.876284] random: nonblocking pool is initialized Begin: Loading essential drivers ... done. Begin: Running /scripts/init-premount ... done. Begin: Mounting root file system ... Begin: Running /scripts/local-top ... [ 4.929843] device-mapper: uevent: version 1.0.3 [ 4.931134] device-mapper: ioctl: 4.27.0-ioctl (2013-10-30) initialised: dm-devel@redhat.com done. Begin: Running /scripts/local-premount ... done. Begin: Will now check root file system ... fsck from util-linux 2.25.2 [/sbin/fsck.ext4 (1) -- /dev/nvme0n1p2] fsck.ext4 -a -C0 /dev/nvme0n1p2 /dev/nvme0n1p2 contains a file system with errors, check forced. /dev/nvme0n1p2: Inodes that were part of a corrupted orphan linked list found. /dev/nvme0n1p2: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY. (i.e., without -a or -p options) fsck exited with status code 4 done. Failure: File system check of the root filesystem failed The root filesystem on /dev/nvme0n1p2 requires a manual fsck [ 5.052672] ACPI: bus type USB registered [ 5.053800] usbcore: registered new interface driver usbfs [ 5.055203] usbcore: registered new interface driver hub [ 5.056611] usbcore: registered new device driver usb [ 5.058820] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver [ 5.060950] ehci-pci: EHCI PCI platform driver [ 5.065352] uhci_hcd: USB Universal Host Controller Interface driver [ 5.068489] ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver [ 5.071678] hidraw: raw HID events driver (C) Jiri Kosina [ 5.074091] usbcore: registered new interface driver usbhid [ 5.075511] usbhid: USB HID core driver ```
2
answers
0
votes
6
views
AWS-User-1149198
asked 6 days ago

Why is HTTPD failing to start? Why is TLS failing to start? Missing certificate key is not missing!

For context, I followed this tutorial to configure SSL/TLS on an EC2 instance: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/SSL-on-amazon-linux-2.html Everything was working fine, I've installed a web application (Drupal 9) from composer-based repo, maintained my code, fine. I updated some packages with yum, update php, etc. I attempt to start Apache: ``` [ec2-user@ip-172-31-32-159 ~]$ sudo systemctl restart httpd Job for httpd.service failed. See "systemctl status httpd.service" and "journalctl -xe" for details. ``` I check `journalctl -xe` The important part appears to be: ``` -- Unit httpd-init.service has begun starting up. Jan 10 00:10:41 ip-172-31-32-159.us-east-2.compute.internal httpd-ssl-gencerts[9368]: Missing certificate key! Jan 10 00:10:41 ip-172-31-32-159.us-east-2.compute.internal systemd[1]: httpd-init.service: main process exited, code=exited, status=1/FAILURE Jan 10 00:10:41 ip-172-31-32-159.us-east-2.compute.internal systemd[1]: Failed to start One-time temporary TLS key generation for httpd.service. -- Subject: Unit httpd-init.service has failed -- Defined-By: systemd -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel -- -- Unit httpd-init.service has failed. -- -- The result is failed. ``` Here is something interesting. I check `vim /etc/httpd/conf.d/ssl.conf` At line 100 is `SSLCertificateFile /etc/pki/tls/certs/localhost.crt` Okay, very good. The interesting thing is if I rename the file `sudo mv /etc/pki/tls/certs/localhost.crt /etc/pki/tls/certs/localhost.crt.bak`, and then try to start httpd `sudo systemctl start httpd`, returned is `Job for httpd.service failed because the control process exited with error code.` Checking `journalctl -xe` again, we recieve a different error: ``` -- Unit httpd.service has begun starting up. Jan 10 00:42:56 ip-172-31-32-159.us-east-2.compute.internal httpd[9841]: AH00526: Syntax error on line 100 of /etc/httpd/conf.d/ssl.conf: Jan 10 00:42:56 ip-172-31-32-159.us-east-2.compute.internal httpd[9841]: SSLCertificateFile: file '/etc/pki/tls/certs/localhost.crt' does not exist or Jan 10 00:42:56 ip-172-31-32-159.us-east-2.compute.internal systemd[1]: httpd.service: main process exited, code=exited, status=1/FAILURE Jan 10 00:42:56 ip-172-31-32-159.us-east-2.compute.internal systemd[1]: Failed to start The Apache HTTP Server. ``` Renaming localhost.crt to localhost.crt.bak changes the error, breaks the link, and SSLCertificateFile appropriately does not exist. Changing localhost.crt.bak to localhost.crt restores the SSLCertificateFile link, and changes the error back to claiming there is a missing certificate key, when we can see it there: ``` Jan 10 00:47:07 ip-172-31-32-159.us-east-2.compute.internal httpd-ssl-gencerts[9884]: Missing certificate key! ``` What is going on here?
0
answers
0
votes
3
views
AWS-User-3495166
asked 7 days ago

EC2 instance fails to start after changing instance type from t2 to t3

I tried to update an EC2 instance from t2 to t3. Since the AZ I was running the instance in did not support t3 instances. I stopped the instance, created an image, and then tried to create an instance from that image in us-east-1c. The instance is running RockyOS v. 8.5. The instance did not start. Using the serial console, it appears as though the EBS volume was not detected. I verified that the ENA and NMVE drivers were installed. I tried a series of experiments, where I created new instances from the original AMI and I was able to create t2 instances, stopped them, created images, and then created new t3 instances without issue. The main difference of course is that the production instance has a lot more data on it, has been updated via dnf update, etc. I suppose I could just create a brand new t3 instance and migrate the data over, but I would like to understand why I wasn't able to convert the instance from t2 to t3. Some more information: The reason that the experiments worked, was because the original AMI was based on RockyOS v. 8.4. This version allows me to migrate between t2 and t3 versions without any issues. The production instance was updated at some point to version 8.5 and for some reason this version does not boot on t3 (nitro) instance types. I repeated my experiment, and launched the original AMI in a t2, did an upgrade, then after changing the instance type to t3, the instance does not boot. While this doesn't provide a solution to the problem, at least not it is reproducible. So, what is it about Rocky OS v. 8.5 that prevents the migration to a nitro instance? The modinfo ena and modinfo nvme both show the drivers are present
2
answers
0
votes
7
views
AWS-User-4761960
asked 7 days ago

Container Insights on Amazon EKS Fluent Bit AccessDeniedException

I'm trying to add a Container Insight to my EKS cluster but running into a bit of an issue when deploying. According to my logs, I'm getting the following: ``` [error] [output:cloudwatch_logs:cloudwatch_logs.2] CreateLogGroup API responded with error='AccessDeniedException' [error] [output:cloudwatch_logs:cloudwatch_logs.2] Failed to create log group ``` The strange part about this is the role it seems to be assuming is the same role found within my EC2 worker nodes rather than the role for the service account I have created. I'm creating the service account and can see it within AWS successfully using the following command: ``` eksctl create iamserviceaccount --region ${env:AWS_DEFAULT_REGION} --name cloudwatch-agent --namespace amazon-cloudwatch --cluster ${env:CLUSTER_NAME} --attach-policy-arn arn:aws:iam::aws:policy/CloudWatchAgentServerPolicy --override-existing-serviceaccounts --approve ``` Despite the serviceaccount being created successfully, I continue to get my AccessDeniedException. One thing I found was the logs work fine when I manually add the CloudWatchAgentServerPolicy to my worker nodes, however this is not the implementation I would like and instead would rather us the automative approach of adding the service account and not touching the worker nodes directly if possible. The steps I followed can be found at the bottom of this [https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/Container-Insights-prerequisites.html](). Thanks so much!
0
answers
0
votes
3
views
AWS-User-8353451
asked 9 days ago

GPU fails to intialize on g5.xlarge instance

Hello, I have tried to create several g5.xlarge innstance with various AMI "quickstart" (Deep Learning AMI GPU TensorFlow 2.7.0 (Amazon Linux 2) 20211111 - ami-0850c76a5926905fb, Deep Learning AMI (Ubuntu 18.04) Version 54.0, ...) In all cases, the instances is booting OK. Status checks are both OK, but the GPU is not accessible. For example with AMI (Ubuntu 18.04) Version 54.0 nvidia-smi gives the error ``` nvidia-smi NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running. ``` With 'dmesg' we can see the following errors: ``` [ 308.148743] nvidia: probe of 0000:00:1e.0 failed with error -1 [ 308.148756] NVRM: The NVIDIA probe routine failed for 1 device(s). [ 308.148756] NVRM: None of the NVIDIA devices were initialized. [ 308.148969] nvidia-nvlink: Unregistered the Nvlink Core, major device number 239 ``` The nvidia drivers installed are ``` apt list --installed | grep -i nvidia libnvidia-container-tools/bionic,now 1.7.0-1 amd64 [installed,automatic] libnvidia-container1/bionic,now 1.7.0-1 amd64 [installed,automatic] nvidia-container-toolkit/bionic,now 1.7.0-1 amd64 [installed] nvidia-docker2/bionic,now 2.8.0-1 all [installed] nvidia-fabricmanager-450/now 450.142.00-1 amd64 [installed,upgradable to: 450.156.00-0ubuntu0.18.04.1] ``` The driver are not updated when doing a system update (i tried to unhold the package, update the system but it does not solve the issue) ``` apt-mark showhold linux-aws linux-headers-aws linux-image-aws nvidia-fabricmanager-450 tensorflow-model-server-neuron ``` Any idea of what i could try to solve the issue ? Or do you know another Deep Learning AMI image that would work fine with this g5.xlarge ? Thanks !
1
answers
0
votes
2
views
AWS-User-9679185
asked 9 days ago

aws-elasticbeanstalk-ec2-role aws-elasticbeanstalk-ec2-role is not authorized to perform: secretsmanager:GetSecretValue although the default role is updated to include policy

There is an EC2 instance attempting to get a secret from SecretsManager but errors with the following: ``` Error getting database credentials from Secrets Manager AccessDeniedException: User: arn:aws:sts::{AccountNumber}:assumed-role/aws-elasticbeanstalk-ec2-role/i-{instanceID} is not authorized to perform: secretsmanager:GetSecretValue on resource: rds/staging/secretName because no identity-based policy allows the secretsmanager:GetSecretValue action ``` I have tried adding the following policy to the general aws-elasticbeanstalk-ec2-role to allow for access but it is still not able to get the secrets: GetSecretsPolicy: ``` { "Version": "2012-10-17", "Statement": [ { "Sid": "VisualEditor0", "Effect": "Allow", "Action": [ "secretsmanager:GetResourcePolicy", "secretsmanager:GetSecretValue", "secretsmanager:DescribeSecret", "secretsmanager:ListSecretVersionIds" ], "Resource": "arn:aws:secretsmanager:*:{AccountNumber}:secret:rds/production/secretName" }, { "Sid": "VisualEditor1", "Effect": "Allow", "Action": "secretsmanager:GetRandomPassword", "Resource": "*" }, { "Sid": "VisualEditor2", "Effect": "Allow", "Action": [ "secretsmanager:GetResourcePolicy", "secretsmanager:GetSecretValue", "secretsmanager:DescribeSecret", "secretsmanager:ListSecretVersionIds" ], "Resource": "arn:aws:secretsmanager:*:{AccountNumber}:secret:rds/staging/secretName" } ] } ``` I continue to get the error and am wondering if there is something I can tweak to make it able to have proper access to the secret values
1
answers
0
votes
7
views
AWS-User-1866056
asked 11 days ago

InvalidParameterValue Error in docker compose deploy

I am trying to deploy two docker containers via docker compose to ECS. This already worked before. Now I'm getting the following error: > **DatabasemongoService TaskFailedToStart: Unexpected EC2 error while attempting to tag the network interface: InvalidParameterValue** I tried deleting all resources in my account and recreating a default VPC which the docker compose uses to deploy. I tried tagging the network interface via the management web UI, which worked without troubles. I found this Documentation about EC2 Error Codes: https://docs.aws.amazon.com/AWSEC2/latest/APIReference/errors-overview.html > **InvalidParameterValue**: A value specified in a parameter is not valid, is unsupported, or cannot be used. Ensure that you specify a resource by using its full ID. The returned message provides an explanation of the error value. I don't get any output besides the error above to put my search on a new trail. Also there is this entry talking about the error: > InvalidNetworkInterface.InUse: The specified interface is currently in use and cannot be deleted or attached to another instance. Ensure that you have detached the network interface first. If a network interface is in use, you may also receive the **InvalidParameterValue** error. As the compose CLI handles creation and deletion of network interfaces automatically, I assume this is not the problem. Below is my docker-compose.yaml file. I start it via `docker compose --env-file=./config/.env.development up` in the ecs context. ``` version: '3' services: feathers: image: xxx build: context: ./app args: - BUILD_MODE=${MODE_ENV:-development} working_dir: /app container_name: 'feather-container' ports: - ${BE_PORT}:${BE_PORT} environment: - MODE=${MODE_ENV:-development} depends_on: - database-mongo networks: - backend env_file: - ./config/.env.${MODE_ENV} database-mongo: image: yyy build: context: ./database container_name: 'mongo-container' command: mongod --port ${MONGO_PORT} --bind_ip_all environment: - MONGO_INITDB_DATABASE=${MONGO_DATABASE} - MONGO_INITDB_ROOT_USERNAME=${MONGO_USERNAME} - MONGO_INITDB_ROOT_PASSWORD=${MONGO_PASSWORD} ports: - ${MONGO_PORT}:${MONGO_PORT} volumes: - mongo-data:/data networks: - backend networks: backend: name: be-network volumes: mongo-data: ``` Any help, idea, or point in the right direction is very appreciated!
0
answers
0
votes
6
views
jkonrath
asked 12 days ago

Why can't I ssh to an instance, given that SG and NACL are open?

I created an instance but cannot ssh to it. This happens with the command exactly as taken from the console, but also with IP address. (I added a verbosity flag See (1) below .) Strangely, debug output shows an attempt to connect to 0.0.0.1. The key file has permissions (following `chmod 400`) of `-r--------` . The Security Group is wide open (2) , as is the NACL (3). (Note: Potentially-identifying IP addresses etc. have been slightly altered for security.) (1) ``` % ssh -v 1 -i "/Users/user1/dev/intercloud/server-inter-cloud-us-east-2.pem" ec2-user@ec2-3-134-169-55.us-east-2.compute.amazonaws.com OpenSSH_8.1p1, LibreSSL 2.7.3 debug1: Reading configuration data /Users/user1/.ssh/config debug1: /Users/user1/.ssh/config line 1: Applying options for * debug1: Reading configuration data /etc/ssh/ssh_config debug1: /etc/ssh/ssh_config line 47: Applying options for * debug1: Connecting to 0.0.0.1 [0.0.0.1] port 22. debug1: connect to address 0.0.0.1 port 22: No route to host ssh: connect to host 0.0.0.1 port 22: No route to host ``` (2) ``` Inbound Rules Security group rule ID sgr-0582e1d030c525c32 22 TCP 0.0.0.0/0 intercloud-sg sgr-0f72f746d5e765465 5001 TCP 0.0.0.0/0 intercloud-sg sgr-0cbe2cf01b08f84ba 0 - 65535 TCP 0.0.0.0/0 intercloud-sg Outbound rules Security group rule ID sgr-037e39d86b69f12a8 All All 0.0.0.0/0 intercloud-sg ``` (3) ``` Inbound rules 100 All traffic All All 0.0.0.0/0 Allow * All traffic All All 0.0.0.0/0 Deny Outbound rules 100 All traffic All All 0.0.0.0/0 Allow * All traffic All All 0.0.0.0/0 Deny ```
9
answers
0
votes
9
views
JoshuaFox
asked 12 days ago

EC2 Launch Template doesn't start Spot Instance (but works for on-demand instance)

My EC2 launch template doesn't work when using it to launch a Spot instance. The launch template is set to launch a c5.xlarge instance **associated to a pre-existing Elastic Network Interface** @ index 0. When launching a spot instance, I receive the following cryptic message, and the spot request fails: > c5.xlarge, ami-b2b55cd5, Linux/UNIX: A network interface may not specify both a network interface ID and a subnet First off, how can a **network interface** specify a network interface id? I believe this error means to say "a spot instance may not specify both a network interface ID and a subnet", but I can't be sure. Secondly, my launch template *doesn't* specify a subnet directly - it only specifies a network interface ID, which in turn specifies the subnet. As a troubleshooting step, I've tried launching an on-demand EC2 instance directly using the same launch template, via "**Launch Templates -> Actions -> Launch Instance from Template**" - when I do this, the EC2 instance launches successfully. I've been able to reproduce this error consistently for over 9 months now, and am surprised that no one else has brought this up. What gives? Here is my Spot config: ``` "MySpotFleet" : { "Type" : "AWS::EC2::SpotFleet", "Properties" : { "SpotFleetRequestConfigData" : { "AllocationStrategy" : "lowestPrice", "IamFleetRole" : {"Fn::GetAtt" : ["MyIAMFleetRole", "Arn"]}, "InstanceInterruptionBehavior" : "stop", "LaunchTemplateConfigs": [ { "LaunchTemplateSpecification": { "LaunchTemplateId": { "Ref" : "MyLaunchTemplate" }, "Version": { "Fn::GetAtt" : [ "MyLaunchTemplate", "LatestVersionNumber" ]} } } ], "ReplaceUnhealthyInstances" : false, "SpotMaxTotalPrice" : "5.01", "SpotPrice" : "5.01", "TargetCapacity" : 1, "TerminateInstancesWithExpiration" : false, "Type" : "maintain", "ValidFrom" : "2021-01-01T00:00:00Z", "ValidUntil" : "2050-12-31T23:59:59Z" } }, "DependsOn": [ "MyLaunchTemplate" ] } ``` If I replace the above Spot config with this on-demand instance config, it works: ``` "MyInstance" : { "Type" : "AWS::EC2::Instance", "Properties" : { "LaunchTemplate" : { "LaunchTemplateId": { "Ref" : "MyLaunchTemplate" }, "Version": { "Fn::GetAtt" : [ "MyLaunchTemplate", "LatestVersionNumber" ]} } }, "DependsOn": [ "MyLaunchTemplate" ] } ``` If it helps, here is my Launch Template: ``` "MyLaunchTemplate" : { "Type" : "AWS::EC2::LaunchTemplate", "Properties" : { "LaunchTemplateName":"MyLaunchTemplate", "LaunchTemplateData":{ "IamInstanceProfile" : { "Arn" : { "Fn::GetAtt" : ["MyEC2IAMInstanceProfile", "Arn"] } }, "ImageId" : "ami-b2b55cd5", "InstanceType": "c5.xlarge", "NetworkInterfaces" : [ { "NetworkInterfaceId" : {"Ref" : "MyENI00"}, "DeviceIndex" : "0" } ], "InstanceInitiatedShutdownBehavior" : "stop", "KeyName" : "my-keypair" } } ``` And the ENI in question: ``` "MyENI00": { "Type": "AWS::EC2::NetworkInterface", "Properties": { "Description" : "MyENI00", "GroupSet" : [ {"Ref" : "MySecurityGroup"} ], "PrivateIpAddresses": [ { "Primary" : true, "PrivateIpAddress" : "172.16.0.100" }, { "Primary" : false, "PrivateIpAddress" : "172.16.0.101" } ], "SourceDestCheck": false, "SubnetId": { "Ref" : "MySubnet" } } } ```
0
answers
0
votes
4
views
AWS-User-7769226
asked 13 days ago

How to solve a "Connection refused" error on ECS task in awsvpc network mode?

Hi there, Even though `containerPort` as well as `hostPort` are set, we experience trouble when connecting to an ECS task from outside the container (even the host EC2 instance cannot access it). ``` sh-4.2$ # This is the EC2 host of the task's container sh-4.2$ curl -o /dev/null http://localhost/some/file.zip # Same with 127.0.0.1 % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0 curl: (7) Failed to connect to localhost port 80: Connection refused ``` Excerpt of the task definition: ```terraform network_mode = "awsvpc" // needed in order to use A records for service discovery network_configuration { subnets = [module.subnet_private.id] } ``` Excerpt of the container definition: ```terraform portMappings = [ { hostPort = 80, // must equal containerPort due to awsvpc networking mode containerPort = 80, // see nginx.conf protocol = "tcp" } ] ``` Full docker inspect: ``` [ { "Id": "f852e5f1f50154f3fab574eac406fd91038a2e5514053d777d21f81c5614dc79", "Created": "2022-01-03T18:52:30.356339157Z", "Path": "/docker-entrypoint.sh", "Args": [ "nginx", "-g", "daemon off;" ], "State": { "Status": "running", "Running": true, "Paused": false, "Restarting": false, "OOMKilled": false, "Dead": false, "Pid": 15694, "ExitCode": 0, "Error": "", "StartedAt": "2022-01-03T18:52:30.866257409Z", "FinishedAt": "0001-01-01T00:00:00Z" }, "NetworkMode": "container:389dbe8d2c45cbb0ddddbbf2a8f46e62483124023880b96ef04319b7050ff5c5", "PortBindings": {}, "RestartPolicy": { "Name": "", "MaximumRetryCount": 0 }, "AutoRemove": false, "VolumeDriver": "", "VolumesFrom": [], "CapAdd": [], "CapDrop": [], "CgroupnsMode": "host", "Dns": null, "DnsOptions": null, "DnsSearch": null, "ExtraHosts": null, "GroupAdd": null, "IpcMode": "shareable", "Cgroup": "", "Links": null, "OomScoreAdj": 0, "PidMode": "", "Privileged": false, "PublishAllPorts": false, "ReadonlyRootfs": false, "SecurityOpt": null, "UTSMode": "", "UsernsMode": "", "ShmSize": 67108864, "Runtime": "runc", "ConsoleSize": [ 0, 0 ], "Isolation": "", "CpuShares": 1024, "Memory": 1073741824, "NanoCpus": 0, "CgroupParent": "/ecs/acafdacf06b9475b83e080cbd637f0fc", "BlkioWeight": 0, "BlkioWeightDevice": null, "BlkioDeviceReadBps": null, "BlkioDeviceWriteBps": null, "BlkioDeviceReadIOps": null, "BlkioDeviceWriteIOps": null, "CpuPeriod": 0, "CpuQuota": 0, "CpuRealtimePeriod": 0, "CpuRealtimeRuntime": 0, "CpusetCpus": "", "CpusetMems": "", "Devices": null, "DeviceCgroupRules": null, "DeviceRequests": null, "KernelMemory": 0, "KernelMemoryTCP": 0, "MemoryReservation": 0, "MemorySwap": 2147483648, "MemorySwappiness": null, "OomKillDisable": false, "PidsLimit": null, "Ulimits": [ { "Name": "nofile", "Hard": 65536, "Soft": 32768 } ], "CpuCount": 0, "CpuPercent": 0, "IOMaximumIOps": 0, "IOMaximumBandwidth": 0, "MaskedPaths": [ "/proc/asound", "/proc/acpi", "/proc/kcore", "/proc/keys", "/proc/latency_stats", "/proc/timer_list", "/proc/timer_stats", "/proc/sched_debug", "/proc/scsi", "/sys/firmware" ], "ReadonlyPaths": [ "/proc/bus", "/proc/fs", "/proc/irq", "/proc/sys", "/proc/sysrq-trigger" ] }, "Config": { "Hostname": "[REDACTED]", "Domainname": "", "User": "", "AttachStdin": false, "AttachStdout": false, "AttachStderr": false, "ExposedPorts": { "80/tcp": {} }, "Cmd": [ "nginx", "-g", "daemon off;" ], "Image": "[REDACTED]", "Volumes": null, "WorkingDir": "", "Entrypoint": [ "/docker-entrypoint.sh" ], "OnBuild": null, "Labels": { "com.amazonaws.ecs.cluster": "Nginx_Build_agent_proxy", "com.amazonaws.ecs.container-name": "buildagent-proxy", "com.amazonaws.ecs.task-arn": "[REDACTED]", "com.amazonaws.ecs.task-definition-family": "buildagent-proxy", "com.amazonaws.ecs.task-definition-version": "20", "maintainer": "NGINX Docker Maintainers <docker-maint@nginx.com>" }, "StopSignal": "SIGQUIT" }, "NetworkSettings": { "Bridge": "", "SandboxID": "", "HairpinMode": false, "LinkLocalIPv6Address": "", "LinkLocalIPv6PrefixLen": 0, "Ports": {}, "SandboxKey": "", "SecondaryIPAddresses": null, "SecondaryIPv6Addresses": null, "EndpointID": "", "Gateway": "", "GlobalIPv6Address": "", "GlobalIPv6PrefixLen": 0, "IPAddress": "", "IPPrefixLen": 0, "IPv6Gateway": "", "MacAddress": "", "Networks": {} } } ] ```
1
answers
1
votes
8
views
GenericAWSUser
asked 13 days ago

Trying to isolate IAM user to have AmazonEC2ReadOnlyAccess to only select instances using python boto3

Ok so the policy `arn:aws:iam::aws:policy/AmazonEC2ReadOnlyAccess` looks like this: ``` { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": "ec2:Describe*", "Resource": "*" }, { "Effect": "Allow", "Action": "elasticloadbalancing:Describe*", "Resource": "*" }, { "Effect": "Allow", "Action": [ "cloudwatch:ListMetrics", "cloudwatch:GetMetricStatistics", "cloudwatch:Describe*" ], "Resource": "*" }, { "Effect": "Allow", "Action": "autoscaling:Describe*", "Resource": "*" } ] } ``` This works to allow the IAM user to perform most ec2 read functions. Problem is that this is too permissive. What I need to do is allow all the same functionality as this but ONLY for certain instances. So what I attempted to do is isolate this given a list of instance ids `instanceids` (using python boto3): ``` ResourceIds = [ f"arn:aws:ec2:{REGION_NAME}:{AWS_ACCOUNTID}:instance/{iid}" for iid in instanceids] Ec2ReadOnlyPolicy = { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": "ec2:Describe*", "Resource": ResourceIds }, { "Effect": "Allow", "Action": "elasticloadbalancing:Describe*", "Resource": ResourceIds }, { "Effect": "Allow", "Action": [ "cloudwatch:ListMetrics", "cloudwatch:GetMetricStatistics", "cloudwatch:Describe*" ], "Resource": ResourceIds }, { "Effect": "Allow", "Action": "autoscaling:Describe*", "Resource": ResourceIds } ] } response = iam_client.put_group_policy( PolicyDocument=json.dumps(Ec2ReadOnlyPolicy), PolicyName=EC2_RO_POLICY_NAME, GroupName=UserGroupName, ) ``` Problem is that this doesn't seem to allow the user to list instances they have access to: ``` $ aws ec2 describe-instances An error occurred (UnauthorizedOperation) when calling the DescribeInstances operation: You are not authorized to perform this operation. ``` What am I doing wrong?
2
answers
0
votes
14
views
chrisjd20
asked 13 days ago

Launched EC2 instance UNREACHABLE for Ubuntu 20.04 AMI with python 3.9 upgrade

I am using **EC2 Ubuntu 20.04 VM**. Due to **[CVE-2021-3177][1]**, Python needs to be upgraded to the latest version of Python3.9 which would be 3.9.5 currently. I did that using the `apt install` option as per the steps mentioned below: sudo apt update sudo apt upgrade -y sudo apt install python3.9 The above ensures that Python3.9.5 is now available. But now python3.8 & python3.9 is available. So next we will use the update-alternatives command to make python3.9 as the default version. sudo update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.8 1 sudo update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.9 2 Now that alternatives are defined, we will switch to Option 2 as the default option i.e. Python3.9 sudo update-alternatives --config python3 Once done, the following command would point to the latest version. sudo python3 -V However, if you use the `sudo apt update` command, you will see an error stating that Traceback (most recent call last): File "/usr/lib/cnf-update-db", line 8, in <module> from CommandNotFound.db.creator import DbCreator File "/usr/lib/python3/dist-packages/CommandNotFound/db/creator.py", line 11, in <module> import apt_pkg ModuleNotFoundError: No module named 'apt_pkg' Reading package lists... Done E: Problem executing scripts APT::Update::Post-Invoke-Success 'if /usr/bin/test -w /var/lib/command-not-found/ -a -e /usr/lib/cnf-update-db; then /usr/lib/cnf-update-db > /dev/null; fi' E: Sub-process returned an error code To fix this we will have to add a link using the following command cd /usr/lib/python3/dist-packages/ sudo ln -s apt-pkg.cpython-{38m,39m}-x86_64-linux-gnu.so Next, I tried used the following commands apt purge python3-apt apt install python3-apt sudo apt install python3.9-distutils python3.9-dev Once done following command will now not result in any errors sudo apt update This means that the issue is fixed. **I can use this machine & use it after reboot too. ** **But for some reason, If I create an AMI and launch an instance then that instance is unreachable.** Appreciate your help. [1]: https://nvd.nist.gov/vuln/detail/CVE-2021-3177
0
answers
0
votes
1
views
awswiki
asked 13 days ago

AWS EC2 F1 ERROR: [v++ 60-773] caught Tcl error: ERROR: '2201011829' is an invalid argument.

Hi, I have been trying to run hardware system image in AWS f1.2x large instance. I am successfully able to run sw_emu. However, when I try to create the hardware image, I keep getting the following error. This is strange because the same code was able to be synthesized 2 days ago. ``` ERROR: [v++ 60-773] In '/home/centos/<cwd>/_x/runOnfpga/runOnfpga/vitis_hls.log', caught Tcl error: ERROR: '2201012237' is an invalid argument. Please specify an integer value. ``` Note: The ERROR number is observed to change as I run the hw synthesis each time. And in the vitis_hls.log, the following info is being displayed ``` INFO: [IP_Flow 19-1686] Generating 'Simulation' target for IP 'runOnfpga_sitodp_32ns_64_4_no_dsp_1_ip'... ERROR: '2201012237' is an invalid argument. Please specify an integer value. while executing "rdi::set_property core_revision 2201012237 {component component_1}" invoked from within "set_property core_revision $Revision $core" (file "run_ippack.tcl" line 1515) INFO: [Common 17-206] Exiting Vivado at Sat Jan 1 22:37:40 2022... ERROR: [IMPL 213-28] Failed to generate IP. INFO: [HLS 200-111] Finished Command export_design CPU user time: 55.92 seconds. CPU system time: 2.62 seconds. Elapsed time: 55.3 seconds; current allocated memory: 1.956 GB. command 'ap_source' returned error code while executing "source runOnfpga.tcl" ("uplevel" body line 1) invoked from within "uplevel \#0 [list source $arg] " ``` Update: It seems like even the Xilinx examples are giving out the same problem when we run on F1 instance. I tried to clone the [Vitis Accel Examples](https://github.com/Xilinx/Vitis_Accel_Examples.git) and it gave the same problem as above. ``` INFO: [v++ 200-789] **** Estimated Fmax: 339.33 MHz ERROR: [v++ 213-28] Failed to generate IP. ERROR: [v++ 60-300] Failed to build kernel(ip) vadd, see log for details: /home/centos/Vitis_Accel_Examples/sys_opt/multiple_devices/_x.hw.xilinx_aws-vu9p-f1_shell-v04261818_201920_2/vadd/vadd/vitis_hls.log ERROR: [v++ 60-773] In '/home/centos/Vitis_Accel_Examples/sys_opt/multiple_devices/_x.hw.xilinx_aws-vu9p-f1_shell-v04261818_201920_2/vadd/vadd/vitis_hls.log', caught Tcl error: ERROR: '2201020036' is an invalid argument. Please specify an integer value. ERROR: [v++ 60-773] In '/home/centos/Vitis_Accel_Examples/sys_opt/multiple_devices/_x.hw.xilinx_aws-vu9p-f1_shell-v04261818_201920_2/vadd/vadd/vitis_hls.log', caught Tcl error: ERROR: [IMPL 213-28] Failed to generate IP. ERROR: [v++ 60-599] Kernel compilation failed to complete ERROR: [v++ 60-592] Failed to finish compilation INFO: [v++ 60-1653] Closing dispatch client. make: *** [_x.hw.xilinx_aws-vu9p-f1_shell-v04261818_201920_2/vadd.xo] Error 1 ``` Please suggest ways to debug this problem.
3
answers
1
votes
2
views
riddlesingh
asked 15 days ago

r6i instances cause ena issues

In the past weeks we have switch a number of instances over to the new r6i instance types. We have used r6i.xl, r6i.2xlarge and r6i.4xlarge instances. These instance types seems to be prone to hangs on the ena driver. Network load on the instances ranges from low to high so the actual amount of network seems to be unrelated to the issue. The instance doen't seem to recover from this on: All these instances have similar message in the logs: ```Dec 27 01:38:20 bc-prod-053 kernel: ena 0000:00:05.0 eth0: Found a Tx that wasn't completed on time, qid 0, index 639. 5404000 usecs have passed since last napi execution. Missing Tx timeout value 5000 msecs Dec 27 01:38:20 bc-prod-053 kernel: ena 0000:00:05.0 eth0: Found a Tx that wasn't completed on time, qid 0, index 668. 5412000 usecs have passed since last napi execution. Missing Tx timeout value 5000 msecs Dec 27 01:38:20 bc-prod-053 kernel: ena 0000:00:05.0 eth0: Found a Tx that wasn't completed on time, qid 1, index 340. 5424000 usecs have passed since last napi execution. Missing Tx timeout value 5000 msecs Dec 27 01:38:20 bc-prod-053 kernel: ena 0000:00:05.0 eth0: Found a Tx that wasn't completed on time, qid 3, index 779. 5436000 usecs have passed since last napi execution. Missing Tx timeout value 5000 msecs Dec 27 01:38:20 bc-prod-053 kernel: ena 0000:00:05.0 eth0: Found a Tx that wasn't completed on time, qid 3, index 780. 5444000 usecs have passed since last napi execution. Missing Tx timeout value 5000 msecs Dec 27 01:38:20 bc-prod-053 kernel: ena 0000:00:05.0 eth0: Found a Tx that wasn't completed on time, qid 3, index 782. 5456000 usecs have passed since last napi execution. Missing Tx timeout value 5000 msecs Dec 27 01:38:20 bc-prod-053 kernel: ena 0000:00:05.0 eth0: Found a Tx that wasn't completed on time, qid 3, index 783. 5468000 usecs have passed since last napi execution. Missing Tx timeout value 5000 msecs Dec 27 01:38:22 bc-prod-053 kernel: ena 0000:00:05.0 eth0: Keep alive watchdog timeout. Dec 27 01:38:22 bc-prod-053 kernel: ena 0000:00:05.0 eth0: Trigger reset is on Dec 27 01:38:22 bc-prod-053 kernel: ena 0000:00:05.0 eth0: tx_timeout: 0 Dec 27 01:38:22 bc-prod-053 kernel: ena 0000:00:05.0 eth0: suspend: 0 Dec 27 01:38:22 bc-prod-053 kernel: ena 0000:00:05.0 eth0: resume: 0 Dec 27 01:38:22 bc-prod-053 kernel: ena 0000:00:05.0 eth0: wd_expired: 1 Dec 27 01:38:22 bc-prod-053 kernel: ena 0000:00:05.0 eth0: interface_up: 1 Dec 27 01:38:22 bc-prod-053 kernel: ena 0000:00:05.0 eth0: interface_down: 0 Dec 27 01:38:22 bc-prod-053 kernel: ena 0000:00:05.0 eth0: admin_q_pause: 0 Dec 27 01:38:22 bc-prod-053 kernel: ena 0000:00:05.0 eth0: queue_0_tx_cnt: 56154872 .... Dec 27 01:38:22 bc-prod-053 kernel: ena 0000:00:05.0 eth0: ena_admin_q_aborted_cmd: 0 Dec 27 01:38:22 bc-prod-053 kernel: ena 0000:00:05.0 eth0: ena_admin_q_submitted_cmd: 53 Dec 27 01:38:22 bc-prod-053 kernel: ena 0000:00:05.0 eth0: ena_admin_q_completed_cmd: 53 Dec 27 01:38:22 bc-prod-053 kernel: ena 0000:00:05.0 eth0: ena_admin_q_out_of_space: 0 Dec 27 01:38:22 bc-prod-053 kernel: ena 0000:00:05.0 eth0: ena_admin_q_no_completion: 0 Dec 27 01:38:22 bc-prod-053 kernel: ena 0000:00:05.0 eth0: Reading reg failed for timeout. expected: req id[10] offset[88] actual: req id[57015] offset[88] Dec 27 01:38:23 bc-prod-053 kernel: ena 0000:00:05.0 eth0: Reading reg failed for timeout. expected: req id[11] offset[8] actual: req id[57016] offset[88] Dec 27 01:38:23 bc-prod-053 kernel: ena 0000:00:05.0 eth0: Reg read32 timeout occurred Dec 27 01:38:23 bc-prod-053 kernel: ena 0000:00:05.0 eth0: Reading reg failed for timeout. expected: req id[1] offset[88] actual: req id[57006] offset[0] Dec 27 01:38:23 bc-prod-053 kernel: ena 0000:00:05.0 eth0: Reading reg failed for timeout. expected: req id[2] offset[8] actual: req id[57007] offset[0] Dec 27 01:38:23 bc-prod-053 kernel: ena 0000:00:05.0 eth0: Reg read32 timeout occurred Dec 27 01:38:23 bc-prod-053 kernel: ena 0000:00:05.0: Can not reset device Dec 27 01:38:23 bc-prod-053 kernel: ena 0000:00:05.0: Can not initialize device Dec 27 01:38:23 bc-prod-053 kernel: ena 0000:00:05.0: Reset attempt failed. Can not reset the device```
0
answers
0
votes
4
views
LeonB
asked 20 days ago
  • 1
  • 90 / page