By using AWS re:Post, you agree to the Terms of Use
/Compute/

Questions tagged with Compute

Sort by most recent
  • 1
  • 90 / page

Browse through the questions and answers listed below or filter and sort to narrow down your results.

EC2 Instance Status Check fails when created by CloudFormation template

I have created a CloudFormation Stack using the below template in the **us-east-1** and **ap-south-1** region AWSTemplateFormatVersion: "2010-09-09" Description: Template for node-aws-ec2-github-actions tutorial Resources: InstanceSecurityGroup: Type: AWS::EC2::SecurityGroup Properties: GroupDescription: Sample Security Group SecurityGroupIngress: - IpProtocol: tcp FromPort: 80 ToPort: 80 CidrIp: 0.0.0.0/0 - IpProtocol: tcp FromPort: 443 ToPort: 443 CidrIp: 0.0.0.0/0 - IpProtocol: tcp FromPort: 22 ToPort: 22 CidrIp: 0.0.0.0/0 EC2Instance: Type: "AWS::EC2::Instance" Properties: ImageId: "ami-0d2986f2e8c0f7d01" #Another comment -- This is a Linux AMI InstanceType: t2.micro KeyName: node-ec2-github-actions-key SecurityGroups: - Ref: InstanceSecurityGroup BlockDeviceMappings: - DeviceName: /dev/sda1 Ebs: VolumeSize: 8 DeleteOnTermination: true Tags: - Key: Name Value: Node-Ec2-Github-Actions EIP: Type: AWS::EC2::EIP Properties: InstanceId: !Ref EC2Instance Outputs: InstanceId: Description: InstanceId of the newly created EC2 instance Value: Ref: EC2Instance PublicIP: Description: Elastic IP Value: Ref: EIP The Stack is executed successfully and all the resources are created. But unfortunately, once the EC2 status checks are initialized the Instance status check fails and I am not able to reach the instance using SSH. I have tried creating an Instance manually by the same IAM user, and that works perfectly. These are the Policies I have attached to the IAM user. Managed Policies * AmazonEC2FullAccess * AWSCloudFormationFullAccess InLine Policy { "Version": "2012-10-17", "Statement": [ { "Sid": "VisualEditor0", "Effect": "Allow", "Action": [ "iam:CreateInstanceProfile", "iam:DeleteInstanceProfile", "iam:GetRole", "iam:GetInstanceProfile", "iam:DeleteRolePolicy", "iam:RemoveRoleFromInstanceProfile", "iam:CreateRole", "iam:DeleteRole", "iam:UpdateRole", "iam:PutRolePolicy", "iam:AddRoleToInstanceProfile" ], "Resource": "*" }, { "Sid": "VisualEditor1", "Effect": "Allow", "Action": [ "s3:PutObject", "s3:GetObject", "s3:ListAllMyBuckets", "s3:CreateBucket", "s3:DeleteObject", "s3:DeleteBucket" ], "Resource": "*" } ] } Thanks in advance for helping out. Have a good day
1
answers
0
votes
3
views
Code Panthers
asked a month ago

Lightsail Instance Fails Status checks after random CPU max out

I am having an issue with my Lightsail instance becoming inaccessible. Around 0300 UTC the CPU will spike to 100% usage(for less than 5 minutes), then I begin to get Instance Status Check Failures. I cannot log in to the Wordpress admin dashboard, nor can I access SSH via Putty nor the web console. I have to manually go in to Lightsail and stop the Instance and then start it. I am intending this to become a production server for the website of an event I run. Very simple Wordpress site which should run with no issues on a 1GB, 1vCPU instance. It is currently just in development so there is literally zero traffic to the site so there is no way it should be using any resources really... Looking through the syslog I can see the following around the time the of the CPU spike ```Mar 20 02:08:03 ip-172-26-4-68 systemd-networkd[445]: message repeated 10 times: [ eth0: DHCPv6 address 2600:1f16:237:5d00:7ba1:b960:d9e5:f2e9/128 timeout preferred 140 valid 450] Mar 20 02:09:01 ip-172-26-4-68 CRON[18990]: (root) CMD ( [ -x /usr/lib/php/sessionclean ] && if [ ! -d /run/systemd/system ]; then /usr/lib/php/sessionclean; fi) Mar 20 02:09:02 ip-172-26-4-68 systemd[1]: Starting Clean php session files... Mar 20 02:09:05 ip-172-26-4-68 systemd[1]: phpsessionclean.service: Succeeded. Mar 20 02:09:05 ip-172-26-4-68 systemd[1]: Finished Clean php session files. Mar 20 02:09:08 ip-172-26-4-68 systemd-networkd[445]: eth0: DHCPv6 address 2600:1f16:237:5d00:7ba1:b960:d9e5:f2e9/128 timeout preferred 140 valid 450 ``` So it seems some kind of CRON job is taking down the system and not recovering. Can anyone help me figure out how to stop this? Thanks
1
answers
0
votes
1
views
ProfessorNerdly
asked 2 months ago

[FAILED] Failed to start Initial cloud-init job (pre-networking)

Hi! I have an issue with a migrate instance from Azure to AWS. I do install the Kernel version for AWS in the instance (Ubuntu 18.04) and enabled the ENA support for that. But when the Instance is starting the status check show an status 1/2. In the registry of system in the console I can see the next error and logging into the instance via the EC2 Serial Console and run the next command the error displayed is: Command: *systemctl status cloud-init-local.service* Error displayed: ● cloud-init-local.service - Initial cloud-init job (pre-networking) Loaded: loaded (/lib/systemd/system/cloud-init-local.service; enabled; vendor Drop-In: /lib/systemd/system/cloud-init-local.service.d └─50-azure-clear-persistent-obj-pkl.conf Active: failed (Result: exit-code) since Fri 2022-02-25 01:28:39 UTC; 10min a Process: 875 ExecStart=/usr/bin/cloud-init init --local (code=exited, status=1 Process: 852 ExecStartPre=/bin/sh -xc if [ -e /var/lib/cloud/instance/obj.pkl Main PID: 875 (code=exited, status=1/FAILURE) Feb 25 01:28:41 UHP-ECOMM-US-DB2 cloud-init[875]: Command: ['netplan', 'generate Feb 25 01:28:41 UHP-ECOMM-US-DB2 cloud-init[875]: Exit code: 1 Feb 25 01:28:41 UHP-ECOMM-US-DB2 cloud-init[875]: Reason: - Feb 25 01:28:41 UHP-ECOMM-US-DB2 cloud-init[875]: Stdout: Feb 25 01:28:41 UHP-ECOMM-US-DB2 cloud-init[875]: Stderr: /etc/netplan/51-netcfg Feb 25 01:28:41 UHP-ECOMM-US-DB2 cloud-init[875]: ethernets: Feb 25 01:28:41 UHP-ECOMM-US-DB2 cloud-init[875]: ^ Feb 25 01:28:41 UHP-ECOMM-US-DB2 cloud-init[875]: ------------------------------ Feb 25 01:28:39 UHP-ECOMM-US-DB2 systemd[1]: cloud-init-local.service: Failed wi Feb 25 01:28:39 UHP-ECOMM-US-DB2 systemd[1]: Failed to start Initial cloud-init Can you help me?
0
answers
0
votes
4
views
AWS-User-4194164
asked 3 months ago
1
answers
0
votes
4
views
oliveirafilipe
asked 4 months ago

[EC2 FPGA] XDMA transfers fail on f1.16xlarge

Hello, I've been using EC2, FPGAs for about a 15 months now. I've always been using f1.2xlarge instances, with Ubuntu OS installed, and it worked as expected. Now, due to the amount of CPU intensive work I need to do, I've decided to try using a more robust f1.16xlarge. However, I ran into problems here. I've done all the steps: Loaded the AGFI, checked via `lspci` is it available, and then tried some simple XDMA read/write tests, just to make sure the connection is still there. Sadly, I get no communication with the PCIe FPGA board. Below is the dmesg output, that reports that the "magic" error in the descriptor happened. Again, I'm using the same driver, same AGFI, and the same Python wrappers around C invocation of kernel. ``` [ 1683.589120] xdma:engine_service_final_transfer: engine 0-H2C0-MM, status error 0x80010. [ 1683.589123] xdma:engine_status_dump: SG engine 0-H2C0-MM status: 0x00080010: MAGIC_STOPPED,DESC_ERR:UNSUPP_REQ [ 1683.589126] 0-H2C0-MM, s 0x80010, aborted xfer 0x00000000e19d64e9, cmpl 0/1 [ 1683.589136] xdma:xdma_xfer_submit: xfer 0x00000000e19d64e9,1024, failed, ep 0x0. ``` EDIT: I've figured out the problem, after looking more closely at the `dmesg` output, I figured out that the AGFI was loaded on a different FPGA slot. I've loaded the AGFI as I always do: `sudo fpga-load-local-image -S 0 -I $MY_AGFI_ID -H`. I don't see how it could end up on Slot 8? When I try to adjust my test and run it on Slot #8, all works as expected! To be honest `dmesg` shows this pretty straightforward: ``` [ 183.021774] xdma:remove_one: pdev 0x00000000999de0ae, xdev 0x0000000076ff6236, 0x00000000ca7f10f1. [ 183.021777] xdma:xpdev_free: xpdev 0x0000000076ff6236, destroy_interfaces, xdev 0x00000000ca7f10f1. [ 183.024133] xdma:xpdev_free: xpdev 0x0000000076ff6236, xdev 0x00000000ca7f10f1 xdma_device_close. [ 186.066065] pci 0000:00:0f.0: [1d0f:f000] type 00 class 0x058000 [ 186.066817] pci 0000:00:0f.0: reg 0x10: [mem 0x86000000-0x87ffffff] [ 186.067206] pci 0000:00:0f.0: reg 0x14: [mem 0x85200000-0x853fffff] [ 186.067855] pci 0000:00:0f.0: reg 0x18: [mem 0x5e000410000-0x5e00041ffff 64bit pref] [ 186.068493] pci 0000:00:0f.0: reg 0x20: [mem 0x5c000000000-0x5dfffffffff 64bit pref] [ 186.083784] pci 0000:00:0f.0: BAR 4: assigned [mem 0x5c000000000-0x5dfffffffff 64bit pref] [ 186.084214] pci 0000:00:0f.0: BAR 0: assigned [mem 0x86000000-0x87ffffff] [ 186.084317] pci 0000:00:0f.0: BAR 1: assigned [mem 0x85200000-0x853fffff] [ 186.084421] pci 0000:00:0f.0: BAR 2: assigned [mem 0x5e000410000-0x5e00041ffff 64bit pref] [ 186.084996] xdma:xdma_device_open: xdma device 0000:00:0f.0, 0x000000006c4610d7. [ 186.086074] xdma:map_single_bar: BAR0 at 0x86000000 mapped at 0x00000000f2cc5fa3, length=33554432(/33554432) [ 186.086088] xdma:map_single_bar: BAR1 at 0x85200000 mapped at 0x00000000caa22a31, length=2097152(/2097152) [ 186.086106] xdma:map_single_bar: BAR2 at 0x5e000410000 mapped at 0x000000003049f8eb, length=65536(/65536) [ 186.086109] xdma:map_bars: config bar 2, pos 2. [ 186.086110] xdma:map_single_bar: Limit BAR 4 mapping from 137438953472 to 2147483647 bytes [ 186.086115] xdma:map_single_bar: BAR4 at 0x5c000000000 mapped at 0x00000000bfd17001, length=2147483647(/137438953472) [ 186.086116] xdma:identify_bars: 4 BARs: config 2, user 0, bypass 4. [ 186.095983] xdma:pci_keep_intx_enabled: 0000:00:0f.0: clear INTX_DISABLE, 0x406 -> 0x6. [ 186.096158] xdma:irq_msix_channel_setup: engine 8-H2C0-MM, irq#572. [ 186.096193] xdma:irq_msix_channel_setup: engine 8-H2C1-MM, irq#573. [ 186.096225] xdma:irq_msix_channel_setup: engine 8-H2C2-MM, irq#574. [ 186.096270] xdma:irq_msix_channel_setup: engine 8-H2C3-MM, irq#575. [ 186.096301] xdma:irq_msix_channel_setup: engine 8-C2H0-MM, irq#576. [ 186.096334] xdma:irq_msix_channel_setup: engine 8-C2H1-MM, irq#577. [ 186.096366] xdma:irq_msix_channel_setup: engine 8-C2H2-MM, irq#578. [ 186.096397] xdma:irq_msix_channel_setup: engine 8-C2H3-MM, irq#579. [ 186.096431] xdma:irq_msix_user_setup: 8-USR-0, IRQ#580 with 0x000000000932b671 [ 186.096463] xdma:irq_msix_user_setup: 8-USR-1, IRQ#581 with 0x000000005edcc121 [ 186.096511] xdma:irq_msix_user_setup: 8-USR-2, IRQ#582 with 0x00000000249674d9 [ 186.096560] xdma:irq_msix_user_setup: 8-USR-3, IRQ#583 with 0x00000000d26d07c5 [ 186.096594] xdma:irq_msix_user_setup: 8-USR-4, IRQ#584 with 0x00000000c940ac79 [ 186.096627] xdma:irq_msix_user_setup: 8-USR-5, IRQ#585 with 0x000000001fccab2f [ 186.096666] xdma:irq_msix_user_setup: 8-USR-6, IRQ#586 with 0x0000000009c457eb [ 186.096699] xdma:irq_msix_user_setup: 8-USR-7, IRQ#587 with 0x000000002bedefd1 [ 186.096732] xdma:irq_msix_user_setup: 8-USR-8, IRQ#588 with 0x000000004ca712de [ 186.096765] xdma:irq_msix_user_setup: 8-USR-9, IRQ#589 with 0x00000000e191ad7b [ 186.096799] xdma:irq_msix_user_setup: 8-USR-10, IRQ#590 with 0x00000000026a9f8b [ 186.096833] xdma:irq_msix_user_setup: 8-USR-11, IRQ#591 with 0x00000000a7138ee8 [ 186.096868] xdma:irq_msix_user_setup: 8-USR-12, IRQ#592 with 0x00000000b0c4b138 [ 186.096902] xdma:irq_msix_user_setup: 8-USR-13, IRQ#593 with 0x000000007f7aa664 [ 186.096934] xdma:irq_msix_user_setup: 8-USR-14, IRQ#594 with 0x0000000070f6c0f6 [ 186.096970] xdma:irq_msix_user_setup: 8-USR-15, IRQ#595 with 0x000000009aed6be9 [ 186.096978] xdma:probe_one: 0000:00:0f.0 xdma8, pdev 0x000000006c4610d7, xdev 0x0000000044888d47, 0x0000000058d876e9, usr 16, ch 4,4. ``` Is this a bug of some kind? How to protect myself? Thank you in advance.
1
answers
0
votes
2
views
jelicicm
asked 4 months ago

g4ad.x16xlarge Ubuntu 18.04 GPU driver issue

I've installed the gpu driver, mate desktop and nice-dcv following the official guide: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/install-amd-driver.html Everything works fine for g4ad.xlarge ... g4ad.8xlarge. But when I change instance type to g4ad.16xlarge, Xorg won't start. My first guess was that it's related to multiple GPUs, but 8xlarge already has 2 of them and it works just fine. Here's the Xorg log: ``` [ 20.536] (==) Log file: "/var/log/Xorg.0.log", Time: Sat Jan 22 05:35:22 2022 [ 20.536] (==) Using config file: "/etc/X11/xorg.conf" [ 20.536] (==) Using system config directory "/usr/share/X11/xorg.conf.d" [ 20.537] (==) ServerLayout "Layout0" [ 20.537] (**) |-->Screen "Screen0" (0) [ 20.537] (**) | |-->Monitor "Monitor0" [ 20.537] (**) | |-->Device "Device0" [ 20.537] (**) |-->Input Device "Keyboard0" [ 20.537] (**) |-->Input Device "Mouse0" [ 20.537] (==) Automatically adding devices [ 20.537] (==) Automatically enabling devices [ 20.537] (==) Automatically adding GPU devices [ 20.537] (==) Automatically binding GPU devices [ 20.537] (==) Max clients allowed: 256, resource mask: 0x1fffff [ 20.537] (WW) The directory "/usr/share/fonts/X11/cyrillic" does not exist. [ 20.537] Entry deleted from font path. [ 20.537] (WW) The directory "/usr/share/fonts/X11/100dpi/" does not exist. [ 20.537] Entry deleted from font path. [ 20.537] (WW) The directory "/usr/share/fonts/X11/75dpi/" does not exist. [ 20.537] Entry deleted from font path. [ 20.537] (WW) The directory "/usr/share/fonts/X11/100dpi" does not exist. [ 20.537] Entry deleted from font path. [ 20.537] (WW) The directory "/usr/share/fonts/X11/75dpi" does not exist. [ 20.537] Entry deleted from font path. [ 20.537] (==) FontPath set to: /usr/share/fonts/X11/misc, /usr/share/fonts/X11/Type1, built-ins [ 20.537] (**) ModulePath set to "/opt/amdgpu/lib64/xorg/modules/drivers,/opt/amdgpu/lib/xorg/modules,/opt/amdgpu-pro/lib/xorg/modules/extensions,/opt/amdgpu-pro/lib64/xorg/modules/extensions,/usr/lib64/xorg/modules,/usr/lib/xorg/modules" [ 20.537] (**) Extension "DPMS" is disabled [ 20.537] (WW) Hotplugging is on, devices using drivers 'kbd', 'mouse' or 'vmmouse' will be disabled. [ 20.537] (WW) Disabling Keyboard0 [ 20.537] (WW) Disabling Mouse0 [ 20.537] (II) Loader magic: 0x55a22090c020 [ 20.537] (II) Module ABI versions: [ 20.537] X.Org ANSI C Emulation: 0.4 [ 20.537] X.Org Video Driver: 23.0 [ 20.537] X.Org XInput driver : 24.1 [ 20.537] X.Org Server Extension : 10.0 [ 20.537] (++) using VT number 7 [ 20.537] (II) systemd-logind: logind integration requires -keeptty and -keeptty was not provided, disabling logind integration [ 20.538] (II) xfree86: Adding drm device (/dev/dri/card0) [ 20.538] (II) xfree86: Adding drm device (/dev/dri/card1) [ 20.539] (II) xfree86: Adding drm device (/dev/dri/card2) [ 20.540] (II) xfree86: Adding drm device (/dev/dri/card3) [ 20.549] (--) PCI:*(0:0:3:0) 1d0f:1111:0000:0000 rev 0, Mem @ 0xfe000000/4194304, BIOS @ 0x????????/131072 [ 20.549] (--) PCI: (0:0:26:0) 1002:7362:1002:0a34 rev 195, Mem @ 0x4040000000/268435456, 0x4080000000/2097152, 0xfe980000/524288 [ 20.549] (--) PCI: (0:0:27:0) 1002:7362:1002:0a34 rev 195, Mem @ 0x4050000000/268435456, 0x4080200000/2097152, 0xfea00000/524288 [ 20.549] (--) PCI: (0:0:28:0) 1002:7362:1002:0a34 rev 195, Mem @ 0x4060000000/268435456, 0x4080400000/2097152, 0xfea80000/524288 [ 20.549] (--) PCI: (0:0:29:0) 1002:7362:1002:0a34 rev 195, Mem @ 0x4070000000/268435456, 0x4080600000/2097152, 0xfeb00000/524288 [ 20.549] (II) LoadModule: "glx" [ 20.549] (II) Loading /opt/amdgpu-pro/lib/xorg/modules/extensions/libglx.so [ 20.549] (II) Module glx: vendor="X.Org Foundation" [ 20.549] compiled for 1.19.0, module version = 1.0.0 [ 20.549] ABI class: X.Org Server Extension, version 10.0 [ 20.549] (II) LoadModule: "amdgpu" [ 20.549] (II) Loading /opt/amdgpu/lib/xorg/modules/drivers/amdgpu_drv.so [ 20.550] (II) Module amdgpu: vendor="X.Org Foundation" [ 20.550] compiled for 1.19.6, module version = 19.1.0 [ 20.550] Module class: X.Org Video Driver [ 20.550] ABI class: X.Org Video Driver, version 23.0 [ 20.550] (II) AMDGPU: Driver for AMD Radeon: All GPUs supported by the amdgpu kernel driver [ 20.550] (II) AMDGPU(G0): [KMS] Kernel modesetting enabled. [ 20.551] (II) AMDGPU(G1): [KMS] Kernel modesetting enabled. [ 20.552] (II) AMDGPU(G2): [KMS] Kernel modesetting enabled. [ 20.554] (II) AMDGPU(G3): [KMS] Kernel modesetting enabled. [ 20.556] (EE) No devices detected. [ 20.556] (II) Applying OutputClass "AMDgpu" to /dev/dri/card0 [ 20.556] loading driver: amdgpu [ 20.556] (II) Applying OutputClass "AMDgpu" to /dev/dri/card0 [ 20.556] loading driver: amdgpu [ 20.556] (II) Applying OutputClass "AMDgpu" to /dev/dri/card1 [ 20.556] loading driver: amdgpu [ 20.556] (II) Applying OutputClass "AMDgpu" to /dev/dri/card1 [ 20.556] loading driver: amdgpu [ 20.556] (II) Applying OutputClass "AMDgpu" to /dev/dri/card2 [ 20.556] loading driver: amdgpu [ 20.556] (II) Applying OutputClass "AMDgpu" to /dev/dri/card2 [ 20.556] loading driver: amdgpu [ 20.556] (II) Applying OutputClass "AMDgpu" to /dev/dri/card3 [ 20.556] loading driver: amdgpu [ 20.556] (II) Applying OutputClass "AMDgpu" to /dev/dri/card3 [ 20.556] loading driver: amdgpu [ 20.556] (==) Matched amdgpu as autoconfigured driver 0 [ 20.556] (==) Matched amdgpu as autoconfigured driver 1 [ 20.556] (==) Matched ati as autoconfigured driver 2 [ 20.556] (==) Matched amdgpu as autoconfigured driver 3 [ 20.556] (==) Matched amdgpu as autoconfigured driver 4 [ 20.556] (==) Matched ati as autoconfigured driver 5 [ 20.556] (==) Matched amdgpu as autoconfigured driver 6 [ 20.556] (==) Matched amdgpu as autoconfigured driver 7 [ 20.556] (==) Matched ati as autoconfigured driver 8 [ 20.556] (==) Matched amdgpu as autoconfigured driver 9 [ 20.556] (==) Matched amdgpu as autoconfigured driver 10 [ 20.556] (==) Matched ati as autoconfigured driver 11 [ 20.556] (==) Matched modesetting as autoconfigured driver 12 [ 20.556] (==) Matched fbdev as autoconfigured driver 13 [ 20.556] (==) Matched vesa as autoconfigured driver 14 [ 20.556] (==) Assigned the driver to the xf86ConfigLayout [ 20.556] (II) LoadModule: "amdgpu" [ 20.556] (II) Loading /opt/amdgpu/lib/xorg/modules/drivers/amdgpu_drv.so [ 20.556] (II) Module amdgpu: vendor="X.Org Foundation" [ 20.556] compiled for 1.19.6, module version = 19.1.0 [ 20.556] Module class: X.Org Video Driver [ 20.556] ABI class: X.Org Video Driver, version 23.0 [ 20.556] (II) LoadModule: "ati" [ 20.556] (II) Loading /usr/lib/xorg/modules/drivers/ati_drv.so [ 20.556] (II) Module ati: vendor="X.Org Foundation" [ 20.556] compiled for 1.19.6, module version = 18.0.1 [ 20.556] Module class: X.Org Video Driver [ 20.556] ABI class: X.Org Video Driver, version 23.0 [ 20.556] (II) LoadModule: "modesetting" [ 20.556] (II) Loading /usr/lib/xorg/modules/drivers/modesetting_drv.so [ 20.556] (II) Module modesetting: vendor="X.Org Foundation" [ 20.556] compiled for 1.19.6, module version = 1.19.6 [ 20.556] Module class: X.Org Video Driver [ 20.556] ABI class: X.Org Video Driver, version 23.0 [ 20.556] (II) LoadModule: "fbdev" [ 20.556] (II) Loading /usr/lib/xorg/modules/drivers/fbdev_drv.so [ 20.556] (II) Module fbdev: vendor="X.Org Foundation" [ 20.556] compiled for 1.19.3, module version = 0.4.4 [ 20.556] Module class: X.Org Video Driver [ 20.556] ABI class: X.Org Video Driver, version 23.0 [ 20.556] (II) LoadModule: "vesa" [ 20.556] (II) Loading /usr/lib/xorg/modules/drivers/vesa_drv.so [ 20.556] (II) Module vesa: vendor="X.Org Foundation" [ 20.556] compiled for 1.19.3, module version = 2.3.4 [ 20.556] Module class: X.Org Video Driver [ 20.556] ABI class: X.Org Video Driver, version 23.0 [ 20.556] (II) AMDGPU: Driver for AMD Radeon: All GPUs supported by the amdgpu kernel driver [ 20.556] (II) modesetting: Driver for Modesetting Kernel Drivers: kms [ 20.556] (II) FBDEV: driver for framebuffer: fbdev [ 20.556] (II) VESA: driver for VESA chipsets: vesa [ 20.556] (WW) xf86OpenConsole: setpgid failed: Operation not permitted [ 20.556] (WW) xf86OpenConsole: setsid failed: Operation not permitted [ 20.556] (WW) Falling back to old probe method for modesetting [ 20.556] (WW) Falling back to old probe method for fbdev [ 20.556] (WW) Falling back to old probe method for vesa [ 20.556] (WW) Falling back to old probe method for modesetting [ 20.556] (WW) Falling back to old probe method for fbdev [ 20.556] (WW) Falling back to old probe method for vesa [ 20.556] (EE) No devices detected. [ 20.557] (EE) Fatal server error: [ 20.557] (EE) no screens found(EE) [ 20.557] (EE) Please consult the The X.Org Foundation support at http://wiki.x.org for help. [ 20.557] (EE) Please also check the log file at "/var/log/Xorg.0.log" for additional information. [ 20.557] (EE) [ 20.579] (EE) Server terminated with error (1). Closing log file. ```
0
answers
0
votes
5
views
imbolc
asked 4 months ago

Instance will no longer connect to Bitnami/website MIA

Hello,my name is Justin and I own a smaller company in Virginia, and thank you in advance for your help and guidance. I started working on a new domain/website several weeks back after getting the technical end set up first and surprisingly I've had no issues until now. My set up is Google domain to AWS (basic plan/$3.99/month), using Lightsail and WordPress. I followed the set up for my instance from an AWS certified specialist on YouTube. That said, I have not set up anything for storage/containers/etc tho I do have my static page set up for my landing page. So, Sunday night I finished working on the site on wordpress with no issues outside of a jetpack warning that had been up for 2 days. It stated that my domain name and IP address were both fighting for the same spot and I needed to select one or it would go into safe mode. Outside of this, my primary updates outside of verbage was in installed a few plug ins. The last one was PayPal which was around $60 for the year. I went thru the process of setting all the pages up and everything worked great. The following morning I had planned on fixing the jetpack issue but once I logged on, or attempted to, I realized that the website was down. All pages refused to load and a few times I got a 504 gateway timeout error message. I attempted to log in thru WordPress but I had the same issue, no loading. I then attempted to relaunch the instance via AWS; however, the instance wouldnt connect correctly to Bitnami. Basically, the black terminal screen comes up but is empty. I then created a second instance (can only have 2 max) and attempted to connect to Bitnami again. This time, it was a success. Circle back to verify and my instance 1 still has the same results of not completely connecting to Bitnami. I'm new, and have a basic account,meaning no technical support. Any help would be GREATLY appreciated!! Sincerely, Justin
0
answers
0
votes
9
views
AWS-User-2354556
asked 4 months ago

How to debug CLI crash when running Lightsail container commands?

I only use a couple of commands in my workflow but very often both will randomly throw a cryptic error in the middle of a deploy on CI: - `aws lightsail get-container-service-deployments --region us-west-2 --service-name my_service --output json` - `aws lightsail get-container-images --region us-west-2 --output json --service-name my_service` My CI stack is Ubuntu Server 18.04 which runs a Github action-runner service. ``` Signal received: -1695824320, errno: 32575 Stack trace: /usr/local/aws-cli/v2/2.3.0/dist/_awscrt.cpython-38-x86_64-linux-gnu.so(aws_backtrace_print+0x4d) [0x7f3f9483edbd] /usr/local/aws-cli/v2/2.3.0/dist/_awscrt.cpython-38-x86_64-linux-gnu.so(+0x68513) [0x7f3f947b5513] /lib/x86_64-linux-gnu/libc.so.6(+0x3f040) [0x7f3f9af46040] /usr/local/aws-cli/v2/2.3.0/dist/libpython3.8.so.1.0(+0x1f9ed0) [0x7f3f9ab65ed0] /usr/local/aws-cli/v2/2.3.0/dist/libpython3.8.so.1.0(+0xbb58b) [0x7f3f9aa2758b] /usr/local/aws-cli/v2/2.3.0/dist/libpython3.8.so.1.0(+0x1fa930) [0x7f3f9ab66930] /usr/local/aws-cli/v2/2.3.0/dist/libpython3.8.so.1.0(PyGC_Collect+0x81) [0x7f3f9ab67aa1] /usr/local/aws-cli/v2/2.3.0/dist/libpython3.8.so.1.0(Py_FinalizeEx+0xe2) [0x7f3f9ab3ff02] /usr/local/aws-cli/v2/2.3.0/dist/libpython3.8.so.1.0(Py_Exit+0x8) [0x7f3f9ab40818] /usr/local/aws-cli/v2/2.3.0/dist/libpython3.8.so.1.0(+0x1d8c8b) [0x7f3f9ab44c8b] aws(+0x378b) [0x55705f60f78b] aws(+0x3b1f) [0x55705f60fb1f] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xe7) [0x7f3f9af28bf7] aws(+0x24fa) [0x55705f60e4fa] ``` Any ideas what to do about this? **edit Jan 1, 2022** I turned on the `--debug` flag. **There is no debug output**. The command seems to immediately crash. I find if I run these command multiple times, they randomly crash with and sometimes it prints "Segmentation fault" which looks to likely to be a memory access/management issue :( Something else worth saying is this issue only presents itself on Ubuntu distros. I ran out of options to try and reinstalled other OS like ArchLinux and did not experience the same problem. Unfortunately some of the other software I use isn not well supported under ArchLinux and I have to come back to Ubuntu :(
2
answers
0
votes
4
views
AWS-User-Arman
asked 5 months ago

r6i instances cause ena issues

In the past weeks we have switch a number of instances over to the new r6i instance types. We have used r6i.xl, r6i.2xlarge and r6i.4xlarge instances. These instance types seems to be prone to hangs on the ena driver. Network load on the instances ranges from low to high so the actual amount of network seems to be unrelated to the issue. The instance doen't seem to recover from this on: All these instances have similar message in the logs: ```Dec 27 01:38:20 bc-prod-053 kernel: ena 0000:00:05.0 eth0: Found a Tx that wasn't completed on time, qid 0, index 639. 5404000 usecs have passed since last napi execution. Missing Tx timeout value 5000 msecs Dec 27 01:38:20 bc-prod-053 kernel: ena 0000:00:05.0 eth0: Found a Tx that wasn't completed on time, qid 0, index 668. 5412000 usecs have passed since last napi execution. Missing Tx timeout value 5000 msecs Dec 27 01:38:20 bc-prod-053 kernel: ena 0000:00:05.0 eth0: Found a Tx that wasn't completed on time, qid 1, index 340. 5424000 usecs have passed since last napi execution. Missing Tx timeout value 5000 msecs Dec 27 01:38:20 bc-prod-053 kernel: ena 0000:00:05.0 eth0: Found a Tx that wasn't completed on time, qid 3, index 779. 5436000 usecs have passed since last napi execution. Missing Tx timeout value 5000 msecs Dec 27 01:38:20 bc-prod-053 kernel: ena 0000:00:05.0 eth0: Found a Tx that wasn't completed on time, qid 3, index 780. 5444000 usecs have passed since last napi execution. Missing Tx timeout value 5000 msecs Dec 27 01:38:20 bc-prod-053 kernel: ena 0000:00:05.0 eth0: Found a Tx that wasn't completed on time, qid 3, index 782. 5456000 usecs have passed since last napi execution. Missing Tx timeout value 5000 msecs Dec 27 01:38:20 bc-prod-053 kernel: ena 0000:00:05.0 eth0: Found a Tx that wasn't completed on time, qid 3, index 783. 5468000 usecs have passed since last napi execution. Missing Tx timeout value 5000 msecs Dec 27 01:38:22 bc-prod-053 kernel: ena 0000:00:05.0 eth0: Keep alive watchdog timeout. Dec 27 01:38:22 bc-prod-053 kernel: ena 0000:00:05.0 eth0: Trigger reset is on Dec 27 01:38:22 bc-prod-053 kernel: ena 0000:00:05.0 eth0: tx_timeout: 0 Dec 27 01:38:22 bc-prod-053 kernel: ena 0000:00:05.0 eth0: suspend: 0 Dec 27 01:38:22 bc-prod-053 kernel: ena 0000:00:05.0 eth0: resume: 0 Dec 27 01:38:22 bc-prod-053 kernel: ena 0000:00:05.0 eth0: wd_expired: 1 Dec 27 01:38:22 bc-prod-053 kernel: ena 0000:00:05.0 eth0: interface_up: 1 Dec 27 01:38:22 bc-prod-053 kernel: ena 0000:00:05.0 eth0: interface_down: 0 Dec 27 01:38:22 bc-prod-053 kernel: ena 0000:00:05.0 eth0: admin_q_pause: 0 Dec 27 01:38:22 bc-prod-053 kernel: ena 0000:00:05.0 eth0: queue_0_tx_cnt: 56154872 .... Dec 27 01:38:22 bc-prod-053 kernel: ena 0000:00:05.0 eth0: ena_admin_q_aborted_cmd: 0 Dec 27 01:38:22 bc-prod-053 kernel: ena 0000:00:05.0 eth0: ena_admin_q_submitted_cmd: 53 Dec 27 01:38:22 bc-prod-053 kernel: ena 0000:00:05.0 eth0: ena_admin_q_completed_cmd: 53 Dec 27 01:38:22 bc-prod-053 kernel: ena 0000:00:05.0 eth0: ena_admin_q_out_of_space: 0 Dec 27 01:38:22 bc-prod-053 kernel: ena 0000:00:05.0 eth0: ena_admin_q_no_completion: 0 Dec 27 01:38:22 bc-prod-053 kernel: ena 0000:00:05.0 eth0: Reading reg failed for timeout. expected: req id[10] offset[88] actual: req id[57015] offset[88] Dec 27 01:38:23 bc-prod-053 kernel: ena 0000:00:05.0 eth0: Reading reg failed for timeout. expected: req id[11] offset[8] actual: req id[57016] offset[88] Dec 27 01:38:23 bc-prod-053 kernel: ena 0000:00:05.0 eth0: Reg read32 timeout occurred Dec 27 01:38:23 bc-prod-053 kernel: ena 0000:00:05.0 eth0: Reading reg failed for timeout. expected: req id[1] offset[88] actual: req id[57006] offset[0] Dec 27 01:38:23 bc-prod-053 kernel: ena 0000:00:05.0 eth0: Reading reg failed for timeout. expected: req id[2] offset[8] actual: req id[57007] offset[0] Dec 27 01:38:23 bc-prod-053 kernel: ena 0000:00:05.0 eth0: Reg read32 timeout occurred Dec 27 01:38:23 bc-prod-053 kernel: ena 0000:00:05.0: Can not reset device Dec 27 01:38:23 bc-prod-053 kernel: ena 0000:00:05.0: Can not initialize device Dec 27 01:38:23 bc-prod-053 kernel: ena 0000:00:05.0: Reset attempt failed. Can not reset the device```
0
answers
0
votes
5
views
LeonB
asked 5 months ago

SMS Patching Fails for ALL Windows Server 2019 EC2 Instances

I just starting using SMS to manage Windows 2019 Server EC2 instance patching (security updates). I noticed that by default, AWS prevents Windows OS to automatically run Windows Update. I followed the instructions for SMS Quick Setup and the Patching of my servers are failing with the following error message: (I have been searching ALL day for a resolution to this. Modifying registry settings, running DSIM commands, etc. Nothing helps. Seems like some type of certificate issue but I can't resolve it). Has anyone else had issues with getting SMS to patch AWS Windows Server 2019 EC2 instances? **Invoke-PatchBaselineOperation : Exception Details: An error occurred when attempting to search Windows Update. Exception Level 1: Error Message: A certificate chain processed, but terminated in a root certificate which is not trusted by the trust provider. (Exception from HRESULT: 0x800B0109)** Stack Trace: at WUApiLib.IUpdateSearcher.Search(String criteria) at Amazon.Patch.Baseline.Operations.PatchNow.Implementations.WindowsUpdateAgent.SearchForUpdates(String searchCriteria) At C:\ProgramData\Amazon\SSM\InstanceData\i-03638bdca902ef8fd\document\orchestration\86ed2eda-065a-49d3-b084-69bfc89c14 3d\PatchWindows\_script.ps1:233 char:13 + $response = Invoke-PatchBaselineOperation -Operation Scan -SnapshotId ... + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + CategoryInfo : OperationStopped: (Amazon.Patch.Ba...UpdateOperation:FindWindowsUpdateOperation) [Invoke -PatchBaselineOperation], Exception + FullyQualifiedErrorId : Exception Level 1: Error Message: Exception Details: An error occurred when attempting to search Windows Update. Exception Level 1: Error Message: A certificate chain processed, but terminated in a root certificate which is not trusted by the t rust provider. (Exception from HRESULT: 0x800B0109) Stack Trace: at WUApiLib.IUpdateSearcher.Search(String criteria) at Amazon.Patch.Baseline.Operations.PatchNow.Implementations.WindowsUpdateAgent.SearchForUpdates(String searc hCriteria) Stack Trace: at Amazon.Patch.Baseline.Operations.PatchNow.Implementations.WindowsUpdateAgent.SearchForUpdates( String searchCriteria) at Amazon.Patch.Baseline.Operations.PatchNow.Implementations.WindowsUpdateOperation.SearchAndProcessResult(Lis t`1 kbGuids) at Amazon.Patch.Baseline.Operations.PatchNow.Implementations.WindowsUpdateOperation.SearchByGuidsPaginated(Lis t`1 kbGuids, Int32 maxPageSize) at Amazon.Patch.Baseline.Operations.PatchNow.Implementations.WindowsUpdateOperation.FilterWindowsUpdateSearch( List`1 filteringMethods) at Amazon.Patch.Baseline.Operations.PatchNow.Implementations.FindWindowsUpdateOperation.DoWindowsUpdateOperati on() at Amazon.Patch.Baseline.Operations.PatchNow.Implementations.WindowsUpdateOperation.DoBeginProcessing() ,Amazon.Patch.Baseline.Operations.PowerShellCmdlets.InvokePatchBaselineOperation failed to run commands: exit status 4294967295
3
answers
0
votes
7
views
KevinM_BMW
asked 5 months ago

Root User for Linux

I am not very fluent in using Linux at the command line. With that in mind, I set up Plesk on my Lightsail instance, and it is working great. What I am trying to do is migrate some websites from a colleague's AWS Lightsail instance which is also running Plesk. I have used this migration tool before, and it works very well once you get the servers authenticated and a connection established. Unfortunately, to do this, I have the option of providing the root user and password for the source server, or I can use SSH keys as credentials. We tried a few different combinations of credentials he uses to sign in to his AWS account, and to sign in to his Plesk dashboard, but nothing will work. I do not have enough familiarity with SSH keys to make a very educated guess on how to proceed, so I am hoping to simply figure out how to get or change the password for his root user. I do not think this is the same as the "root" user credentials I use to sign in to my AWS console. Why does this seem so complicated? When I have had a VPS with other providers in the past, I could usually get a root user password and this migration tool just works. I know this will require him (my colleague) to connect to his server, which I do not have credentials for at this point. He is willing to give me control for the period required to complete the migration, then he will change passwords again. What I am hoping for is a simple method for changing the root user password, and whether this would cause any issues by changing it, as long as he has it. Would this cause any issues between the VPS and the AWS console as far as administration goes? Any help or suggestions would be greatly appreciated. I have been working on this for over a week with no real progress. I am running out of time.
1
answers
0
votes
12
views
AWS-User-1242566
asked 5 months ago

Python Flask: OpenCV library does not work, produces HTTP code 502 Bad Gateway when trying to compare images

Hello, I'm having trouble with Python OpenCV library, running on AWS Lightsail container instance. Some information: * It is running on python:3.7 Docker image. * Python Flask app * AWS Lightsail container instance * Using following packages: [link](https://pastebin.com/xD5gEqZH) * Uses opencv-contrib-python-headless==4.5.4.60 for OpenCV. * Error image: [link](https://ibb.co/r7Mm2DX) When trying to compare two images, I'm receiving HTTP status code of 502 Bad Gateway, which is very strange. Seems to work perfectly on my Windows machine locally, but on this Linux image it does not work. `from cv2 import cv2 import logging logger = logging.getLogger() def compare_two_images(image_to_compare_file, image_to_compare_against_file): # Image imports # Features logger.warning("image_to_compare_file " + image_to_compare_file) logger.warning("image_to_compare_against_file " + image_to_compare_against_file) sift = cv2.SIFT_create() logger.warning("SIFT created " + str(sift is None)) # QueryImage img1 = cv2.imread(image_to_compare_file, cv2.IMREAD_GRAYSCALE) logger.warning("IMG1 read created " + str(img1 is None)) # Find the key points and descriptors with SIFT kp1, desc1 = sift.detectAndCompute(img1, None) logger.warning("DETECT AND COMPUTE " + str(kp1 is None) + " " + str(desc1 is None)) img2 = cv2.imread(image_to_compare_against_file, cv2.IMREAD_GRAYSCALE) logger.warning("IMG2 read created " + str(img2 is None)) kp2, desc2 = sift.detectAndCompute(img2, None) logger.warning("DETECT AND COMPUTE " + str(kp2 == None) + " " + str(desc2 is None)) # BFMatcher with default params bf = cv2.BFMatcher() matches = bf.knnMatch(desc1, desc2, k=2) # Apply ratio test good = [] for m, n in matches: if m.distance < 0.55 * n.distance: good.append([m])` It crashes on `kp1, desc1 = sift.detectAndCompute(img1, None) ` and produces **502 Bad Gateway**. Then, on some other endpoints I have in my Python Flask app, it produces **503 Service Temporarily Unavailable** for a very times. After that, I can see that images were deleted. Any help is appreciated.
1
answers
0
votes
28
views
AWS-User-4298801
asked 5 months ago

Python Flask: OpenCV library does not work, produces HTTP code 502 Bad Gateway when trying to compare images

Hello, I'm having trouble with Python OpenCV library, running on AWS Lightsail container instance. Some information: * It is running on python:3.7 Docker image. * Python Flask app * AWS Lightsail container instance * Using following packages: [link](https://pastebin.com/xD5gEqZH) * Uses opencv-contrib-python-headless==4.5.4.60 for OpenCV. * Error image: [link](https://ibb.co/r7Mm2DX) When trying to compare two images, I'm receiving HTTP status code of 502 Bad Gateway, which is very strange. Seems to work perfectly on my Windows machine locally, but on this Linux image it does not work. `from cv2 import cv2 import logging logger = logging.getLogger() def compare_two_images(image_to_compare_file, image_to_compare_against_file): # Image imports # Features logger.warning("image_to_compare_file " + image_to_compare_file) logger.warning("image_to_compare_against_file " + image_to_compare_against_file) sift = cv2.SIFT_create() logger.warning("SIFT created " + str(sift is None)) # QueryImage img1 = cv2.imread(image_to_compare_file, cv2.IMREAD_GRAYSCALE) logger.warning("IMG1 read created " + str(img1 is None)) # Find the key points and descriptors with SIFT kp1, desc1 = sift.detectAndCompute(img1, None) logger.warning("DETECT AND COMPUTE " + str(kp1 is None) + " " + str(desc1 is None)) img2 = cv2.imread(image_to_compare_against_file, cv2.IMREAD_GRAYSCALE) logger.warning("IMG2 read created " + str(img2 is None)) kp2, desc2 = sift.detectAndCompute(img2, None) logger.warning("DETECT AND COMPUTE " + str(kp2 == None) + " " + str(desc2 is None)) # BFMatcher with default params bf = cv2.BFMatcher() matches = bf.knnMatch(desc1, desc2, k=2) # Apply ratio test good = [] for m, n in matches: if m.distance < 0.55 * n.distance: good.append([m])` It crashes on `kp1, desc1 = sift.detectAndCompute(img1, None) ` and produces **502 Bad Gateway**. Then, on some other endpoints I have in my Python Flask app, it produces **503 Service Temporarily Unavailable** for a very times. After that, I can see that images were deleted. Any help is appreciated.
1
answers
0
votes
15
views
AWS-User-4298801
asked 5 months ago
  • 1
  • 90 / page