Amazon Linux 2023 - issue with installing packages with cloud-init

0

I am attempting to install packages via cloud-init in an Amazon Linux 2023 instance. Sometimes it works, sometimes it fails with: "RPM: error: can't create transaction lock on /var/lib/rpm/.rpm.lock (Resource temporarily unavailable)" This issue occurs when attempting pkg installs in both a cloudconfig and a script. The most annoying part is that it fails most of the time but not always. And it doesn't always fail in the same place. Sometimes it get through cloudconfig pkgs and fails in the pkgs being installed by the script, sometimes it doesn't get that far.

himmel
已提問 1 年前檢視次數 1978 次
3 個答案
2
已接受的答案

Based on my testing with the amazon linux 2023 image and SSM enabled, I see the following processes utilizing yum during the cloud-init phase of the EC2 initialization

From ~17 to ~39 seconds rpm -U --force ./amazon-cloudwatch-agent.rpm

From ~70 to ~86 seconds rpm -U amazon-ssm-agent.rpm

I tested this using the following in my user-data file:

timeout=0
while [ -f /var/lib/rpm/.rpm.lock ];
do
  echo "$timeout seconds"
  ls -a /var/lib/rpm/
  file /var/lib/rpm/.rpm.lock
  ps -ef|grep rpm
  # When the timeout is equal to zero, show an error and leave the loop.
  if [ "$timeout" == 120 ]; then
    break
  fi
  sleep 1
  ((timeout++))
done

Instead of performing an install after a sleep or when the lock is released, you will have better luck if you just wrap your yum installs in retry logic. Here is an example:

max_attempts=5
attempt_num=1
success=false
while [ $success = false ] && [ $attempt_num -le $max_attempts ]; do
  echo "Trying yum install"
  yum update -y
  yum install java-1.8.0 java-17-amazon-corretto-devel.x86_64 wget telnet -y
  # Check the exit code of the command
  if [ $? -eq 0 ]; then
    echo "Yum install succeeded"
    success=true
  else
    echo "Attempt $attempt_num failed. Sleeping for 3 seconds and trying again..."
    sleep 3
    ((attempt_num++))
  fi
done
profile pictureAWS
已回答 6 個月前
profile picture
專家
已審閱 1 個月前
1

Hello,

Failing to get a transaction lock is due to something else is accessing the resource and doesn’t want anything else to access to make changes. You can try adding a sleep.

AWS
已回答 1 年前
0

Hello Himmel.

Can you confirm you are running the rpm install with elevated permissions? i.e. sudo [command] or sudo su?

Thank you.

profile pictureAWS
專家
已回答 1 年前
  • Install is being done by cloud-init, so yes, its performed by root.

  • Can you share what your cloud-config look like to try to understand how it is being executed?

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南