Amazon Linux 2023 - issue with installing packages with cloud-init

0

I am attempting to install packages via cloud-init in an Amazon Linux 2023 instance. Sometimes it works, sometimes it fails with: "RPM: error: can't create transaction lock on /var/lib/rpm/.rpm.lock (Resource temporarily unavailable)" This issue occurs when attempting pkg installs in both a cloudconfig and a script. The most annoying part is that it fails most of the time but not always. And it doesn't always fail in the same place. Sometimes it get through cloudconfig pkgs and fails in the pkgs being installed by the script, sometimes it doesn't get that far.

himmel
已提问 1 年前1978 查看次数
3 回答
2
已接受的回答

Based on my testing with the amazon linux 2023 image and SSM enabled, I see the following processes utilizing yum during the cloud-init phase of the EC2 initialization

From ~17 to ~39 seconds rpm -U --force ./amazon-cloudwatch-agent.rpm

From ~70 to ~86 seconds rpm -U amazon-ssm-agent.rpm

I tested this using the following in my user-data file:

timeout=0
while [ -f /var/lib/rpm/.rpm.lock ];
do
  echo "$timeout seconds"
  ls -a /var/lib/rpm/
  file /var/lib/rpm/.rpm.lock
  ps -ef|grep rpm
  # When the timeout is equal to zero, show an error and leave the loop.
  if [ "$timeout" == 120 ]; then
    break
  fi
  sleep 1
  ((timeout++))
done

Instead of performing an install after a sleep or when the lock is released, you will have better luck if you just wrap your yum installs in retry logic. Here is an example:

max_attempts=5
attempt_num=1
success=false
while [ $success = false ] && [ $attempt_num -le $max_attempts ]; do
  echo "Trying yum install"
  yum update -y
  yum install java-1.8.0 java-17-amazon-corretto-devel.x86_64 wget telnet -y
  # Check the exit code of the command
  if [ $? -eq 0 ]; then
    echo "Yum install succeeded"
    success=true
  else
    echo "Attempt $attempt_num failed. Sleeping for 3 seconds and trying again..."
    sleep 3
    ((attempt_num++))
  fi
done
profile pictureAWS
已回答 6 个月前
profile picture
专家
已审核 1 个月前
1

Hello,

Failing to get a transaction lock is due to something else is accessing the resource and doesn’t want anything else to access to make changes. You can try adding a sleep.

AWS
已回答 1 年前
0

Hello Himmel.

Can you confirm you are running the rpm install with elevated permissions? i.e. sudo [command] or sudo su?

Thank you.

profile pictureAWS
专家
已回答 1 年前
  • Install is being done by cloud-init, so yes, its performed by root.

  • Can you share what your cloud-config look like to try to understand how it is being executed?

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则