Amazon Linux 2023 - issue with installing packages with cloud-init

0

I am attempting to install packages via cloud-init in an Amazon Linux 2023 instance. Sometimes it works, sometimes it fails with: "RPM: error: can't create transaction lock on /var/lib/rpm/.rpm.lock (Resource temporarily unavailable)" This issue occurs when attempting pkg installs in both a cloudconfig and a script. The most annoying part is that it fails most of the time but not always. And it doesn't always fail in the same place. Sometimes it get through cloudconfig pkgs and fails in the pkgs being installed by the script, sometimes it doesn't get that far.

himmel
asked 10 months ago1875 views
3 Answers
2
Accepted Answer

Based on my testing with the amazon linux 2023 image and SSM enabled, I see the following processes utilizing yum during the cloud-init phase of the EC2 initialization

From ~17 to ~39 seconds rpm -U --force ./amazon-cloudwatch-agent.rpm

From ~70 to ~86 seconds rpm -U amazon-ssm-agent.rpm

I tested this using the following in my user-data file:

timeout=0
while [ -f /var/lib/rpm/.rpm.lock ];
do
  echo "$timeout seconds"
  ls -a /var/lib/rpm/
  file /var/lib/rpm/.rpm.lock
  ps -ef|grep rpm
  # When the timeout is equal to zero, show an error and leave the loop.
  if [ "$timeout" == 120 ]; then
    break
  fi
  sleep 1
  ((timeout++))
done

Instead of performing an install after a sleep or when the lock is released, you will have better luck if you just wrap your yum installs in retry logic. Here is an example:

max_attempts=5
attempt_num=1
success=false
while [ $success = false ] && [ $attempt_num -le $max_attempts ]; do
  echo "Trying yum install"
  yum update -y
  yum install java-1.8.0 java-17-amazon-corretto-devel.x86_64 wget telnet -y
  # Check the exit code of the command
  if [ $? -eq 0 ]; then
    echo "Yum install succeeded"
    success=true
  else
    echo "Attempt $attempt_num failed. Sleeping for 3 seconds and trying again..."
    sleep 3
    ((attempt_num++))
  fi
done
profile pictureAWS
answered 6 months ago
profile picture
EXPERT
reviewed 24 days ago
1

Hello,

Failing to get a transaction lock is due to something else is accessing the resource and doesn’t want anything else to access to make changes. You can try adding a sleep.

AWS
answered 10 months ago
0

Hello Himmel.

Can you confirm you are running the rpm install with elevated permissions? i.e. sudo [command] or sudo su?

Thank you.

profile pictureAWS
EXPERT
answered 10 months ago
  • Install is being done by cloud-init, so yes, its performed by root.

  • Can you share what your cloud-config look like to try to understand how it is being executed?

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions