Compute environment with managed EC2 instance - unable to extend root partition size

0

I have a need for more storage space in my Compute Environment running a Batch job (Docker image from ECR). I thought it would be simple to just extend the root file system size for the EC2 instance running the job. I found that I should be able to use a Launch Template to assign an EBS volume larger than the default 30G to my Compute Env. This works only partially - running lsblk on the instance (from within the Docker image - as part of the Batch job), I see that the large EBS device is available:

NAME    MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
xvda    202:0    0   8T  0 disk 
`-xvda1 202:1    0   8T  0 part /etc/hosts

However df -h shows that only the default 30G is partitioned and available for the system:

Filesystem      Size  Used Avail Use% Mounted on
overlay          30G  5.1G   25G  17% /
tmpfs            64M     0   64M   0% /dev
tmpfs            16G     0   16G   0% /sys/fs/cgroup
/dev/xvda1       30G  5.1G   25G  17% /etc/hosts
shm              64M     0   64M   0% /dev/shm
tmpfs            16G     0   16G   0% /proc/acpi
tmpfs            16G     0   16G   0% /proc/scsi
tmpfs            16G     0   16G   0% /sys/firmware

I tried adding a script as part of UserData in the Launch template to extend the partition and the file system (following Amazon Q suggestion), but it looks like it's not working - /dev/xvda1 stays 30G, no matter what I try.

The EC2 instance is a m4.2xlarge I'm using Serverless Framework to define my Batch components, and this is how the Launch Template and Compute Env definitions look like:

    ExtendedStorageLaunchTemplate:
      Type: AWS::EC2::LaunchTemplate
      Properties:
        LaunchTemplateData:
          BlockDeviceMappings:
          - DeviceName: /dev/xvda
            Ebs:
              VolumeSize: 8192
              VolumeType: gp2
              DeleteOnTermination: true
          UserData:
            Fn::Base64: !Sub |
              MIME-Version: 1.0
              Content-Type: multipart/mixed; boundary="==BOUNDARY=="

              --==BOUNDARY==
              Content-Type: text/cloud-config; charset="us-ascii"

              #cloud-config
              cloud_final_modules:
              - [scripts-user, always]

              --==BOUNDARY==
              Content-Type: text/x-shellscript; charset="us-ascii"

              #!/bin/bash
              set -e
              # Extend the partition
              growpart /dev/xvda 1
              # Extend the file system (assuming it's ext4, adjust if using xfs)
              resize2fs /dev/xvda1

              --==BOUNDARY==--
    ComputeEnvironmentExtendedStorage:
      Type: AWS::Batch::ComputeEnvironment
      Properties:
        ComputeEnvironmentName: process-layer-service-compute-environment-extended-storage
        Type: MANAGED
        ServiceRole: !Ref BatchServiceRole
        ComputeResources:
          InstanceRole: !GetAtt ecsInstanceProfile.Arn
          InstanceTypes:
            - m4.2xlarge
          Type: EC2
          MaxvCpus: 1024
          MinvCpus: 0
          LaunchTemplate:
            LaunchTemplateId: !Ref ExtendedStorageLaunchTemplate
            Version: !GetAtt ExtendedStorageLaunchTemplate.LatestVersionNumber
          SecurityGroupIds:
            - sg-XXXXXXX
          Subnets:
            - subnet-XXXXXXX
        State: ENABLED

How can I make this work? Maybe this is a wrong approach?

Pawel
asked 25 days ago38 views
4 Answers
0

Hello Pawel,

You might want to consider using a secondary volume instead of expanding the root volume. You can refer to this GitHub resource.

Alternatively, you can create a custom AMI with an 8TiB root volume and include that custom AMI in your launch template definition.

profile picture
answered 25 days ago
  • I thought expanding the root volume would be the simplest thing to do... How silly of me :)

0

AWS Batch supports Launch Templates where you can customise the EC2 when launched. Here you can define the EBS volume size

https://docs.aws.amazon.com/batch/latest/userguide/launch-templates.html

profile picture
EXPERT
answered 25 days ago
  • Hi Gary, if you look at the servlerless.yml fragment posted in my question you'll see this is exactly what I am doing. The EBS device I define in the Launch Template is available to the EC2 instance, but the root filesystem is not extended to the whole volume and I cannot make it to do so. I will try the approach suggested by Praveen and mount the extended volume as a separate one rather than extending the root volume, then use it in the docker image to run the job.

0

I'll share here the way I solved my issue with help from an Amazon solutions architect - it might be useful for someone in the future.

As suggested by Praveen in one of the answers here, I went for using a secondary volume on my EC2 Compute Environment, defined by a Launch Template. I later mount it as a volume in the Docker image used for my Batch Job Definition. The key point is that it's not enough to define an EBS volume and attach it to a device in the Launch Template. The volume also needs to be partitioned and and mounted in the file system to be available for use. Partitions larger than 2TiB can only be obtained with GPT partitioning scheme - for this some extra tools not available at launch on the EC2 instance need to be added (xfstools).

This can be done using a User Data script defined in the Launch Template, taking care to include MIME headers correctly. Something like this does the job:

MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="==MYBOUNDARY=="

--==MYBOUNDARY==
Content-Type: text/x-shellscript; charset="us-ascii"

#!/bin/bash

# Update package list and install xfsprogs
yum update
yum install -y xfsprogs

# Create XFS filesystem on /dev/xvdb
mkfs.xfs /dev/xvdb

# Create mount point directory
mkdir -p /data

# Mount the volume
mount /dev/xvdb /data

# Add entry to fstab for persistent mounting
echo "UUID=$(blkid -s UUID -o value /dev/xvdb) /data xfs defaults,nofail 0 2" >> /etc/fstab

echo "EBS Volume mounted successfully"

--==MYBOUNDARY==--
Pawel
answered 17 days ago
  • I’m glad you were able to resolve the issue. At the EC2 level, properly creating and mounting an EBS volume is crucial for Docker to access the filesystem and integrate it into its container environment. Raw EBS volumes can’t be used directly within containers. I’m also surprised that the operating system doesn’t include the XFS filesystem by default, as XFS has been widely used in many Linux systems for years (it’s always been my go-to choice).

-1

Hi,

This KC article is probably the answer to your question: https://repost.aws/knowledge-center/extend-linux-file-system

Follow this guidance to be able to extend your partition with available disk space.

Best,

Didier

profile pictureAWS
EXPERT
answered 25 days ago
profile picture
EXPERT
reviewed 25 days ago
  • Hi Didier,

    I don't think this is the right answer. I'm basically already trying to do what's described in the article you posted. My use case is quite different as well - it's not a persistent EC2 instance. I have an AWS Batch Compute Environment, which describes the EC2 instance created only when the Batch Job is running. Another surprising thing - the approach I'm trying here (Launch Template defining the EBS drive with more storage to be attached to the EC2 instance) worked for a few months without the need to extend the partition. It was running just like that - I had 1TB available as /dev/xvda1 without doing anything to expand the partition and file system. Problems started when I tried to increase the volume size to 8TB. Now even when I roll back to the infrastructure definition which worked before, it does not work...

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions