I'm experiencing a very strange "No space left on device" error when using a custom AMI for AWS Batch.
The AMI was created starting from ECS-Optimized Amazon Linux AMI 2017.03 to which was added a third EBS volume of 1000GB. The Docker storage has been extended as explained here http://docs.aws.amazon.com/AmazonECS/latest/developerguide/ecs-ami-storage-config.html i.e.:
sudo vgextend docker /dev/xvdb
sudo lvextend -L+1000G /dev/docker/docker-pool
However when launching a few jobs I get quite immediately the following error message:
.command.run.1: line 50: cannot create temp file for here-document: No space left on device
tee: .command.err: No space left on device
mkdir: cannot create directory ‘fastqc_SRR3192434_logs’: No space left on device
Logging in the instance it seems to be enough space:
$ df -h
Filesystem Size Used Avail Use% Mounted on
/dev/xvda1 7.8G 1.2G 6.5G 16% /
devtmpfs 32G 96K 32G 1% /dev
tmpfs 32G 0 32G 0% /dev/shm
$ sudo vgs
VG #PV #LV #SN Attr VSize VFree
docker 2 1 0 wz--n- 1021.99g 224.00m
$ sudo lvs
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
docker-pool docker twi-aot--- 1021.73g 2.46 11.87
$ docker info
Containers: 8
Running: 3
Paused: 0
Stopped: 5
Images: 3
Server Version: 17.03.2-ce
Storage Driver: devicemapper
Pool Name: docker-docker--pool
Pool Blocksize: 524.3 kB
Base Device Size: 10.74 GB
Backing Filesystem: ext4
Data file:
Metadata file:
Data Space Used: 28 GB
Data Space Total: 1.097 TB
Data Space Available: 1.069 TB
Metadata Space Used: 3.031 MB
Metadata Space Total: 25.17 MB
Metadata Space Available: 22.13 MB
Thin Pool Minimum Free Space: 109.7 GB
Udev Sync Supported: true
Deferred Removal Enabled: true
Deferred Deletion Enabled: true
Deferred Deleted Device Count: 0
Library Version: 1.02.135-RHEL7 (2016-11-16)
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge host macvlan null overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 4ab9917febca54791c5f071a9d1f404867857fcc
runc version: 54296cf40ad8143b62dbcaa1d90e520a2136ddfe
init version: 949e6fa
Security Options:
seccomp
Profile: default
Kernel Version: 4.9.51-10.52.amzn1.x86_64
Operating System: Amazon Linux AMI 2017.09
OSType: linux
Architecture: x86_64
CPUs: 16
Total Memory: 62.91 GiB
Name: ip-172-30-2-110
ID: VEEF:VQDY:Z72J:NY25:YIMO:BG7Z:J5EH:ZBXU:IKLX:OJN2:F7GM:EY5Q
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
Any idea what's wrong ?
Edited by: paulecci on Oct 28, 2017 5:00 AM