How do I copy an EBS Snapshot to another region using coldsnap and the EBS Direct APIs?
This article provides step-by-step guidance on using coldsnap to copy EBS snapshot data block-by-block across AWS regions using the EBS Direct APIs, and then verify the integrity of the resulting data.
If you are migrating data from the Middle East (UAE) Region (me-central-1), then you might experience increased error rates as we continue making progress with restoration efforts. For additional information about recovery efforts and service updates that impact your AWS accounts, see the AWS Personal Health Dashboard. For assistance with this event, contact AWS Support through the AWS Management Console or the AWS Support Center.
This approach complements the snapshot-based and image-based methods covered in How do I migrate my Compute and Container resources to another region?.
This is part of a series of articles that provide general guidance on migrating resources form one Region to another. This article provides guidance on migrating RDS and Aurora databases to another AWS region using logical dump and restore via an intermediary EC2 instance. This method is useful when native snapshot copy or replication options are not available or practical.
For general guidance and a full list of domain and service-specific migration guides, see How do I migrate my resources to another region?
For other domains, see the following resources:
- How do I migrate my Security, Identity and Compliance resources to another region?
- How do I migrate my Compute and Container resources to another region?
- How do I migrate my Database resources to another region?
- How do I migrate my Database resources to another region using a logical dump?
- How do I migrate my Networking and Content Delivery resources to another region?
- How do I migrate my Storage resources to another region?
- How do I migrate my Application Integration resources to another region?
Overview
The standard approach to cross-region snapshot copies is aws ec2 copy-snapshot. However, there are scenarios where you need lower-level control over the snapshot data when the standard copy mechanism is unavailable.
coldsnap is an open-source command-line tool from AWS Labs that uses the EBS Direct APIs to download and upload EBS snapshot data block by block, without needing to create intermediate EBS volumes or manage volume attachments during the transfer itself.
The EBS Direct APIs provide six actions — three for reading (ListSnapshotBlocks, ListChangedBlocks, GetSnapshotBlock) and three for writing (StartSnapshot, PutSnapshotBlock, CompleteSnapshot).
coldsnap wraps these APIs into simple download and upload commands.
Key considerations
- EBS Direct APIs cannot be used with archived snapshots or public snapshots
- For encrypted snapshots, the principal also needs
kms:Decrypton the source KMS key andkms:CreateGrant+kms:GenerateDataKeyWithoutPlaintexton the target KMS key.
Prerequisites
- An EBS snapshot in the source region (e.g.,
snap-0abcdef1234567890inus-east-1). - An EC2 Linux instance with the Amazon Linux 2023 AMI in the target region (e.g.,
eu-west-1) with:- Sufficient local disk space to hold the snapshot dump (at least the size of the source volume).
- IAM role with EBS Direct API permissions in both source and target regions.
- AWS CLI configured with credentials that have the required permissions, if the IAM role is attached through an instance profile the AWS CLI will gain credentials through the instance profile.
Launch a temporary instance in the target Region
Launch a temporary instance in the target region. EBS and networking performance depend on the instance type.
Select the instance type that better suits your needs. We recommend you select at least a t3.large instance
- EBS specifications: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-optimized.html
- Network specifications: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-network-bandwidth.html
Ensure the security group allows SSH (TCP 22) for Linux or use SSM Session Manager to connect to the instance.
The instance must be associated with an IAM Role with permissions for ebs:ListSnapshotBlocks, ebs:GetSnapshotBlock (source region) and ebs:StartSnapshot, ebs:PutSnapshotBlock, ebs:CompleteSnapshot (target region).
See Control access to EBS direct APIs using IAM.
AMI_ID=$(aws ssm get-parameters \
--region eu-west-1 \
--names /aws/service/ami-amazon-linux-latest/al2023-ami-kernel-default-x86_64 \
--query 'Parameters[0].Value' \
--output text)
aws ec2 run-instances \
--region eu-west-1 \
--image-id "$AMI_ID" \
--instance-type t3.large \
--placement AvailabilityZone=eu-west-1a \
--key-name my-key-pair \
--security-group-ids sg-0123456789abcdef0 \
--tag-specifications 'ResourceType=instance,Tags=[{Key=Name,Value=recovery-temp-instance}]'
For Windows volumes you need to also launch windows based recovery instance:
AMI_ID=$(aws ssm get-parameters \
--region eu-west-1 \
--names /aws/service/ami-windows-latest/Windows_Server-2022-English-Full-Base \
--query 'Parameters[0].Value' \
--output text)
aws ec2 run-instances \
--region eu-west-1 \
--image-id "$AMI_ID" \
--instance-type t3.large \
--placement AvailabilityZone=eu-west-1a \
--key-name my-key-pair \
--security-group-ids sg-0123456789abcdef0 \
--tag-specifications 'ResourceType=instance,Tags=[{Key=Name,Value=recovery-temp-instance-win}]'
Note the InstanceId from the output. Wait for it to be running:
aws ec2 wait instance-running \
--region eu-west-1 \
--instance-ids i-0abc123def456789
Install Cargo
Log on to the recovery instance and install cargo:
# install cargo using the AL2023 package
sudo dnf install cargo -y
Install coldsnap on the EC2 instance in the target region
Install coldsnap from source, important do not install in any other way:
# Install coldsnap
cargo install --git https://github.com/awslabs/coldsnap.git --branch develop
This step might take a long time, depending on the instance type, as it will compile the tool and all its dependencies from source.
Then add the folder with the binary to your PATH
export PATH=$PATH:~/.cargo/bin
Verify the installation:
coldsnap --help
Download the snapshot from the source region
Important! Ensure the instance has enough disk space to download the snapshot. Move to the folder to which you will download the snapshot and use
df -h `pwd`
to verify before starting.
Use coldsnap download to read the snapshot block by block from the source region and write it to a local file.
Specify the source region explicitly using --region before the download command and the --checkpoint option after the download command:
coldsnap \
--region us-east-1 \
download
snap-0abcdef1234567890 \
snap-0abcdef1234567890.img \
--checkpoint
This uses the EBS Direct APIs (ListSnapshotBlocks and GetSnapshotBlock) to download each 512 KiB block of the snapshot and write it sequentially to the output file.
During download two files will be created:
snap-0abcdef1234567890.img.partialcontaining the actual snapshot partial datasnap-0abcdef1234567890.img.coldsnap-progresscontaining a checkpoint of successful chunks that coldsnap can use if you restart it after a failure
Upon successful download, a single file snapshot-dump.img will be present.
If the download fails to get some of the chunks of the snapshot, the output will be something like:
Failed to download snapshot:
Failed to get 3 blocks for snapshot 'snap-0abcdef1234567890':
blocks [3105, 4712, 58002]
Each block is 512 KiB so you can calculate where the gaps are in the partial file.
If the download fails, the snapshot file also will remain with the .partial extension, and the .coldsnap-progress file will remain.
Rerunning the same coldsnap command will retry to fetch just the missing blocks, which can help in the case of network or other failures
The download time depends on the snapshot size and network throughput and retries needed.
First check
If the snapshot you downloaded is from a Linux volume, you can now associate a loopback device to the downloaded image to mount it as a volume before uploading it to EBS in the target region to make sure the data in the snapshot is usable.
It should work with both successful and partial images:
sudo losetup -f --show snap-0abcdef1234567890.dmp.partial
this will assign the first available loopback device to the image. The output will be the name of the device.
/dev/loop0
You can now follow the steps provided in section Filesystem check, mount, and recovery before uploading the image to create a snapshot in the target region.
Upload the dump as a new snapshot in the target region
Upload the local file as a new EBS snapshot in the target region:
coldsnap \
--region eu-west-1 \
upload \
--wait \
snap-0abcdef1234567890.img
The --wait flag causes coldsnap to poll until the snapshot reaches the completed state.
This uses the EBS Direct APIs (StartSnapshot, PutSnapshotBlock, CompleteSnapshot) to write each block into the new snapshot.
The command outputs the new snapshot ID, for example:
snap-0fedcba9876543210
Optionally, tag the snapshot for identification:
aws ec2 create-tags \
--region eu-west-1 \
--resources snap-0fedcba9876543210 \
--tags Key=Name,Value="Recovered from us-east-1" Key=Source,Value=snap-0abcdef1234567890
Create an EBS volume from the new snapshot in the same AZ as the recovery instance
Create a volume in a specific Availability Zone within the target region.
The volume and the instance you attach it to must be in the same AZ.
See Create an Amazon EBS volume.
aws ec2 create-volume \
--region eu-west-1 \
--availability-zone eu-west-1a \
--snapshot-id snap-0fedcba9876543210 \
--volume-type gp3 \
--tag-specifications 'ResourceType=volume,Tags=[{Key=Name,Value=recovery-volume}]'
Note the VolumeId from the output (e.g., vol-0abc123def456789).
Wait for the volume to become available:
aws ec2 wait volume-available \
--region eu-west-1 \
--volume-ids vol-0abc123def456789
Attach the volume to the temporary instance
For Linux you can reuse the Linux instance or launch a new one. For Windows you need to launch a Windows recovery instance as well.
Attach the volume to the instance. See Attach an Amazon EBS volume to an Amazon EC2 instance.
aws ec2 attach-volume \
--region eu-west-1 \
--volume-id vol-0abc123def456789 \
--instance-id i-0abc123def456789 \
--device /dev/sdf
Filesystem check, mount, and recovery for Linux volumes
Note: filesystem checks ensure the integrity of the filesystem and not necessarily of all data stored. You should use the tooling relevant to your application to verify the consistency of the data stored in the snapshot. For instance, if you are running MySQL on the instance, you would run OPTIMIZE, CHECK and REPAIR on all tables.
Identify the device and run filesystem checks (Linux)
SSH into the temporary instance and identify the device:
# get the device name (nvme1n1 in this example)
lsblk
# verifies the type of the special device file
sudo file -s /dev/nvme1n1
the output should be something like:
/dev/nvme1n1: DOS/MBR boot sector, extended partition table (last)
then get the partition filesystem information with
sudo lsblk -f
the output will be something like:
NAME FSTYPE FSVER LABEL UUID FSAVAIL FSUSE% MOUNTPOINTS
nvme0n1
├─nvme0n1p1 xfs / bb1ad377-aefa-4354-a419-b1d6a31d6d2c 79.7G 20% /
├─nvme0n1p127
└─nvme0n1p128 vfat FAT16 3D07-5F4A 8.7M 13% /boot/efi
nvme1n1
├─nvme1n1p1 xfs / e8b842a5-f549-434e-bc03-152dd3e41fc6
└─nvme1n1p128
in this case the filesystem type in partition nvme1n1p1 is xfs.
Before mounting, run a filesystem check on the unmounted device. The tool depends on the filesystem type. See Make an Amazon EBS volume available for use.
For ext4 filesystems:
# Dry-run check first (no modifications)
sudo e2fsck -n /dev/nvme1n1
# If errors are found, run with automatic repair
sudo e2fsck -y /dev/nvme1n1
For XFS filesystems:
# Check the filesystem (XFS checks are read-only by default)
sudo xfs_repair -n /dev/nvme1n1p1
# If errors are found, run repair
sudo xfs_repair /dev/nvme1n1p1
Important: Always run the check on an unmounted filesystem. Running
e2fsckorxfs_repairon a mounted filesystem can cause data corruption.
If the filesystem check reports unrecoverable errors, proceed to Step 9 (Linux) for specialised recovery tools before attempting to mount.
Mount the volume and clean up unnecessary data (Linux) - optional
If the filesystem check was successful, mount the volume:
sudo mkdir -p /mnt/recovery
sudo mount /dev/nvme1n1p1 /mnt/recovery
Remove unnecessary data and empty caches to reduce the volume size and prepare it for use:
# Clear package manager caches
sudo rm -rf /mnt/recovery/var/cache/yum/*
sudo rm -rf /mnt/recovery/var/cache/dnf/*
sudo rm -rf /mnt/recovery/var/cache/apt/archives/*.deb 2>/dev/null
# Clear temporary files
sudo rm -rf /mnt/recovery/tmp/*
sudo rm -rf /mnt/recovery/var/tmp/*
# Clear log files (optional — review before deleting)
sudo find /mnt/recovery/var/log -type f -name "*.gz" -delete
sudo find /mnt/recovery/var/log -type f -name "*.old" -delete
# Clear user-level caches
sudo rm -rf /mnt/recovery/home/*/.cache/*
Run specialised file recovery tools (if needed) (Linux)
If the filesystem check reported errors that could not be fully repaired, or if you suspect data loss, use specialised recovery tools.
Install recovery tools on the temporary instance:
wget https://www.cgsecurity.org/testdisk-7.2.linux26-x86_64.tar.bz2
tar xjf testdisk-7.2.linux26-x86_64.tar.bz2
cd testdisk-7.2
TestDisk — recovers lost partitions and repairs partition tables:
sudo ./testdisk_static /dev/nvme1n1
Follow the interactive menu to analyse the disk, search for lost partitions, and write a corrected partition table if needed.
PhotoRec (bundled with TestDisk) — recovers individual files regardless of filesystem state:
sudo ./photorec_static /dev/nvme1n1
Follow the prompts to select the partition, filesystem type, and output directory for recovered files.
For ext4 filesystems with journal issues:
# Replay the journal
sudo e2fsck -y /dev/nvme1n1
# If the journal is corrupt, recreate it (last resort)
sudo tune2fs -O ^has_journal /dev/nvme1n1
sudo e2fsck -y /dev/nvme1n1
sudo tune2fs -j /dev/nvme1n1
Filesystem check, mount, and recovery for Windows volumes
Bring the disk online and run chkdsk (Windows)
Connect to the Windows instance via RDP. By default, Windows keeps newly attached EBS volumes offline. See Make an Amazon EBS volume available for use and Resolve offline EBS volume on EC2 Windows instance.
Open PowerShell as Administrator and identify the disk:
Get-Disk
The attached volume will appear with OperationalStatus: Offline and PartitionStyle: MBR or GPT. Note the disk number (e.g., 1).
Bring the disk online read-only first, so chkdsk can scan without the OS writing to it. The -IsOffline and -IsReadOnly parameters are in different parameter sets and must be set in separate calls. Set read-only first, then bring online:
Set-Disk -Number 1 -IsReadOnly $true
Set-Disk -Number 1 -IsOffline $false
Identify the volume and its drive letter(s) or assign one:
Get-Partition -DiskNumber 1
# If no drive letter is assigned:
Get-Partition -DiskNumber 1 | Set-Partition -NewDriveLetter D
Run chkdsk in read-only scan mode for NTFS volumes
chkdsk D: /scan
The /scan parameter runs an online scan without fixing any errors or taking the disk offline.
If errors are found, take the disk read-write and run repair:
Set-Disk -Number 1 -IsReadOnly $false
chkdsk D: /f /r
/f— fixes filesystem errors and requires exclusive access to the volume./r— locates bad sectors and recovers readable information.
Important:
chkdsk /fand/rrequire exclusive access to the volume. Do not run them on the system (C:) drive of a running instance, only on a secondary attached volume as shown here.
Clean up unnecessary data (Windows)
After chkdsk completes successfully, the volume is mounted at the assigned drive letter (e.g., D:\). Remove unnecessary data and caches:
If not done already remount the disk in read-write mode:
Set-Disk -Number 1 -IsReadOnly $false
# Clear Windows temp files
Remove-Item -Path "D:\Windows\Temp\*" -Recurse -Force -ErrorAction SilentlyContinue
Remove-Item -Path "D:\Users\*\AppData\Local\Temp\*" -Recurse -Force -ErrorAction SilentlyContinue
# Clear Windows Update cache
Remove-Item -Path "D:\Windows\SoftwareDistribution\Download\*" -Recurse -Force -ErrorAction SilentlyContinue
# Clear Windows Prefetch
Remove-Item -Path "D:\Windows\Prefetch\*" -Force -ErrorAction SilentlyContinue
# Clear user-level browser caches (Edge/Chrome)
Remove-Item -Path "D:\Users\*\AppData\Local\Microsoft\Edge\User Data\*\Cache\*" -Recurse -Force -ErrorAction SilentlyContinue
Remove-Item -Path "D:\Users\*\AppData\Local\Google\Chrome\User Data\*\Cache\*" -Recurse -Force -ErrorAction SilentlyContinue
Note: If the volume is a Windows system drive, avoid deleting files under
D:\Windows\System32\config(registry hives) orD:\Windows\WinSxS(component store) as these are required for the OS to boot.
Run specialised recovery tools (if needed) (Windows)
If chkdsk reported unrecoverable errors, or if you need to recover deleted files or repair a corrupt partition table, use the following tools.
EC2Rescue for Windows Server — AWS-provided tool for diagnosing and repairing common Windows issues on EC2 instances. See Use EC2Rescue to troubleshoot EC2 Windows instance issues.
# Download EC2Rescue
Invoke-WebRequest -Uri "https://s3.amazonaws.com/ec2rescue/windows/EC2Rescue_latest.zip" -OutFile "$env:TEMP\EC2Rescue.zip"
Expand-Archive -Path "$env:TEMP\EC2Rescue.zip" -DestinationPath "$env:TEMP\EC2Rescue"
# Run EC2Rescue (interactive GUI)
& "$env:TEMP\EC2Rescue\EC2Rescue.exe"
EC2Rescue can fix boot configuration, restore registry hives from backup, and repair common OS-level issues.
Windows Recovery tools for corrupt registry:
If the volume is a Windows system drive with a corrupt registry, you can restore from the automatic backup. See Restore a corrupt registry on an EC2 Windows instance.
# Back up current registry hives
Copy-Item -Path "D:\Windows\System32\config\SYSTEM" -Destination "D:\Windows\System32\config\SYSTEM.bak"
Copy-Item -Path "D:\Windows\System32\config\SOFTWARE" -Destination "D:\Windows\System32\config\SOFTWARE.bak"
# Restore from RegBack (if available)
Copy-Item -Path "D:\Windows\System32\config\RegBack\SYSTEM" -Destination "D:\Windows\System32\config\SYSTEM" -Force
Copy-Item -Path "D:\Windows\System32\config\RegBack\SOFTWARE" -Destination "D:\Windows\System32\config\SOFTWARE" -Force
Note: Windows 10/Server 2019 and later no longer populate RegBack by default. If the directory is empty, EC2Rescue or a System Restore point is the alternative.
Third party options:
- TestDisk for Windows — recovers lost NTFS/FAT partitions and repairs partition tables. Download from cgsecurity.org and run from the recovery instance:
- PhotoRec for Windows (bundled with TestDisk) — recovers individual files from damaged NTFS/FAT volumes regardless of filesystem state:
Clean up
After you have finished with the recovery, unmount/offline the volume and clean up the temporary resources.
Linux — unmount:
sudo umount /mnt/recovery
Windows — take the disk offline:
Set-Disk -Number 1 -IsOffline $true
Detach, snapshot, and clean up (both platforms):
# Detach the volume
aws ec2 detach-volume \
--region eu-west-1 \
--volume-id vol-0abc123def456789
# Create a final snapshot of the cleaned volume
aws ec2 create-snapshot \
--region eu-west-1 \
--volume-id vol-0abc123def456789 \
--description "Recovered and cleaned snapshot from us-east-1"
# Terminate the temporary instance
aws ec2 terminate-instances \
--region eu-west-1 \
--instance-ids i-0abc123def456789
# Delete the recovery volume (after snapshot is complete)
aws ec2 delete-volume \
--region eu-west-1 \
--volume-id vol-0abc123def456789
# Delete the local dump file if still on the instance
# rm /tmp/snapshot-dump.img
Related resources
- coldsnap on GitHub
- Use EBS direct APIs to access the contents of an EBS snapshot
- Control access to EBS direct APIs using IAM
- Copy an Amazon EBS snapshot (standard approach)
- Create an Amazon EBS volume
- Attach an Amazon EBS volume to an Amazon EC2 instance
- Make an Amazon EBS volume available for use (Linux)
- How do I migrate my Compute and Container resources to another region?
- Topics
- Storage
- Language
- English
do we have steps to install on window server ? I tired following steps mentioned here [+]https://github.com/awslabs/coldsnap but failed to do.
Relevant content
- asked 2 years ago
AWS OFFICIALUpdated a year ago