I'm working on an Image Recipe, and I'm running into an issue with the WebDownload Checksum that has me stumped.

There's the component configuration that I have

- "name": "build"
  - "action": "WebDownload"
    - "algorithm": "SHA256"
      "checksum": "31c589c76d96965ddeec3e3d89c0bf5322513dbe3f523dcc8d2352c6167cdc71"
      "destination": "/tmp/pdflib.tar.gz"
      "source": ""
    "maxAttempts": 1
    "name": "Download_PHPlib"
"schemaVersion": 1

I've verified the checksum on my laptop and on 2 different Amazon Linux 2023 EC2 instances using the following

curl | sha256sum

which consistently results in

31c589c76d96965ddeec3e3d89c0bf5322513dbe3f523dcc8d2352c6167cdc71  -

If instead I do the following

curl -o /tmp/phplib.tar.gz
sha256sum /tmp/pdflib.tar.gz

I consistently get

31c589c76d96965ddeec3e3d89c0bf5322513dbe3f523dcc8d2352c6167cdc71  /tmp/pdflib.tar.gz

However when I run an Image with the image recipe it fails, with the following logs

2024-01-18T05:18:38.262000+00:00 0.1.0/1 Phase build
2024-01-18T05:18:38.262000+00:00 0.1.0/1 Step Download_PHPlib
2024-01-18T05:18:38.264000+00:00 0.1.0/1 WebDownload: STARTED EXECUTION
2024-01-18T05:18:38.264000+00:00 0.1.0/1 WebDownload: Starting download
2024-01-18T05:18:38.264000+00:00 0.1.0/1 WebDownload: Source:, Destination:/tmp/pdflib.tar.gz
2024-01-18T05:18:38.264000+00:00 0.1.0/1 WebDownload: Target destination - /tmp/pdflib.tar.gz
2024-01-18T05:18:38.264000+00:00 0.1.0/1 WebDownload: Creating directories /tmp
2024-01-18T05:18:38.264000+00:00 0.1.0/1 WebDownload: Created directories /tmp
2024-01-18T05:18:38.264000+00:00 0.1.0/1 WebDownload: Checking if destination /tmp/pdflib.tar.gz exists
2024-01-18T05:18:38.264000+00:00 0.1.0/1 WebDownload: Creating file /tmp/pdflib.tar.gz
2024-01-18T05:18:38.264000+00:00 0.1.0/1 WebDownload: Created file /tmp/pdflib.tar.gz
2024-01-18T05:18:38.776000+00:00 0.1.0/1 WebDownload: Received HTTP status code for HEAD request: 200
2024-01-18T05:18:38.776000+00:00 0.1.0/1 WebDownload: Size of source = 48309824 bytes
2024-01-18T05:18:38.776000+00:00 0.1.0/1 WebDownload: Can download source of size 48309824 bytes
2024-01-18T05:18:38.921000+00:00 0.1.0/1 WebDownload: Received HTTP status code for GET request: 200
2024-01-18T05:18:38.921000+00:00 0.1.0/1 WebDownload: Success! Received HTTP status code 200
2024-01-18T05:18:38.921000+00:00 0.1.0/1 WebDownload: Copying HTTP response body to file /tmp/pdflib.tar.gz
2024-01-18T05:18:41.739000+00:00 0.1.0/1 WebDownload: Copied HTTP response body to file /tmp/pdflib.tar.gz
2024-01-18T05:18:41.739000+00:00 0.1.0/1 WebDownload: Matching checksums with algorithm SHA256...
2024-01-18T05:18:41.924000+00:00 0.1.0/1 Waiting for command to complete (command id: eced9d6e-6336-43dd-a8c9-584d0835c902). Attempt number: 1.
2024-01-18T05:18:42.204000+00:00 0.1.0/1 WebDownload: Matching given checksum 31c589c76d96965ddeec3e3d89c0bf5322513dbe3f523dcc8d2352c6167cdc71 with calculated checksum b9f3f512ea646b7158f2ca27c3a3776a1c8faf817c746e6f6924b396b30b32db
2024-01-18T05:18:42.204000+00:00 0.1.0/1 WebDownload: [ ERROR ] Checksums do not match. Deleting file /tmp/pdflib.tar.gz
2024-01-18T05:18:42.219000+00:00 0.1.0/1 WebDownload: Ending download operation - Destination /tmp/pdflib.tar.gz, Error checksums do not match
2024-01-18T05:18:42.219000+00:00 0.1.0/1 WebDownload: [ ERROR ] Operation could not be completed due to error - Source, Destination /tmp/pdflib.tar.gz, Error checksums do not match
2024-01-18T05:18:43.228000+00:00 0.1.0/1 Executor: FINISHED EXECUTION OF ALL DOCUMENTS

I'm really confused about where b9f3f512ea646b7158f2ca27c3a3776a1c8faf817c746e6f6924b396b30b32db checksum is coming from. I set the image builder infrastructure to not delete the instance on complete, so I ran the commands I showed above on the actual instance that ran and failed this job, and I get the same sha256sum result. I thought maybe it was not using sha256sum, but rather sha256hmac but that instead results in 1cf38be352b26de149250acda5af07ab1819449126a02f88f247ec6b147a6cb1 which doesn't match the SHA256 result that WebDownload is giving.

Is there some obscure implementation of SHA256 that this uses, or is there a bug?

  • To make sure that I was specifying SHA256 correctly, I attempted to specify SHA256SUM as the algorithm, and got the following error

    Invalid action module inputs in phase 'build', step 'Download_PHPlib'. Error: WebDownload: Given algorithm SHA256SUM for source is not one of MD5, SHA1, SHA256, and SHA512

    So it seems like the SHA256 checksum algorithm that WebDownload uses is not consistent with sha256sum

Copying from my Slack response to Jamie: It looks like EC2 image builder is gunzipping the tarball and calculating the checksum on that:

% wget ""
2024-01-18 13:20:21 (3.74 MB/s) - ‘PDFlib-10.0.1-Linux-x64-php.tar.gz’ saved [48309824/48309824]
% shasum -a 256 PDFlib-10.0.1-Linux-x64-php.tar.gz
31c589c76d96965ddeec3e3d89c0bf5322513dbe3f523dcc8d2352c6167cdc71  PDFlib-10.0.1-Linux-x64-php.tar.gz
% gunzip PDFlib-10.0.1-Linux-x64-php.tar.gz
% shasum -a 256 PDFlib-10.0.1-Linux-x64-php.tar
b9f3f512ea646b7158f2ca27c3a3776a1c8faf817c746e6f6924b396b30b32db  PDFlib-10.0.1-Linux-x64-php.tar
answered a month ago

