Why is my CloudFront distribution returning application/octet-stream content for some files?

0

I have a CloudFront distribution with an S3 bucket set as origin. I put two objects index2.html and index3.html into the bucket, and test the CloudFront endpoint with the curl command.

$ curl -isSL d2ai8k3b74ctok.cloudfront.net/index2.html
HTTP/1.1 200 OK
Content-Type: application/octet-stream
Content-Length: 22
Connection: keep-alive
Date: Wed, 27 Sep 2023 08:16:44 GMT
Last-Modified: Wed, 27 Sep 2023 08:12:50 GMT
ETag: "63dc6718a6cc98446a099f6a22d254cf"
x-amz-server-side-encryption: AES256
Accept-Ranges: bytes
Server: AmazonS3
X-Cache: Miss from cloudfront
Via: 1.1 9496dc19277503ce2ac4d4d181a9a432.cloudfront.net (CloudFront)
X-Amz-Cf-Pop: NRT57-P4
X-Amz-Cf-Id: h1pf9sakNTIEA7CQXdBQNBFxZtCbSbV7aw7_l3QWgdIvZrexbDKa_w==

<h1>Hello World!</h1>
$ curl -isSL d2ai8k3b74ctok.cloudfront.net/index3.html
HTTP/1.1 200 OK
Content-Type: text/html
Content-Length: 22
Connection: keep-alive
Date: Wed, 27 Sep 2023 08:16:48 GMT
Last-Modified: Wed, 27 Sep 2023 08:16:40 GMT
ETag: "63dc6718a6cc98446a099f6a22d254cf"
x-amz-server-side-encryption: AES256
Accept-Ranges: bytes
Server: AmazonS3
X-Cache: Miss from cloudfront
Via: 1.1 17a02959a1dd77a49eeba1ffffcee214.cloudfront.net (CloudFront)
X-Amz-Cf-Pop: NRT57-P4
X-Amz-Cf-Id: EjDzPVhRWRwHNGGf2qXG51DqMY2ig4HbL8gqamTkbouh7WSPB1m2wA==

<h1>Hello World!</h1>

My CloudFront returned index2.html as application/octet-stream media type and index3.html as text/html despite the two objects are the same (has an identical ETag property).

Why is my CloudFront distribution returned index2.html as application/octet-stream instead of text/html?

The only difference between index2.html and index3.html is index2.html was created by PutObjectCommand from AWS JavaScript SDK API and index3.html was uploaded by aws s3 cp AWS CLI command.

You can reproduce my configuration by creating a CloudFormation stack with the following template. (Note: index2.html will be created automatically by custom resource)

AWSTemplateFormatVersion: "2010-09-09"
Transform: AWS::Serverless-2016-10-31
Description: Using CloudFront distribution

Rules:
  TestVirginia:
    Assertions:
      - AssertDescription: Only us-east-1 is allowed
        Assert:
          Fn::Equals:
            - us-east-1
            - Ref: AWS::Region

Resources:
  # S3
  S3Bucket:
    Type: AWS::S3::Bucket
    Properties:
      BucketName:
        Fn::Sub: ${AWS::StackName}-s3bucket-${AWS::Region}
      BucketEncryption:
        ServerSideEncryptionConfiguration:
          - BucketKeyEnabled: true
            ServerSideEncryptionByDefault:
              SSEAlgorithm: AES256

  S3BucketPolicy:
    Type: AWS::S3::BucketPolicy
    Properties:
      Bucket:
        Ref: S3Bucket
      PolicyDocument:
        Version: "2012-10-17"
        Statement:
          - Effect: Allow
            Principal:
              Service: cloudfront.amazonaws.com
            Action:
              - s3:GetObject
            Resource:
              - Fn::Sub: ${S3Bucket.Arn}/*
            Condition:
              StringEquals:
                aws:SourceArn:
                  Fn::Sub: arn:${AWS::Partition}:cloudfront::${AWS::AccountId}:distribution/${Distribution}

  # CloudFront
  NoCachePolicy:
    Type: AWS::CloudFront::CachePolicy
    Properties:
      CachePolicyConfig:
        Name:
          Fn::Sub: ${AWS::StackName}-NoCachePolicy
        Comment: CloudFront no-cache policy
        DefaultTTL: 0
        MinTTL: 0
        MaxTTL: 0
        ParametersInCacheKeyAndForwardedToOrigin:
          EnableAcceptEncodingBrotli: false
          EnableAcceptEncodingGzip: false
          CookiesConfig:
            CookieBehavior: none
          HeadersConfig:
            HeaderBehavior: none
          QueryStringsConfig:
            QueryStringBehavior: none

  OriginAccessControl:
    Type: AWS::CloudFront::OriginAccessControl
    Properties:
      OriginAccessControlConfig:
        Name:
          Fn::Sub: ${AWS::StackName}-OriginAccessControl
        Description: Origin access control for S3
        OriginAccessControlOriginType: s3
        SigningBehavior: always
        SigningProtocol: sigv4

  Distribution:
    Type: AWS::CloudFront::Distribution
    Properties:
      DistributionConfig:
        Comment: CloudFront distribution
        Enabled: true
        Origins:
          - Id:
              Fn::GetAtt: S3Bucket.RegionalDomainName
            DomainName:
              Fn::GetAtt: S3Bucket.RegionalDomainName
            OriginAccessControlId:
              Ref: OriginAccessControl
            S3OriginConfig:
              OriginAccessIdentity: ""
        DefaultCacheBehavior:
          CachePolicyId:
            Ref: NoCachePolicy
          AllowedMethods:
            - GET
            - HEAD
          CachedMethods:
            - GET
            - HEAD
          Compress: false
          TargetOriginId:
            Fn::GetAtt: S3Bucket.RegionalDomainName
          ViewerProtocolPolicy: allow-all
        DefaultRootObject: index.html

  # Custom resource
  S3ObjectFunctionPolicy:
    Type: AWS::IAM::ManagedPolicy
    Properties:
      ManagedPolicyName:
        Fn::Sub: ${AWS::StackName}-S3ObjectFunctionPolicy-${AWS::Region}
      Description: Policy for S3ObjectFunction
      PolicyDocument:
        Version: "2012-10-17"
        Statement:
          - Sid: S3AccessPermissions
            Effect: Allow
            Action:
              - s3:PutObject
              - s3:DeleteObject
            Resource: "*"

  S3ObjectFunctionRole:
    Type: AWS::IAM::Role
    Properties:
      RoleName:
        Fn::Sub: ${AWS::StackName}-S3ObjectFunctionRole-${AWS::Region}
      Description: Service role for S3ObjectFunction
      AssumeRolePolicyDocument:
        Version: "2012-10-17"
        Statement:
          - Effect: Allow
            Principal:
              Service: lambda.amazonaws.com
            Action:
              - sts:AssumeRole
      ManagedPolicyArns:
        - arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole
        - Ref: S3ObjectFunctionPolicy

  S3ObjectFunction:
    Type: AWS::Serverless::Function
    Properties:
      FunctionName:
        Fn::Sub: ${AWS::StackName}-S3ObjectFunction
      Description: AMI lookup function
      Role:
        Fn::GetAtt: S3ObjectFunctionRole.Arn
      Architectures:
        - arm64
      Runtime: nodejs18.x
      Handler: index.handler
      Timeout: 30
      InlineCode: |
        const https = require("https");
        const {
          DeleteObjectCommand,
          PutObjectCommand,
          S3Client,
        } = require("@aws-sdk/client-s3");

        const serialize = obj => JSON.stringify(obj, null, 2);

        const sendResponse = async (
          event,
          context,
          status,
          data,
          physicalResourceId,
          noEcho,
        ) => {
          const { StackId, RequestId, LogicalResourceId, ResponseURL } = event;
          const body = serialize({
            Status: status,
            Reason: `See the details in CloudWatch Log Stream: ${context.logStreamName}`,
            PhysicalResourceId: physicalResourceId || context.logStreamName,
            Data: data,
            StackId,
            RequestId,
            LogicalResourceId,
            NoEcho: noEcho || false,
          });

          const { hostname, pathname, search } = new URL(ResponseURL);
          const path = `${pathname}${search}`;
          const headers = {
            "Content-Type": "application/json",
            "Content-Length": body.length,
          };

          return await new Promise((resolve, reject) => {
            const req = https.request(
              { hostname, port: 443, path, method: "PUT", headers },
              res => {
                res.on("data", chunk => {
                  const body = chunk.toString();
                  resolve(body);
                });
              },
            );
            req.on("error", e => {
              reject(e.message);
            });
            req.write(body);
            req.end();
          });
        };

        const putObject = async (Bucket, Key, Body) => {
          const client = new S3Client({});
          const command = new PutObjectCommand({ Bucket, Key, Body });
          return await client.send(command);
        };

        const deleteObject = async (Bucket, Key) => {
          const client = new S3Client({});
          const command = new DeleteObjectCommand({ Bucket, Key });
          return await client.send(command);
        };

        exports.handler = async (event, context) => {
          console.log(serialize(event));
          const { ResourceProperties } = event;
          const { Bucket, Key, Body } = ResourceProperties;

          try {
            if (event.RequestType === "Create" || event.RequestType === "Update") {
              await putObject(Bucket, Key, Body);
              return await sendResponse(
                event,
                context,
                "SUCCESS",
                { Bucket, Key, Body },
                `s3://${Bucket}/${Key}`,
              );
            } else if (event.RequestType === "Delete") {
              await deleteObject(Bucket, Key).catch(console.error);
              return await sendResponse(event, context, "SUCCESS");
            } else {
              throw new Error(`Invalid RequestType: ${event.RequestType}`);
            }
          } catch (error) {
            console.error(error);
            return await sendResponse(event, context, "FAILED", {});
          }
        };

  S3ObjectFunctionLogGroup:
    Type: AWS::Logs::LogGroup
    Properties:
      LogGroupName:
        Fn::Sub: /aws/lambda/${S3ObjectFunction}

  Index2HtmlObject:
    Type: AWS::CloudFormation::CustomResource
    Properties:
      ServiceToken:
        Fn::GetAtt: S3ObjectFunction.Arn
      Bucket:
        Ref: S3Bucket
      Key: index2.html
      Body: |
        <h1>Hello World!</h1>

Outputs:
  DistributionDnsName:
    Description: Distribution domain name
    Value:
      Fn::GetAtt: Distribution.DomainName
1 Answer
1
Accepted Answer

Hello.

The Content-Type header that CloudFront returns is typically determined by the Content-Type metadata set on the object in your S3 bucket. When you upload a file to an S3 bucket, the Content-Type metadata is typically inferred from the file extension. However, in some cases, it may not be set correctly, leading to differences in the Content-Type header when accessing the same file through CloudFront.

In your case, you mentioned that index2.html was created using the PutObjectCommand, while index3.html was uploaded using the AWS CLI's aws s3 cp command. The Content-Type metadata might have been set differently for these two files during their respective creation/upload processes, which is why you see a difference in the Content-Type header when accessing them through CloudFront.

To resolve this issue and ensure that both files are served with the correct Content-Type header, you can do the following:

Check Content-Type Metadata: First, verify the Content-Type metadata for both index2.html and index3.html in your S3 bucket. You can do this using the AWS CLI:

aws s3api head-object --bucket your-bucket-name --key index2.html
aws s3api head-object --bucket your-bucket-name --key index3.html

This command will display the Content-Type metadata for each object. Make sure it is set to text/html for both files. If it's not, you can update it using the AWS CLI:

aws s3 cp --content-type "text/html" s3://your-bucket-name/index2.html s3://your-bucket-name/index2.html

Best regards, Andrii

profile picture
EXPERT
answered 7 months ago
profile picture
EXPERT
reviewed 7 months ago
  • Wow, you are absolutely right! The root cause of the wrong content type was improper metadata of the S3 object.

    I fixed my custom resource to support Content-Type and now it works as expected. Thank you so much!

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions