Skip to content

CloudFront VpcOrigin to EC2 Spot Instance

0

TL;DR: Trying to use VpcOrigin with EC2 Spot, running into escalating amounts of complexity, hoping for simple solution.

Problem Statement: We are deploying a web application on AWS using CloudFormation. Our goal is to maintain a single, active EC2 Spot Instance for cost optimization, leveraging an EC2 Spot Fleet (or Auto Scaling Group) to manage its provisioning, retries, and replacement when capacity is lost. The application must be served via CloudFront.

Current Architecture: Currently, our CloudFront distribution uses a VpcOrigin that points directly to the private DNS name of a single AWS::EC2::Instance. This instance is provisioned as a Spot Instance via a custom CloudFormation resource (SpotInstanceSelector) and is IPv6-only within a custom VPC. This setup is extremely fragile due to the cheapest Spot price often having low capacity, and the complete lack of an automated mechanism to update the CloudFront origin when the instance changes.

Desired Architecture: We want to transition to an EC2 Spot Fleet (or ASG with a mixed instances policy) with a target capacity of one (1) instance. The purpose of this fleet is solely to manage the Spot request lifecycle, automatically retrying with different instance types or Availability Zones until a single instance is successfully provisioned and maintained. This instance would also be IPv6-only and private.

Critical Architectural Constraints (Cost-Driven): We are explicitly avoiding certain AWS services due to their associated costs:

  1. No Application Load Balancer (ALB): Undesired due to extra fees.
  2. No NAT Gateway: Undesired due to extra fees.
  3. No Public IPv4 Addresses: Instances must remain private and only use IPv6 for outbound and inbound traffic within the VPC.

The Core Challenge: How can we reliably connect our CloudFront distribution (using a VpcOrigin) to this single, dynamically replaced private, IPv6-only EC2 Spot Instance managed by an EC2 Spot Fleet, without using an Application Load Balancer, and without a mechanism to automatically update the CloudFront origin when the instance's private IP/DNS changes?

The VpcOrigin expects a single private DNS name or IP address. When the Spot Fleet replaces the active instance, its private IP/DNS will change. We need a robust mechanism to automatically update the CloudFront VpcOrigin to point to the newly active instance from the fleet.

Considered/Attempted Approaches (and why they are problematic under constraints):

  • ALB: The standard solution for this, but explicitly disallowed due to cost.

  • Custom Lambda to monitor Spot Fleet and update CloudFront Origin: This seems like the only viable custom solution, but it would be complex:

    • The Lambda would need to monitor Spot Fleet events (instance launches/terminations).
    • It would need to identify the single, healthy, active instance from the fleet.
    • It would then need to update the CloudFront distribution's origin with the new instance's private IP/DNS.
    • Concerns remain about the speed of detection, CloudFront propagation time, and potential downtime during transitions.
  • Direct IP/DNS of a single instance from the fleet: Not feasible as the IP/DNS changes upon replacement.

We are looking for guidance on any AWS services, CloudFormation patterns, or custom solutions that can reliably manage CloudFront VpcOrigin updates for a single, dynamically replaced Spot instance from a Spot Fleet, under these specific IPv6-only, cost-constrained conditions, especially given the lack of an existing automated origin update mechanism.

3 Answers
0

When working with CloudFront VPC Origins and EC2 Spot Instances under your cost constraints, there are a few approaches you could consider:

Using AWS Cloud Map for Service Discovery

AWS Cloud Map could provide a solution for your dynamic instance scenario:

  1. Register your EC2 Spot Instance with Cloud Map when it launches
  2. Use the Cloud Map service discovery DNS name as a stable endpoint for your CloudFront VPC Origin
  3. When a new Spot Instance replaces the old one, it would register itself with Cloud Map, automatically updating the DNS record

This approach provides a consistent DNS endpoint that CloudFront can connect to, regardless of which specific EC2 instance is currently active. The Cloud Map service would handle the DNS updates when instances change.

Using EC2 Instance ENI with Static Private IP

Another approach would be to:

  1. Create an Elastic Network Interface (ENI) with a static private IP address
  2. Configure your Spot Fleet to attach this ENI to whichever instance is currently active
  3. Point your CloudFront VPC Origin to this static private IP address

This way, even as instances come and go, the IP address remains constant because it's tied to the ENI rather than the instance.

Considerations for Your Setup

For your IPv6-only, private subnet setup:

  • Ensure your VPC has an internet gateway configured
  • Configure appropriate inbound and outbound rules in your network ACLs
  • Update security groups to allow the CloudFront managed prefix list
  • Verify you have at least one available IPv4 address in your private subnet (required for CloudFront VPC Origins, even in IPv6-focused architectures)

Remember that both your CloudFront distribution and VPC Origin must reside in the same AWS account at this time.

These approaches should allow you to maintain a single active Spot Instance while providing a stable endpoint for CloudFront, without requiring an ALB or NAT Gateway.
Sources
CloudFront VPC Origin with ECS-EC2 (non-Fargate) Service | AWS re:Post
Use CloudFront VPC origins to secure applications | AWS re:Post
Introducing CloudFront Virtual Private Cloud (VPC) Origins: Shield your web applications from public internet | Networking & Content Delivery

answered 2 months ago
  • I'm trying the Cloud Map option described here

0

Something we've often used successfully when we need an ASG with only a single auto-replaced instance (e.g. a license server) is adding commands in the instance bootstrap to update a DNS A or AAAA record pointing to the new instance's IP address. It's very simple and effective.

EXPERT
answered 2 months ago
  • I'm trying to do something similar to this but with Cloud Map DNS instead of Route 53.

0

Here’s a design that keeps CloudFront’s VpcOrigin pointing at a stable private DNS name (Route 53 Private Hosted Zone), while your Spot-backed ASG freely replaces the single instance behind that name.

Key idea: never update the CloudFront origin again. Instead, point it once at origin.internal (a Route 53 PHZ record), and automate updating that DNS record to the current instance’s private IPv6 when replacements happen.

That gives you:

  • No ALB (and no NAT GW / public IPv4)
  • Works with a single, private, IPv6-only instance
  • Fast, simple failover: update DNS; CloudFront follows on next DNS refresh
  • No CloudFront distribution update/propagation each time

Below are two clean ways to “take action when a new instance is created” and rotate the AAAA record to the new instance. Pick one (or use both, the lifecycle-hook version is more controlled).


0) One-time setup (static)

  1. VpcOrigin → stable name Create a Route 53 Private Hosted Zone (PHZ) for internal.example.com, associate it with the VPC you attached to the CloudFront VPC origin access. Set your CloudFront origin’s “Domain name” to origin.internal.example.com (no change ever again).

  2. Initial DNS record Create an AAAA record for origin.internal.example.com with a low TTL (e.g., 30–60 seconds). We’ll keep this record updated to the single instance’s primary private IPv6.

  3. ASG (target capacity = 1) Use a mixed instances policy if you like; IPv6-only subnet; no public IPv4. Attach an instance profile that allows a very narrow Route 53 change permission (see IAM policy below).


1) Approach A — Update DNS from User Data (self-register on launch)

Let the instance, as soon as it boots, discover its own private IPv6 address and UPSERT the Route 53 AAAA record to itself. This is simple and has zero moving parts.

Minimal IAM permissions (instance role)

Scope this to your hosted zone only (replace Z123... with your hosted zone id):

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "ChangeOnlyThisHostedZone",
      "Effect": "Allow",
      "Action": [
        "route53:ChangeResourceRecordSets"
      ],
      "Resource": "arn:aws:route53:::hostedzone/Z1234567890ABC"
    },
    {
      "Sid": "ListHostedZonesForSafety",
      "Effect": "Allow",
      "Action": ["route53:ListHostedZonesByName"],
      "Resource": "*"
    }
  ]
}

User Data (Amazon Linux) — one-shot “claim the name”

  • Reads primary private IPv6 from IMDSv2
  • UPSERTs the AAAA for origin.internal.example.com to that address
  • (Optional) health-gates with a tiny local check before finishing
#!/bin/bash
set -euo pipefail

# Enable IMDSv2 token
TOKEN=$(curl -sX PUT "http://169.254.169.254/latest/api/token" -H "X-aws-ec2-metadata-token-ttl-seconds: 21600")

# Get primary private IPv6 of eth0
IPV6=$(curl -s -H "X-aws-ec2-metadata-token: $TOKEN" \
  http://169.254.169.254/latest/meta-data/network/interfaces/macs/$(curl -s -H "X-aws-ec2-metadata-token: $TOKEN" http://169.254.169.254/latest/meta-data/network/interfaces/macs/ | head -n1)ipv6s | head -n1)

HOSTED_ZONE_ID="Z1234567890ABC"
NAME="origin.internal.example.com."
TTL=30

# Write the Route 53 change batch
cat >/tmp/r53-upsert.json <<EOF
{
  "Comment": "Point origin.internal to current instance private IPv6",
  "Changes": [{
    "Action": "UPSERT",
    "ResourceRecordSet": {
      "Name": "${NAME}",
      "Type": "AAAA",
      "TTL": ${TTL},
      "ResourceRecords": [{ "Value": "${IPV6}" }]
    }
  }]
}
EOF

# Apply the change
aws route53 change-resource-record-sets \
  --hosted-zone-id "${HOSTED_ZONE_ID}" \
  --change-batch file:///tmp/r53-upsert.json

# (Optional) wait until DNS resolves to me inside the VPC resolver
sleep 5
getent ahosts ${NAME} || true

Pros

  • Easiest: no extra AWS resources.
  • Fast: updates happen as soon as the OS boots.

Cons

  • If two instances boot briefly (e.g., capacity thrash), the “last writer wins.” With ASG desired = 1 that’s acceptable, but see the lifecycle-hook approach for stricter control.

2) Approach B — Update DNS via ASG Lifecycle Hook at Pending:Wait

Use a lifecycle hook to pause the instance at Pending:Wait, run a Lambda (or SSM) to:

  1. Confirm the instance is the only desired capacity and is healthy enough,
  2. Update the Route 53 AAAA record to this instance’s private IPv6,
  3. Call CompleteLifecycleAction → instance proceeds to InService.

This avoids any race and guarantees CloudFront’s DNS points to the right replacement before the instance becomes active.

CloudFormation sketch (core pieces)

Resources:
  Asg:
    Type: AWS::AutoScaling::AutoScalingGroup
    Properties:
      MinSize: '1'
      MaxSize: '1'
      DesiredCapacity: '1'
      VPCZoneIdentifier: [subnet-abc123]
      LaunchTemplate: { LaunchTemplateId: !Ref Lt, Version: !GetAtt Lt.LatestVersionNumber }
      LifecycleHookSpecificationList:
        - LifecycleHookName: OnLaunchWait
          LifecycleTransition: autoscaling:EC2_INSTANCE_LAUNCHING
          HeartbeatTimeout: 300          # keep short to avoid extra cost
          DefaultResult: CONTINUE        # safety
          NotificationTargetARN: !Ref HookTopic
          RoleARN: !GetAtt HookRole.Arn

  HookTopic:
    Type: AWS::SNS::Topic

  HookRole:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Version: '2012-10-17'
        Statement:
          - Effect: Allow
            Principal: { Service: [autoscaling.amazonaws.com] }
            Action: sts:AssumeRole
      Policies:
        - PolicyName: AllowASGToNotify
          PolicyDocument:
            Version: '2012-10-17'
            Statement:
              - Effect: Allow
                Action: "sns:Publish"
                Resource: !Ref HookTopic

  HookFunction:
    Type: AWS::Lambda::Function
    Properties:
      Runtime: python3.12
      Handler: index.handler
      Timeout: 30
      Role: !GetAtt HookFunctionRole.Arn
      Environment:
        Variables:
          HOSTED_ZONE_ID: Z1234567890ABC
          RECORD_NAME: origin.internal.example.com.
          TTL: '30'
          ASG_NAME: !Ref Asg
      Code:
        ZipFile: |
          import os, json, boto3
          r53 = boto3.client('route53')
          asg = boto3.client('autoscaling')
          ec2 = boto3.client('ec2')

          def handler(event, ctx):
              msg = json.loads(event['Records'][0]['Sns']['Message'])
              inst_id = msg['EC2InstanceId']
              asg_name = msg['AutoScalingGroupName']

              # ensure this instance is the desired one (capacity=1)
              g = asg.describe_auto_scaling_groups(AutoScalingGroupNames=[asg_name])['AutoScalingGroups'][0]
              if int(g['DesiredCapacity']) != 1:
                  # if you later change capacity, adjust logic here
                  pass

              # fetch primary private IPv6
              ni = ec2.describe_instances(InstanceIds=[inst_id])['Reservations'][0]['Instances'][0]
              ipv6s = [ip['Ipv6Address'] for iface in ni['NetworkInterfaces'] for ip in iface.get('Ipv6Addresses',[])]
              ipv6 = ipv6s[0]

              r53.change_resource_record_sets(
                  HostedZoneId=os.environ['HOSTED_ZONE_ID'],
                  ChangeBatch={
                    "Comment": "ASG launch: point origin to new instance",
                    "Changes": [{
                      "Action": "UPSERT",
                      "ResourceRecordSet": {
                        "Name": os.environ['RECORD_NAME'],
                        "Type": "AAAA",
                        "TTL": int(os.environ['TTL']),
                        "ResourceRecords": [{ "Value": ipv6 }]
                      }
                    }]
                  }
              )

              # let the instance proceed
              asg.complete_lifecycle_action(
                  AutoScalingGroupName=asg_name,
                  LifecycleHookName=msg['LifecycleHookName'],
                  LifecycleActionToken=msg['LifecycleActionToken'],
                  LifecycleActionResult='CONTINUE'
              )

              return {'status':'ok'}

  HookSub:
    Type: AWS::SNS::Subscription
    Properties:
      TopicArn: !Ref HookTopic
      Protocol: lambda
      Endpoint: !GetAtt HookFunction.Arn

  HookInvokePerm:
    Type: AWS::Lambda::Permission
    Properties:
      FunctionName: !Ref HookFunction
      Action: lambda:InvokeFunction
      Principal: sns.amazonaws.com
      SourceArn: !Ref HookTopic

  HookFunctionRole:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Version: '2012-10-17'
        Statement:
          - Effect: Allow
            Principal: { Service: [lambda.amazonaws.com] }
            Action: sts:AssumeRole
      Policies:
        - PolicyName: AllowDnsAndAsg
          PolicyDocument:
            Version: '2012-10-17'
            Statement:
              - Effect: Allow
                Action: ["route53:ChangeResourceRecordSets"]
                Resource: "arn:aws:route53:::hostedzone/Z1234567890ABC"
              - Effect: Allow
                Action:
                  - autoscaling:CompleteLifecycleAction
                  - autoscaling:DescribeAutoScalingGroups
                Resource: "*"
              - Effect: Allow
                Action: ec2:DescribeInstances
                Resource: "*"
              - Effect: Allow
                Action:
                  - logs:CreateLogGroup
                  - logs:CreateLogStream
                  - logs:PutLogEvents
                Resource: "*"

Pros

  • Deterministic: DNS is updated before the instance goes InService.
  • Avoids any race if a previous instance is still shutting down.
  • Centralized: logic lives in Lambda, not on the instance.

Cons

  • Slightly more moving parts (SNS + Lambda).
  • Keep the hook timeout reasonable (e.g., 300 s) to avoid waiting costs.

Termination cleanup (optional but nice)

If you want to avoid a brief period where the old instance still holds the name, you can add a termination lifecycle hook (Terminating:Wait) that clears or moves the record only if the instance still matches the current value. In practice, launch-time UPSERT (above) is typically enough because it overwrites the old value.


Health & downtime considerations

  • TTL: Keep the AAAA TTL low (30–60 s). That’s your switchover time budget.

  • Bootstrap: If your app needs a minute to warm up, do it before DNS flips:

    • Lifecycle-hook flow can warm (SSM, local checks), then UPSERT, then CONTINUE.
    • User-data flow can warm, then UPSERT at the end.
  • Origin failover: CloudFront doesn’t “health check” VPC origins like ALB; you can define a second standby origin (e.g., a minimal static “maintenance” page in a private S3 bucket via VPC origin) in an origin group if you want a safety net.


If any part of this explanation was helpful in solving the issue, feel free to let me know — happy to help further.

answered 2 months ago
  • I'm trying to do something similar to this but with Cloud Map DNS instead of Route 53.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.