在公共VPC/Subnet中出现无法拉取容器错误。我遗漏/做错了什么?

0

【以下的问题经过翻译处理】 我已经创建了一个全新的 AWS 账户(只是为了解决这个问题)并且这个账户中每个区域的默认 VPC 和子网都是原始的和未修改的。

这是 us-east-1 中的默认 VPC:

$ aws ec2 describe-vpcs
{
    "Vpcs": [
        {
            "CidrBlock": "172.31.0.0/16",
            "DhcpOptionsId": "dopt-095a7873b289557a1",
            "State": "available",
            "VpcId": "vpc-08ba51697a37c5ad9",
            "OwnerId": "...",
            "InstanceTenancy": "default",
            "CidrBlockAssociationSet": [
                {
                    "AssociationId": "vpc-cidr-assoc-0dba5df7b176877b7",
                    "CidrBlock": "172.31.0.0/16",
                    "CidrBlockState": {
                        "State": "associated"
                    }
                }
            ],
            "IsDefault": true
        }
    ]
}

这是此 VPC 的路由表:

$ aws ec2 describe-route-tables --filters Name=vpc-id,Values=vpc-08ba51697a37c5ad9
{
    "RouteTables": [
        {
            "Associations": [
                {
                    "Main": true,
                    "RouteTableAssociationId": "rtbassoc-08e6f9833f341f6c4",
                    "RouteTableId": "rtb-000d61d5d0236d276",
                    "AssociationState": {
                        "State": "associated"
                    }
                }
            ],
            "PropagatingVgws": [],
            "RouteTableId": "rtb-000d61d5d0236d276",
            "Routes": [
                {
                    "DestinationCidrBlock": "172.31.0.0/16",
                    "GatewayId": "local",
                    "Origin": "CreateRouteTable",
                    "State": "active"
                },
                {
                    "DestinationCidrBlock": "0.0.0.0/0",
                    "GatewayId": "igw-0b7ed209f5cd38fa6",
                    "Origin": "CreateRoute",
                    "State": "active"
                }
            ],
            "Tags": [],
            "VpcId": "vpc-08ba51697a37c5ad9",
            "OwnerId": "..."
        }
    ]
}

如您所见,第二条路由允许出口到互联网:

{
    "DestinationCidrBlock": "0.0.0.0/0",
    "GatewayId": "igw-0b7ed209f5cd38fa6",
    "Origin": "CreateRoute",
    "State": "active"
}

所以我假设如果我在此 VPC 中部署 ECS Fargate 任务,它应该能够从 docker.io 中提取 amazoncorretto:17-alpine3.15

尽管如此,每当我部署 CloudFormation 堆栈时,ECS 都无法运行计划任务,因为它无法从 DockerHub 获取图像并输出错误:

CannotPullContainerError: inspect image has been retried 5 time(s): failed >to resolve ref "docker.io/library/amazoncorretto:17-alpine3.15": failed to >do request: Head https://registry-1.docker.io/v2/library/amazoncorretto/manifests/17-alpine3.15: dial ...

这是我的 CloudFormation 模板(我有意为所有涉及的角色授予广泛的开放权限,以确保此问题不是由于 IAM 权限不足而造成的):

AWSTemplateFormatVersion: "2010-09-09"
Description: ECS Cron Task
Parameters:
  AppName:
    Type: String
    Default: CronTask

  AppImage:
    Type: String
    Default: amazoncorretto:17-alpine3.15

  AppLogGroup:
    Type: String
    Default: ECS

  AppLogPrefix:
    Type: String
    Default: CronTask

  ScheduledTaskSubnets:
    Type: List<AWS::EC2::Subnet::Id>
    Default: "subnet-0031a6eaf7e52173c, subnet-01950a0d2d1e04dc1, subnet-0a1aa70f0421e2025, subnet-036abb95995a86c73, subnet-0f8b5043babfb9a7e, subnet-07cb2210ce2d5bb8f"

Resources:
  Cluster:
    Type: AWS::ECS::Cluster

  TaskRole:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Version: "2012-10-17"
        Statement:
          - Action: sts:AssumeRole
            Effect: Allow
            Principal:
              Service: ecs-tasks.amazonaws.com
      Policies:
        - PolicyName: AdminAccess
          PolicyDocument:
            Version: "2012-10-17"
            Statement:
              - Action: "*"
                Effect: Allow
                Resource: "*"

  TaskExecutionRole:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Version: "2012-10-17"
        Statement:
          - Action: sts:AssumeRole
            Effect: Allow
            Principal:
              Service: ecs-tasks.amazonaws.com
      Path: /
      ManagedPolicyArns:
        - arn:aws:iam::aws:policy/service-role/AmazonECSTaskExecutionRolePolicy
      Policies:
        - PolicyName: AdminAccess
          PolicyDocument:
            Version: "2012-10-17"
            Statement:
              - Action: "*"
                Effect: Allow
                Resource: "*"

  TaskScheduleRole:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Statement:
          - Action: sts:AssumeRole
            Effect: Allow
            Principal:
              Service: events.amazonaws.com
      Path: /
      Policies:
        - PolicyName: AdminAccess
          PolicyDocument:
            Statement:
              - Action: "*"
                Effect: Allow
                Resource: "*"

  TaskDefinition:
    Type: AWS::ECS::TaskDefinition
    Properties:
      Cpu: 256
      Memory: 512
      NetworkMode: awsvpc
      TaskRoleArn: !Ref TaskRole
      ExecutionRoleArn: !Ref TaskExecutionRole
      Family: !Ref AppName
      RequiresCompatibilities:
        - FARGATE
      ContainerDefinitions:
        - Name: !Ref AppName
          Image: !Ref AppImage
          Command: ["java", "--version"]
          Essential: true
          LogConfiguration:
            LogDriver: awslogs
            Options:
              awslogs-create-group: true
              awslogs-group: !Ref AppLogGroup
              awslogs-region: !Ref "AWS::Region"
              awslogs-stream-prefix: !Ref AppLogPrefix

  TaskSchedule:
    Type: AWS::Events::Rule
    DependsOn: 
      - TaskScheduleRole
      - DeadLetterQueue
    Properties:
      Description: Trigger the task once every minute
      ScheduleExpression: cron(0/1 * * * ? *)
      State: ENABLED
      Targets:
        - Arn: !GetAtt Cluster.Arn
          Id: ClusterTarget
          RoleArn: !GetAtt TaskScheduleRole.Arn
          DeadLetterConfig:
            Arn: !GetAtt DeadLetterQueue.Arn
          EcsParameters:
            LaunchType: FARGATE
            TaskCount: 1
            TaskDefinitionArn: !Ref TaskDefinition
            NetworkConfiguration:
              AwsVpcConfiguration:
                Subnets: !Ref ScheduledTaskSubnets

  DeadLetterQueue:
    Type: AWS::SQS::Queue
    Properties:
      QueueName: "CronTaskDeadLetterQueue"

  DeadLetterQueuePolicy:
    Type: AWS::SQS::QueuePolicy
    Properties:
      Queues:
        - !Ref DeadLetterQueue
      PolicyDocument:
        Statement:
          - Action: "*"
            Effect: Allow
            Resource: "*"

我在这里错过了什么?为什么尽管在公共子网/VPC 中运行任务(如下),AWS 仍无法从 docker.io 中提取图像?我的 TaskSchedule 资源中是否缺少某些内容?

TaskSchedule:
    Type: AWS::Events::Rule
    ...
    Properties:
        ...
        Targets:
            - ...
                EcsParameters:
                LaunchType: FARGATE
                TaskCount: 1
                TaskDefinitionArn: !Ref TaskDefinition
                NetworkConfiguration:
                    AwsVpcConfiguration:
                        Subnets: !Ref ScheduledTaskSubnets

提前致谢。

1 Antwort
0

【以下的回答经过翻译处理】 您的任务由于未被分配公共IP地址而无法与Internet通信。您需要在您的AwsVpcConfiguration中添加AssignPublicIp:ENABLED。详见https://docs.aws.amazon.com/eventbridge/latest/APIReference/API_AwsVpcConfiguration.html

profile picture
EXPERTE
beantwortet vor 5 Monaten

Du bist nicht angemeldet. Anmelden um eine Antwort zu veröffentlichen.

Eine gute Antwort beantwortet die Frage klar, gibt konstruktives Feedback und fördert die berufliche Weiterentwicklung des Fragenstellers.

Richtlinien für die Beantwortung von Fragen