AWS IoT Greengrass: How does the deployment Rollback function work? what kind of errors does it roll back?

0

In AWS IoT Greengrass, there is a Rollback option for deployments under "Deployment Policies". If I understood correctly it to rolls back devices to their previous configuration if the deployment fails. I wanted to test this so I purposely build a nonworking ECR docker image and deployed it through a greengrass component. (I basically introduced a python filenotfouderror by commanding to run a nonexistend python script in my Dockerfile.)

Before that, I had a working container running. What I would like to see is my device rolling back to running the old (working) container after realizing that the container failed. However, this doesn't happen. Only the device state changes to unhealthy in the AWS console.

Now my question: What kind of errors is this Rollback function able to detect/handle? and do you have any suggestions on how I could achieve my goal of rolling back the device if the docker cmd or any file therein shows an error?

Thanks a lot for you help!

2개 답변
0

Hi. Given that your device is unhealthy, it seems the rollback didn't occur or it failed. Please check the device deployment status (describe-job-execution) as described here (or look in the console): https://docs.aws.amazon.com/greengrass/v2/developerguide/check-deployment-status.html#check-device-deployment-status

And please check the Greengrass and component logs as described here: https://docs.aws.amazon.com/greengrass/v2/developerguide/monitor-logs.html

profile pictureAWS
전문가
Greg_B
답변함 일 년 전
  • thanks for your answer. I've been trying to replace one component of my deployment by a new different one (that shall fail). maybe the rollback only works between different versions of the same component?

0

The job execution always stated SUCCESSFUL but the device is unhealthy.

This is my Component description:

{
  "RecipeFormatVersion": "2020-01-25",
  "ComponentName": "com.example.MyPrivateDockerComponent",
  "ComponentVersion": "1.1.6",
  "ComponentType": "aws.greengrass.generic",
  "ComponentDescription": "A component that runs a Docker container from a private Amazon ECR image.",
  "ComponentPublisher": "Amazon",
  "ComponentDependencies": {
    "aws.greengrass.DockerApplicationManager": {
      "VersionRequirement": ">=2.0.0 <2.1.0",
      "DependencyType": "HARD"
    },
    "aws.greengrass.TokenExchangeService": {
      "VersionRequirement": ">=2.0.0 <2.1.0",
      "DependencyType": "HARD"
    }
  },
  "Manifests": [
    {
      "Platform": {
        "os": "all"
      },
      "Lifecycle": {
        "Run": "docker run --rm 242944196659.dkr.ecr.eu-central-1.amazonaws.com/test_repo:fileerror",
        "Shutdown": "docker stop $(docker ps -q --filter ancestor=242944196659.dkr.ecr.eu-central-1.amazonaws.com/test_repo:fileerror)"
      },
      "Artifacts": [
        {
          "Uri": "docker:242944196659.dkr.ecr.eu-central-1.amazonaws.com/test_repo:fileerror",
          "Unarchive": "NONE",
          "Permission": {
            "Read": "OWNER",
            "Execute": "NONE"
          }
        }
      ]
    }
  ],
  "Lifecycle": {}
}

And these are the errors component errors I get on CloudWatch:

[WARN] (Copier) com.example.MyPrivateDockerComponent: stderr. Usage:  docker stop [OPTIONS] CONTAINER [CONTAINER...]. {scriptName=services.com.example.MyPrivateDockerComponent.lifecycle.Shutdown, serviceName=com.example.MyPrivateDockerComponent, currentState=BROKEN`

and

2023-03-27T09:11:24.110Z [WARN] (pool-2-thread-14) com.example.MyPrivateDockerComponent: shell-runner-error. {scriptName=services.com.example.MyPrivateDockerComponent.lifecycle.Shutdown, serviceName=com.example.MyPrivateDockerComponent, currentState=BROKEN, command=["docker stop $(docker ps -q --filter ancestor=242944196659.dkr.ecr.eu-central-1..."]}

so apparently there is a problem with my defined shutdown command. Because the container is exited and removed immediately after it fails, tha Shutdown command can't find the container to Shutdown anymore. but is that really the reason why the Rollback doesn't work and the device becomes unhealthy?

답변함 일 년 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠