Hey there,
I am trying to test the Rollback function for deploying a docker container on a fleet of Raspberry Pi's. For that cause, I first deployed a container 1 that calls a python script that prints out "Hello, world!" to the console. I then created a deliberately non-working container 2 where the docker command tries to execute a python script which does not exist. When I revise the deployment to include the component of docker container 2 instead of the previously running container 1, the component fails as expected and enters into broken state (currentState=BROKEN). However, no rollback to the previously working deployment with container 1 occurs. Why not?
The deployment status always shows "Succeeded" but the device status turns to "Unhealthy".
My deployment.json is as follows:
{
"targetArn": "arn:aws:iot:eu-central-1:242944196659:thinggroup/flappiedoors",
"revisionId": "40",
"deploymentId": "ba6b2009-15c8-4b7b-ab90-905211bb3894",
"deploymentName": "test_deployments",
"deploymentStatus": "ACTIVE",
"iotJobId": "1f18b898-9d95-4890-97c4-4c1ee6a68282",
"iotJobArn": "arn:aws:iot:eu-central-1:242944196659:job/1f18b898-9d95-4890-97c4-4c1ee6a68282",
"components": {
"aws.greengrass.LogManager": {
"componentVersion": "2.3.1",
"configurationUpdate": {
"merge": "{\"logsUploaderConfiguration\":{\"systemLogsConfiguration\":{\"uploadToCloudWatch\":\"true\",\"deleteLogFileAfterCloudUpload\":\"true\"},\"componentLogsConfigurationMap\":{\"com.example.MyPrivateDockerComponent\":{\"deleteLogFileAfterCloudUpload\":\"true\"}}}}"
},
"runWith": {}
},
"aws.greengrass.SecureTunneling": {
"componentVersion": "1.0.13"
},
"com.example.MyPrivateDockerComponent": {
"componentVersion": "2.0.0"
}
},
"deploymentPolicies": {
"failureHandlingPolicy": "ROLLBACK",
"componentUpdatePolicy": {
"timeoutInSeconds": 60,
"action": "NOTIFY_COMPONENTS"
}
},
"iotJobConfiguration": {
"jobExecutionsRolloutConfig": {
"maximumPerMinute": 1000
}
},
"creationTimestamp": "2023-03-27T12:31:28.764Z",
"isLatestForTarget": true,
"tags": {}
}
For Reference, this is my component recipe for the according docker containers. The only thing I change between the two is the "ComponentVersion" and the container tag in the "Run" and "Shutdown" commands.
{
"RecipeFormatVersion": "2020-01-25",
"ComponentName": "com.example.MyPrivateDockerComponent",
"ComponentVersion": "2.0.0",
"ComponentType": "aws.greengrass.generic",
"ComponentDescription": "A component that runs a Docker container from a private Amazon ECR image.",
"ComponentPublisher": "Amazon",
"ComponentDependencies": {
"aws.greengrass.DockerApplicationManager": {
"VersionRequirement": ">=2.0.0 <2.1.0",
"DependencyType": "HARD"
},
"aws.greengrass.TokenExchangeService": {
"VersionRequirement": ">=2.0.0 <2.1.0",
"DependencyType": "HARD"
}
},
"Manifests": [
{
"Platform": {
"os": "all"
},
"Lifecycle": {
"Run": "docker run 242944196659.dkr.ecr.eu-central-1.amazonaws.com/test_repo:0.0.1",
"Shutdown": "docker stop $(docker ps -a -q --filter ancestor=242944196659.dkr.ecr.eu-central-1.amazonaws.com/test_repo:0.0.1)"
},
"Artifacts": [
{
"Uri": "docker:242944196659.dkr.ecr.eu-central-1.amazonaws.com/test_repo:fileerror",
"Unarchive": "NONE",
"Permission": {
"Read": "OWNER",
"Execute": "NONE"
}
}
]
}
],
"Lifecycle": {}
}
These are my component logs:
2023-03-27T12:33:19.673Z [INFO] (pool-2-thread-33) com.example.MyPrivateDockerComponent: shell-runner-start. {scriptName=services.com.example.MyPrivateDockerComponent.lifecycle.Run, serviceName=com.example.MyPrivateDockerComponent, currentState=STARTING, command=["docker run 242944196659.dkr.ecr.eu-central-1.amazonaws.com/test_repo:fileerror"]}
2023-03-27T12:33:21.952Z [WARN] (Copier) com.example.MyPrivateDockerComponent: stderr. python3: can't open file 'hello_world.py': [Errno 2] No such file or directory. {scriptName=services.com.example.MyPrivateDockerComponent.lifecycle.Run, serviceName=com.example.MyPrivateDockerComponent, currentState=RUNNING}
2023-03-27T12:33:22.779Z [INFO] (Copier) com.example.MyPrivateDockerComponent: Run script exited. {exitCode=2, serviceName=com.example.MyPrivateDockerComponent, currentState=RUNNING}
2023-03-27T12:33:22.807Z [INFO] (pool-2-thread-31) com.example.MyPrivateDockerComponent: shell-runner-start. {scriptName=services.com.example.MyPrivateDockerComponent.lifecycle.Shutdown, serviceName=com.example.MyPrivateDockerComponent, currentState=STOPPING, command=["docker stop $(docker ps -a -q --filter ancestor=242944196659.dkr.ecr.eu-centra..."]}
2023-03-27T12:33:23.546Z [INFO] (Copier) com.example.MyPrivateDockerComponent: stdout. f07d93c3c983. {scriptName=services.com.example.MyPrivateDockerComponent.lifecycle.Shutdown, serviceName=com.example.MyPrivateDockerComponent, currentState=STOPPING}
2023-03-27T12:33:23.594Z [INFO] (pool-2-thread-31) com.example.MyPrivateDockerComponent: shell-runner-start. {scriptName=services.com.example.MyPrivateDockerComponent.lifecycle.Run, serviceName=com.example.MyPrivateDockerComponent, currentState=STARTING, command=["docker run 242944196659.dkr.ecr.eu-central-1.amazonaws.com/test_repo:fileerror"]}
2023-03-27T12:33:25.985Z [WARN] (Copier) com.example.MyPrivateDockerComponent: stderr. python3: can't open file 'hello_world.py': [Errno 2] No such file or directory. {scriptName=services.com.example.MyPrivateDockerComponent.lifecycle.Run, serviceName=com.example.MyPrivateDockerComponent, currentState=RUNNING}
2023-03-27T12:33:26.714Z [INFO] (Copier) com.example.MyPrivateDockerComponent: Run script exited. {exitCode=2, serviceName=com.example.MyPrivateDockerComponent, currentState=RUNNING}
2023-03-27T12:33:26.756Z [INFO] (pool-2-thread-31) com.example.MyPrivateDockerComponent: shell-runner-start. {scriptName=services.com.example.MyPrivateDockerComponent.lifecycle.Shutdown, serviceName=com.example.MyPrivateDockerComponent, currentState=STOPPING, command=["docker stop $(docker ps -a -q --filter ancestor=242944196659.dkr.ecr.eu-centra..."]}
2023-03-27T12:33:27.511Z [INFO] (Copier) com.example.MyPrivateDockerComponent: stdout. 4ba1ed3b2ae0. {scriptName=services.com.example.MyPrivateDockerComponent.lifecycle.Shutdown, serviceName=com.example.MyPrivateDockerComponent, currentState=STOPPING}
2023-03-27T12:33:27.513Z [INFO] (Copier) com.example.MyPrivateDockerComponent: stdout. f07d93c3c983. {scriptName=services.com.example.MyPrivateDockerComponent.lifecycle.Shutdown, serviceName=com.example.MyPrivateDockerComponent, currentState=STOPPING}
2023-03-27T12:33:27.560Z [INFO] (pool-2-thread-31) com.example.MyPrivateDockerComponent: shell-runner-start. {scriptName=services.com.example.MyPrivateDockerComponent.lifecycle.Run, serviceName=com.example.MyPrivateDockerComponent, currentState=STARTING, command=["docker run 242944196659.dkr.ecr.eu-central-1.amazonaws.com/test_repo:fileerror"]}
2023-03-27T12:33:30.461Z [WARN] (Copier) com.example.MyPrivateDockerComponent: stderr. python3: can't open file 'hello_world.py': [Errno 2] No such file or directory. {scriptName=services.com.example.MyPrivateDockerComponent.lifecycle.Run, serviceName=com.example.MyPrivateDockerComponent, currentState=RUNNING}
2023-03-27T12:33:31.206Z [INFO] (Copier) com.example.MyPrivateDockerComponent: Run script exited. {exitCode=2, serviceName=com.example.MyPrivateDockerComponent, currentState=RUNNING}
2023-03-27T12:33:31.221Z [INFO] (pool-2-thread-31) com.example.MyPrivateDockerComponent: shell-runner-start. {scriptName=services.com.example.MyPrivateDockerComponent.lifecycle.Shutdown, serviceName=com.example.MyPrivateDockerComponent, currentState=BROKEN, command=["docker stop $(docker ps -a -q --filter ancestor=242944196659.dkr.ecr.eu-centra..."]}
2023-03-27T12:33:31.943Z [INFO] (Copier) com.example.MyPrivateDockerComponent: stdout. 8523b3d4bc02. {scriptName=services.com.example.MyPrivateDockerComponent.lifecycle.Shutdown, serviceName=com.example.MyPrivateDockerComponent, currentState=BROKEN}
2023-03-27T12:33:31.944Z [INFO] (Copier) com.example.MyPrivateDockerComponent: stdout. 4ba1ed3b2ae0. {scriptName=services.com.example.MyPrivateDockerComponent.lifecycle.Shutdown, serviceName=com.example.MyPrivateDockerComponent, currentState=BROKEN}
2023-03-27T12:33:31.944Z [INFO] (Copier) com.example.MyPrivateDockerComponent: stdout. f07d93c3c983. {scriptName=services.com.example.MyPrivateDockerComponent.lifecycle.Shutdown, serviceName=com.example.MyPrivateDockerComponent, currentState=BROKEN}
Hi, could you update description to provide the recipe's for the component (assume 1.0.0 and 2.0.0) along with the contents of the
deployment.json
used? As ROLLBACK is performed at the deployment level, also having the log entries would be helpful.Also, in the recipe provided,
Run
andShutdown
aren't compatible. As a result, the deployment can be complete and successful even if theRun
lifecycle script fails soon after running.Best is to use
Startup
andShutdown
to properly track the return code ofdocker run
.@Gavin_A; thanks for your comment. I updated the question. Hope this helps. Also, I am using Shutdown to make sure that the previous container is stopped once I deploy a new one. should I just replace "Run" by "Startup"?