best practises for auto restarting docker application

0

I've created a custom component that downloads a ecr private docker and run 1 program as

"Lifecycle": {
"Run": "docker run --cap-add=SYS_PTRACE --runtime=nvidia -e DISPLAY=$DISPLAY --privileged --volume /tmp/.X11-unix:/tmp/.X11-unix --net=host -e NVIDIA_VISIBLE_DEVICES=all -v $HOME/.Xauthority:/root/.Xauthority -v /run/udev/control:/run/udev/control -v /dev:/dev -v /sys/firmware/devicetree/base/serial-number:/sys/firmware/devicetree/base/serial-number -e NVIDIA_DRIVER_CAPABILITIES=compute,utility,graphics xxxxxxxxx.dkr.ecr.ap-southeast-2.amazonaws.com/smartdvr:latest my-program" }, ...

I've got the above running, however if i manually stop the docker, or the my-program crashes, i don't see it auto-restarting.
what is the usual way to make sure the docker stays running and the application restarts if the program crashes for example ?

is there an option in the custom component that i can set ?
or does everyone just start their program as a linux service and let the service handle the restart ?

clogwog
已提問 3 年前檢視次數 551 次
1 個回答
0

Hi clogwog,

Thanks for using Greengrass V2. Greengrass automatically restarts components 3 times if the component Run lifecycle processes exits with an error the component goes to ERRORED state. If it doesn't recover in those 3 attempts the component will be put in BROKEN state and won't be auto-restarted, and you will need to deploy a fix for that issue. If you're application crashes and the docker run command exits in this manner your docker component will also be restarted. However, if the error is never reported to greengrass in this way and the docker container keeps running or exits with code 0, then Greengrass will not know about the issue and won't restart. You can check your components log file and greengrass.log file to check if the component follows this path. This is likely what you're looking for since you want to rerun the container when the containerized application crashes.
If you manually stop the container then greengrass does not have knowledge of that, the run command will also finish with 0 exit code in that case which is treated by Greengrass as success.
Another mechanism for restarting a component is via the greengrass cli, but that needs to be done by logging into the device. https://docs.aws.amazon.com/greengrass/v2/developerguide/gg-cli-component.html#component-restart

Please update this thread if that does't address your concern and provide component and container logs

Thanks,
Shagupta

AWS
已回答 3 年前

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南