I think this might be a bug, here is the scenario - in the same deployment upgrade aws.greengrass.Nucleus from any version (in my case 2.9.1) to 2.9.4 and an example component called com.example.longinstall . Nucleus doesn't seem to respect the installing component install step timeout, and restarts everything in the middle of the install. This is a problem for things like MSIs which are backgrounded and can't be interrupted.
Component source:
---
RecipeFormatVersion: "2020-01-25"
ComponentName: "com.example.longinstall"
ComponentVersion: "1.0.0"
ComponentType: "aws.greengrass.generic"
ComponentDescription: "This component takes 120s to install"
ComponentPublisher: "Me"
Manifests:
- Platform:
os: "windows"
Lifecycle:
Install:
Script: "powershell -command \"for ($i = 0; $i -lt 120; $i++){ 'Installing ('+$i+')'\
\ | Write-Output; Start-Sleep -Seconds 1 }\""
RequiresPrivilege: true
Timeout: 150
Startup:
Script: "powershell -command \"write-output Startup\""
RequiresPrivilege: true
Shutdown:
Script: "powershell -command \"write-output Shutdown\""
RequiresPrivilege: true
Artifacts: []
Lifecycle: {}
Log from component:
2023-03-09T00:58:47.101Z [INFO] (pool-2-thread-22) com.example.longinstall: shell-runner-start. {scriptName=services.com.example.longinstall.lifecycle.Install.Script, serviceName=com.example.longinstall, currentState=NEW, command=["powershell -command \u0022for ($i = 0; $i -lt 120; $i++){ 'Installing ('+$i+')' | W..."]}
2023-03-09T00:58:48.460Z [INFO] (Copier) com.example.longinstall: stdout. Installing (0). {scriptName=services.com.example.longinstall.lifecycle.Install.Script, serviceName=com.example.longinstall, currentState=NEW}
2023-03-09T00:58:49.484Z [INFO] (Copier) com.example.longinstall: stdout. Installing (1). {scriptName=services.com.example.longinstall.lifecycle.Install.Script, serviceName=com.example.longinstall, currentState=NEW}
2023-03-09T00:58:50.488Z [INFO] (Copier) com.example.longinstall: stdout. Installing (2). {scriptName=services.com.example.longinstall.lifecycle.Install.Script, serviceName=com.example.longinstall, currentState=NEW}
2023-03-09T00:58:51.502Z [INFO] (Copier) com.example.longinstall: stdout. Installing (3). {scriptName=services.com.example.longinstall.lifecycle.Install.Script, serviceName=com.example.longinstall, currentState=NEW}
..snip..
2023-03-09T00:59:46.057Z [INFO] (Copier) com.example.longinstall: stdout. Installing (57). {scriptName=services.com.example.longinstall.lifecycle.Install.Script, serviceName=com.example.longinstall, currentState=NEW}
2023-03-09T00:59:47.069Z [INFO] (Copier) com.example.longinstall: stdout. Installing (58). {scriptName=services.com.example.longinstall.lifecycle.Install.Script, serviceName=com.example.longinstall, currentState=NEW}
2023-03-09T00:59:53.984Z [INFO] (pool-2-thread-10) com.example.longinstall: shell-runner-start. {scriptName=services.com.example.longinstall.lifecycle.Install.Script, serviceName=com.example.longinstall, currentState=NEW, command=["powershell -command \u0022for ($i = 0; $i -lt 120; $i++){ 'Installing ('+$i+')' | W..."]}
2023-03-09T00:59:55.743Z [INFO] (Copier) com.example.longinstall: stdout. Installing (0). {scriptName=services.com.example.longinstall.lifecycle.Install.Script, serviceName=com.example.longinstall, currentState=NEW}
2023-03-09T00:59:56.752Z [INFO] (Copier) com.example.longinstall: stdout. Installing (1). {scriptName=services.com.example.longinstall.lifecycle.Install.Script, serviceName=com.example.longinstall, currentState=NEW}
..snip..
2023-03-09T01:01:55.966Z [INFO] (Copier) com.example.longinstall: stdout. Installing (119). {scriptName=services.com.example.longinstall.lifecycle.Install.Script, serviceName=com.example.longinstall, currentState=NEW}
2023-03-09T01:01:57.019Z [INFO] (pool-2-thread-10) com.example.longinstall: shell-runner-start. {scriptName=services.com.example.longinstall.lifecycle.Startup.Script, serviceName=com.example.longinstall, currentState=STARTING, command=["powershell -command \u0022write-output Startup\u0022"]}
2023-03-09T01:01:58.281Z [INFO] (Copier) com.example.longinstall: stdout. Startup. {scriptName=services.com.example.longinstall.lifecycle.Startup.Script, serviceName=com.example.longinstall, currentState=STARTING}
Where do you see the component is being "restarted"? If the startup is timing out you will see the log "Timeout in install" and the component will move to the errored state and then retry the installation. This log will be in the greengrass.log file.
I see in your log that the
startup
script does execute which means that the installation must have completed.The timeout is definitely respected, but you may have something else going on which is causing a restart. You will need to review the greengrass.log file for things relating to your component.
Cheers,
Michael
Another possibility is that you are doing a local deployment but com.example.longinstall also exists in your AWS account. If you deploy, even locally, the cloud version of a component version will be deployed, not the local copy. So if this is the case, make sure you deploy a version which does not exist in your account.
This was with a cloud deployment, no local deployments ever on this machine. The deployment file only includes these components:
You're right, it does complete the second time, but the timeout fires in greengrass.log at 00:59:17, when the install section started at 00:58:47, suspiciously 30 seconds. I can attach a santized log here in a minute.
Looks like there's no attachment feature and each comment can only be 600char, so I'll put it in code blocks and hopefully not make the thread unreadable: Nuclueus restart
Component is restarted (but not installed?)
(part1)
Shutdown timeout exception:
(part 2)