I've been working with GGv2 and containerized Lambdas and have run into an issue where removing a Lambda via a deployment revision causes the aws.greengrass.LambdaManager
component to be stuck in a fault state.
To reproduce the issue, you can follow these steps:
- Create two distinct, containerized Lambda components (e.g., lambda1 and lambda2)
- Create a deployment containing both containerized Lambda components
- Verify the deployment is successfully pushed down to the GGv2 core device
- Revise the deployment to remove one of the containerized Lambda components (e.g., lambda2) and push it down to the GGv2 core device
- While processing the revised deployment, the
aws.greengrass.LambdaManager
component will complain that the Lambda that was removed from the deployment (e.g., lambda2) 'does not exist' and throws a LambdaNotFoundException
:
2022-01-19T17:09:18.266Z [INFO] (aws.greengrass.LambdaManager-lifecycle) com.aws.greengrass.lambdamanager.LambdaManager: service-set-state. {serviceName=aws.greengrass.LambdaManager, currentState=ERRORED, newState=NEW}
2022-01-19T17:09:18.280Z [ERROR] (pool-2-thread-29) com.aws.greengrass.lambdamanager.LambdaManager: service-errored. {serviceName=aws.greengrass.LambdaManager, currentState=NEW}
com.aws.greengrass.lambdamanager.LambdaNotFoundException: Lambda component:<LAMBDA_COMPONENT_NAME> does not exist
at com.aws.greengrass.lambdamanager.system.v1subscription.RouteTable.lambda$normalizeToArn$2(RouteTable.java:139)
at java.base/java.util.Optional.orElseThrow(Unknown Source)
at com.aws.greengrass.lambdamanager.system.v1subscription.RouteTable.normalizeToArn(RouteTable.java:138)
at com.aws.greengrass.lambdamanager.system.v1subscription.RouteTable.convertConfigToRoute(RouteTable.java:106)
at com.aws.greengrass.lambdamanager.system.v1subscription.RouteTable.addRoute(RouteTable.java:98)
at com.aws.greengrass.lambdamanager.system.v1subscription.RouteTable.loadRoutesFromConfig(RouteTable.java:59)
at com.aws.greengrass.lambdamanager.system.RouterLambda.loadSubscription(RouterLambda.java:60)
at com.aws.greengrass.lambdamanager.LambdaManager.reloadV1Subscription(LambdaManager.java:337)
at com.aws.greengrass.lambdamanager.LambdaManager.lambda$install$1(LambdaManager.java:122)
at com.aws.greengrass.config.Topics.subscribe(Topics.java:469)
at com.aws.greengrass.lambdamanager.LambdaManager.install(LambdaManager.java:117)
at com.aws.greengrass.lifecyclemanager.Lifecycle.lambda$handleCurrentStateNew$5(Lifecycle.java:441)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.base/java.lang.Thread.run(Unknown Source)
2022-01-19T17:09:18.287Z [INFO] (aws.greengrass.LambdaManager-lifecycle) com.aws.greengrass.lambdamanager.LambdaManager: service-set-state. {serviceName=aws.greengrass.LambdaManager, currentState=NEW, newState=BROKEN}
2022-01-19T17:09:18.312Z [INFO] (aws.greengrass.LambdaManager-lifecycle) com.aws.greengrass.status.FleetStatusService: fss-status-update-published. Status update published to FSS. {serviceName=FleetStatusService, currentState=RUNNING}
2022-01-19T17:09:18.318Z [ERROR] (aws.greengrass.LambdaManager-lifecycle) com.aws.greengrass.lambdamanager.LambdaManager: service-broken. service is broken. Deployment is needed. {serviceName=aws.greengrass.LambdaManager, currentState=BROKEN}
2022-01-19T17:09:19.201Z [WARN] (pool-2-thread-28) com.aws.greengrass.deployment.DeploymentConfigMerger: merge-config. merge-config-service BROKEN. {serviceName=aws.greengrass.LambdaManager}
2022-01-19T17:09:19.207Z [ERROR] (pool-2-thread-28) com.aws.greengrass.deployment.activator.DeploymentActivator: merge-config. Deployment failed. {deploymentId=8ce921fa-e850-40f7-96f7-dca557bcf130}
com.aws.greengrass.deployment.exceptions.ServiceUpdateException: Service aws.greengrass.LambdaManager in broken state after deployment
at com.aws.greengrass.deployment.DeploymentConfigMerger.waitForServicesToStart(DeploymentConfigMerger.java:194)
at com.aws.greengrass.deployment.activator.DefaultActivator.activate(DefaultActivator.java:84)
at com.aws.greengrass.deployment.DeploymentConfigMerger.updateActionForDeployment(DeploymentConfigMerger.java:150)
at com.aws.greengrass.deployment.DeploymentConfigMerger.lambda$mergeInNewConfig$0(DeploymentConfigMerger.java:102)
at com.aws.greengrass.lifecyclemanager.UpdateSystemPolicyService.runUpdateActions(UpdateSystemPolicyService.java:95)
at com.aws.greengrass.lifecyclemanager.UpdateSystemPolicyService.lambda$startup$0(UpdateSystemPolicyService.java:165)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.base/java.lang.Thread.run(Unknown Source)
After this:
- The deployment's status is set to 'Failed'
- The GGv2 core device's status is set to 'Unhealthy'
- The
aws.greengrass.LambdaManager
component's status is set to 'Broken'
- Worst of all, the other Lambda which remains in the deployment (e.g., lambda1) is NOT started since the
aws.greengrass.LambdaManager
component is a in 'Broken' state
Pushing down the deployment again (i.e., revising it without making any changes) does not fix the issue. Revising the deployment to contain both original Lambdas restores the GGv2 core back to a working state; however, this is definitely not preferred.
Has anyone seen this issue? Or are there any workarounds/settings/configurations that I am missing here?
Relevant component versions:
- aws.greengrass.Nucleus: 2.5.3
- aws.greengrass.LambdaLauncher: 2.0.9
- aws.greengrass.TokenExchangeService: 2.0.3
- aws.greengrass.LambdaRuntimes: 2.0.8
- aws.greengrass.LambdaManager: 2.2.1
- aws.greengrass.LegacySubscriptionRouter: 2.1.4
Thanks!