ShadowManager encountering issues and shutting down IPC service

0

I am facing an issue where it seems that the ShadowManager is encountering some issues, and shutting down the IPC service I believe. I haven't had this issue before, and it only came when I added the ShadowManager.

After a little while, when it shuts it down, other components that need IPC are no longer able to subscribe to topics.

I've got the below from the greengrass.log


2023-05-04T18:42:48.323Z [INFO] (pool-2-thread-48) com.aws.greengrass.tes.TokenExchangeService: TokenExchangeService is shutting down!. {serviceName=aws.greengrass.TokenExchangeService, currentState=STOPPING}
2023-05-04T18:42:49.327Z [INFO] (pool-2-thread-48) com.aws.greengrass.tes.TokenExchangeService: Stopped server at port 43297. {serviceName=aws.greengrass.TokenExchangeService, currentState=STOPPING}
2023-05-04T18:42:49.328Z [INFO] (aws.greengrass.TokenExchangeService-lifecycle) com.aws.greengrass.tes.TokenExchangeService: service-set-state. {serviceName=aws.greengrass.TokenExchangeService, currentState=STOPPING, newState=FINISHED}
2023-05-04T18:43:03.109Z [ERROR] (aws.greengrass.ShadowManager-lifecycle) com.aws.greengrass.shadowmanager.ShadowManager: service-errored. {reason=Timeout in shutdown, serviceName=aws.greengrass.ShadowManager, currentState=STOPPING}
2023-05-04T18:43:03.110Z [INFO] (aws.greengrass.ShadowManager-lifecycle) com.aws.greengrass.shadowmanager.ShadowManager: service-set-state. {serviceName=aws.greengrass.ShadowManager, currentState=STOPPING, newState=ERRORED}
2023-05-04T18:43:03.115Z [WARN] (pool-2-thread-12) com.aws.greengrass.deployment.IotJobsHelper: Interrupted while subscribing to Iot Jobs descriptions topic. {ThingName=DOWNSTAIRS-TESTBENCH-2}
2023-05-04T18:43:03.116Z [WARN] (FleetStatusService-lifecycle) com.aws.greengrass.status.FleetStatusService: service-state-transition-interrupted. Service lifecycle thread interrupted. Thread will exit now. {serviceName=FleetStatusService, currentState=FINISHED}
2023-05-04T18:43:03.116Z [WARN] (aws.greengrass.SecretManager-lifecycle) com.aws.greengrass.secretmanager.SecretManagerService: service-state-transition-interrupted. Service lifecycle thread interrupted. Thread will exit now. {serviceName=aws.greengrass.SecretManager, currentState=FINISHED}
2023-05-04T18:43:03.114Z [WARN] (pool-2-thread-13) com.aws.greengrass.deployment.ShadowDeploymentListener: Interrupted while subscribing to device shadow topics. {}
2023-05-04T18:43:03.116Z [ERROR] (pool-2-thread-38) com.aws.greengrass.shadowmanager.sync.CloudDataClient: cloud-data-client-subscription-error. Failed to update subscriptions. {}
java.lang.InterruptedException
    at java.util.concurrent.CompletableFuture.reportGet(Unknown Source)
    at java.util.concurrent.CompletableFuture.get(Unknown Source)
    at com.aws.greengrass.mqttclient.MqttClient.subscribe(MqttClient.java:410)
    at com.aws.greengrass.shadowmanager.sync.CloudDataClient.subscribeToShadows(CloudDataClient.java:260)
    at com.aws.greengrass.shadowmanager.sync.CloudDataClient.lambda$updateSubscriptions$1(CloudDataClient.java:173)
    at com.aws.greengrass.util.RetryUtils.runWithRetry(RetryUtils.java:50)
    at com.aws.greengrass.shadowmanager.sync.CloudDataClient.updateSubscriptions(CloudDataClient.java:171)
    at com.aws.greengrass.shadowmanager.sync.CloudDataClient.lambda$updateSubscriptions$0(CloudDataClient.java:139)
    at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
    at java.util.concurrent.FutureTask.run(Unknown Source)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
    at java.lang.Thread.run(Unknown Source)

2023-05-04T18:43:03.117Z [WARN] (pool-2-thread-47) com.aws.greengrass.shadowmanager.sync.strategy.BaseSyncStrategy: sync. Interrupted while waiting for sync requests. {}
2023-05-04T18:43:03.119Z [INFO] (pool-2-thread-47) com.aws.greengrass.shadowmanager.sync.strategy.BaseSyncStrategy: sync. Finished processing sync requests. {}
2023-05-04T18:43:03.117Z [WARN] (pool-2-thread-42) com.aws.greengrass.mqttclient.MqttClient: Shutting down spooler task. {}
2023-05-04T18:43:03.122Z [INFO] (pool-2-thread-25) com.aws.greengrass.shadowmanager.sync.strategy.BaseSyncStrategy: sync. Stop syncing. {}
2023-05-04T18:43:03.123Z [INFO] (pool-2-thread-25) com.aws.greengrass.shadowmanager.sync.strategy.BaseSyncStrategy: sync. Cancel 1 sync thread(s). {}
2023-05-04T18:43:03.131Z [INFO] (Serialized listener processor) com.aws.greengrass.lifecyclemanager.KernelLifecycle: executor-service-shutdown-initiated. {}
2023-05-04T18:43:03.130Z [WARN] (pool-2-thread-13) com.aws.greengrass.mqttclient.MqttClient: Failed to run MQTT spooler. {}
java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.FutureTask@3df83b86 rejected from java.util.concurrent.ThreadPoolExecutor@66954c7d[Shutting down, pool size = 2, active threads = 2, queued tasks = 0, completed tasks = 163]
    at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(Unknown Source)
    at java.util.concurrent.ThreadPoolExecutor.reject(Unknown Source)
    at java.util.concurrent.ThreadPoolExecutor.execute(Unknown Source)
    at java.util.concurrent.AbstractExecutorService.submit(Unknown Source)
    at com.aws.greengrass.mqttclient.MqttClient.triggerSpooler(MqttClient.java:427)
    at com.aws.greengrass.mqttclient.MqttClient.publish(MqttClient.java:524)
    at com.aws.greengrass.mqttclient.WrapperMqttClientConnection.publish(WrapperMqttClientConnection.java:96)
    at software.amazon.awssdk.iot.iotshadow.IotShadowClient.PublishGetNamedShadow(IotShadowClient.java:425)
    at com.aws.greengrass.deployment.ShadowDeploymentListener.publishToGetDeviceShadowTopic(ShadowDeploymentListener.java:310)
    at com.aws.greengrass.deployment.ShadowDeploymentListener.lambda$setupShadowCommunications$2(ShadowDeploymentListener.java:209)
    at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
    at java.util.concurrent.FutureTask.run(Unknown Source)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
    at java.lang.Thread.run(Unknown Source)

2023-05-04T18:43:03.133Z [INFO] (Thread-1) com.aws.greengrass.lifecyclemanager.KernelLifecycle: Waiting for executors to shutdown. {}
2023-05-04T18:43:03.133Z [INFO] (Thread-1) com.aws.greengrass.lifecyclemanager.KernelLifecycle: executor-service-shutdown-complete. {executor-terminated=false, scheduled-executor-terminated=true}
2023-05-04T18:43:03.136Z [INFO] (Thread-1) com.aws.greengrass.lifecyclemanager.KernelLifecycle: context-shutdown-initiated. {}
2023-05-04T18:43:03.136Z [ERROR] (Thread-4) com.aws.greengrass.mqttclient.MqttClient: Error subscribing. {topic=$aws/things/DOWNSTAIRS-TESTBENCH-2/jobs/$next/namespace-aws-gg-deployment/get/accepted}
java.util.concurrent.CompletionException: java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.FutureTask@5bfdc9e rejected from java.util.concurrent.ThreadPoolExecutor@66954c7d[Shutting down, pool size = 10, active threads = 3, queued tasks = 0, completed tasks = 162]
    at java.util.concurrent.CompletableFuture.encodeThrowable(Unknown Source)
    at java.util.concurrent.CompletableFuture.completeThrowable(Unknown Source)
    at java.util.concurrent.CompletableFuture.uniApply(Unknown Source)
    at java.util.concurrent.CompletableFuture$UniApply.tryFire(Unknown Source)
    at java.util.concurrent.CompletableFuture.postComplete(Unknown Source)
    at java.util.concurrent.CompletableFuture.complete(Unknown Source)
    at software.amazon.awssdk.crt.AsyncCallback$1.onSuccess(AsyncCallback.java:27)
    at software.amazon.awssdk.crt.mqtt.MqttClientConnection.onConnectionComplete(MqttClientConnection.java:138)
Caused by: java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.FutureTask@5bfdc9e rejected from java.util.concurrent.ThreadPoolExecutor@66954c7d[Shutting down, pool size = 10, active threads = 3, queued tasks = 0, completed tasks = 162]
    at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(Unknown Source)
    at java.util.concurrent.ThreadPoolExecutor.reject(Unknown Source)
    at java.util.concurrent.ThreadPoolExecutor.execute(Unknown Source)
    at java.util.concurrent.AbstractExecutorService.submit(Unknown Source)
    at com.aws.greengrass.shadowmanager.sync.CloudDataClient.updateSubscriptions(CloudDataClient.java:139)
    at com.aws.greengrass.shadowmanager.ShadowManager.startSyncingShadows(ShadowManager.java:553)
    at com.aws.greengrass.shadowmanager.ShadowManager$1.onConnectionResumed(ShadowManager.java:114)
    at com.aws.greengrass.mqttclient.CallbackEventManager.runOnInitialConnect(CallbackEventManager.java:49)
    at com.aws.greengrass.mqttclient.AwsIotMqttClient.lambda$connect$7(AwsIotMqttClient.java:246)
    ... 6 more

2023-05-04T18:43:03.138Z [ERROR] (Thread-4) com.aws.greengrass.mqttclient.MqttClient: Error subscribing. {topic=$aws/things/DOWNSTAIRS-TESTBENCH-2/tunnels/notify}
java.util.concurrent.CompletionException: java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.FutureTask@5bfdc9e rejected from java.util.concurrent.ThreadPoolExecutor@66954c7d[Shutting down, pool size = 10, active threads = 3, queued tasks = 0, completed tasks = 162]
    at java.util.concurrent.CompletableFuture.encodeThrowable(Unknown Source)
    at java.util.concurrent.CompletableFuture.completeThrowable(Unknown Source)
    at java.util.concurrent.CompletableFuture.uniApply(Unknown Source)
    at java.util.concurrent.CompletableFuture$UniApply.tryFire(Unknown Source)
    at java.util.concurrent.CompletableFuture.postComplete(Unknown Source)
    at java.util.concurrent.CompletableFuture.complete(Unknown Source)
    at software.amazon.awssdk.crt.AsyncCallback$1.onSuccess(AsyncCallback.java:27)
    at software.amazon.awssdk.crt.mqtt.MqttClientConnection.onConnectionComplete(MqttClientConnection.java:138)
Caused by: java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.FutureTask@5bfdc9e rejected from java.util.concurrent.ThreadPoolExecutor@66954c7d[Shutting down, pool size = 10, active threads = 3, queued tasks = 0, completed tasks = 162]
    at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(Unknown Source)
    at java.util.concurrent.ThreadPoolExecutor.reject(Unknown Source)
    at java.util.concurrent.ThreadPoolExecutor.execute(Unknown Source)
    at java.util.concurrent.AbstractExecutorService.submit(Unknown Source)
    at com.aws.greengrass.shadowmanager.sync.CloudDataClient.updateSubscriptions(CloudDataClient.java:139)
    at com.aws.greengrass.shadowmanager.ShadowManager.startSyncingShadows(ShadowManager.java:553)
    at com.aws.greengrass.shadowmanager.ShadowManager$1.onConnectionResumed(ShadowManager.java:114)
    at com.aws.greengrass.mqttclient.CallbackEventManager.runOnInitialConnect(CallbackEventManager.java:49)
    at com.aws.greengrass.mqttclient.AwsIotMqttClient.lambda$connect$7(AwsIotMqttClient.java:246)
  • This log is indicative of Greengrass as a whole shutting down or restarting.

    Look for "system-shutdown" in the logs.

  • Yeah I can see it littered through my logs - but I am not sure if it has been initiated by me restarting the greengrass service, or because of something else. Here's a small snippet:

    2023-05-02T17:57:49.440Z [WARN] (Thread-1) com.aws.greengrass.lifecyclemanager.Kernel: Shutting down Nucleus due to external signal. {} 2023-05-02T17:57:49.455Z [INFO] (Copier) com.aws.greengrass.lifecyclemanager.GenericExternalService: Run script exited. {exitCode=143, serviceName=com.savic.VehicleDataUploader, currentState=RUNNING} 2023-05-02T17:57:49.456Z [INFO] (com.savic.Evaluator-lifecycle) com.aws.greengrass.lifecyclemanager.GenericExternalService: service-set-state. {serviceName=com.savic.Evaluator, currentState=RUNNING, newState=ERRORED} 2023-05-02T17:57:49.453Z [INFO] (Thread-1) com.aws.greengrass.lifecyclemanager.KernelLifecycle: system-shutdown. {main=services.main:FINISHED} 2023-05-02T17:57:49.461Z [INFO] (Thread-8) software.amazon.awssdk.eventstreamrpc.RpcServer: Server connection closed code [socket is closed.]: Id 37, Class ServerConnection, Refs 1 - <null>. {} 2023-05-02T17:57:49.462Z [INFO] (com.savic.ProximityStart-lifecycle) com.aws.greengrass.lifecyclemanager.GenericExternalService: service-set-state. {serviceName=com.savic.ProximityStart, currentState=ERRORED, newState=STOPPING} 2023-05-02T17:57:49.463Z [INFO] (com.savic.Evaluator-lifecycle) com.aws.greengrass.lifecyclemanager.GenericExternalService: service-set-state. {serviceName=com.savic.Eval

  • Shutting down Nucleus due to external signal -- you are telling Greengrass to shutdown, so then it does shutdown.

  • Look... I've been working on this last 1.5 years, with a comprehensive framework built around GG - so I am well aware when I shut down, start, and restart. This issue where suddenly experiencing is after the introduction of ShadowManager. What happens is after some undetermined time, the service simply no longer is able to subscribe to IPC, the device loses connectivity with AWS, and it no longer can accept GG component updates, or can I no longer SSH into the device via Secure Tunnell.

    The restarts initiatied from the outside are very likely either:

    1. Me restarting it via greengrass-cli
    2. Me restarting greengrass.service via systemctl; or
    3. Me rebooting the device abruptly
Sash
asked a year ago96 views
No Answers

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions