ShadowManager encountering issues and shutting down IPC service

0

I am facing an issue where it seems that the ShadowManager is encountering some issues, and shutting down the IPC service I believe. I haven't had this issue before, and it only came when I added the ShadowManager.

After a little while, when it shuts it down, other components that need IPC are no longer able to subscribe to topics.

I've got the below from the greengrass.log


2023-05-04T18:42:48.323Z [INFO] (pool-2-thread-48) com.aws.greengrass.tes.TokenExchangeService: TokenExchangeService is shutting down!. {serviceName=aws.greengrass.TokenExchangeService, currentState=STOPPING}
2023-05-04T18:42:49.327Z [INFO] (pool-2-thread-48) com.aws.greengrass.tes.TokenExchangeService: Stopped server at port 43297. {serviceName=aws.greengrass.TokenExchangeService, currentState=STOPPING}
2023-05-04T18:42:49.328Z [INFO] (aws.greengrass.TokenExchangeService-lifecycle) com.aws.greengrass.tes.TokenExchangeService: service-set-state. {serviceName=aws.greengrass.TokenExchangeService, currentState=STOPPING, newState=FINISHED}
2023-05-04T18:43:03.109Z [ERROR] (aws.greengrass.ShadowManager-lifecycle) com.aws.greengrass.shadowmanager.ShadowManager: service-errored. {reason=Timeout in shutdown, serviceName=aws.greengrass.ShadowManager, currentState=STOPPING}
2023-05-04T18:43:03.110Z [INFO] (aws.greengrass.ShadowManager-lifecycle) com.aws.greengrass.shadowmanager.ShadowManager: service-set-state. {serviceName=aws.greengrass.ShadowManager, currentState=STOPPING, newState=ERRORED}
2023-05-04T18:43:03.115Z [WARN] (pool-2-thread-12) com.aws.greengrass.deployment.IotJobsHelper: Interrupted while subscribing to Iot Jobs descriptions topic. {ThingName=DOWNSTAIRS-TESTBENCH-2}
2023-05-04T18:43:03.116Z [WARN] (FleetStatusService-lifecycle) com.aws.greengrass.status.FleetStatusService: service-state-transition-interrupted. Service lifecycle thread interrupted. Thread will exit now. {serviceName=FleetStatusService, currentState=FINISHED}
2023-05-04T18:43:03.116Z [WARN] (aws.greengrass.SecretManager-lifecycle) com.aws.greengrass.secretmanager.SecretManagerService: service-state-transition-interrupted. Service lifecycle thread interrupted. Thread will exit now. {serviceName=aws.greengrass.SecretManager, currentState=FINISHED}
2023-05-04T18:43:03.114Z [WARN] (pool-2-thread-13) com.aws.greengrass.deployment.ShadowDeploymentListener: Interrupted while subscribing to device shadow topics. {}
2023-05-04T18:43:03.116Z [ERROR] (pool-2-thread-38) com.aws.greengrass.shadowmanager.sync.CloudDataClient: cloud-data-client-subscription-error. Failed to update subscriptions. {}
java.lang.InterruptedException
    at java.util.concurrent.CompletableFuture.reportGet(Unknown Source)
    at java.util.concurrent.CompletableFuture.get(Unknown Source)
    at com.aws.greengrass.mqttclient.MqttClient.subscribe(MqttClient.java:410)
    at com.aws.greengrass.shadowmanager.sync.CloudDataClient.subscribeToShadows(CloudDataClient.java:260)
    at com.aws.greengrass.shadowmanager.sync.CloudDataClient.lambda$updateSubscriptions$1(CloudDataClient.java:173)
    at com.aws.greengrass.util.RetryUtils.runWithRetry(RetryUtils.java:50)
    at com.aws.greengrass.shadowmanager.sync.CloudDataClient.updateSubscriptions(CloudDataClient.java:171)
    at com.aws.greengrass.shadowmanager.sync.CloudDataClient.lambda$updateSubscriptions$0(CloudDataClient.java:139)
    at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
    at java.util.concurrent.FutureTask.run(Unknown Source)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
    at java.lang.Thread.run(Unknown Source)

2023-05-04T18:43:03.117Z [WARN] (pool-2-thread-47) com.aws.greengrass.shadowmanager.sync.strategy.BaseSyncStrategy: sync. Interrupted while waiting for sync requests. {}
2023-05-04T18:43:03.119Z [INFO] (pool-2-thread-47) com.aws.greengrass.shadowmanager.sync.strategy.BaseSyncStrategy: sync. Finished processing sync requests. {}
2023-05-04T18:43:03.117Z [WARN] (pool-2-thread-42) com.aws.greengrass.mqttclient.MqttClient: Shutting down spooler task. {}
2023-05-04T18:43:03.122Z [INFO] (pool-2-thread-25) com.aws.greengrass.shadowmanager.sync.strategy.BaseSyncStrategy: sync. Stop syncing. {}
2023-05-04T18:43:03.123Z [INFO] (pool-2-thread-25) com.aws.greengrass.shadowmanager.sync.strategy.BaseSyncStrategy: sync. Cancel 1 sync thread(s). {}
2023-05-04T18:43:03.131Z [INFO] (Serialized listener processor) com.aws.greengrass.lifecyclemanager.KernelLifecycle: executor-service-shutdown-initiated. {}
2023-05-04T18:43:03.130Z [WARN] (pool-2-thread-13) com.aws.greengrass.mqttclient.MqttClient: Failed to run MQTT spooler. {}
java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.FutureTask@3df83b86 rejected from java.util.concurrent.ThreadPoolExecutor@66954c7d[Shutting down, pool size = 2, active threads = 2, queued tasks = 0, completed tasks = 163]
    at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(Unknown Source)
    at java.util.concurrent.ThreadPoolExecutor.reject(Unknown Source)
    at java.util.concurrent.ThreadPoolExecutor.execute(Unknown Source)
    at java.util.concurrent.AbstractExecutorService.submit(Unknown Source)
    at com.aws.greengrass.mqttclient.MqttClient.triggerSpooler(MqttClient.java:427)
    at com.aws.greengrass.mqttclient.MqttClient.publish(MqttClient.java:524)
    at com.aws.greengrass.mqttclient.WrapperMqttClientConnection.publish(WrapperMqttClientConnection.java:96)
    at software.amazon.awssdk.iot.iotshadow.IotShadowClient.PublishGetNamedShadow(IotShadowClient.java:425)
    at com.aws.greengrass.deployment.ShadowDeploymentListener.publishToGetDeviceShadowTopic(ShadowDeploymentListener.java:310)
    at com.aws.greengrass.deployment.ShadowDeploymentListener.lambda$setupShadowCommunications$2(ShadowDeploymentListener.java:209)
    at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
    at java.util.concurrent.FutureTask.run(Unknown Source)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
    at java.lang.Thread.run(Unknown Source)

2023-05-04T18:43:03.133Z [INFO] (Thread-1) com.aws.greengrass.lifecyclemanager.KernelLifecycle: Waiting for executors to shutdown. {}
2023-05-04T18:43:03.133Z [INFO] (Thread-1) com.aws.greengrass.lifecyclemanager.KernelLifecycle: executor-service-shutdown-complete. {executor-terminated=false, scheduled-executor-terminated=true}
2023-05-04T18:43:03.136Z [INFO] (Thread-1) com.aws.greengrass.lifecyclemanager.KernelLifecycle: context-shutdown-initiated. {}
2023-05-04T18:43:03.136Z [ERROR] (Thread-4) com.aws.greengrass.mqttclient.MqttClient: Error subscribing. {topic=$aws/things/DOWNSTAIRS-TESTBENCH-2/jobs/$next/namespace-aws-gg-deployment/get/accepted}
java.util.concurrent.CompletionException: java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.FutureTask@5bfdc9e rejected from java.util.concurrent.ThreadPoolExecutor@66954c7d[Shutting down, pool size = 10, active threads = 3, queued tasks = 0, completed tasks = 162]
    at java.util.concurrent.CompletableFuture.encodeThrowable(Unknown Source)
    at java.util.concurrent.CompletableFuture.completeThrowable(Unknown Source)
    at java.util.concurrent.CompletableFuture.uniApply(Unknown Source)
    at java.util.concurrent.CompletableFuture$UniApply.tryFire(Unknown Source)
    at java.util.concurrent.CompletableFuture.postComplete(Unknown Source)
    at java.util.concurrent.CompletableFuture.complete(Unknown Source)
    at software.amazon.awssdk.crt.AsyncCallback$1.onSuccess(AsyncCallback.java:27)
    at software.amazon.awssdk.crt.mqtt.MqttClientConnection.onConnectionComplete(MqttClientConnection.java:138)
Caused by: java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.FutureTask@5bfdc9e rejected from java.util.concurrent.ThreadPoolExecutor@66954c7d[Shutting down, pool size = 10, active threads = 3, queued tasks = 0, completed tasks = 162]
    at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(Unknown Source)
    at java.util.concurrent.ThreadPoolExecutor.reject(Unknown Source)
    at java.util.concurrent.ThreadPoolExecutor.execute(Unknown Source)
    at java.util.concurrent.AbstractExecutorService.submit(Unknown Source)
    at com.aws.greengrass.shadowmanager.sync.CloudDataClient.updateSubscriptions(CloudDataClient.java:139)
    at com.aws.greengrass.shadowmanager.ShadowManager.startSyncingShadows(ShadowManager.java:553)
    at com.aws.greengrass.shadowmanager.ShadowManager$1.onConnectionResumed(ShadowManager.java:114)
    at com.aws.greengrass.mqttclient.CallbackEventManager.runOnInitialConnect(CallbackEventManager.java:49)
    at com.aws.greengrass.mqttclient.AwsIotMqttClient.lambda$connect$7(AwsIotMqttClient.java:246)
    ... 6 more

2023-05-04T18:43:03.138Z [ERROR] (Thread-4) com.aws.greengrass.mqttclient.MqttClient: Error subscribing. {topic=$aws/things/DOWNSTAIRS-TESTBENCH-2/tunnels/notify}
java.util.concurrent.CompletionException: java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.FutureTask@5bfdc9e rejected from java.util.concurrent.ThreadPoolExecutor@66954c7d[Shutting down, pool size = 10, active threads = 3, queued tasks = 0, completed tasks = 162]
    at java.util.concurrent.CompletableFuture.encodeThrowable(Unknown Source)
    at java.util.concurrent.CompletableFuture.completeThrowable(Unknown Source)
    at java.util.concurrent.CompletableFuture.uniApply(Unknown Source)
    at java.util.concurrent.CompletableFuture$UniApply.tryFire(Unknown Source)
    at java.util.concurrent.CompletableFuture.postComplete(Unknown Source)
    at java.util.concurrent.CompletableFuture.complete(Unknown Source)
    at software.amazon.awssdk.crt.AsyncCallback$1.onSuccess(AsyncCallback.java:27)
    at software.amazon.awssdk.crt.mqtt.MqttClientConnection.onConnectionComplete(MqttClientConnection.java:138)
Caused by: java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.FutureTask@5bfdc9e rejected from java.util.concurrent.ThreadPoolExecutor@66954c7d[Shutting down, pool size = 10, active threads = 3, queued tasks = 0, completed tasks = 162]
    at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(Unknown Source)
    at java.util.concurrent.ThreadPoolExecutor.reject(Unknown Source)
    at java.util.concurrent.ThreadPoolExecutor.execute(Unknown Source)
    at java.util.concurrent.AbstractExecutorService.submit(Unknown Source)
    at com.aws.greengrass.shadowmanager.sync.CloudDataClient.updateSubscriptions(CloudDataClient.java:139)
    at com.aws.greengrass.shadowmanager.ShadowManager.startSyncingShadows(ShadowManager.java:553)
    at com.aws.greengrass.shadowmanager.ShadowManager$1.onConnectionResumed(ShadowManager.java:114)
    at com.aws.greengrass.mqttclient.CallbackEventManager.runOnInitialConnect(CallbackEventManager.java:49)
    at com.aws.greengrass.mqttclient.AwsIotMqttClient.lambda$connect$7(AwsIotMqttClient.java:246)
  • This log is indicative of Greengrass as a whole shutting down or restarting.

    Look for "system-shutdown" in the logs.

  • Yeah I can see it littered through my logs - but I am not sure if it has been initiated by me restarting the greengrass service, or because of something else. Here's a small snippet:

    2023-05-02T17:57:49.440Z [WARN] (Thread-1) com.aws.greengrass.lifecyclemanager.Kernel: Shutting down Nucleus due to external signal. {} 2023-05-02T17:57:49.455Z [INFO] (Copier) com.aws.greengrass.lifecyclemanager.GenericExternalService: Run script exited. {exitCode=143, serviceName=com.savic.VehicleDataUploader, currentState=RUNNING} 2023-05-02T17:57:49.456Z [INFO] (com.savic.Evaluator-lifecycle) com.aws.greengrass.lifecyclemanager.GenericExternalService: service-set-state. {serviceName=com.savic.Evaluator, currentState=RUNNING, newState=ERRORED} 2023-05-02T17:57:49.453Z [INFO] (Thread-1) com.aws.greengrass.lifecyclemanager.KernelLifecycle: system-shutdown. {main=services.main:FINISHED} 2023-05-02T17:57:49.461Z [INFO] (Thread-8) software.amazon.awssdk.eventstreamrpc.RpcServer: Server connection closed code [socket is closed.]: Id 37, Class ServerConnection, Refs 1 - <null>. {} 2023-05-02T17:57:49.462Z [INFO] (com.savic.ProximityStart-lifecycle) com.aws.greengrass.lifecyclemanager.GenericExternalService: service-set-state. {serviceName=com.savic.ProximityStart, currentState=ERRORED, newState=STOPPING} 2023-05-02T17:57:49.463Z [INFO] (com.savic.Evaluator-lifecycle) com.aws.greengrass.lifecyclemanager.GenericExternalService: service-set-state. {serviceName=com.savic.Eval

  • Shutting down Nucleus due to external signal -- you are telling Greengrass to shutdown, so then it does shutdown.

  • Look... I've been working on this last 1.5 years, with a comprehensive framework built around GG - so I am well aware when I shut down, start, and restart. This issue where suddenly experiencing is after the introduction of ShadowManager. What happens is after some undetermined time, the service simply no longer is able to subscribe to IPC, the device loses connectivity with AWS, and it no longer can accept GG component updates, or can I no longer SSH into the device via Secure Tunnell.

    The restarts initiatied from the outside are very likely either:

    1. Me restarting it via greengrass-cli
    2. Me restarting greengrass.service via systemctl; or
    3. Me rebooting the device abruptly
Sash
已提问 1 年前98 查看次数
没有答案

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则