I have a Greengrass deployment that includes a custom component as well as the AWS-authored StreamManager component in version 2.1.11. Starting a few hours ago, the StreamManager component has been logging the below sequence of messages every five minutes.
What does this message mean? Is StreamManager still operating normally or is data loss occurring? How can I recover?
I do not have physical access to the system where Greengrass is deployed. The system is online and I am able to create deployments. I am receiving logs from the device in Cloudwatch using AWS Greengrass Logmanager. These logs are taken from the /aws/greengrass/UserComponent/[my-aws-region]/aws.greengrass.StreamManager
log group.
2024-08-23T22:55:16.902Z [INFO] (Copier) aws.greengrass.StreamManager: stdout. 2024 Aug 23 15:55:16,902 [ERROR] (pool-7-thread-1) com.amazonaws.iot.greengrass.streammanager.export.upload.MessageUploaderTask: Encountered Throwable when exporting messages. {scriptName=services.aws.greengrass.StreamManager.lifecycle.startup.script, serviceName=aws.greengrass.StreamManager, currentState=RUNNING}
2024-08-23T22:55:16.902Z [INFO] (Copier) aws.greengrass.StreamManager: stdout. com.amazonaws.iot.greengrass.streammanager.store.exceptions.InvalidStreamPositionException: the sequence number is out of bounds: 17965369. Bounds are from 17965369 to 17965369.. {scriptName=services.aws.greengrass.StreamManager.lifecycle.startup.script, serviceName=aws.greengrass.StreamManager, currentState=RUNNING}
2024-08-23T22:55:16.902Z [INFO] (Copier) aws.greengrass.StreamManager: stdout. at com.amazonaws.iot.greengrass.streammanager.store.log.MessageStreamLog.read(MessageStreamLog.java:363) ~[AWSGreengrassGreenlake-1.0-super.jar:?]. {scriptName=services.aws.greengrass.StreamManager.lifecycle.startup.script, serviceName=aws.greengrass.StreamManager, currentState=RUNNING}
2024-08-23T22:55:16.902Z [INFO] (Copier) aws.greengrass.StreamManager: stdout. at com.amazonaws.iot.greengrass.streammanager.store.log.MessageInputStreamHandleLogImpl.read(MessageInputStreamHandleLogImpl.java:32) ~[AWSGreengrassGreenlake-1.0-super.jar:?]. {scriptName=services.aws.greengrass.StreamManager.lifecycle.startup.script, serviceName=aws.greengrass.StreamManager, currentState=RUNNING}
2024-08-23T22:55:16.902Z [INFO] (Copier) aws.greengrass.StreamManager: stdout. at com.amazonaws.iot.greengrass.streammanager.export.upload.MessageUploaderTask.upload(MessageUploaderTask.java:66) ~[AWSGreengrassGreenlake-1.0-super.jar:?]. {scriptName=services.aws.greengrass.StreamManager.lifecycle.startup.script, serviceName=aws.greengrass.StreamManager, currentState=RUNNING}
2024-08-23T22:55:16.902Z [INFO] (Copier) aws.greengrass.StreamManager: stdout. at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1700) [?:?]. {scriptName=services.aws.greengrass.StreamManager.lifecycle.startup.script, serviceName=aws.greengrass.StreamManager, currentState=RUNNING}
2024-08-23T22:55:16.902Z [INFO] (Copier) aws.greengrass.StreamManager: stdout. at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]. {scriptName=services.aws.greengrass.StreamManager.lifecycle.startup.script, serviceName=aws.greengrass.StreamManager, currentState=RUNNING}
2024-08-23T22:55:16.902Z [INFO] (Copier) aws.greengrass.StreamManager: stdout. at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]. {scriptName=services.aws.greengrass.StreamManager.lifecycle.startup.script, serviceName=aws.greengrass.StreamManager, currentState=RUNNING}
2024-08-23T22:55:16.902Z [INFO] (Copier) aws.greengrass.StreamManager: stdout. at java.lang.Thread.run(Thread.java:829) [?:?]. {scriptName=services.aws.greengrass.StreamManager.lifecycle.startup.script, serviceName=aws.greengrass.StreamManager, currentState=RUNNING}
Since I posted the question I have learned that the system where Greengrass is running was reset to a Windows restore point. This does indeed explain why StreamManager finds the file-persistence cache be in an inconsistent state. And, of course, any data written to disk after the restore point is lost. But I still do not understand what the implication of this is for the data export. Assuming the system was never offline while new data arrived, can I assume that all data has been forwarded to my configured export (Kinesis) despite this problem with persisting to the file cache?
I would expect any data from streams that were processed before the corruption to have been forwarded to the destination correctly, up until around the point of the corruption. If you check your destination and find the last timestamp for your data then data up until that point should all be there.