aws-greengrass-stream-manager-sdk-python causes stream_manager.exceptions.InvalidRequestException: File does not exist at path

0

I am using aws-greengrass-stream-manager-sdk-python:

if not self.client: self.client = StreamManagerClient() i check status of file and it is good to go: self.logger.debug(f"appending file is: file://{file}. isfile={os.path.isfile(file)}. sufficient rwx:{os.access(file, os.R_OK)} {os.access(file, os.W_OK)} {os.access(file, os.X_OK)} access.") Produces: isfile=True. sufficient rwx:True True True access. yet: # Append the S3 Task definition to the stream. print(f"Appending S3 Task Definition to stream {self.stream_name} with payload length: {len(payload)}") try: sequence_number = self.client.append_message(self.stream_name, payload) except Exception as e: print(f"Exception: {e}") assert False, "This is a test" self.logger.info( f"Successfully appended S3 Task Definition to stream with sequence number {sequence_number}." ) Produces: tream_manager.exceptions.InvalidRequestException: File does not exist at path < file name >

I got the same error from the sample code. https://github.com/aws-greengrass/aws-greengrass-stream-manager-sdk-python/blob/main/samples/stream_manager_s3.py

asked 4 months ago164 views
1 Answer
0
Accepted Answer

Hello,

Just because you have access to the file does not necessarily mean that Stream manager has access to the file, given the error you're seeing. Try writing the file to a guaranteed permissions-open path such as /tmp.

Take note that on Linux, file system permissions are required for every part of the path, not just the file. If you write a file to a/b/c, then the reader needs execute permission on directory a and directory b and then read permission on the file c.

Unless you have changed the configuration, Stream manager will be running as the user ggc_user, so that user must have the appropriate permissions to read the files.

You may change the user that a component runs as in a deployment by setting the runWith option. https://docs.aws.amazon.com/greengrass/v2/developerguide/create-deployments.html

Cheers,

Michael

AWS
EXPERT
answered 4 months ago
profile picture
EXPERT
reviewed 4 months ago
  • Thank you for such a quick reponse. To clarify:

    1. the there are 3 containers. 1. first container writes the file to a mount. 2. The second container reads the file from the mount. 3. Greengrass Nucleus is in a 3rd container.

    the 3rd container is listening on port 8088. it does not need access to the file. Right? "Stream manager " is running in the nucleus. I bet the port 8088 is just for signaling and the 3rd container needs access to the file.

    The permissions of the directory on the host are "drwxr-xr-x". The SDK is in the 2nd container. inside the container the directory permissions are "drwxr-xr-x." the container is running as privileged and root.

  • Containers are going to make this extremely complex. I cannot recommend that you do that.

    You can probably make this work, but I won't be able to help you with that. Just keep in mind what users exist both in and out of the container, what volumes are mounted, and what permissions are for those mounts and the files within.

    Files are not transferred over the HTTP connection (port 8088), that's why you provide a file path and not the file itself. Stream manager needs access to the file path that you tell it to read.

  • Perfect. It worked! The nucleus component needs access to the file! as you confirmed above. if you could influence the Documentation which is already good just needs to make clear two more things:

    1. the documents say the Nucleus Component controls the life cycle of the "Stream Manager Component" but it does not say it runs it! that would be helpful and I learned from "ps -ef --forest".
    2. the documents say the custom component that uses Stream Manager client (SDK) is only signalling "steammanger" and it must have access to the file.

    and for bonus: https://github.com/aws-greengrass/aws-greengrass-stream-manager-sdk-python/tree/main is to be used and not https://github.com/aws/aws-iot-device-sdk-python-v2 or https://github.com/aws/aws-greengrass-core-sdk-python

  • Glad it is working for you!

    1. Nucleus runs all components. That's the point of Nucleus. If you want to simplify more, Nucleus is Greengrass.
    2. https://github.com/aws/aws-greengrass-core-sdk-python this SDK is acceptable, it includes the stream manager SDK as part of it. Documentation provides links to the SDK here: https://docs.aws.amazon.com/greengrass/v2/developerguide/manage-data-streams.html#stream-manager-requirements. Exporting to S3 is described here: https://docs.aws.amazon.com/greengrass/v2/developerguide/stream-export-configurations.html#export-to-s3. "If a Docker container component writes input files to an input file directory, you must mount the directory as a volume in the container with write permissions."

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions