Sagemaker Data Capture does not write files

0

I want to enable data capture for a specific endpoint (so far, only via the console). The endpoint works fine and also logs & returns the desired results. However, no files are written to the specified S3 location.

Endpoint Configuration

The endpoint is based on a training job with a scikit learn classifier. It has only one variant which is a ml.m4.xlarge instance type. Data Capture is enabled with a sampling percentage of 100%. As data capture storage locations I tried s3://<bucket-name> as well as s3://<bucket-name>/<some-other-path>. With the "Capture content type" I tried leaving everything blank, setting text/csv in "CSV/Text" and application/json in "JSON".

Endpoint Invokation

The endpoint is invoked in a Lambda function with a client. Here's the call:

sagemaker_body_source = {
            "segments": segments,
            "language": language
        }
payload = json.dumps(sagemaker_body_source).encode()
response = self.client.invoke_endpoint(EndpointName=endpoint_name,
                                       Body=payload,
                                       ContentType='application/json',
                                       Accept='application/json')
result = json.loads(response['Body'].read().decode())
return result["predictions"]

Internally, the endpoint uses a Flask API with an /invocation path that returns the result.

Logs

The endpoint itself works fine and the Flask API is logging input and output:

INFO:api:body: {'segments': [<strings...>], 'language': 'de'}
INFO:api:output: {'predictions': [{'text': 'some text', 'label': 'some_label'}, ....]}
2 Answers
1
Accepted Answer

So the issue seemed to be related to the IAM role. The default role (ModelEndpoint-Role) does not have access to write S3 files. It worked via the SDK since it uses another role in the sagemaker studio. I did not receive any error message about this.

Richard
answered 2 years ago
0

I had the same issue, and it was also related to IAM roles, but with a slight variation. When creating my Sagemaker domain I created a role with the suggested AmazonSageMakerFullAccess role. This role does have permission to write to S3 buckets, but only to those fulfilling any of the the patterns:

"arn:aws:s3:::*SageMaker*",
"arn:aws:s3:::*Sagemaker*",
"arn:aws:s3:::*sagemaker*",
"arn:aws:s3:::*aws-glue*"

As my bucket did not contain either of these patterns, it wasn't allowed to write to it. After renaming my bucket to contain "sagemaker" it worked.

answered 3 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions