Batch Transformation Mismatch -

0

I performing batch transformation in a pipeline and joining it with the ground truth labels to create a ModelQuality monitoring step as per the guide here: https://github.com/aws/amazon-sagemaker-examples/blob/main/sagemaker-pipelines/tabular/model-monitor-clarify-pipelines/sagemaker-pipeline-model-monitor-clarify-steps.ipynb

Overview:

  • Utilizing Sklearn pipeline tools for data handling and prediction (i.e. the sklearn pipeline module)
  • Using my own inference script (based on the build your own example provided by AWS)
  • Reading in csv file of data, which successfully loads and predicts labels (output ex: ['an01' 'hxn2' 'sv4' ... 'ngn' 'ssv' 'ssv'])
  • Transformation step is unsuccessful and provides following message:

2022-11-08T16:22:59.531:[sagemaker logs]: sagemaker-us-east-1-766029086407/TEST-monitor-steps/cipm6ea9musp/projectKW-V3-Tune-TrainRegister-process/output/test/test.csv: Fail to join data: mismatched line count between the input and the output

I am having trouble understanding if the configurations of TransformInput provided to the step are wrong or if there is an error with the inference script itself.

Below is the code for the transformer step:

transformer = Transformer(
    model_name=step_create_model.properties.ModelName,
    instance_type="ml.m5.xlarge",
    instance_count=1,
    accept="text/csv",
    assemble_with="Line",
    output_path=f"s3://{bucket}/Transform",
    sagemaker_session=sagemaker_session
)

step_transform = TransformStep(
    name="TransformStep",
    transformer=transformer,
    inputs=TransformInput(
        data=preprocessing_step.properties.ProcessingOutputConfig.Outputs["test"].S3Output.S3Uri,
        join_source="Input",
        content_type="text/csv",
        split_type="Line",
    ),
)

Below is the code for the output_fn in the inference script:

def output_fn(prediction, accept):
    if accept == "application/json":
        instances = []
        for row in prediction.tolist():
            instances.append({"features": row})

        json_output = {"instances": instances}

        return worker.Response(json.dumps(json_output), mimetype=accept)
    elif accept == 'application/x-npy':
        return worker.Response(encoders.encode(prediction, accept), mimetype=accept)
    elif accept == 'text/csv':
        return worker.Response(encoders.encode(prediction, accept), mimetype=accept)
    else:
        raise RuntimeException("{} accept type is not supported by this script.".format(accept))

Thanks for the help!

2 Answers
0

Due to the fact that this is an account specific issue, that is also related to your unique use case involving a custom inference script, to answer your question with great depth and to investigate the error you have provided, we require details that are non-public information. Please kindly open a support case with AWS using the following link and our engineers will be pleased to assist you further.

In your support case, kindly include the following details:

  1. BatchTransform Job ARN
  2. The overview you have included in your question for context.
  3. The inference script.
  4. CloudWatch BatchTransform job logs (say 10 lines before error and 5 lines after the error). These can be found in the following log stream "/aws/sagemaker/TransformJobs" , as seen the doc in [1].

[1] Log Amazon SageMaker Events with Amazon CloudWatch - https://docs.aws.amazon.com/sagemaker/latest/dg/logging-cloudwatch.html

AWS
answered a year ago
0

Agree that this sounds like it'd need some specific troubleshooting and good to raise with support.

However my initial guesses would be:

  • If your model output array is 1D, might be that it's getting serialized to 1,2,3,4,... instead of 1\n2\n3\n4\n... - in which case batch transform might be seeing the output as one record with many fields, instead of many records with a single field each. Try expanding the extra dimension / ensuring the CSV output format for a batched request is as you expect
  • If you're trying to accept/ignore header rows in input batches, remember you need to return the same number of output rows. Generally the model should return same number of records out as the batch had going in.
  • If urgently struggling with this, you could try setting strategy="SingleRecord" which will ensure each request sends only a single record so less to go wrong. However note that it'll be less resource-efficient (more HTTP/request overhead vs actual payload, than the default batched approach)
AWS
EXPERT
Alex_T
answered a year ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions