SageMaker Model Monitor Data Quality on Random Cut Forest configuration issues

I am trying to set up a Random Cut Forest model with a Data Quality job attached. I managed to train and deploy the model with the "data_capture" feature enabled.

# Training
rcf = sagemaker.RandomCutForest(
    role=role,
    instance_count=1,
    instance_type='ml.m4.xlarge',
    data_location=f"s3://{BUCKET}/random_cut_forest/input",
    output_path=f's3://{BUCKET}/random_cut_forest/output',
    num_sample_per_tree=1024,
    num_trees=50,
    serializer=JSONSerializer(),
    deserializer=CSVDeserializer()
)
rs = rcf.record_set(df_multi_measurements.drop("datetime", axis=1).to_numpy())
rcf.fit(rs, wait=False)

# Deploy
data_capture_config = DataCaptureConfig(
    enable_capture=True, 
    sampling_percentage=100, 
    destination_s3_uri=s3_capture_upload_path
)
rcf_inference = rcf.deploy(
    initial_instance_count=1, 
    instance_type='ml.m4.xlarge',
    endpoint_name=ENDPOINT_NAME,
    data_capture_config=data_capture_config,
    serializer=CSVSerializer(),
    deserializer=JSONDeserializer(),
    )

Then, I configured and started the ModelMonitor job

# Model Monitor
my_default_monitor = DefaultModelMonitor(
    role=role,
    instance_count=1,
    instance_type="ml.m4.xlarge",
    volume_size_in_gb=5,
    max_runtime_in_seconds=3600
)

my_default_monitor.suggest_baseline(
    baseline_dataset=baseline_data_uri + "/df_multi_measurements.csv",
    dataset_format=DatasetFormat.csv(header=True),
    output_s3_uri=baseline_results_uri,
    wait=True,
    logs=False
)

my_default_monitor.create_monitoring_schedule(
    monitor_schedule_name=mon_schedule_name,
    endpoint_input=rcf_inference.endpoint,
    output_s3_uri=s3_report_path,
    statistics=my_default_monitor.baseline_statistics(),
    constraints=my_default_monitor.suggested_constraints(),
    schedule_cron_expression=CronExpressionGenerator.hourly(),
    enable_cloudwatch_metrics=True,
)

But at the first run of the job I got this error:

Error: Encoding mismatch: Encoding is CSV for endpointInput, but Encoding is JSON for endpointOutput. We currently only support the same type of input and output encoding at the moment.

Data captured looked like:

{"captureData":{"endpointInput":{"observedContentType":"text/csv","mode":"INPUT","data":"4.150000013333333,3.330000003333333,...","encoding":"CSV"},"endpointOutput":{"observedContentType":"application/json","mode":"OUTPUT","data":"{\"scores\":[{\"score\":0.5794829282}]}","encoding":"JSON"}},"eventMetadata":{"eventId":"79add993-68cf-4903-9dfe-8275d164496f","inferenceTime":"2023-03-17T14:10:08Z"},"eventVersion":"0"}
...

So later I tried to force input and output to be both CSV but no luck.

After some tuning, I managed to instruct DataCapture to only collect requests in JSON so, since I couldn't change the output, now DataCapture has both input and output in the same (JSON) form.

The JSON requests look like this:

{
    "instances": [
        {
            'features': [3.8600000533333336, 3.5966666533333336...]
        }, 
        ...
    ]
}

and the model correctly works, returning its predictions:

b'{"scores":[{"score":0.6015237349},...]}'

Data captured now looks like:

{"captureData":{"endpointInput":{"observedContentType":"application/json","mode":"INPUT","data":"{\"instances\": [{\"features\": [3.8600000533333336, 3.5966666533333336, ...]}]}","encoding":"JSON"},"endpointOutput":{"observedContentType":"application/json","mode":"OUTPUT","data":"{\"scores\":[{\"score\":0.6015237349},{\"score\":0.4439660733},{\"score\":0.5100689867},{\"score\":0.5456048291},{\"score\":0.5099260466}]}","encoding":"JSON"}},"eventMetadata":{"eventId":"27e2c9cd-3301-419c-8d06-9ede4c6380e6","inferenceTime":"2023-03-17T17:10:18Z"},"eventVersion":"0"}

BUT... at the first run of this new configuration, the job returns an error on the data analysis part.

So, after some search, I found that model monitor only works with tabular data or plain json, so I added a preprocessing step into the ModelMonitor https://docs.aws.amazon.com/sagemaker/latest/dg/model-monitor-pre-and-post-processing.html

The preprocessing script looks like this:

import json
import random

"""
{
    "instances": [
        {
            'features': [3.8600000533333336, 3.5966666533333336...]
        }
        ...
    ]
}
"""

def preprocess_handler(inference_record):
    input_record = inference_record.endpoint_input.data
    print(input_record)
    input_record_dict = json.loads(input_record)
    
    features = input_record_dict["instances"][0]['features']
    
    return { str(i).zfill(20) : d for i, d in enumerate(features) }

And now, at the first run, again, I get an error that this time is absolutely NOT understandable at all:

2023-03-17 18:08:46,326 ERROR Main: No usable value for features
2023-03-17T19:08:46.935+01:00	No usable value for completeness
2023-03-17T19:08:46.935+01:00	Did not find value which can be converted into double

At this stage I feel a bit stuck. How can this be fixed? RCF and ModelMonitor should be easier to be integrated in my opinion.

What I am doing wrong?

Themen

Maschinelles Lernen & KI DevOps

Relevanter Inhalt

Wie behebe ich den Fehler „Unable to validate the following destination configurations“ in AWS CloudFormation?
AWS OFFICIALAktualisiert vor 2 Jahren
Wie konfiguriere ich meine Site-to-Site-VPN-Verbindung so, dass Tunnel A gegenüber Tunnel B bevorzugt wird?
AWS OFFICIALAktualisiert vor 2 Jahren
Wie kann ich den Fehler „Waiting for the slave SQL thread to free enough relay log space“ in Amazon Aurora MySQL beheben?
AWS OFFICIALAktualisiert vor 3 Jahren
Wie behebe ich den Fehler „Insufficient privileges for accessing data in S3“ beim Erstellen eines Imports in Amazon Personalize?
AWS OFFICIALAktualisiert vor 2 Jahren