Create ML job V2 bug

1

I'm trying to convert my workflow away from the excellent Sagemaker Studio toward Sagemaker Notebooks so that I can have more granular control over the ML models I make. I've tried creating autopilot jobs with the create_auto_ml_job() function, but each job I've made has failed after 4-5 minutes without any logs. To solve this problem, I thought I'd try out the create_auto_ml_job_v2() function, with heavy reference to this API.

My data is tabular, but when I call the function like so:

client.create_auto_ml_job_v2(
    AutoMLJobName=job_name,
    AutoMLJobInputDataConfig=input_data_config,
    OutputDataConfig=output_data_config,
    AutoMLProblemTypeConfig={'TabularJobConfig': {'CompletionCriteria': 
                                                     {'MaxCandidates': 10,
                                                      'MaxRuntimePerTrainingJobInSeconds': 600,
                                                      'MaxAutoMLJobRuntimeInSeconds': 1800},
                                                 'FeatureSpecificationS3Uri': f's3://{bucket}/{prefix}/{feature_file}',
                                                 'Mode': 'AUTO',
                                                 'GenerateCandidateDefinitionsOnly': True,
                                                 'ProblemType': 'MulticlassClassification',
                                                 'TargetAttributeName': 'Y'}},                            
    RoleArn=role,
)

I get this error:

ParamValidationError: Parameter validation failed:
Unknown parameter in AutoMLProblemTypeConfig: "TabularJobConfig", must be one of: ImageClassificationJobConfig, TextClassificationJobConfig

This is clearly a valid parameter as stated in the API, so I'm struggling to see why this error is being thrown.

mluser
asked 10 months ago265 views
1 Answer
0

Hi there,

The documentation is correct, the error you are getting is most likely because you are using an older version of boto3 which has not yet implemented all of the features for create_auto_ml_job_v2. Please ensure that boto3 is updated to version 1.28.2 by running the following command in a new cell in your notebook & then restart the kernel:

!pip install boto3==1.28.2

I have tested this with one of our sample notebooks and got the AutoML job to start without issue, see the sample code provided below:

input_data_config = [
    {
        "DataSource": {
            "S3DataSource": {
                "S3DataType": "S3Prefix",
                "S3Uri": "s3://{}/{}/train".format(bucket, prefix),
            }
        },
    }
]

job_config = {
    'TabularJobConfig': {
            'CandidateGenerationConfig': {
                'AlgorithmsConfig': [
                    {
                        'AutoMLAlgorithms': [
                            'xgboost','lightgbm'
                        ]
                    },
                ]
            },
            "TargetAttributeName": target,
            "Mode": "ENSEMBLING",
    }
}

output_data_config = {"S3OutputPath": "s3://{}/{}/output".format(bucket, prefix)}
from time import gmtime, strftime, sleep
import boto3

timestamp_suffix = strftime("%Y%m%d-%H-%M", gmtime())

auto_ml_job_name = "automl-housing-" + timestamp_suffix
print("AutoMLJobName: " + auto_ml_job_name)

client = boto3.client("sagemaker")

client.create_auto_ml_job_v2(
    AutoMLJobName=auto_ml_job_name,
    AutoMLJobInputDataConfig=input_data_config,
    OutputDataConfig=output_data_config,
    AutoMLProblemTypeConfig=job_config,
    # Uncomment to automatically deploy an endpoint
    # ModelDeployConfig={
    #'AutoGenerateEndpointName': True,
    #'EndpointName': 'autopilot-DEMO-housing-' + timestamp_suffix
    # },
    RoleArn=role,
)
AWS
SUPPORT ENGINEER
answered 10 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions