how to create a custom inference file using sagemaker sdk that allows me to call a custom predict function(combination rule based method and ML prediction) after model training.

0

I am currently working on a kmeans clustering algorithm for my dataset. Currently what i have done is to creating a preprocess.py that preprocess my data and stores it in s3 bucket.and train step function called via Estimator sdk.

input_data = ParameterString(
    name="InputDataUrl",
    default_value="s3://ml-pipeline-jobs/input_files/mydata.csv",
)

# processing step for feature engineering
sklearn_processor = SKLearnProcessor(
    framework_version="0.23-1",
    instance_type=processing_instance_type,
    instance_count=processing_instance_count,
    base_job_name=f"{base_job_prefix}/sklearn-billofwork-preprocess",
    sagemaker_session=pipeline_session,
    role=role,
)
step_args = sklearn_processor.run(
    outputs=[
        ProcessingOutput(output_name="train_preprocessed", source="/opt/ml/processing/train/data_final"),
    ],
    code=os.path.join(BASE_DIR, "preprocess.py"),
    arguments=["--input-data", input_data],
)
step_process = ProcessingStep(
    name="PreprocessBillOfWorkData",
    step_args=step_args,
)

image_uri = sagemaker.image_uris.retrieve( framework="kmeans", region=region, py_version="py3", instance_type=training_instance_type, ) kmeans = Estimator( image_uri=image_uri, sagemaker_session=pipeline_session, role=role, instance_type=training_instance_type, instance_count=1, ) kmeans.set_hyperparameters( k= 40, feature_dim=27295 )

step_args_preprocess= TrainingInput(
s3_data=step_process.properties.ProcessingOutputConfig.Outputs["preprocessed_data"].S3Output.S3Uri, 
content_type="text/csv", 
)

step_train = TrainingStep(
    name="TrainBowModel",
    estimator=kmeans,
    inputs={
    "train":step_args_preprocess,
    }
 )

Now what i would like to do is to have a step created that accepts part of input data from step_process function and also accept a .py file that can take the new data and do some additional preprocessing and then perform .predict function.

I was able to complete until training dataset using aws sdks.But i am not sure how to proceed after this. I studied about how inferencing is done using AWS sdk and it seems there are 4 different types. But i clueless which exactly suits my type of problem.

Kindly please guide.Thanks

1 Answer
0

To implement a custom prediction behavior, you can use "Script Mode".

You can specify "entry_point" argument for Model object, and use it in your pipeline via ModelStep.

AWS
answered 2 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions