Integrate Sklearn Processing Step in Inference Pipeline

0

Hello,

I am facing a problem as I not able to integrate a fitted sklearn processor/estimator in my sagemaker pipeline. I am defining the different steps in different functions as the follows:

def _get_step_preprocess(
    pipeline_session: PipelineSession,
    processing_instance_count: ParameterInteger,
    role: str,
    # input_data_uri: ParameterString,
    subnet_id: str,
    security_group_id: str,
) -> ProcessingStep:
    """
    Step 1
    This Step is preprocessing the data as a first step of the pipeline.
    Args:
        processing_instance_count (ParameterInteger): Number of instances
        role (str): Sagemaker Execution Role

    Returns:
        ProcessingStep: Defined PreprocessingStep
    """

    network_config = NetworkConfig(
        enable_network_isolation=False,
        security_group_ids=[security_group_id],
        subnets=[subnet_id],
        encrypt_inter_container_traffic=True,
    )

    sklearn_processor = FrameworkProcessor(
        estimator_cls=SKLearn,
        framework_version="1.0-1",
        instance_count=processing_instance_count,
        instance_type="ml.m5.xlarge",
        sagemaker_session=pipeline_session,
        base_job_name="name",
        role=role,
        network_config=network_config,
    )

    processor_args = sklearn_processor.run(
        inputs=[],
        outputs=[
            ProcessingOutput(output_name="train", source="/opt/ml/processing/train"),
            ProcessingOutput(output_name="validation", source="/opt/ml/processing/validation"),
            ProcessingOutput(output_name="test", source="/opt/ml/processing/test"),
            ProcessingOutput(output_name="encoder", source="/opt/ml/processing/encoder"),
        ],
        code="main.py",
        source_dir="../sagemaker/step_preprocess",
    )

    step_preprocess = ProcessingStep(name="BankingSecondaryRejectionPreprocess", step_args=processor_args)

    return step_preprocess

If seen in different examples that I am not only able to execute a script like in the given example but also fit a sklearn preprocessor which can be integrated in my final pipeline model and so in the whole inference endpoint. An example i came across was this: https://sagemaker-examples.readthedocs.io/en/latest/sagemaker-python-sdk/scikit_learn_inference_pipeline/Inference%20Pipeline%20with%20Scikit-learn%20and%20Linear%20Learner.html

Nevertheless, I am not able to integrate the sklearn estimator from the example into my whole preprocessing step defined above. How is it done the right way? Is it even possible? The ProcessingStep seems not to be able to take a fitted estimator as an argument.

Thanks in advance

No hay respuestas

No has iniciado sesión. Iniciar sesión para publicar una respuesta.

Una buena respuesta responde claramente a la pregunta, proporciona comentarios constructivos y fomenta el crecimiento profesional en la persona que hace la pregunta.

Pautas para responder preguntas