It depends on the dataset and the question for ML to answer.
Yes, it is feasible to do HPO with preprocessing. However, to run a HPO job, it is required to define to a specific target to achieve, e.g. maximize/minimize certain values during the whole HPO process. Thus, it is important to understand what is the target during preprocessing. If the answer is yes, they should be able to leverage Hyperparameter Tuning Jobs.
Here is how HPO works in SageMaker. Firstly, we define each training Job with output in a container and specify the hyperparameters in /opt/ml/input/config/hyperparameters.json. When we run the pipeline using HyperparameterTuner in SageMaker, the initial Job can pass the hyperparameters to the Pipeline for HPO, and return the model with highest score.
Option 1, if there is a clear defined target for preprocessing to achieve, we can also do HPO separately for data preprocessing through defining the function and outputs in a container and use HyperparameterTuner fit to tune the preprocessing.
Option 2. include the preprocessing + training code in the whole SageMaker Training Job. But then you can't use separate infrastructure for training and preprocessing.
So it depends on what exactly they are looking for, but they can likely use SageMaker HPO.
Using Hyperparameter Tuning Jobs over Training and PreprocessingAccepted Answerasked 2 years ago
Greengrass for data processing and ML model trainingAccepted AnswerEXPERTasked 3 years ago
SageMaker Model Registry, Model Monitor and Hyperparameter Tuning jobs - Pricing?Accepted Answerasked 9 days ago
Can I limit the type of instances that data scientists can launch for training jobs in SageMaker?Accepted Answerasked 2 years ago
Exporting Sagemaker model to local computerasked 6 months ago
Hyperparameter tuning for pipeline modelasked 8 months ago
SageMaker training job is not stoppingasked 5 months ago
Sagemaker vs. Data Science platformsAccepted Answerasked 4 years ago
Sagemaker training for multiclass classification run does not store the trained modelAccepted Answerasked 4 months ago
[AI/ML] Data acquisition and preprocessingAccepted Answerasked 2 years ago