Not able to create an endpoint

0

I am able to train and tune the model. But at the time of model deployment, the endpoint is not getting created, and it fails after some time. It gives the error as "FileNotFoundError: [Errno 2] No such file or directory: '/opt/ml/input/data/train/train_features.csv'". The question is: at the time of model deployment, if it is not picking the train_features.csv file because the path is not correct, then how does the model training and tuning happen? The training and tuning also happen on the same train_features.csv file, and the path is same.

1 Answer
0

For training and tuning, you provide the train_features.csv file to the container through the input channel when you call the .fit() method. The file is downloaded from S3 and put into the /opt/ml/input/data/train/train_features.csv location. After training is completed, only /opt/ml/model will be exported to S3 as model artificats.

During model deployment, a brand new host is provisioned and model artifacts are downloaded to /opt/ml/model . However, your train_features.csv file will not be available to the host because it does not know about it. You will need to get it from S3 and put in the expected location /opt/ml/input/data/train/train_features.csv . You can use an inference.py script to write a function that downloads the file from S3 before your model is loaded. Please have a look at this example for XGBoost.

Hope it helps.

AWS
answered a year ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions