How do you find out input features required for a sagemaker model to do batch inference?

0

Say for example I have a trained sagemaker model artefact or model in a model registry. Now I need to prepare the input dataset to be used in the model for batch inference. How do I know what input features the model is expecting so that I can prepare the data accordingly ? Is there a way to find out from the model artefact?

已提問 2 年前檢視次數 556 次
1 個回答
-1

In short, this is not possible unless you manually associate some additional data or can trace back to somewhere the information is available.

I'd suggest to check out SageMaker Lineage Tracking to help with tracking connections like these, but the tools there are generally at the dataset level rather than feature-level. Since SageMaker serves a very broad variety of ML domains (e.g. from tabular to image, voice, video, text, and many more), the concept of "a feature" is tricky to scope without being overly restrictive: Is it a reference to SageMaker Feature Store? What about customers using alternative feature stores or plain CSV data?

If you're working in domains that support it (e.g. especially tabular), I might recommend SageMaker data quality profiling as a nice way to track this. A data quality baseline report will contain schema and also feature distribution information, and you can attach this report to your model package in Model Registry. It won't fully describe the source of your data of course, but will document the properties of it and also enable you to run data drift monitoring on your deployed models.

You should see that supported request/response content types are also available as fields in Model Registry, and you can even attach a sample payload URL as used by Inference Recommender. If you need to store additional information that doesn't have a clear place in Model Registry, you could of course resort to Tags.

So there are multiple options to associate data information with your model packages - but if you have an existing model package without this information, there may not be an automatic way to trace the data "source" in your particular context.

AWS
專家
Alex_T
已回答 2 年前

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南