1 Answer
- Newest
- Most votes
- Most comments
0
Hi Seun,
What type of endpoint are you deploying and what error are you getting at inference time? Not sure if you are using Batch Transform. When using Batch inference, you have a maximum payload size of GBs and maximum runtime of days. If you're using another type of deployment, such as a real-time endpoint, you might be exceeding a limit of payload size or response timeout. See here for more details.
See this link for a simple script to see the cause of error in the Inference Recommender job. Inference Recommender is the easiest way to assess the minimum instance type that will suit the workload.
Hope this helps.
answered 7 months ago
Relevant content
- asked 3 years ago
- AWS OFFICIALUpdated 2 years ago
- AWS OFFICIALUpdated 2 years ago
- AWS OFFICIALUpdated 8 months ago
- AWS OFFICIALUpdated a year ago