Amazon SageMaker - Training Job / Data Wrangler

0

I have a customer who is interested in testing Amazon Sagemaker and would like to consult the following questions:

Q1. While submitting training job in Amazon Sagemaker, if there is insufficient capacity occurred, would there be any auto-retry mechanism? How to set up? Q2. Is the underlying SQL / MySQL infrastructure in Data Wrangler from AWS serverless DB backend? Q3. What is the backend database to support Sagemaker / Sagemkaer Data Wrangler ?

Use case: Vision ML - Object detection (self-built algorithm) Framework: Tensorflow 2.4.4

2개 답변
0

Hi,

Q1/ this is not a built in feature for training job API. You'd need to implement on your side with some try:catch mechanism. If instead you are using SageMaker Pipelines to start the jobs, then that has such functionality (see: Retry mechanism)

Q2/Q3/ SageMaker Data Wrangler does NOT implement a database. It does offer the option to connect to a number of data sources, including databases. Is this what you meant? Can you elaborate a bit more on these two points on what you are looking for?

Thank you, G

AWS
답변함 2년 전
0

Q1:In order to realize the retry of Training Job, you can use EventBridge to integrate Sagemaker Pipeline. For details, please refer to the following link https://docs.aws.amazon.com/sagemaker/latest/dg/pipeline-eventbridge.html

답변함 일 년 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인