Amazon SageMaker - Training Job / Data Wrangler

0

I have a customer who is interested in testing Amazon Sagemaker and would like to consult the following questions:

Q1. While submitting training job in Amazon Sagemaker, if there is insufficient capacity occurred, would there be any auto-retry mechanism? How to set up? Q2. Is the underlying SQL / MySQL infrastructure in Data Wrangler from AWS serverless DB backend? Q3. What is the backend database to support Sagemaker / Sagemkaer Data Wrangler ?

Use case: Vision ML - Object detection (self-built algorithm) Framework: Tensorflow 2.4.4

AWS
feita há 2 anos390 visualizações
2 Respostas
0

Hi,

Q1/ this is not a built in feature for training job API. You'd need to implement on your side with some try:catch mechanism. If instead you are using SageMaker Pipelines to start the jobs, then that has such functionality (see: Retry mechanism)

Q2/Q3/ SageMaker Data Wrangler does NOT implement a database. It does offer the option to connect to a number of data sources, including databases. Is this what you meant? Can you elaborate a bit more on these two points on what you are looking for?

Thank you, G

AWS
respondido há 2 anos
0

Q1:In order to realize the retry of Training Job, you can use EventBridge to integrate Sagemaker Pipeline. For details, please refer to the following link https://docs.aws.amazon.com/sagemaker/latest/dg/pipeline-eventbridge.html

respondido há um ano

Você não está conectado. Fazer login para postar uma resposta.

Uma boa resposta responde claramente à pergunta, dá feedback construtivo e incentiva o crescimento profissional de quem perguntou.

Diretrizes para responder a perguntas