Amazon SageMaker - Training Job / Data Wrangler

0

I have a customer who is interested in testing Amazon Sagemaker and would like to consult the following questions:

Q1. While submitting training job in Amazon Sagemaker, if there is insufficient capacity occurred, would there be any auto-retry mechanism? How to set up? Q2. Is the underlying SQL / MySQL infrastructure in Data Wrangler from AWS serverless DB backend? Q3. What is the backend database to support Sagemaker / Sagemkaer Data Wrangler ?

Use case: Vision ML - Object detection (self-built algorithm) Framework: Tensorflow 2.4.4

AWS
gefragt vor 2 Jahren390 Aufrufe
2 Antworten
0

Hi,

Q1/ this is not a built in feature for training job API. You'd need to implement on your side with some try:catch mechanism. If instead you are using SageMaker Pipelines to start the jobs, then that has such functionality (see: Retry mechanism)

Q2/Q3/ SageMaker Data Wrangler does NOT implement a database. It does offer the option to connect to a number of data sources, including databases. Is this what you meant? Can you elaborate a bit more on these two points on what you are looking for?

Thank you, G

AWS
beantwortet vor 2 Jahren
0

Q1:In order to realize the retry of Training Job, you can use EventBridge to integrate Sagemaker Pipeline. For details, please refer to the following link https://docs.aws.amazon.com/sagemaker/latest/dg/pipeline-eventbridge.html

beantwortet vor einem Jahr

Du bist nicht angemeldet. Anmelden um eine Antwort zu veröffentlichen.

Eine gute Antwort beantwortet die Frage klar, gibt konstruktives Feedback und fördert die berufliche Weiterentwicklung des Fragenstellers.

Richtlinien für die Beantwortung von Fragen