Do Amazon SageMaker manifest files enable dataset versioning?

0

Some Amazon SageMaker algorithms can train with a manifest JSON file that stores the mapping between images and their Amazon S3 ARNs and metadata, such as labels. This is a great option, because the manifest file is much smaller than the dataset itself. Because the manifest files are small, they can be used easily in versioning tools or saved as part of the model artifact. This appears to be the best construct enabling exact dataset versioning within SageMaker. i.e., if we exclude the creation of a unique training set hard copy per training job that can't be scaled to large datasets. Is my understanding accurate?

AWS
ESPERTO
posta 3 anni fa328 visualizzazioni
1 Risposta
0
Risposta accettata

If you create the conditions for immutability of the assets the manifest points to, then manifest enables exact dataset versioning with SageMaker. You can have a data store in Amazon S3 with all versions of the data assets and use the manifest files for creating and versioning datasets for specific usage.

If you don't guarantee immutability for the assets that the manifest points to, then your manifest becomes invalid.

con risposta 3 anni fa

Accesso non effettuato. Accedi per postare una risposta.

Una buona risposta soddisfa chiaramente la domanda, fornisce un feedback costruttivo e incoraggia la crescita professionale del richiedente.

Linee guida per rispondere alle domande