1개 답변
- 최신
- 최다 투표
- 가장 많은 댓글
0
Hi, SageMaker will replicate a subset of data (1/n ML compute instances) on each ML compute instance that is launched for model training when you specify ShardedByS3Key. If there are n ML compute instances launched for a training job, each instance gets approximately 1/n of the number of S3 objects. This applies in both File and Pipe modes. Keep this in mind when developing algorithms.
To answer your question: How much data of each worker get to train, 1 file or 2 files? 1 file each from the training channel.
답변함 4년 전