SageMaker Pipe Mode

0

Does SageMaker pipe mode serve as a cost saving measure? Or is is just faster than file mode but generally not much cheaper? The cost savings of it might be 1. no need to copy data to training instances and 2. training instances need less space. Are these savings generally significant for customers?

AWS
전문가
질문됨 4년 전357회 조회
1개 답변
0
수락된 답변

To the best of my understanding, pipe mode decreases startup times, but frequently increases the bill.

The SageMaker billing starts after the data has been copied onto the container in File mode and control is transferred to the user script.

Reading the data in pipe mode starts after control is transferred, so the data transfer happens during the billable time.

Further the data is, to the best of my knowledge, not hitting the disk (EBS). This is fast, but also means that if you pass over your data multiple times, you have to re-read it again, on your dime (S3 requests and container wait times).

Pipe mode is still a good idea. For example if you have only few passes over the data and the data is rather large, so that it would not fit on an EBS volume.

Also, in PyTorch for example, data loading can happen in parallel. So while the GPU is chucking away on one batch, the CPUs load and prepare the data for the next batch.

AWS
mkamp
답변함 4년 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠