By using AWS re:Post, you agree to the Terms of Use

AWS Sagemaker - Either the training channel is empty or the mini-batch size


I am trying to train a linear learner model in Sagemaker. My training set is 422 rows split into 4 files on AWS S3. The mini-batch size that I set is 50.

I keep on getting this error in Sagemaker.

Customer Error: No training data processed. Either the training
channel is empty or the mini-batch size is too high. Verify that
training data contains non-empty files and the mini-batch size is less
than the number of records per training host.

I am using this InputDataConfig

                'ChannelName': 'train',  
                'DataSource': {  
                    'S3DataSource': {  
                        'S3DataType': 'S3Prefix',  
                        'S3Uri': 's3://MY_S3_BUCKET/REST_OF_PREFIX/exported/',  
                        'S3DataDistributionType': 'FullyReplicated'  
                'ContentType': 'text/csv',  
                'CompressionType': 'Gzip'  

I am not sure what I am doing wrong here. I tried increasing the number of records to 5547495 split across 6 files. The same error. That makes me think that somehow the config itself has something missing. Due to which it seems to think training channel is just not present. I tried changing 'train' to 'training' as that is what the erorr message is saying. But then I got

Customer Error: Unable to initialize the algorithm. Failed to validate
input data configuration. (caused by ValidationError)

Caused by: {u'training': {u'TrainingInputMode': u'Pipe',
u'ContentType': u'text/csv', u'RecordWrapperType': u'None',
u'S3DistributionType': u'FullyReplicated'}} is not valid under any of
the given schemas

I went back to train as that seems to be what is needed. But what am I doing wrong with that?

Edited by: anshbansal on Jun 3, 2019 12:06 AM

asked 3 years ago25 views
1 Answer

Found the problem. The CompressionType was mentioned as 'Gzip' but I had changed the actual file to be not compressed when doing the exports. As soon as I changed it to be 'None' the training went smoothly.

answered 3 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions