AWS Sagemaker splitting between training and validation and ClientError

0

Dears,

I started working with Sagemaker! In other words, I am new with this service. I have a db contains some facial images for three races black/white/asian and two genders male/female. I would like to use sagemaker AWS to train/model it. I created train and validation labels and sepreated data into 6 classess for training and validation: (asian female af/black male bf/white male wm/etc)

s3://dbimgraces/af/

s3://dbimgraces/am/

s3://dbimgraces/bf/

s3://dbimgraces/bm/

s3://dbimgraces/wf/

s3://dbimgraces/wm/

The first question is that we only have 20 channeles in sagemaker, 6 trainings+6 validations+6 train lables+6 validation lables=24 channels which wont be accepted by aws sagemaker. How can I solve this issue? Is any advice for splitting and structuring about the model?

The second question is that with every combination (train/validation/train-lb/validation-lb), I get the following error? why?

ClientError: Unable to initialize the algorithm. Failed to validate input data configuration. (caused by ValidationError) Caused by: Additional properties are not allowed ('validation-lb-bm', 'validation-lb-bf', 'train-lb-bf', 'train-lb-bm' were unexpected) Failed validating 'additionalProperties' in schema: {'$schema': 'http://json-schema.org/draft-04/schema#', 'additionalProperties': False, 'anyOf': [{'required': ['train']}, {'required': ['validation']}, {'optional': ['train_lst']}, {'optional': ['validation_lst']}, {'optional': ['model']}], 'definitions': {'data_channel': {'properties': {'ContentType': {'type': 'string'}}, 'type': 'object'}}, 'properties': {'model': {'$ref': '#/definitions/data_channel'}, 'train': {'$ref': '#/definitions/data_channel'}, 'train_lst': {'$ref': '#/definitions/data_channel'}, 'validation': {'$, e

ClientError: Unable to initialize the algorithm. Failed to validate input data configuration. (caused by ValidationError) Caused by: Additional properties are not allowed ('lb_bf', 'valid_af', 'valid_bm', 'lb_wm', 'tr-bm', 'valid_wm', 'tr-am', 'tr-wf', 'lb_bm', 'lb_am', 'lb_wf', 'tr-af', 'tr-bf', 'valid_bf', 'valid_wf', 'lb_af', 'valid_am', 'tr-wm' were unexpected) Failed validating 'additionalProperties' in schema: {'$schema': 'http://json-schema.org/draft-04/schema#', 'additionalProperties': False, 'anyOf': [{'required': ['train']}, {'required': ['validation']}, {'optional': ['train_lst']}, {'optional': ['validation_lst']}, {'optional': ['model']}], 'definitions': {'data_channel': {'properties': {'ContentType': {'type': 'string'}}, 'type': 'object'}}, 'properties': {'model': {'$ref': '#/definitions/data_channel'}, 'train': {'$ref': '#/definitions/data_channel'}, , e

ClientError: Unable to initialize the algorithm. Failed to validate input data configuration. (caused by ValidationError) Caused by: Additional properties are not allowed ('train-lst', 'validation-lst' were unexpected) Failed validating 'additionalProperties' in schema: {'$schema': 'http://json-schema.org/draft-04/schema#', 'additionalProperties': False, 'anyOf': [{'required': ['train']}, {'required': ['validation']}, {'optional': ['train_lst']}, {'optional': ['validation_lst']}, {'optional': ['model']}], 'definitions': {'data_channel': {'properties': {'ContentType': {'type': 'string'}}, 'type': 'object'}}, 'properties': {'model': {'$ref': '#/definitions/data_channel'}, 'train': {'$ref': '#/definitions/data_channel'}, 'train_lst': {'$ref': '#/definitions/data_channel'}, 'validation': {'$ref': '#/definitions/data_channel'}, , e

asked a year ago258 views
1 Answer
0

A channel in SageMaker Training job is not defined as a class. It's defined as a source type. You can have EFS as a channel or S3 as a channel. For S3 type, if you set prefix as s3://dbimgraces, everything under this bucket will be downloaded into /opt/ml/input/data/channel_name of the training instance.

The main task of channel is provided a download source and pattern. You could further process your data in your training script.

This error is related to how you put your input data config. If you could show this corresponding section, we could take a further look

AWS
answered a year ago
  • Thanks for your comment. Suppose that I use the image classification algorithem for training. How can I config my training script? Actually, inside my bucket s3://dbimgraces/output/ , there are 6 folders for 6 classess and 12 lables files (6 for traning lables+6 for validation lables).

  • My valid and training lables are like:

    1 1 ./output/af/af05.tif 2 1 ./output/af/af08.tif 3 1 ./output/af/af11.tif 4 1 ./output/af/af09.tif 5 1 ./output/af/af12.tif 6 1 ./output/af/af07.tif 7 1 ./output/af/af10.tif 8 1 ./output/af/af02.tif 9 1 ./output/af/af06.tif 10 0 ./output/am/am01.tif 11 0 ./output/am/am06.tif 12 0 ./output/am/am07.tif 13 0 ./output/am/am03.tif 14 0 ./output/am/am05.tif 15 3 ./output/bf/bf12.tif 16 3 ./output/bf/bf09.tif 17 3 ./output/bf/bf04.tif 18 3 ./output/bf/bf15.tif 19 3 ./output/bf/bf06.tif 20 3 ./output/bf/bf10.tif ...

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions