Verbose Information About Why Training Data Ineligible in AWS Sagemaker Canvas

0

Hello,

I tried to play around with Sagemaker Canvas to build a text classification prediction model. I built first 2 model using same approach and it worked fine, but when It tried to build another one with different dataset, I stuck in selecting the target field as the target field is disabled (it says that it is ineligible) However, since the information is too vague, I am unable to continue on this. Note :

  • CSV file consist of 2 column, reason and reason_label
  • target will be reason_label column and source will be reason
  • basically I want the model to predict if I am giving such text input, it should produce relevant label as per trained data
  • I make sure that there are no empty string on the csv file, trying to chunk the file (instead of 70k row in one go, I tried with 1 chunk of file of 7k row) didn't work also
  • for additional information, reason_label training data will have 250 unique value at maximum
  • I make sure that no empty string and no missing value in the dataset

My question will be :

  1. What is causing this ineligible issue? I can send sample of data if necessary
  2. Is there any way to make this information more verbose? (e.g. ineligible due to too many variations/other thing) so we are not looking for needle in haystack when debugging this

Thanks and best regards

esanto
asked 3 months ago92 views
No Answers

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions