Sagemaker - dataset target column / label - yes / no


Hi there,

I have a sample "incident" dataset. We are planning to use it in Sagemaker, and the intention is to find out what kind of predictions Sagemaker can make with this dataset including the future possible incident locations. I saw in all the videos that the last column is the target column or label, and the values are like yes, no, 0 or 1!

The dataset I have is full of incidents. That means if I have a row, that is an incident.

Can someone give some details on this dataset's target column or label (yes or no) and tell me how to prepare the values for that column?

Thanks Ram

asked 9 months ago216 views
2 Answers

Hey Ram,

Your target is usually the last column, but that is just convention. The target is just whatever you want to predict with the given information.

You are correct in that you don't want to predict "is incident", because everything is an incident. To do that, you would need all the other "non-incidents" so that the algorithm could compare the two.

It sounds like you want to predict probabilities for the next incident. Take a look at the sample notebook in Sagemaker "Predicting driving speed violations with the Amazon SageMaker DeepAR algorithm". In that example, they use past traffic violations to predict future violations. You can get to it from the "SageMaker JumpStart" menu under "Models, notebooks, solutions". You can read more about it here (about half way down the page) :

profile pictureAWS
answered 9 months ago

Thanks, Jim, for your reply. Let me have a look. One side question, I can see Sagemaker is creating feature10,11,12 in the Model explainability page. Where can i find the mapping for feature10,11,12 ?

answered 9 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions