Sagemaker - dataset target column / label - yes / no

0

Hi there,

I have a sample "incident" dataset. We are planning to use it in Sagemaker, and the intention is to find out what kind of predictions Sagemaker can make with this dataset including the future possible incident locations. I saw in all the videos that the last column is the target column or label, and the values are like yes, no, 0 or 1!

The dataset I have is full of incidents. That means if I have a row, that is an incident.

Can someone give some details on this dataset's target column or label (yes or no) and tell me how to prepare the values for that column?

Thanks Ram

Ram
已提问 1 年前263 查看次数
2 回答
1

Hey Ram,

Your target is usually the last column, but that is just convention. The target is just whatever you want to predict with the given information.

You are correct in that you don't want to predict "is incident", because everything is an incident. To do that, you would need all the other "non-incidents" so that the algorithm could compare the two.

It sounds like you want to predict probabilities for the next incident. Take a look at the sample notebook in Sagemaker "Predicting driving speed violations with the Amazon SageMaker DeepAR algorithm". In that example, they use past traffic violations to predict future violations. You can get to it from the "SageMaker JumpStart" menu under "Models, notebooks, solutions". You can read more about it here (about half way down the page) : https://aws.amazon.com/blogs/machine-learning/illustrative-notebooks-in-amazon-sagemaker-jumpstart/

profile pictureAWS
专家
已回答 1 年前
0

Thanks, Jim, for your reply. Let me have a look. One side question, I can see Sagemaker is creating feature10,11,12 in the Model explainability page. Where can i find the mapping for feature10,11,12 ?

Ram
已回答 1 年前

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则