Sagemaker - dataset target column / label - yes / no

0

Hi there,

I have a sample "incident" dataset. We are planning to use it in Sagemaker, and the intention is to find out what kind of predictions Sagemaker can make with this dataset including the future possible incident locations. I saw in all the videos that the last column is the target column or label, and the values are like yes, no, 0 or 1!

The dataset I have is full of incidents. That means if I have a row, that is an incident.

Can someone give some details on this dataset's target column or label (yes or no) and tell me how to prepare the values for that column?

Thanks Ram

Ram
質問済み 1年前263ビュー
2回答
1

Hey Ram,

Your target is usually the last column, but that is just convention. The target is just whatever you want to predict with the given information.

You are correct in that you don't want to predict "is incident", because everything is an incident. To do that, you would need all the other "non-incidents" so that the algorithm could compare the two.

It sounds like you want to predict probabilities for the next incident. Take a look at the sample notebook in Sagemaker "Predicting driving speed violations with the Amazon SageMaker DeepAR algorithm". In that example, they use past traffic violations to predict future violations. You can get to it from the "SageMaker JumpStart" menu under "Models, notebooks, solutions". You can read more about it here (about half way down the page) : https://aws.amazon.com/blogs/machine-learning/illustrative-notebooks-in-amazon-sagemaker-jumpstart/

profile pictureAWS
エキスパート
回答済み 1年前
0

Thanks, Jim, for your reply. Let me have a look. One side question, I can see Sagemaker is creating feature10,11,12 in the Model explainability page. Where can i find the mapping for feature10,11,12 ?

Ram
回答済み 1年前

ログインしていません。 ログイン 回答を投稿する。

優れた回答とは、質問に明確に答え、建設的なフィードバックを提供し、質問者の専門分野におけるスキルの向上を促すものです。

質問に答えるためのガイドライン

関連するコンテンツ