How to specify target feature in Sagemaker XGBoost?

0

I am considering migrating a data science project from Datarobot to Sagemaker. I am familiar with writing Python and have been going through one of the tutorial Jupyter notebooks to see how to explore the data and to build and deploy and estimator. But, I cannot see how to specify the target feature. I have entirely numerical data in a csv file. One of the fields in that file is the intended target for estimation, the rest are information from which the estimate is to be made.

How do I specify the column that is to be estimated? The code I expect should have this is ...

container = sm.image_uris.retrieve("xgboost", session.boto_region_name, "1.5-1")

xgb = sm.estimator.Estimator(
    container,
    role,
    instance_count=1,
    instance_type="ml.m4.xlarge",
    output_path="s3://xxxxxx001/",
    sagemaker_session=session,
)

xgb.set_hyperparameters(
    max_depth=5,
    eta=0.2,
    gamma=4,
    min_child_weight=6,
    subsample=0.8,
    verbosity=0,
    num_round=100,
)
s3_input_train = TrainingInput(
    s3_data="s3://xxxxxx001/data.csv", content_type="csv"
)
xgb.fit({"train": s3_input_train})

1개 답변
0
수락된 답변

On a badly formatted page on the AWS documentation, I found a statement that - the CSV file must have no headers and the target field must be the first field. So, apparently, it is not possible to specify the target. So primitive, yeah?

답변함 2년 전
profile picture
전문가
검토됨 18일 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠