Cant generate XGBoost training report in sagemaker, only profiler_report.

0

I am trying to generate the XGBoost training report to see feature importances however the following code only generates the profiler report.

import boto3, re, sys, math, json, os, sagemaker, urllib.request
from sagemaker import get_execution_role
import numpy as np
import pandas as pd
from sagemaker.predictor import csv_serializer
from sagemaker.debugger import Rule, rule_configs

# Define IAM role
rules=[
    Rule.sagemaker(rule_configs.create_xgboost_report())
]
role = get_execution_role()
prefix = 'sagemaker/models'
my_region = boto3.session.Session().region_name 

# this line automatically looks for the XGBoost image URI and builds an XGBoost container.
xgboost_container = sagemaker.image_uris.retrieve("xgboost", my_region, "latest")



bucket_name = 'binary-base' 
s3 = boto3.resource('s3')
try:
    if  my_region == 'us-east-1':
      s3.create_bucket(Bucket=bucket_name)
    else: 
      s3.create_bucket(Bucket=bucket_name, CreateBucketConfiguration={ 'LocationConstraint': my_region })
    print('S3 bucket created successfully')
except Exception as e:
    print('S3 error: ',e)

boto3.Session().resource('s3').Bucket(bucket_name).Object(os.path.join(prefix, 'train/train.csv')).upload_file('../Data/Base_Model_Data_No_Labels/train.csv')
boto3.Session().resource('s3').Bucket(bucket_name).Object(os.path.join(prefix, 'validation/val.csv')).upload_file('../Data/Base_Model_Data_No_Labels/val.csv')
boto3.Session().resource('s3').Bucket(bucket_name).Object(os.path.join(prefix, 'test/test.csv')).upload_file('../Data/Base_Model_Data/test.csv'


sess = sagemaker.Session()
xgb = sagemaker.estimator.Estimator(xgboost_container,
                                    role, 
                                    volume_size =5,
                                    instance_count=1, 
                                    instance_type='ml.m4.xlarge',
                                    output_path='s3://{}/{}/output'.format(bucket_name, prefix, 'xgboost_model'),
                                    sagemaker_session=sess, 
                                    rules=rules)

xgb.set_hyperparameters(objective='binary:logistic',
                        num_round=100, 
                        scale_pos_weight=8.5)

xgb.fit({'train': s3_input_train, "validation": s3_input_val}, wait=True)

When Checking the output path via:

rule_output_path = xgb.output_path + "/" + xgb.latest_training_job.job_name + "/rule-output"
! aws s3 ls {rule_output_path} --recursive

Which Outputs:

2022-07-07 18:40:27     329715 sagemaker/models/output/xgboost-2022-07-07-18-35-55-436/rule-output/ProfilerReport-1657218955/profiler-output/profiler-report.html
2022-07-07 18:40:26     171087 sagemaker/models/output/xgboost-2022-07-07-18-35-55-436/rule-output/ProfilerReport-1657218955/profiler-output/profiler-report.ipynb
2022-07-07 18:40:23        191 sagemaker/models/output/xgboost-2022-07-07-18-35-55-436/rule-output/ProfilerReport-1657218955/profiler-output/profiler-reports/BatchSize.json
2022-07-07 18:40:23        199 sagemaker/models/output/xgboost-2022-07-07-18-35-55-436/rule-output/ProfilerReport-1657218955/profiler-output/profiler-reports/CPUBottleneck.json
2022-07-07 18:40:23        126 sagemaker/models/output/xgboost-2022-07-07-18-35-55-436/rule-output/ProfilerReport-1657218955/profiler-output/profiler-reports/Dataloader.json
2022-07-07 18:40:23        127 sagemaker/models/output/xgboost-2022-07-07-18-35-55-436/rule-output/ProfilerReport-1657218955/profiler-output/profiler-reports/GPUMemoryIncrease.json
2022-07-07 18:40:23        198 sagemaker/models/output/xgboost-2022-07-07-18-35-55-436/rule-output/ProfilerReport-1657218955/profiler-output/profiler-reports/IOBottleneck.json
2022-07-07 18:40:23        119 sagemaker/models/output/xgboost-2022-07-07-18-35-55-436/rule-output/ProfilerReport-1657218955/profiler-output/profiler-reports/LoadBalancing.json
2022-07-07 18:40:23        151 sagemaker/models/output/xgboost-2022-07-07-18-35-55-436/rule-output/ProfilerReport-1657218955/profiler-output/profiler-reports/LowGPUUtilization.json
2022-07-07 18:40:23        179 sagemaker/models/output/xgboost-2022-07-07-18-35-55-436/rule-output/ProfilerReport-1657218955/profiler-output/profiler-reports/MaxInitializationTime.json
2022-07-07 18:40:23        133 sagemaker/models/output/xgboost-2022-07-07-18-35-55-436/rule-output/ProfilerReport-1657218955/profiler-output/profiler-reports/OverallFrameworkMetrics.json
2022-07-07 18:40:23        465 sagemaker/models/output/xgboost-2022-07-07-18-35-55-436/rule-output/ProfilerReport-1657218955/profiler-output/profiler-reports/OverallSystemUsage.json
2022-07-07 18:40:23        156 sagemaker/models/output/xgboost-2022-07-07-18-35-55-436/rule-output/ProfilerReport-1657218955/profiler-output/profiler-reports/StepOutlier.json

As you can see only the profiler report in created which does not interest me. Why isn't there a CreateXGBoostReport folder generated with the training report? How do I generate this/what am I missing?

답변 없음

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠