Cant generate XGBoost training report in sagemaker, only profiler_report.

0

I am trying to generate the XGBoost training report to see feature importances however the following code only generates the profiler report.

import boto3, re, sys, math, json, os, sagemaker, urllib.request
from sagemaker import get_execution_role
import numpy as np
import pandas as pd
from sagemaker.predictor import csv_serializer
from sagemaker.debugger import Rule, rule_configs

# Define IAM role
rules=[
    Rule.sagemaker(rule_configs.create_xgboost_report())
]
role = get_execution_role()
prefix = 'sagemaker/models'
my_region = boto3.session.Session().region_name 

# this line automatically looks for the XGBoost image URI and builds an XGBoost container.
xgboost_container = sagemaker.image_uris.retrieve("xgboost", my_region, "latest")



bucket_name = 'binary-base' 
s3 = boto3.resource('s3')
try:
    if  my_region == 'us-east-1':
      s3.create_bucket(Bucket=bucket_name)
    else: 
      s3.create_bucket(Bucket=bucket_name, CreateBucketConfiguration={ 'LocationConstraint': my_region })
    print('S3 bucket created successfully')
except Exception as e:
    print('S3 error: ',e)

boto3.Session().resource('s3').Bucket(bucket_name).Object(os.path.join(prefix, 'train/train.csv')).upload_file('../Data/Base_Model_Data_No_Labels/train.csv')
boto3.Session().resource('s3').Bucket(bucket_name).Object(os.path.join(prefix, 'validation/val.csv')).upload_file('../Data/Base_Model_Data_No_Labels/val.csv')
boto3.Session().resource('s3').Bucket(bucket_name).Object(os.path.join(prefix, 'test/test.csv')).upload_file('../Data/Base_Model_Data/test.csv'


sess = sagemaker.Session()
xgb = sagemaker.estimator.Estimator(xgboost_container,
                                    role, 
                                    volume_size =5,
                                    instance_count=1, 
                                    instance_type='ml.m4.xlarge',
                                    output_path='s3://{}/{}/output'.format(bucket_name, prefix, 'xgboost_model'),
                                    sagemaker_session=sess, 
                                    rules=rules)

xgb.set_hyperparameters(objective='binary:logistic',
                        num_round=100, 
                        scale_pos_weight=8.5)

xgb.fit({'train': s3_input_train, "validation": s3_input_val}, wait=True)

When Checking the output path via:

rule_output_path = xgb.output_path + "/" + xgb.latest_training_job.job_name + "/rule-output"
! aws s3 ls {rule_output_path} --recursive

Which Outputs:

2022-07-07 18:40:27     329715 sagemaker/models/output/xgboost-2022-07-07-18-35-55-436/rule-output/ProfilerReport-1657218955/profiler-output/profiler-report.html
2022-07-07 18:40:26     171087 sagemaker/models/output/xgboost-2022-07-07-18-35-55-436/rule-output/ProfilerReport-1657218955/profiler-output/profiler-report.ipynb
2022-07-07 18:40:23        191 sagemaker/models/output/xgboost-2022-07-07-18-35-55-436/rule-output/ProfilerReport-1657218955/profiler-output/profiler-reports/BatchSize.json
2022-07-07 18:40:23        199 sagemaker/models/output/xgboost-2022-07-07-18-35-55-436/rule-output/ProfilerReport-1657218955/profiler-output/profiler-reports/CPUBottleneck.json
2022-07-07 18:40:23        126 sagemaker/models/output/xgboost-2022-07-07-18-35-55-436/rule-output/ProfilerReport-1657218955/profiler-output/profiler-reports/Dataloader.json
2022-07-07 18:40:23        127 sagemaker/models/output/xgboost-2022-07-07-18-35-55-436/rule-output/ProfilerReport-1657218955/profiler-output/profiler-reports/GPUMemoryIncrease.json
2022-07-07 18:40:23        198 sagemaker/models/output/xgboost-2022-07-07-18-35-55-436/rule-output/ProfilerReport-1657218955/profiler-output/profiler-reports/IOBottleneck.json
2022-07-07 18:40:23        119 sagemaker/models/output/xgboost-2022-07-07-18-35-55-436/rule-output/ProfilerReport-1657218955/profiler-output/profiler-reports/LoadBalancing.json
2022-07-07 18:40:23        151 sagemaker/models/output/xgboost-2022-07-07-18-35-55-436/rule-output/ProfilerReport-1657218955/profiler-output/profiler-reports/LowGPUUtilization.json
2022-07-07 18:40:23        179 sagemaker/models/output/xgboost-2022-07-07-18-35-55-436/rule-output/ProfilerReport-1657218955/profiler-output/profiler-reports/MaxInitializationTime.json
2022-07-07 18:40:23        133 sagemaker/models/output/xgboost-2022-07-07-18-35-55-436/rule-output/ProfilerReport-1657218955/profiler-output/profiler-reports/OverallFrameworkMetrics.json
2022-07-07 18:40:23        465 sagemaker/models/output/xgboost-2022-07-07-18-35-55-436/rule-output/ProfilerReport-1657218955/profiler-output/profiler-reports/OverallSystemUsage.json
2022-07-07 18:40:23        156 sagemaker/models/output/xgboost-2022-07-07-18-35-55-436/rule-output/ProfilerReport-1657218955/profiler-output/profiler-reports/StepOutlier.json

As you can see only the profiler report in created which does not interest me. Why isn't there a CreateXGBoostReport folder generated with the training report? How do I generate this/what am I missing?

没有答案

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则

相关内容