SageMaker Debugger: cannot load training information of estimator

0

I am using a SageMaker notebook for training a ML model. When I created and trained the estimator successfully with the following script, I could load the debugging information (s3_output_path) as expected:

from sagemaker.debugger import Rule, DebuggerHookConfig, CollectionConfig, rule_configs
rules = [
    Rule.sagemaker(rule_configs.loss_not_decreasing()),
    Rule.sagemaker(rule_configs.vanishing_gradient()),
    Rule.sagemaker(rule_configs.overfit()),
    Rule.sagemaker(rule_configs.overtraining()),
    Rule.sagemaker(rule_configs.poor_weight_initialization())]

collection_configs=[CollectionConfig(name="CrossEntropyLoss_output_0", parameters={
    "include_regex": "CrossEntropyLoss_output_0", "train.save_interval": "100","eval.save_interval": "10"})]

debugger_config = DebuggerHookConfig(
    collection_configs=collection_configs)

estimator = PyTorch(
role=sagemaker.get_execution_role(),
instance_count=1,
instance_type="ml.m5.xlarge",
#instance_type="ml.g4dn.2xlarge",
entry_point="train.py",
framework_version="1.8",
py_version="py36",
hyperparameters=hyperparameters,
debugger_hook_config=debugger_config,
rules=rules,
)

estimator.fit({"training": inputs})

s3_output_path = estimator.latest_job_debugger_artifacts_path()

After the kernel died, I attached the estimator and tried to access the debugging information of the training:

estimator = sagemaker.estimator.Estimator.attach('pytorch-training-2022-06-07-11-07-09-804')

s3_output_path = estimator.latest_job_debugger_artifacts_path()
rules_path = estimator.debugger_rules

The return values of these 2 functions were None. Could this be a problem with the attach-function? And how can I access training information of the debugger after the kernel was shut down?

1 Antwort
0

Hello,

It will be difficult to analyze why the two values of the functions were none. We will require more information on the same. I will request you to open a support case with AWS PS and one of our Sagemaker engineer should be able to assist you.

Thank you.

SUPPORT-TECHNIKER
beantwortet vor 2 Jahren

Du bist nicht angemeldet. Anmelden um eine Antwort zu veröffentlichen.

Eine gute Antwort beantwortet die Frage klar, gibt konstruktives Feedback und fördert die berufliche Weiterentwicklung des Fragenstellers.

Richtlinien für die Beantwortung von Fragen