How can I tell if my Notebook instance is frozen?

0

I am running a Sagemaker Notebook instance. How can I tell if my Notebook is frozen or just taking a long time? I am using a 24xlarge and querying from Athena in parallel and it seems to be stuck on the same query for a long time. How can I tell if I need more Memory or more VCPUs?

asked a year ago539 views
2 Answers
0
Accepted Answer

Hi there,

Greetings for the day!

I understand that you wanted to know how can you determine if you need more VCPU or memory when your SageMaker Notebook Instance is frozen or just taking a long time?

I’d like to inform you that If your Sagemaker Notebook instance is taking a long time, you can check if it is frozen or still running by monitoring the CPU and memory usage. If the CPU usage is low or zero, it may be frozen.

If the CPU usage is high but the memory usage is low, you may need more VCPUs and If the memory usage is high, you may need more memory.

You can check it via SageMaker Notebook Instance terminal:

To see the memory and CPU information in detail , kindly follow the below instructions:-

[1] Start Your Notebook Instance [2] Go to the Jupyter Home Page [3] Right hand side ,Click on DropDown Option “New” [4] Select “Terminal”.

In the Jupyter terminal, Run the below commands to see the information of Memory and CPUs.

[+] To see the memory information:

$ free -h

=> output of “free -h” will provide the information of total memory, used memory, free memory, shared memory etc in human readable form.

[+] To see the CPU information, you can run any of the commands:

$ mpstat -u

=> Output of “mpstat -u” consists of different fields like %guest, %gnice, %steal etc.

%steal Show the percentage of time spent in involuntary wait by the virtual CPU or CPUs while the hyper‐ visor was servicing another virtual processor.

%guest Show the percentage of time spent by the CPU or CPUs to run a virtual processor.

%idle Show the percentage of time that the CPU or CPUs were idle and the system did not have an out‐ standing disk I/O request.

many more.. You can find more detail about the each field of mpstat command by visiting the manual page of it. To see the manual page of mpstat command, use “$ man mpstst”.

Along with “mpstat -u”, you can also try the below listed commands to get information about the cpu:

$ lscpu $ cat /proc/cpuinfo $ top

Additionally, You can also check the cloudwatch logs for any errors or warnings that may indicate the cause of the issue. Most of the time, Cloud watch logs helps to find out the root cause of the issue.

You can find the CloudWatch logs under CloudWatch → Log Groups → /aws/sagemaker/NotebookInstances -> Notebook Instance Name

Based on the analysis , you can select different Notebook Instance type, You can find more detail of SageMaker Instance Here [1].

I request you to kindly follow the above suggested workarounds.

If you have any difficulty or run into any issue, Please reach out to AWS Support[+] (Sagemaker), along with your issue/use case in details and we would be happy to assist you further.

[+] Creating support cases and case management - https://docs.aws.amazon.com/awssupport/latest/user/case-management.html#creating-a-support-casehttps://docs.aws.amazon.com/awssupport/latest/user/case-management.html#creating-a-support-case

I hope this information would be useful to you.

Thank you!

REFERENCES:

[1] https://aws.amazon.com/sagemaker/pricing/

AWS
SUPPORT ENGINEER
answered a year ago
profile picture
EXPERT
reviewed a year ago
0

How about SSH connection to Sagemaker Notebook to check CPU and memory load?
If the resulting load is too high, we will need to consider changing the specifications.

profile picture
EXPERT
answered a year ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions