- Newest
- Most votes
- Most comments
The issue could be attributed to either Python code or the sagemaker setting(instance type selection etc) . Please try below (High level tips based on limited context.)
Python Code Issues:
-
Global Variables: Unintended modifications to global variables can lead to memory accumulation.
-
Object References: Objects that are no longer needed might still be referenced, preventing garbage collection.
-
Inefficient Data Structures: Using inappropriate data structures can consume excessive memory.
-
Library-Specific Issues: Certain libraries might have memory leaks or inefficient implementations.
** Use Python's memory_profiler or tracemalloc to identify memory-intensive parts of your code. OR Consider using del or gc.collect() to explicitly release memory.
SageMaker JupyterLab Environment:
1/ Kernel Configuration: The kernel's memory limits and configuration might be affecting behavior.
2/ Instance Type: The chosen instance type might not have sufficient resources for your workload.
3/ Environment Variables: Incorrectly set environment variables could impact memory usage.
Relevant content
- asked a year ago
- asked 2 years ago
- asked 3 years ago

I doubled back to the code, and pulled things apart more. Turns out memory growth was related to use of Pandas concat(). My test code is using csv input into pandas, the subject code was using ndjson. Not sure what is different inside of Pandas between those two, but they seem to handle memory differently. My test code was fine but the subject code showed continual memory growth. I had to put some explicit 'del <concatted df>' code into the cell and that has stabilized memory usage. Thanks for the guidance.