- Newest
- Most votes
- Most comments
The error RuntimeError: This class is NOT fork-safe occurs because you are initializing the s3fs.S3FileSystem object globally (outside the function).
SageMaker inference containers use Gunicorn to manage worker processes. Gunicorn initializes the script in a main process and then "forks" it into multiple worker processes to handle incoming requests. Since s3fs is built on top of asyncio, it creates an event loop upon initialization. A running event loop cannot be safely inherited by a child process (the fork), leading to the crash when the worker tries to access the global object.
Option 1 (Using s3fs correctly)
To fix this while keeping s3fs, you must initialize the filesystem object inside the input_handler function. This ensures each worker process has its own independent instance.
import pandas as pd
import s3fs
def input_handler(data, context):
# Initialize INSIDE the function to ensure fork-safety
s3 = s3fs.S3FileSystem(anon=False)
input_bucket_name = "your-bucket" # Extract from data/context
file_key = "your-file.csv" # Extract from data/context
s3_path = f's3://{input_bucket_name}/{file_key}'
print(f'Reading from s3_path: {s3_path}')
with s3.open(s3_path, 'rb') as f:
new_df = pd.read_csv(f)
# Your remaining logic here...
Option 2 (Using boto3 - Recommended)
Since you mentioned the data is "big," using boto3 (which is pre-installed in SageMaker environments) is often more robust as it doesn't rely on complex async loops for simple synchronous reads.
import boto3
import pandas as pd
import io
def input_handler(data, context):
s3_client = boto3.client('s3')
# Streaming the body directly into pandas
response = s3_client.get_object(Bucket=input_bucket_name, Key=file_key)
# pd.read_csv can handle the StreamingBody directly
new_df = pd.read_csv(response['Body'])
# Your remaining logic here...
Important Considerations for "Big Data"
If your CSV files are significantly large, keep these two constraints in mind:
1. Memory Limits: Every SageMaker instance type has a RAM limit. If your CSV is larger than the available memory, the container will crash with an OOM (Out of Memory) error. You might need to process the file in chunks using chunksize in pd.read_csv.
2. Inference Timeouts: SageMaker endpoints usually have a default timeout (60 seconds). If downloading and parsing a massive CSV takes longer than this, the client will receive a 504 Gateway Timeout, even if the code eventually finishes.
I tried S3 approach in
In my inference.py file:
def input_handler(data, context):
...
request_body = data.read().decode('utf-8')
request_data = json.loads(request_body)
input_bucket_name = str(request_data['input_bucket_name'])
output_bucket_name = str(request_data['output_bucket_name'])
file_key = str(request_data['file_key'])
group_num = request_data['group_num']
test = request_data['test']
print("group num:", group_num)
print("input_bucket_name:", input_bucket_name)
print("file_key:", file_key)
try:
s3_client = boto3.client('s3')
# Streaming the body directly into pandas
response = s3_client.get_object(Bucket=input_bucket_name, Key=file_key)
new_df = pd.read_csv(response['Body'])
print('new_df:', new_df.shape)
It had errors, in CloudWatch:
2026-05-04T18:47:58.484Z
1670 2026-05-04 18:47:58,280 ERROR exception handling request: maximum recursion depth exceeded while calling a Python object
1670 2026-05-04 18:47:58,280 ERROR exception handling request: maximum recursion depth exceeded while calling a Python object
2026-05-04T18:47:58.484Z
Traceback (most recent call last):
File "/sagemaker/python_service.py", line 423, in _handle_invocation_post
res.body, res.content_type = handlers(data, context)
File "/sagemaker/python_service.py", line 455, in handler
processed_input = custom_input_handler(data, context)
File "/opt/ml/model/code/inference.py", line 79, in input_handler
raise e
File "/opt/ml/model/code/inference.py", line 43, in input_handler
s3_client = boto3.client('s3')
File "/usr/local/lib/python3.10/site-packages/boto3/__init__.py", line 92, in client
return _get_default_session().client(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/boto3/session.py", line 299, in client
return self._session.create_client(
File "/usr/local/lib/python3.10/site-packages/botocore/session.py", line 951, in create_client
credentials = self.get_credentials()
File "/usr/local/lib/python3.10/site-packages/botocore/session.py", line 507, in get_credentials
self._credentials = self._components.get_component(
File "/usr/local/lib/python3.10/site-packages/botocore/session.py", line 1108, in get_component
self._components[name] = factory()
File "/usr/local/lib/python3.10/site-packages/botocore/session.py", line 186, in _create_credential_resolver
return botocore.credentials.create_credential_resolver(
File "/usr/local/lib/python3.10/site-packages/botocore/credentials.py", line 92, in create_credential_resolver
container_provider = ContainerProvider()
File "/usr/local/lib/python3.10/site-packages/botocore/credentials.py", line 1893, in __init__
fetcher = ContainerMetadataFetcher()
File "/usr/local/lib/python3.10/site-packages/botocore/utils.py", line 2872, in __init__
session = botocore.httpsession.URLLib3Session(
File "/usr/local/lib/python3.10/site-packages/botocore/httpsession.py", line 323, in __init__
self._manager = PoolManager(**self._get_pool_manager_kwargs())
File "/usr/local/lib/python3.10/site-packages/botocore/httpsession.py", line 341, in _get_pool_manager_kwargs
'ssl_context': self._get_ssl_context(),
File "/usr/local/lib/python3.10/site-packages/botocore/httpsession.py", line 350, in _get_ssl_context
return create_urllib3_context()
File "/usr/local/lib/python3.10/site-packages/botocore/httpsession.py", line 139, in create_urllib3_context
context.options |= options
File "/usr/local/lib/python3.10/ssl.py", line 620, in options
super(SSLContext, SSLContext).options.__set__(self, value)
File "/usr/local/lib/python3.10/ssl.py", line 620, in options
super(SSLContext, SSLContext).options.__set__(self, value)
File "/usr/local/lib/python3.10/ssl.py", line 620, in options
super(SSLContext, SSLContext).options.__set__(self, value)
[Previous line repeated 477 more times]
Googled this error and here is their answer:
The error maximum recursion depth exceeded while calling a Python object in your SageMaker inference script indicates an infinite recursion loop, typically triggered by an SSL certificate configuration issue when boto3 tries to connect to S3 within the SageMaker container, often linked to conflict with asynchronous libraries like gevent
Their solutions:
Recommended Solutions
- Set Gunicorn Worker Class to "sync" (Most Common Fix) Add the following environment variable to your SageMaker Model/Endpoint configuration: SAGEMAKER_GUNICORN_WORKER_CLASS = "sync"
- Monkey Patch gevent
If you are using gevent or it is imported implicitly, add the patch at the absolute top of your inference.py script, before importing boto3 or requests
- Use a Global or Persistent Boto3 Client
Do not create a new boto3.client('s3') inside every input_handler call. Create it once outside the function (globally) or use a session object to avoid repeated initialization costs and potential resource issues.
- Update Botocore/Boto3 This error was confirmed as a bug in certain botocore versions. Ensure you are using the latest versions in your container or requirements file
Here is my testing for the above solutions:
Solution 4 is NOT working: put the followings in requirements.txt
boto3>=1.34.0
botocore>=1.34.0
Solution 3: it only worked when I put all S3 read before input_handler function. Only using the global client didn't work.
import json
import io
import pandas as pd
print("Creating S3 client...")
s3_client = boto3.client("s3")
print("S3 client created")
prefix = 'consumer-attrition'
input_bucket_name = 'bb-dev-inputdata'
file_key = prefix + "/new_dataset.csv"
response = s3_client.get_object(Bucket=input_bucket_name, Key=file_key)
new_df = pd.read_csv(io.BytesIO(response['Body'].read()))
print('new_df :', new_df.shape)
group_num = 10
output_bucket_name = 'sagemaker-us-east-1-047618027998'
def input_handler(data, context):
...
But this solution will require me to hard code the S3 path.
I have found the solution 1 works. while deploying the endpoint with TensorflowModel container: Set the environment variable as follows:
sagemaker_model = TensorFlowModel(
model_data=model_artifact,
source_dir='code/',
entry_point=script_path,
role=role,
framework_version="2.12",
env={
'SAGEMAKER_GUNICORN_WORKER_CLASS': "sync"
}
)
'SAGEMAKER_GUNICORN_WORKER_CLASS' variable is set to 'gevent' which has a conflict or something with S3 in SSL.
Relevant content
- asked 2 months ago
- asked 4 years ago

If my answer helped solve your problem, I would appreciate it if you click on “accepted answer”