Skip to content

read S3 csv file in TensorflowModel container in input_handler custom script function

0

Hi I deployed my model as endpoint with TensorflowModel container which has requirements.txt to install the following libraries:

pandas s3fs

I used "s3fs" library to read S3 csv file in input_handler function in inference.py custom script. The code snippet of input_handler:

#move it out of input_handler which is due to error: Calling sync() from within a running loop in input_handler function in inference.py
 s3 = s3fs.S3FileSystem(anon=False)  

def input_handler(data, context):
       .....
       
        s3_path = 's3://' + input_bucket_name+'/'+file_key         
        print('s3_path:', s3_path)
        with s3.open(s3_path, 'rb') as f:
            new_df = pd.read_csv(f)
     ......

but I got the following error in Cloudwatch

2026-03-30T17:33:12.625Z
s3_path: s3://bb-dev-inputdata/consumer-attrition/new_dataset.csv
2026-03-30T17:33:12.625Z
Error reading from S3: This class is not fork-safe

Error reading from S3: This class is not fork-safe
2026-03-30T17:33:12.625Z
1683 2026-03-30 17:33:12,496 ERROR    exception handling request: This class is not fork-safe

1683 2026-03-30 17:33:12,496 ERROR exception handling request: This class is not fork-safe
2026-03-30T17:33:12.626Z
Traceback (most recent call last):
  File "/sagemaker/python_service.py", line 423, in _handle_invocation_post
    res.body, res.content_type = handlers(data, context)
  File "/sagemaker/python_service.py", line 455, in handler
    processed_input = custom_input_handler(data, context)
  File "/opt/ml/model/code/inference.py", line 55, in input_handler
    raise e
  File "/opt/ml/model/code/inference.py", line 43, in input_handler
    with s3.open(s3_path, 'rb') as f:
  File "/usr/local/lib/python3.10/site-packages/fsspec/spec.py", line 1352, in open
    f = self._open(
  File "/usr/local/lib/python3.10/site-packages/s3fs/core.py", line 816, in _open
    return S3File(
  File "/usr/local/lib/python3.10/site-packages/s3fs/core.py", line 2556, in __init__
    super().__init__(
  File "/usr/local/lib/python3.10/site-packages/fsspec/spec.py", line 1926, in __init__
    self.size = self.details["size"]
  File "/usr/local/lib/python3.10/site-packages/fsspec/spec.py", line 1939, in details
    self._details = self.fs.info(self.path)
  File "/usr/local/lib/python3.10/site-packages/fsspec/asyn.py", line 118, in wrapper
    return sync(self.loop, func, *args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/fsspec/asyn.py", line 327, in loop
    raise RuntimeError("This class is not fork-safe")

Traceback (most recent call last): File "/sagemaker/python_service.py", line 423, in _handle_invocation_post res.body, res.content_type = handlers(data, context) File "/sagemaker/python_service.py", line 455, in handler processed_input = custom_input_handler(data, context) File "/opt/ml/model/code/inference.py", line 55, in input_handler raise e File "/opt/ml/model/code/inference.py", line 43, in input_handler with s3.open(s3_path, 'rb') as f: File "/usr/local/lib/python3.10/site-packages/fsspec/spec.py", line 1352, in open f = self._open( File "/usr/local/lib/python3.10/site-packages/s3fs/core.py", line 816, in _open return S3File( File "/usr/local/lib/python3.10/site-packages/s3fs/core.py", line 2556, in __init__ super().__init__( File "/usr/local/lib/python3.10/site-packages/fsspec/spec.py", line 1926, in __init__ self.size = self.details["size"] File "/usr/local/lib/python3.10/site-packages/fsspec/spec.py", line 1939, in details self._details = self.fs.info(self.path) File "/usr/local/lib/python3.10/site-packages/fsspec/asyn.py", line 118, in wrapper return sync(self.loop, func, *args, **kwargs) File "/usr/local/lib/python3.10/site-packages/fsspec/asyn.py", line 327, in loop raise RuntimeError("This class is not fork-safe")
2026-03-30T17:33:12.626Z
RuntimeError: This class is not fork-safe

I don't know why this error occurred. Which is the way to read input data from S3 in input_handler function since the input data is big? Thanks in advance.

  • If my answer helped solve your problem, I would appreciate it if you click on “accepted answer”

3 Answers
5

The error RuntimeError: This class is NOT fork-safe occurs because you are initializing the s3fs.S3FileSystem object globally (outside the function).

SageMaker inference containers use Gunicorn to manage worker processes. Gunicorn initializes the script in a main process and then "forks" it into multiple worker processes to handle incoming requests. Since s3fs is built on top of asyncio, it creates an event loop upon initialization. A running event loop cannot be safely inherited by a child process (the fork), leading to the crash when the worker tries to access the global object.

Option 1 (Using s3fs correctly)

To fix this while keeping s3fs, you must initialize the filesystem object inside the input_handler function. This ensures each worker process has its own independent instance.

import pandas as pd
import s3fs

def input_handler(data, context):
    # Initialize INSIDE the function to ensure fork-safety
    s3 = s3fs.S3FileSystem(anon=False)
    
    input_bucket_name = "your-bucket" # Extract from data/context
    file_key = "your-file.csv"        # Extract from data/context
    s3_path = f's3://{input_bucket_name}/{file_key}'
    
    print(f'Reading from s3_path: {s3_path}')
    
    with s3.open(s3_path, 'rb') as f:
        new_df = pd.read_csv(f)
        
    # Your remaining logic here...

Option 2 (Using boto3 - Recommended)

Since you mentioned the data is "big," using boto3 (which is pre-installed in SageMaker environments) is often more robust as it doesn't rely on complex async loops for simple synchronous reads.

import boto3
import pandas as pd
import io

def input_handler(data, context):
    s3_client = boto3.client('s3')
    
    # Streaming the body directly into pandas
    response = s3_client.get_object(Bucket=input_bucket_name, Key=file_key)
    
    # pd.read_csv can handle the StreamingBody directly
    new_df = pd.read_csv(response['Body'])
    
    # Your remaining logic here...

Important Considerations for "Big Data"

If your CSV files are significantly large, keep these two constraints in mind:

1. Memory Limits: Every SageMaker instance type has a RAM limit. If your CSV is larger than the available memory, the container will crash with an OOM (Out of Memory) error. You might need to process the file in chunks using chunksize in pd.read_csv.

2. Inference Timeouts: SageMaker endpoints usually have a default timeout (60 seconds). If downloading and parsing a massive CSV takes longer than this, the client will receive a 504 Gateway Timeout, even if the code eventually finishes.

EXPERT
answered 2 months ago
0

I tried S3 approach in

In my inference.py file:

def input_handler(data, context):
   ...
    request_body = data.read().decode('utf-8')
    request_data = json.loads(request_body)
    input_bucket_name = str(request_data['input_bucket_name'])
    output_bucket_name = str(request_data['output_bucket_name'])
    file_key = str(request_data['file_key'])
    group_num = request_data['group_num']
    test = request_data['test']
    print("group num:", group_num)
    print("input_bucket_name:", input_bucket_name)
    print("file_key:", file_key)
    try:

        s3_client = boto3.client('s3')        
        # Streaming the body directly into pandas
        response = s3_client.get_object(Bucket=input_bucket_name, Key=file_key)
        new_df = pd.read_csv(response['Body'])
        print('new_df:', new_df.shape)

It had errors, in CloudWatch:

2026-05-04T18:47:58.484Z
1670 2026-05-04 18:47:58,280 ERROR    exception handling request: maximum recursion depth exceeded while calling a Python object

1670 2026-05-04 18:47:58,280 ERROR exception handling request: maximum recursion depth exceeded while calling a Python object

2026-05-04T18:47:58.484Z
Traceback (most recent call last):
  File "/sagemaker/python_service.py", line 423, in _handle_invocation_post
    res.body, res.content_type = handlers(data, context)
  File "/sagemaker/python_service.py", line 455, in handler
    processed_input = custom_input_handler(data, context)
  File "/opt/ml/model/code/inference.py", line 79, in input_handler
    raise e
  File "/opt/ml/model/code/inference.py", line 43, in input_handler
    s3_client = boto3.client('s3')
  File "/usr/local/lib/python3.10/site-packages/boto3/__init__.py", line 92, in client
    return _get_default_session().client(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/boto3/session.py", line 299, in client
    return self._session.create_client(
  File "/usr/local/lib/python3.10/site-packages/botocore/session.py", line 951, in create_client
    credentials = self.get_credentials()
  File "/usr/local/lib/python3.10/site-packages/botocore/session.py", line 507, in get_credentials
    self._credentials = self._components.get_component(
  File "/usr/local/lib/python3.10/site-packages/botocore/session.py", line 1108, in get_component
    self._components[name] = factory()
  File "/usr/local/lib/python3.10/site-packages/botocore/session.py", line 186, in _create_credential_resolver
    return botocore.credentials.create_credential_resolver(
  File "/usr/local/lib/python3.10/site-packages/botocore/credentials.py", line 92, in create_credential_resolver
    container_provider = ContainerProvider()
  File "/usr/local/lib/python3.10/site-packages/botocore/credentials.py", line 1893, in __init__
    fetcher = ContainerMetadataFetcher()
  File "/usr/local/lib/python3.10/site-packages/botocore/utils.py", line 2872, in __init__
    session = botocore.httpsession.URLLib3Session(
  File "/usr/local/lib/python3.10/site-packages/botocore/httpsession.py", line 323, in __init__
    self._manager = PoolManager(**self._get_pool_manager_kwargs())
  File "/usr/local/lib/python3.10/site-packages/botocore/httpsession.py", line 341, in _get_pool_manager_kwargs
    'ssl_context': self._get_ssl_context(),
  File "/usr/local/lib/python3.10/site-packages/botocore/httpsession.py", line 350, in _get_ssl_context
    return create_urllib3_context()
  File "/usr/local/lib/python3.10/site-packages/botocore/httpsession.py", line 139, in create_urllib3_context
    context.options |= options
  File "/usr/local/lib/python3.10/ssl.py", line 620, in options
    super(SSLContext, SSLContext).options.__set__(self, value)
  File "/usr/local/lib/python3.10/ssl.py", line 620, in options
    super(SSLContext, SSLContext).options.__set__(self, value)
  File "/usr/local/lib/python3.10/ssl.py", line 620, in options
    super(SSLContext, SSLContext).options.__set__(self, value)
  [Previous line repeated 477 more times]

Googled this error and here is their answer:

The error maximum recursion depth exceeded while calling a Python object in your SageMaker inference script indicates an infinite recursion loop, typically triggered by an SSL certificate configuration issue when boto3 tries to connect to S3 within the SageMaker container, often linked to conflict with asynchronous libraries like gevent

Their solutions:

Recommended Solutions

  1. Set Gunicorn Worker Class to "sync" (Most Common Fix) Add the following environment variable to your SageMaker Model/Endpoint configuration: SAGEMAKER_GUNICORN_WORKER_CLASS = "sync"
  2. Monkey Patch gevent

If you are using gevent or it is imported implicitly, add the patch at the absolute top of your inference.py script, before importing boto3 or requests

  1. Use a Global or Persistent Boto3 Client

Do not create a new boto3.client('s3') inside every input_handler call. Create it once outside the function (globally) or use a session object to avoid repeated initialization costs and potential resource issues.

  1. Update Botocore/Boto3 This error was confirmed as a bug in certain botocore versions. Ensure you are using the latest versions in your container or requirements file

Here is my testing for the above solutions:

Solution 4 is NOT working: put the followings in requirements.txt

boto3>=1.34.0
botocore>=1.34.0

Solution 3: it only worked when I put all S3 read before input_handler function. Only using the global client didn't work.

import json
import io
import pandas as pd
   
print("Creating S3 client...")
s3_client = boto3.client("s3")
print("S3 client created")
prefix = 'consumer-attrition'
input_bucket_name = 'bb-dev-inputdata'
file_key = prefix + "/new_dataset.csv"
response = s3_client.get_object(Bucket=input_bucket_name, Key=file_key)
new_df = pd.read_csv(io.BytesIO(response['Body'].read()))

print('new_df :', new_df.shape)

group_num = 10
output_bucket_name = 'sagemaker-us-east-1-047618027998'

def input_handler(data, context):
...

But this solution will require me to hard code the S3 path.

answered 14 days ago
0

I have found the solution 1 works. while deploying the endpoint with TensorflowModel container: Set the environment variable as follows:

sagemaker_model = TensorFlowModel(
    model_data=model_artifact,
    source_dir='code/',
    entry_point=script_path,
    role=role,
    framework_version="2.12", 
    env={
        'SAGEMAKER_GUNICORN_WORKER_CLASS': "sync"
    }
)

'SAGEMAKER_GUNICORN_WORKER_CLASS' variable is set to 'gevent' which has a conflict or something with S3 in SSL.

answered 14 days ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.