내용으로 건너뛰기

Textract is timing out

0

I have been using lambda function since last two years. Recently one of the databases moved to VPC environment so lambda is configured that connection with since then when Im trying to access text using python textract methods such as textract.detect_document_text(Document=doc_spec) I'm getting an timed out error. However I increased lambda function time to 2 mins despite that Task timed out after 120.03 seconds" error is showing up.

2개 답변
0

Hi Ajay,

To address the timeout issue when using Amazon Textract with a Lambda function within a VPC, you can try the following steps:

  1. Increase the Lambda Function Timeout: You have already increased the timeout to 2 minutes, but it might be necessary to increase it further depending on the size of the document you are processing. Consider increasing the timeout to a higher value, such as 5 minutes (300 seconds).

  2. Configure the Lambda Function to Access Amazon Textract: Ensure your Lambda function is configured correctly to access Amazon Textract. This includes:

    • Configuring the IAM role with the necessary permissions to call Textract.
    • Ensuring that the Lambda has internet access if Textract is outside the VPC (e.g., using a NAT Gateway).
  3. Subnets and Security Groups: Make sure your Lambda function is associated with subnets that have routes to a NAT Gateway or Internet Gateway, allowing external communication. Also, configure the security groups to allow the necessary outbound traffic.

  4. Check the Regional Endpoint for Textract: If you are using VPC endpoints for AWS services, ensure that the endpoint for Amazon Textract is correctly configured. Add a VPC endpoint for Textract if needed.

  5. Divide the Document: If the document is very large, consider splitting the document into smaller parts and processing them separately.

  6. Logs and Metrics: Use CloudWatch Logs to get more information about the timeout error. Check the Lambda function's metrics in CloudWatch to see the average execution time and adjust accordingly.

If these solutions do not resolve the issue, please share some logs with us for further investigation.

Bests.

전문가
답변함 2년 전
전문가
검토됨 2년 전
0

In the middle of the other answer is the thing that I would rate as the most likely:

Make sure that you add a VPC endpoint for the Textract service. If you have not done this then the Lambda function (most probably) cannot reach the Textract API endpoint.

Note that you can also use a NAT Gateway/Internet Gateway combination - so what you do here depends on the routing and other networking within your VPC.

For references, here's a list of the services that can be reached via VPC endpoints: https://docs.aws.amazon.com/vpc/latest/privatelink/aws-services-privatelink-support.html

AWS
전문가
답변함 2년 전
전문가
검토됨 2년 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

관련 콘텐츠