Skip to content

Amazon Polly Response Time

0

Hi, We're using Amazon Polly ( boto3 Python client ) for production TTS. We have a "fixed" test text that we're using to monitor our system. Usually this text takes a few seconds to be completed. I've noticed that sometimes it takes more than 1 minute (!!) to complete successfully. As we're using it as part of an IVR system where user is waiting on a phone, I would like to know if there is a way to speed it up or to prioritize such operations as well as to know how to audit/log/troubleshoot such operations (via cloudwatch or any other method). thanks in advanced !!

  • We seem to be facing this issue or something very similar.

    Other interesting details:

    • We've only seen this issue occur if our service has been idle for over 24hrs
    • We always make 2 requests at once in different threads (audio & marks)
    • The request is always JUST over 60s, so much so it can't be a coincidence and some sort of retry timeout is very likely
    • It only ever seems to be one request that takes a long time (one pair of audio and marks, we don't log which was the long one atm)

    This is leading me to believe this is a client bug. A 5 second delay is unacceptable for our use-case (conversational AI), so I would rather fix the problem instead of tweaking retry timings.

    I will add some logging to try and understand if response["ResponseMetadata"]["RetryAttempts"] is non-zero for the 60s requests.

asked 4 years ago1.1K views
1 Answer
1

Hi,

The 1 minute mentioned here is really close to the default boto3 connection timeout which is 60 seconds. I think what's happening here is that the client which is using boto3 tries to connect to the Polly endpoint but is unable to do so due to intermittent/transient networking issues. These are normally resolved by retrying the request. In the case of Boto3, it has to wait for 60 seconds before retrying the request again.

You can enable boto3 debug logs and follow the steps here in this documentation to check if all of the instances that are taking more than 1 minute is due to retries -> https://boto3.amazonaws.com/v1/documentation/api/latest/guide/retries.html#validating-retry-attempts

Please read this page for more information about this -> https://aws.amazon.com/premiumsupport/knowledge-center/lambda-function-retry-timeout-sdk/. This mentions Lambda but it can occur to any client which is invoking any AWS API via the SDK.

You can update the default boto3 settings by following the example in the above article

# max_attempts: retry count / read_timeout: socket timeout / connect_timeout: new connection timeout

from botocore.session import Session
from botocore.config import Config

s = Session()
c = s.create_client('s3', config=Config(connect_timeout=5, read_timeout=60, retries={'max_attempts': 2}))

The idea here is to "fail fast" and ensure that boto3 times out quickly and retries the request immediately instead of waiting for the default timeout. Please make sure that the connection timeout is around 3x your expected or normal timeout to ensure healthy requests are not considered as timeouts.

That said, this behavior should only happen for a small number of requests. If a large number of your requests are experiencing timeouts or if the requests are taking more than 60 seconds without any retries then I would recommend logging the RequestID's and creating a support case with us so that a log dive can be performed on these requests.

AWS
SUPPORT ENGINEER
answered 4 years ago
  • Thank you Ryan. It makes sense. will definitely do as you propose . Thank You

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.