Skip to content

Timeout when loading table from glue catalog

0

Hi

In one of our environments, we are receiving timeouts while loading the table mapped by glue catalog, we are not sure of the reason.

we would like to know the following

  1. is it possible to increase the timeout to glue?
  2. how can we keep track the number of requests made to glue via the AWS console?

This is the error that we see in EMR:

software.amazon.awssdk.core.exception.SdkClientException: Unable to execute HTTP request: connect timed out at software.amazon.awssdk.core.exception.SdkClientException$BuilderImpl.build(SdkClientException.java:102) at software.amazon.awssdk.core.exception.SdkClientException.create(SdkClientException.java:47) at software.amazon.awssdk.core.internal.http.pipeline.stages.utils.RetryableStageHelper.setLastException(RetryableStageHelper.java:204) at software.amazon.awssdk.core.internal.http.pipeline.stages.RetryableStage.execute(RetryableStage.java:83) at software.amazon.awssdk.core.internal.http.pipeline.stages.RetryableStage.execute(RetryableStage.java:36) at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206) at software.amazon.awssdk.core.internal.http.StreamManagingStage.execute(StreamManagingStage.java:56) at software.amazon.awssdk.core.internal.http.StreamManagingStage.execute(StreamManagingStage.java:36) at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.executeWithTimer(ApiCallTimeoutTrackingStage.java:80) at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:60) at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:42) at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallMetricCollectionStage.execute(ApiCallMetricCollectionStage.java:48) at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallMetricCollectionStage.execute(ApiCallMetricCollectionStage.java:31) at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206) at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206) at software.amazon.awssdk.core.internal.http.pipeline.stages.ExecutionFailureExceptionReportingStage.execute(ExecutionFailureExceptionReportingStage.java:37) at software.amazon.awssdk.core.internal.http.pipeline.stages.ExecutionFailureExceptionReportingStage.execute(ExecutionFailureExceptionReportingStage.java:26) at software.amazon.awssdk.core.internal.http.AmazonSyncHttpClient$RequestExecutionBuilderImpl.execute(AmazonSyncHttpClient.java:193) at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.invoke(BaseSyncClientHandler.java:103) at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.doExecute(BaseSyncClientHandler.java:167) at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.lambda$execute$1(BaseSyncClientHandler.java:82) at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.measureApiCallSuccess(BaseSyncClientHandler.java:175) at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.execute(BaseSyncClientHandler.java:76) at software.amazon.awssdk.core.client.handler.SdkSyncClientHandler.execute(SdkSyncClientHandler.java:45) at software.amazon.awssdk.awscore.client.handler.AwsSyncClientHandler.execute(AwsSyncClientHandler.java:56) at software.amazon.awssdk.services.glue.DefaultGlueClient.getTable(DefaultGlueClient.java:7713)

  • Hello Roee,

    It seems like a networking issue which most likely happens when the source is unable to connect to Glue resources due to lack of internet access (access via NAT gateway is required and not via Internet gateway) or a valid interface endpoint for AWS Glue in the same VPC subnet. I recommend you to also refer to this troubleshooting article that addresses a similar issue - https://repost.aws/knowledge-center/glue-connect-time-out-error, for more information.

    You may either try adding an interface endpoint for AWS Glue or attach a NAT gateway in your VPC subnet that is executing the request to reach the Glue Data Catalog.

    Many thanks!

asked 3 years ago500 views
1 Answer
0

It's very likely that means the catalog is not reachable, it could be the security group but much more likely is because you are running on a VPC that doesn't have internet access, neither a Glue endpoint. I suggest you add the endpoint if that's the case.

AWS
EXPERT
answered 3 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.