Skip to content

ECS Service Connect blocks HTTP GET but allows HTTP POST to same endpoint for calls within the ECS Cluster

0

Hello,

I have multiple Java Spring Boot microservices deployed in ECS cluster with ECS Service Connect. The microservices in ECS Cluster / ECS Service Connect are in client/server configuration. I have an interesting issue that I am debugging and running out of ideas to find a resolution:

  1. Service A calls Service B using POST and GET REST APIs.
  2. Service A calls to Service B using HTTP POST are successful but HTTP GET fails with 403 Forbidden error.
  3. The same GET API is exposed by Service B over AWS API Gateway for external clients which works fine. API Gateway logs shows in HTTP response headers that the host was "envoy" - the Envoy proxy configured for Service B.
  4. I have verified that Security Groups are configured correctly (otherwise HTTP POST call will not go through).
  5. I have also verified that HTTP Method, host domain, and URI for the GET calls are correct. I have also verified HTTP headers, and UseAgent are ok in request. I have disabled CSRF checks (for debugging but that was not the issue).
  6. I have also verified that the HTTP GET API call from Service A never makes it to Service B. This means the Envoy Proxy between Service A and Service B is blocking it. ECS Service Connect doesn't expose configuration for the Envoy Proxy (except timeout value) so I cannot verify configuration of the underlying Envoy Proxy.
  7. The CloudWatch logs for ECS Service Connect has no meaningful log beyond Envoy Proxy warning on deprecated configuration.

The question is why would HTTP GET call fail but HTTP Post calls from Service A to Service B is fine? Are there ECS Service Connect configuration that I am missing? The ECS Service Connect configuration is correct (it has correct namespace, logDriver, service block for port mapping, dnsName, port, and discovery name). Since POST call works just fine between same services, it is not an issue with configuration. Security Groups are fine too.

2 Answers
1

The behavior you're describing is unusual for ECS Service Connect, as it typically doesn't discriminate between HTTP methods like GET and POST. However, there are a few potential explanations and areas to investigate:

  1. Service Configuration: Double-check the Service Connect configuration for both services, especially Service B. Ensure that the port mappings and service discovery settings are correctly configured for both GET and POST methods.

  2. Application Code: Verify that Service B's application code is correctly handling GET requests. The issue might be in how the Spring Boot application is processing these requests.

  3. Proxy Configuration: Although ECS Service Connect doesn't expose much of the Envoy proxy configuration, there might be an issue with how it's handling GET requests. The fact that the API Gateway works fine suggests the problem is specific to the Service Connect setup.

  4. Network ACLs: While you've checked security groups, also verify any Network ACLs that might be in place. It's possible (though unlikely) that there's a rule blocking GET requests specifically.

  5. Service Connect Proxy Logs: Dive deeper into the CloudWatch logs for the Service Connect proxy. Look for any entries related to GET requests specifically, even if they're not immediately obvious.

  6. Client Configuration: Check Service A's client configuration. Ensure it's using the correct Service Connect endpoint for GET requests.

  7. TLS/SSL Issues: If you're using TLS, there might be a certificate issue specifically affecting GET requests. This is less likely but worth checking.

  8. Temporary Service Connect Issue: Consider if this could be a temporary issue with the Service Connect service itself. You might want to reach out to AWS support if the problem persists.

  9. Version Compatibility: Ensure both services are using compatible versions of the ECS agent and Service Connect components.

To troubleshoot further:

  • Try using a simple test application that only does GET requests to isolate the issue.
  • Temporarily disable Service Connect and test direct communication between the services to see if the problem persists.
  • Use network monitoring tools to capture the traffic between services and analyze what's happening at the network level.

If none of these suggestions resolve the issue, it may be necessary to engage AWS support for a more in-depth investigation of your specific Service Connect setup.
Sources
Amazon ECS Service Connect components - Amazon Elastic Container Service
Amazon ECS Service Connect configuration overview - Amazon Elastic Container Service

answered a year ago
EXPERT
reviewed a year ago
0

Hello,

I know my response comes a bit late but I ran into the same issue and after ~5 hours of debugging I finally tracked down the root cause.

If you are using Spring Cloud Gateway (WebMVC) as a proxy, the problem comes from the default RestClientProxyExchange bean. By default, the underlying RestClient may attempt a protocol upgrade (e.g. Upgrade: TLS/1.2 on GET requests), which Envoy (underlying Service Connect software) rejects.

To fix this, you need to override the RestClientProxyExchange bean and force the client to always use HTTP/1.1 with no protocol upgrade. Example in Kotlin:

@Bean
fun restClientProxyExchange(
  restClientBuilder: RestClient.Builder,
  properties: GatewayMvcProperties?,
): RestClientProxyExchange {
  // Force HTTP/1.1, disable protocol upgrade
  val httpClient = HttpClient.newBuilder()
    .version(HttpClient.Version.HTTP_1_1)
    .build()

  val restClient = restClientBuilder
    .requestFactory(JdkClientHttpRequestFactory(httpClient))
    .build()

  return RestClientProxyExchange(restClient, properties)
}

With this configuration, the client no longer tries to upgrade the connection, and Envoy accepts the request without error.

Example Envoy logs before the fix (notice the Upgrade header that causes upgrade_failed):

[2025-09-15 15:17:43.485][18213893][debug][http] [source/common/http/conn_manager_impl.cc:1206] [Tags: "ConnectionId":"44","StreamId":"17231060347254901479"] request headers complete (end_stream=false):
':authority', 'localhost:6319'
':path', '/my/path'
':method', 'GET'
'authorization', 'Bearer {redacted}'
'user-agent', 'PostmanRuntime/7.45.0'
'accept', '*/*'
'postman-token', '{redacted}'
'accept-encoding', 'gzip, deflate, br'
'connection', 'keep-alive,Upgrade'
'upgrade', 'TLS/1.2'

[2025-09-15 15:17:43.485][18213893][debug][connection] [./source/common/network/connection_impl.h:99] [Tags: "ConnectionId":"44"] current connecting state: false
[2025-09-15 15:17:43.485][18213893][debug][http] [source/common/http/filter_manager.cc:1118] [Tags: "ConnectionId":"44","StreamId":"17231060347254901479"] Sending local reply with details upgrade_failed
[2025-09-15 15:17:43.485][18213893][debug][http] [source/common/http/conn_manager_impl.cc:1850] [Tags: "ConnectionId":"44","StreamId":"17231060347254901479"] closing connection due to connection close header
[2025-09-15 15:17:43.485][18213893][debug][http] [source/common/http/conn_manager_impl.cc:1916] [Tags: "ConnectionId":"44","StreamId":"17231060347254901479"] encoding headers via codec (end_stream=true):
':status', '403'
'date', 'Mon, 15 Sep 2025 13:17:43 GMT'
'server', 'envoy'
'connection', 'close'

After applying the override, those headers are gone and the request goes through normally.

Cheers!

answered 7 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.