AWS Transfer Family FTPS (Public Endpoint, Custom Hostname) - Connection Reset During TLS Handshake - No CloudWatch Logs

0

Hello AWS Community,

I'm struggling with a persistent issue connecting to an AWS Transfer Family server via FTPS and would appreciate any insights or suggestions.

Goal: Connect clients (initially testing with curl, ultimately a Reolink camera) to an AWS Transfer Family server using FTPS (Explicit TLS) on port 21 via a custom hostname.

Configuration:

  • Service: AWS Transfer Family
  • Region: us-east-1
  • Endpoint Type: Public
  • Protocols Enabled: FTPS
  • Custom Hostname: Configured (e.g., ftp.example-domain.com) using a CNAME record in my external DNS provider (Squarespace) pointing to the default Transfer Family endpoint hostname (s-abcdef1234567890.server.transfer.us-east-1.amazonaws.com).
  • Certificate: Using a valid, issued wildcard certificate (*.example-domain.com) from ACM (in us-east-1), Key Algorithm: RSA (SHA-256 signature). This certificate is correctly associated with the custom hostname in the Transfer Family server settings.
  • Security Policy: Using a standard recent policy (TransferSecurityPolicy-2025-03). Compatible with RSA keys.
  • Client: Testing primarily with curl on macOS (using system LibreSSL) and also tested from a Conda environment (using OpenSSL). Client machine local IP e.g., 192.168.X.Y.
  • Local Router: Huawei HG8245W5.

Problem:

When attempting to connect using curl, the connection consistently fails during the TLS handshake, immediately after the Client Hello is sent. Crucially, no logs whatsoever (errors or successes) appear in the Transfer Family server's configured CloudWatch Logs for these attempts.

Example curl command: curl -v --ssl-reqd -u [username]:[password] ftp://ftp.example-domain.com:21/

Typical Failing Output:

* Host ftp.example-domain.com:21 was resolved.
* IPv6: (none)
* IPv4: [A.B.C.D]  <-- Correct Public EIP Resolved
* Trying [A.B.C.D]:21...
* Connected to ftp.example-domain.com ([A.B.C.D]) port 21 (#0) <-- TCP Connect OK
* TLSv1.x (OUT), TLS handshake, Client hello (1): <-- Starting TLS Handshake
* CAfile: /etc/ssl/cert.pem
* CApath: none
* Recv failure: Connection reset by peer <-- Reset DURING Handshake
* [OpenSSL/LibreSSL specific error]: Connection reset by peer
* Closing connection 0
curl: (35) Recv failure: Connection reset by peer

(Note: Sometimes, usually earlier in troubleshooting or perhaps intermittently, the failure manifested as a connection timeout curl: (28) before the TCP connection completed, but the most recent consistent behavior is the reset during handshake).

Troubleshooting Steps Performed:

  1. DNS Verified: Confirmed ftp.example-domain.com resolves correctly to the public AWS EIP (A.B.C.D) using nslookup, dig, and online checkers. (An earlier issue resolving to CGNAT space on one network was isolated and bypassed by testing from other networks).
  2. General Port 21 Outbound Works: Successfully connected using curl -v ftp://test.rebex.net:21. This indicates no general block on outbound TCP port 21 from my network/ISP.
  3. Consistent Failure Across Networks: The same FTPS handshake reset error occurs when testing from my primary ISP network AND from a mobile hotspot (different carrier/network path).
  4. Consistent Failure Across TLS Libraries: The same error occurs using system curl (LibreSSL) and curl within Conda (OpenSSL).
  5. Forced TLS 1.2: Tried curl -v --tlsv1.2 --ssl-reqd ... - resulted in the same reset error.
  6. Plain FTP Test: Cannot be performed, as Transfer Family returned an error stating plain FTP is only supported on VPC endpoints, not Public endpoints.
  7. Router Firewall Rules: Added explicit ALLOW, LAN to WAN, TCP rules on the Huawei HG8245W5 for Destination Port 21 and 1024-65535 from the client's source IP (192.168.X.Y). Confirmed direction is LAN to WAN. Rules were applied/saved. Issue persists.
  8. Router FTP ALG: Could not find a setting to disable FTP ALG on the Huawei HG8245W5 interface (logged in as telecomadmin). This remains a suspect for interfering with the handshake.
  9. Traceroute: Traceroute from my primary network to the AWS EIP (A.B.C.D) successfully traverses my ISP and enters the Cogent network (JFK area) but then consistently times out (* * *) for all subsequent hops before reaching AWS networks or the destination. (Sanitized output below).
  10. AWS Health Dashboard: No relevant service issues reported for Transfer Family or Networking in us-east-1.
  11. Ping: Times out (understood this is often blocked and not conclusive).

Sanitized Traceroute:

traceroute to A.B.C.D (A.B.C.D), 64 hops max, 40 byte packets
 1  192.168.X.1 (192.168.X.1)  ... ms ... ms ... ms
 2  [ISP Hop IP] ([ISP Hop IP]) ... ms ... ms ... ms
 3  * * *
 4  * * *
 5  [Transit Hop IP 1] ([Transit Hop IP 1]) ... ms ... ms
    [Transit Hop IP 2] ([Transit Hop IP 2]) ... ms
 6  [Cogent Hop 1] (xyz.cogentco.com) ... ms ... ms
    [Cogent Hop 2] (xyz.cogentco.com) ... ms
 7  [Cogent Hop 3] (xyz.cogentco.com) ... ms
    [Cogent Hop 4] (xyz.cogentco.com) ... ms
    [Cogent Hop 5] (xyz.cogentco.com) ... ms
 8  * * *
 9  * * *
... (repeated timeouts until max hops) ...

Question:

Given that the TCP connection establishes, but the FTPS TLS handshake is consistently reset immediately after the Client Hello (before any CloudWatch logs), and this happens across different networks:

  • What could be causing this reset before the Transfer Family application layer seems to log anything?
  • Could this still be the router's (un-disable-able) FTP ALG or stateful inspection, even if basic TCP works?
  • Could this indicate an issue with the specific Transfer Family endpoint instance, the AWS network edge handling TLS for this EIP/service, or the custom certificate/hostname binding?
  • Are there any other diagnostic steps recommended, especially without access to paid AWS Support?

Any help or pointers would be greatly appreciated!

1 Answer
0

Based on the information provided, there are several potential causes for the FTPS connection issue you're experiencing with AWS Transfer Family. Here are some insights and suggestions:

  1. TLS Handshake Reset: The fact that the connection is reset during the TLS handshake, specifically after the Client Hello, suggests a problem with the TLS negotiation. This could be due to a mismatch in supported TLS versions or cipher suites between your client and the server.

  2. Router FTP ALG: The inability to disable the FTP ALG (Application Layer Gateway) on your Huawei HG8245W5 router is a significant concern. FTP ALGs can interfere with FTPS connections, especially during the TLS handshake. This remains a prime suspect for your issue.

  3. Network Path: The traceroute results showing timeouts after entering the Cogent network could indicate network path issues. However, since you've tested from multiple networks with the same result, this is less likely to be the root cause.

  4. AWS Transfer Family Configuration: The lack of CloudWatch logs suggests the issue occurs before the AWS Transfer Family application layer processes the connection. This could point to a problem with the endpoint configuration, particularly the custom hostname and certificate setup.

  5. Certificate Mismatch: Ensure that the wildcard certificate (*.example-domain.com) exactly matches the custom hostname you're using (ftp.example-domain.com). Any mismatch could cause the TLS handshake to fail.

To further troubleshoot:

  1. Try connecting using different TLS versions explicitly (e.g., TLS 1.2, TLS 1.3) to see if there's a version compatibility issue.

  2. Attempt to connect from a different network that uses a router where you can disable the FTP ALG. This would help isolate whether the router is the problem.

  3. Double-check the custom hostname configuration in AWS Transfer Family, ensuring it exactly matches your certificate and DNS settings.

  4. If possible, temporarily set up a test Transfer Family server without a custom hostname to see if the issue persists. This could help determine if the problem is related to the custom hostname configuration.

  5. Use a network packet capture tool like Wireshark to analyze the TLS handshake in more detail. This might provide more insights into why the handshake is failing.

  6. Consider testing with a different FTPS client to rule out any client-specific issues.

If these steps don't resolve the issue, it may indeed indicate a problem with the specific Transfer Family endpoint or the AWS network edge handling for this service. In this case, reaching out to AWS support might be necessary for a more in-depth investigation.
Sources
Troubleshooting AWS Transfer Family - AWS Transfer Family
OpenSSL SSL_read: Connection reset by peer, errno 104 | AWS re:Post
ACM AWS SSL not reflected | AWS re:Post
Trying to connect Power Automate to Amazon Web Service is returning this error "Details: Server HMAC algorithm not found " | AWS re:Post

profile picture
answered a month ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions