API Gateway websockets don't stay alive

0

My websocket connection works fine until the keepalive timeout triggers on the client, and then it closes despite messages being sent successfully in both directions. I have tried with both .NET ClientWebSocket and python websockets. Does API Gateway not handle websocket pings? Are there any settings related to this? I could set the client keepalive really high but that seems wrong in the case of actual network issues.

Playing around with the python websocket, the connection closes as soon as it sends a ping:

import asyncio
import websockets

async def ws():
    print('connecting...')
    async with websockets.connect('wss://my-websocket-url',
            ping_interval = 1, ping_timeout = 5) as socket:
        print('connected')
        message = await socket.recv()
        print(f"< {message}")
        print('exiting')

asyncio.run(ws())

If no messages are received, the above code exits after 1 second with:

asyncio.streams.IncompleteReadError: 0 bytes read on a total of 2 expected bytes

The above exception was the direct cause of the following exception:

websockets.exceptions.ConnectionClosed: WebSocket connection is closed: code = 1006 (connection closed abnormally [internal]), no reason

This implies to me that AWS is responding to the ping with something weird? Interestingly, if I don't recv() at all then the connection does not fail. Why would receiving messages conflict with websocket pings?

Edited by: F. Spiral on Apr 3, 2019 12:21 PM

Edited by: F. Spiral on Apr 3, 2019 12:25 PM

Edited by: F. Spiral on Apr 3, 2019 12:29 PM

asked 5 years ago3286 views
4 Answers
1

Okay, this latest issue was actually my fault - apparently if you have a timeout on a ReceiveAsync() or SendAsync() then it aborts the whole websocket, by design, so my 30s timeout was making it seem like a keepalive issue. False alarm!

answered 4 years ago
0

I have discovered that this happens in the us-west-2 region but not in us-east-1. Those are the only regions I tried. Steps to reproduce:

  1. Create a new lambda function (python 3.7), accept all the defaults and publish it.
  2. Create a new API Gateway websocket API.
  3. Route the $connect function to your lambda function.
  4. Deploy the API.
  5. Run my python code from above using your websocket URL.

For me, in us-west-2 the connection reliably closes after ping_interval seconds but in us-east-1 it stays open. Am I just confusing myself or is something misconfigured on Amazon's side?

Edited by: F. Spiral on Apr 3, 2019 2:00 PM

answered 5 years ago
0

Well, this problem seems to have fixed itself for now. This was happening for about a day and then it suddenly started working fine.

answered 5 years ago
0

This seems to be happening again, but now I'm using C# and System.Net.Websockets.ClientWebSocket. The KeepAliveInterval doesn't seem to matter this time. The websocket is disconnecting with status Aborted after 30 seconds of no activity, even if KeepAliveInterval is set to 5 seconds. How do I keep an API Gateway websocket alive without inventing random data to send? Is anyone else using ClientWebSocket and having this problem?

answered 4 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions