Game Session Terminates Instantly (Apparently Unhealthy?)

1

I am having a problem where my fleets can't create game sessions, or rather, they are created and almost immediately are terminated.

In the fleet events view I get the message that the fleet apparently terminated a process because it didn't report healthy for 3 minutes, yet when I download the log for the terminated game session I can see that the game session itself wasn't even alive for a full minute (no crash / errors in the log either), and it took less that thirty seconds to reach the point where I call ProcessReady. My health checks are just a lambda that returns true also.

I don't think this is a specs issue, as the fleet that I'm trying to create a game session for only has a one process limit and a single process of that build doesn't even reach 1gb of memory usage (this is a lobby server that doesn't even have a single player yet).

Same thing is happening with the game server fleet when I try to create a session manually through the aws cli, which makes sense since it is essentially the same build but is using a different map.

I am using Unreal Engine 4.22. The logic that I'm following to set up the game session in game is as follows:

  • Process starts
  • In the game mode constructor, call InitSDK for the gamelift module.
  • Bind the functions for OnTerminate ( call ProcessEnding() ), onHealthCheck( return true ), OnStartGameSession ( call ActivateGameSession() ).
  • When the Process reaches the "WaitingToStart" state, after it has loaded the map and chosen a server port to listen for connections, I finish setting up the process parameters with this port and then call ProcessReady.

Am I doing anything blatantly wrong here? Any help would be greatly appreciated, thanks in advance!.

asked 4 years ago206 views
6 Answers
1

In my specific case, this turned out to be an actual crash, but since the error wasn't being output to the log file, it was not trivial to catch.

For anyone using Unreal Engine, this happened because I was making use of the GameSession creation callback to start a timer (we've had some issues connecting sometimes and this timer was meant to serve as a way of hard terminating game sessions if no user had connected in X minutes), and you can't start Unreal timers in a non-game thread, which is where these callbacks are invoked.

As a workaround, the thread now modifies a flag, and the game thread reads it periodically until it has been changed, then starting the timer.

answered 4 years ago
0

Some quick thoughts:

  • When you call ProcessReady GameLift should start calling your HealthCheck callback fairly quickly. ** So I would add some logic here to get timestamps of the call and ensure it hits your callback ** As the documents state it calls your health check every 60s so ensure thats what you are seeing.

Seen some issues issues with thread lockup/starvation causes problems like this on Unreal so I would ensure everything looks good in your server logs. Ensure you're ticking/updating as expected on server threads.

I would also try a sanity test against GameLift local as if you can replicate here, your debug path is much healthier.

Also ensure you aren't causing a scale down event, temporarily disable any scaling rules for your fleet, as instance could being terminated.

Basically you want to debug on a fleet with one instance and one process. Remote into the fleet to grab/tail the server logs in realtime so you can see whats going on (this is why sanity testing with GameLift local can be a real time saver)

If everything still looks good, if you can provide:

  • Which GameLift Server SDK version you are on and/or version of the Unreal plugin you are on
  • A fleet id, region and a approximate time of the most recent failures

I can then get GameLift service team to take a look as this could be a Service/SDK problem preventing messages flowing correctly.

answered 4 years ago
0

Hi Pip, thank you so much for responding!

I am using Gamelit Server SDK version 3.3.0 (12_14_2018).

Testing with Gamelift Local, the health check gets called from the Unreal Server Immediately as soon as I call ProcessReady ( less than 30 seconds after the process was launched ), and again every minute from there. And simultaneously, on the SDK Listener side, it receives the onReportHealth message with a healthy status every time.

Uploading the build to gamelift, getting instance access and tailing the log file I can see that the process gets restarted successfully when I start a game session, takes a couple seconds to load the map, and open up its ports, but there's no error or network warning whatsoever.

Here is the output I get from describe-game-sessions where I can see that the game session was terminated within 5 seconds of being created:

{
    "GameSessions": [
        {
            "GameSessionId": "arn:******
            "FleetId": "fleet-******
            "CreationTime": 1579078940.086,
            "TerminationTime": 1579078944.917,
            "CurrentPlayerSessionCount": 0,
            "MaximumPlayerSessionCount": 200,
            "Status": "TERMINATED",
            "GameProperties": [],
            "IpAddress": "52.57.231.108",
            "Port": 7777,
            "PlayerSessionCreationPolicy": "ACCEPT_ALL"
        }
    ]
}

It's worth noting that the server process is very much alive and well, despite the game session having been terminated, as I can manually connect knowing the IP and port, but this is not viable as I'm relying on gamelift retrieving the game sessions of the lobby alias for connecting to the lobby, and using flexmatch for matchmaking.

answered 4 years ago
0

If you can provide your Fleet id and region, I can get GameLift service team to take a look at whats going on.

answered 4 years ago
0

That would be amazing help, thank you so much! The fleet is : fleet-******

Region: eu-central-1

Most recent failed game session: arn:******

answered 4 years ago
0

I ended up contacting premium AWS support, since the fleet that I linked here had already been deleted, sorry for taking up your time, thanks! I'll be sure to update this thread if/when I can get it solved.

answered 4 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions