New fleet created from old build keeps crashing

0

Hello,

2 month ago, we implemented GameLift to our Unity project. Worked perfectly, we tested it for a couple of weeks with multiple devices. We were able to create game and player sessions through the SDK methods "CreateGameSession" & "CreatePlayerSession".

Fast-forward 2 month later, after a discussion with an employee of AWS, the latter recommand us to use "StartGameSessionPlacement" instead, in addition to game session queues. We decided to follow the advice, however after uploading the new build & creating a new fleet, the instance keeps crashing, alternating between these 2 errors :

  • SERVER_PROCESS_SDK_INITIALIZATION_TIMEOUT - Server process started correctly but did not call InitSDK() or ProcessReady() within 5 minutes
  • SERVER_PROCESS_CRASHED - Server process exited without calling ProcessEnding(), exitCode(134)

screenshot

I looked for an hour before trying to do the exact same with an older build from 2 month ago, and the exact same thing happened. After playing around with the fleet settings, I reduced the amount of concurrent processes from 20 to 5, and it finally worked ! I am still very confused though as to why ? 2 month ago, this problem never showed up in any of our tests. Since the build used is the exact same, it should work the same way, right ?

Another thing, even after reducing the amount of processes, when I manually scale down the instance number to 0, this error shows up 5 times in a row (1 for each process I assume) :

  • SERVER_PROCESS_FORCE_TERMINATED - Server process did not cleanly exit within 30 seconds of OnProcessTerminate(), exitCode(137)

screenshot

Again, never happened before :/ Does anybody knows why Gamelift might behave like this all of a sudden ?

Best regards, Tom

  • Hi Tom,

    Whats the instance specifications that Gamelift has access to?

    Just curious whether you already have game servers running on the instances prior to switching over to incorporating StartGameSessionPlacement? The concurrent processes change from 20 --> 5 may indicate that the instance is running out of resources when starting.

    For the 2nd issue, do you have a method that handles the ProcessEnding() workflow? You'll need to handle clean up tasks such as shutting down / migrating game sessions. The ProcessEnding() method should exit with an exit code of 0; a non-zero exit code results in an event message that the process did not exit cleanly.

  • Hi Steven,

    The instance type we're using is a c5.large, so that amounts to 2 vCPU & 4 GB RAM. We used to be able to run our game server on these instances just fine, even with 20 concurrent processes. We do handle ProcessEnding(), it is called whenever we deem the game/lobby to be over, and also on the callback OnProcessTerminate().

  • Hi Tom, Apologies for the slow follow up:

    Would you happen to have opened a AWS support ticket on this one? The behaviour doesn't seem normal because a c5.large can handle up to a max of 50 processes.

    Likewise with OnProcessTerminate(), i've found a similar thread related:

    https://repost.aws/questions/QUt1IrMTdFSYuAatlUKsnP8g/server-process-force-terminated-event-and-testing-on-process-terminate

    Would you be able to help confirm that

    1. OnProcessTerminate() is being triggered by GameLift and received by your GameServer? and
    2. Your OnProcessTerminate() callback executes to the end of the block (No exceptions or OOM errors). The similar repost question looked like there was a date parse exception - which was being hidden.
  • Hi Steven, Thanks for your reply. We have just opened a support ticket and waiting for a response.

    The OnProcessTerminate() callback is triggered and received properly. However, for some reason it never seems to reach the end of the method in time. Another thing we've noticed in the server log is that the OnProcessTerminate() is being triggered twice in a row.

    I cannot upload files in a comment, so I will add them as an answer below in order to have a better understanding.

asked a year ago331 views
3 Answers
0

Hello Steven,

I am using on demand instances just to make sure the instance doesn't get recycled before I am done testing. Once this issue is fixed, we will switch back to spot instances. Thanks for the advice !

I added some log message to the OnHealthCheck callback, no issues there the log shows up and the server is healthy.

For the Application.Quit(); I put "Application is quitting" right above it, it shows up in the log properly in that order :

  • ProcessEnding success.
  • Process has ended
  • Application is quitting

I also tried to replace the direct call to Application.Quit(); and using a coroutine instead to do the same job without impeding the OnProcessTerminate callback. Same result, unfortunately. Even removing this line of code didn't change anything, the same error : SERVER_PROCESS_FORCE_TERMINATED shows up in the gamelift console.

We also checked if the game sessions were being terminated properly with the describe-game-session command. The sessions all displayed the status TERMINATED.

As for potential stuck network connection, the only external connection we have is to our API. However, we do not use connection pooling, just regular web requests.

Tom

answered a year ago
  • Thanks for the update Tom, can you keep me posted with what the support team responds with? I think this warrants some assistance from the support team to dive into the service logs. Hopefully its something thats easily fixed!

0

This is a follow-up to the comment posted above.

Here is the code used to initialize and terminate gamelift instances :

public class GameLiftManagerSP : MonoBehaviour
{
	public bool ProcessReady(int port)
	{
	    var initSDKOutcome = GameLiftServerAPI.InitSDK();

	    if (initSDKOutcome.Success)
	    {
	        _isProcessActive = true;

	        ProcessParameters processParameters = new ProcessParameters(
	            OnStartGameSession,
	            OnUpdateGameSession,
	            OnProcessTerminate,
	            OnHealthCheck,
	            port,
	            new LogParameters(new List<string>()
	            {
	                "/local/game/logs/server.log"
	            }
	        ));

	        var processReadyOutcome = GameLiftServerAPI.ProcessReady(processParameters);
	        if (processReadyOutcome.Success)
	        {
	            Logger.LogMessage("ProcessReady success.");
	        }
	        else
	        {
	            Logger.LogError("ProcessReady failure : " + processReadyOutcome.Error.ToString());
	        }

	        return processReadyOutcome.Success;
	    }
	    else
	    {
	        Logger.LogError("InitSDK failure : " + initSDKOutcome.Error.ToString());
	    }

	    return initSDKOutcome.Success;
	}

	private void OnProcessTerminate()
	{
	    ProcessEnding();
	    Logger.LogMessage("Process has ended");

	    Application.Quit();
	    Logger.LogMessage("Application has quit");
	}

	public bool ProcessEnding()
	{
	    _isProcessActive = false;

	    var processEndingOutcome = GameLiftServerAPI.ProcessEnding();
	    if (processEndingOutcome.Success)
	    {
	        Logger.LogMessage("ProcessEnding success.");
	    }
	    else
	    {
	        Logger.LogError("ProcessEnding failure : " + processEndingOutcome.Error.ToString());
	    }

	    return processEndingOutcome.Success;
	}
}

And here is the server log produced by this code :

ProcessEnding success.
UnityEngine.StackTraceUtility:ExtractStackTrace ()
UnityEngine.DebugLogHandler:LogFormat (UnityEngine.LogType,UnityEngine.Object,string,object[])
UnityEngine.Logger:Log (UnityEngine.LogType,object)
UnityEngine.Debug:Log (object)
SpicyParty.Logger:LogMessage (string) (at D:/Projets/Spicy Party/Spicy_Party_Game/Assets/Scripts/UI/Logger/Logger.cs:34)
SpicyParty.GameLiftManagerSP:ProcessEnding () (at D:/Projets/Spicy Party/Spicy_Party_Game/Assets/Scripts/Game/Network/GameLiftManagerSP.cs:177)
SpicyParty.GameLiftManagerSP:OnProcessTerminate () (at D:/Projets/Spicy Party/Spicy_Party_Game/Assets/Scripts/Game/Network/GameLiftManagerSP.cs:96)
Aws.GameLift.Server.ServerState:<OnTerminateProcess>b__31_0 ()
System.Threading.Tasks.Task:InnerInvoke ()
System.Threading.Tasks.Task:Execute ()
System.Threading.Tasks.Task:ExecutionContextCallback (object)
System.Threading.ExecutionContext:RunInternal (System.Threading.ExecutionContext,System.Threading.ContextCallback,object,bool)
System.Threading.ExecutionContext:Run (System.Threading.ExecutionContext,System.Threading.ContextCallback,object,bool)
System.Threading.Tasks.Task:ExecuteWithThreadLocal (System.Threading.Tasks.Task&)
System.Threading.Tasks.Task:ExecuteEntry (bool)
System.Threading.Tasks.Task:System.Threading.IThreadPoolWorkItem.ExecuteWorkItem ()
System.Threading.ThreadPoolWorkQueue:Dispatch ()
System.Threading._ThreadPoolWaitCallback:PerformWaitCallback ()

Process has ended
UnityEngine.StackTraceUtility:ExtractStackTrace ()
UnityEngine.DebugLogHandler:LogFormat (UnityEngine.LogType,UnityEngine.Object,string,object[])
UnityEngine.Logger:Log (UnityEngine.LogType,object)
UnityEngine.Debug:Log (object)
SpicyParty.Logger:LogMessage (string) (at D:/Projets/Spicy Party/Spicy_Party_Game/Assets/Scripts/UI/Logger/Logger.cs:34)
SpicyParty.GameLiftManagerSP:OnProcessTerminate () (at D:/Projets/Spicy Party/Spicy_Party_Game/Assets/Scripts/Game/Network/GameLiftManagerSP.cs:97)
Aws.GameLift.Server.ServerState:<OnTerminateProcess>b__31_0 ()
System.Threading.Tasks.Task:InnerInvoke ()
System.Threading.Tasks.Task:Execute ()
System.Threading.Tasks.Task:ExecutionContextCallback (object)
System.Threading.ExecutionContext:RunInternal (System.Threading.ExecutionContext,System.Threading.ContextCallback,object,bool)
System.Threading.ExecutionContext:Run (System.Threading.ExecutionContext,System.Threading.ContextCallback,object,bool)
System.Threading.Tasks.Task:ExecuteWithThreadLocal (System.Threading.Tasks.Task&)
System.Threading.Tasks.Task:ExecuteEntry (bool)
System.Threading.Tasks.Task:System.Threading.IThreadPoolWorkItem.ExecuteWorkItem ()
System.Threading.ThreadPoolWorkQueue:Dispatch ()
System.Threading._ThreadPoolWaitCallback:PerformWaitCallback ()

ProcessEnding success.
UnityEngine.StackTraceUtility:ExtractStackTrace ()
UnityEngine.DebugLogHandler:LogFormat (UnityEngine.LogType,UnityEngine.Object,string,object[])
UnityEngine.Logger:Log (UnityEngine.LogType,object)
UnityEngine.Debug:Log (object)
SpicyParty.Logger:LogMessage (string) (at D:/Projets/Spicy Party/Spicy_Party_Game/Assets/Scripts/UI/Logger/Logger.cs:34)
SpicyParty.GameLiftManagerSP:ProcessEnding () (at D:/Projets/Spicy Party/Spicy_Party_Game/Assets/Scripts/Game/Network/GameLiftManagerSP.cs:177)
SpicyParty.GameLiftManagerSP:OnProcessTerminate () (at D:/Projets/Spicy Party/Spicy_Party_Game/Assets/Scripts/Game/Network/GameLiftManagerSP.cs:96)
Aws.GameLift.Server.ServerState:<OnTerminateProcess>b__31_0 ()
System.Threading.Tasks.Task:InnerInvoke ()
System.Threading.Tasks.Task:Execute ()
System.Threading.Tasks.Task:ExecutionContextCallback (object)
System.Threading.ExecutionContext:RunInternal (System.Threading.ExecutionContext,System.Threading.ContextCallback,object,bool)
System.Threading.ExecutionContext:Run (System.Threading.ExecutionContext,System.Threading.ContextCallback,object,bool)
System.Threading.Tasks.Task:ExecuteWithThreadLocal (System.Threading.Tasks.Task&)
System.Threading.Tasks.Task:ExecuteEntry (bool)
System.Threading.Tasks.Task:System.Threading.IThreadPoolWorkItem.ExecuteWorkItem ()
System.Threading.ThreadPoolWorkQueue:Dispatch ()
System.Threading._ThreadPoolWaitCallback:PerformWaitCallback ()

Process has ended
UnityEngine.StackTraceUtility:ExtractStackTrace ()
UnityEngine.DebugLogHandler:LogFormat (UnityEngine.LogType,UnityEngine.Object,string,object[])
UnityEngine.Logger:Log (UnityEngine.LogType,object)
UnityEngine.Debug:Log (object)
SpicyParty.Logger:LogMessage (string) (at D:/Projets/Spicy Party/Spicy_Party_Game/Assets/Scripts/UI/Logger/Logger.cs:34)
SpicyParty.GameLiftManagerSP:OnProcessTerminate () (at D:/Projets/Spicy Party/Spicy_Party_Game/Assets/Scripts/Game/Network/GameLiftManagerSP.cs:97)
Aws.GameLift.Server.ServerState:<OnTerminateProcess>b__31_0 ()
System.Threading.Tasks.Task:InnerInvoke ()
System.Threading.Tasks.Task:Execute ()
System.Threading.Tasks.Task:ExecutionContextCallback (object)
System.Threading.ExecutionContext:RunInternal (System.Threading.ExecutionContext,System.Threading.ContextCallback,object,bool)
System.Threading.ExecutionContext:Run (System.Threading.ExecutionContext,System.Threading.ContextCallback,object,bool)
System.Threading.Tasks.Task:ExecuteWithThreadLocal (System.Threading.Tasks.Task&)
System.Threading.Tasks.Task:ExecuteEntry (bool)
System.Threading.Tasks.Task:System.Threading.IThreadPoolWorkItem.ExecuteWorkItem ()
System.Threading.ThreadPoolWorkQueue:Dispatch ()
System.Threading._ThreadPoolWaitCallback:PerformWaitCallback ()

Essentially, there is only one way the OnProcessTerminate() method can be called and it is through the Gamelift callback. We can see from the log that it is being triggered twice, is that normal ?

Also, the sentence "Application has quit" doesn't appear in the log, therefore we never get past Application.Quit(); Should this be invoked as a unity coroutine, in order to allow the OnProcessTerminate() method to finish before it gets timed out ?

answered a year ago
0

Hey Tom - sorry hitting word limits so using an answer.

Just a quick one to start with - If you're running Gamelift during development, make sure to set the Instances to use Spot Instances, so you save up to 90% of the cost.

Diving into the code, i recreated a similar version of the game server and ran my own GameLift locally using this - its a local version that just runs on a Java file.

https://docs.aws.amazon.com/gamelift/latest/developerguide/integration-testing-local.html

Looking at onProcessTerminate(), GameLift may call this in 3 cases: (1) when the server process has reported poor health or has not responded to GameLift, (2) when terminating the instance during a scale-down event, or (3) when an instance is being terminated due to a Spot interruption.

So we just need to check that 1) the game server responds to the healthcheck - defined in your OnHealthCheck callback handler. 2) The game server isnt responding to a scale-down event, and 3) the instance isnt being terminated due to a spot interruption.

It is possible to be called twice if two of these 3 events happen within a very small timeframe.

For the last component - with the Application.Quit(); - if this line calls a function that stops the process (i.e Environment.Exit(0) ), the execution will never go past that line, as the game server will cease to exist immediately at that execution line.

So you'd want to put the "Application is quitting" right above that line.

What i'd also check is when you run the GameLiftServerAPI.ProcessEnding() command, your specific Game Session status should become "TERMINATED". This is important because we want GameLift to have acknowledged our intention to close the game session hosting process - before we actually shut the process down. If we tell Gamelift we are about to shut the process and GameLift doesn't acknowledge, GameLift will think the process has shut in an unhealthy way. This (could) be related to logs relating to unclean exits within 30 seconds. There is also the off-chance that the cleanup steps didn't actually finish within 30 seconds - you'll need to confirm this too. (For example a stuck network connection, like a open connection pool to a DB or something)

You can confirm the specific game session via this CLI: https://awscli.amazonaws.com/v2/documentation/api/latest/reference/gamelift/describe-game-session-details.html

Hopefully this clears the air, but if you have more questions please send me an email!

stetseng(at)amazon.com

answered a year ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions