By using AWS re:Post, you agree to the Terms of Use

How exactly does target-based auto-scaling work?


I have been experimenting recently with auto-scaling in GameLift using a target tracking policy, and I have come across a few things that I have questions on.

  1. When there are zero active game sessions on the fleet and the minimum instance count on the fleet is zero, the fleet scales down to zero instances. Why is this? And do I need to have a minimum instance count of at least 1 instance for auto scalilng to work? It seems to be that way after reading this forum post,, but I just wanted to confirm.

  2. When does the upscaling take effect? The reason I ask is that when the percentage of game sessions is lower than the buffer percentage, it takes about 10-15 minutes for a new instance or multiple instances to start up for the percentage of game session availability to go back up. Are the dashboard metrics delayed or does it actually take this long for the number of instances to scale up? And do the new instances start spinning up right when the game session availability percentage drops down below the buffer for the auto scaling policy? Or is there more involved?

  1. When does the downscaling take effect? Similarly to the last question, it also takes about 15 minutes for idle instances to shut down when the percentage of game session availability is "too high". I ask the same question as before here: are the dashboard metrics delayed or does it actually take this long for the number of instances to scale down? Also, how exactly does GameLift know when to scale the number of instances in a fleet down when target tracking is implemented? Does GameLift constantly check whether the percentage of game session availability will still be above the buffer if a certain instance or certain instances shut down? Or is there more involved?

  2. Lastly, if it truly does take a while for the number of instances to scale up and down within a fleet, what can we do to best reduce player wait times? Should we try to avoid shutting down server processes and rather try to reuse them after terminating a game session? Should we increase the number of concurrent server processes running in an instance? Should we increase the minimum and maximum instance count range for a fleet? Should we make the target-based policy's buffer bigger? These are all ideas but I just want to make sure that they are all valid and inquire about other ideas.

I know these are a lot of weird questions that are more about the inner workings of GameLift, so thank you in advance!

asked 2 years ago17 views
2 Answers


  1. You asked the fleet to scale to zero as the min instance count is set to zero " Under Instance Limits , check that the minimum and maximum limits are appropriate for the fleet. With auto-scaling enabled, capacity may adjust to any level between these two limits."


There is a natural tension about what todo with fleets scaled to zero. For some developers a fleet scaled to zero should remain there because you know there should be no traffic to fleet, why pay for empty instance hours during game development? There are also lots of fleets deliberately shut off like this.

For other developers, esp during alpha/betas, you want the fleets to be available at all times so you expect traffic to drop to zero and then start up again.

For now, GameLift respects the hard stop of fleets at zero, even if target tracking is on. I know GameLift has been looking at how to support both, but for now you need to maintain some active buffer for GameLift to autoscale your fleet or a process to kick off scaling again.

  1. Metrics propagation is relatively fast but scaling up new instances is expensive (takes x minutes depending on instance type, region and OS). Your build needs to be dropped onto the machine, spun up and then connected to GameLift. I'd recommend tracking metrics on instance creation time as this will help you understand what spare capacity you might need and also enables you to tune your server launch time.

  2. I don't know the internals here. But downscaling requires stopping existing resources which can take time esp if resources are slow to respond. In addition EC2 instances do not stop instantly, your game server also needs to shut down when it gets that termination signal, logs need to be uploaded etc (

GameLift generally considers capacity gone once a server / session has been successfully terminated (or force terminated if your process will not exit cleanly).

You should ideally track / measure metrics around the time to shutdown for your server and ensure you shutdown correctly (to avoid hitting the time delays around forced shutdowns).

  1. You reduce player wait times by having some spare capacity in your fleet. First, you maximize your server costs by: packing as many servers per instance as you can, use Linux servers and use spot. Second, you understand the cost of scaling up new servers based on your metrics. Thirdly, you have some data that helps drive where your scaling needs to be (ie expected load, peak loads etc).

Then you work out what spare capacity/buffer is sufficient for your needs.

"Choosing a buffer size depends on how you want to prioritize minimizing player wait time against controlling hosting costs. With a large buffer, you minimize wait time but you also pay for extra resources that may not get used. If your players are more tolerant of wait times, you can lower costs by setting a small buffer. "

Some folks do manual scaling to really control costs (which is tricky) or use Lambdas/processes to handle some aspects of their scaling (ie scale up fleets on weekend in region x, or move fleet from zero based on some metrics).

Hope that helps somewhat.

answered 2 years ago


answered 2 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions