What is the fastest way EC2 Auto Scaling Group can react to the inputs?

0

I am building a build server (GitHub self-hosted runner) that is expected to pick up jobs as soon as they become available (queued).

I am wondering what is the fastest way to have an effect on the ASG for it to invoke the scaling process and pick up the jobs.

I have full flexibility in using anything on AWS.

Right now our GitHub events come in as Event Bridge events almost instantly.

I can publish these events to a high-resolution CloudWatch metric, but I am not sure if ASG will monitor this in real-time, or if it will only do a once-a-minute evaluation.

I can also force ASG to scale out, but then I don't know how to scale it in, once we have processed a job.

profile picture
m0ltar
asked 10 months ago516 views
3 Answers
0
Accepted Answer

EC2 Autoscaling is not going to be the optimum choice for this use case due to the reaction time to scale-up/down for the build servers. Github proposes a solution based on Lambda functions to Autoscale self-hosted runners. Another option as suggested in other answers is to look at containerization but still orchestration at scale will require some reflection and development.

AWS
answered 10 months ago
  • As mentioned in my original post, we do use a Lambda already. It's part of the partner events solution that pipes these events into the Event Bridge.

    Manually scaling seems to be a bit challenging too, as the scaling value isn't reflected immediately. So there are race conditions when multiple events are coming in.

  • Accepting this answer as it does answer the question of "what is the fastest way". That seems to be the fastest. But it still does not address our bigger concerns, that are somewhat outside of the scope of the question.

0

Once a minute is the best you're going to get out of the ASG; autoscaling won't be able to react to-the-second. Perhaps a lambda that watches for those events and immediately sets the ASG to +1? Once a job is done, how about a wait period (in case new work comes in immediately) and if not then have the instance call https://docs.aws.amazon.com/cli/latest/reference/autoscaling/terminate-instance-in-auto-scaling-group.html with should-decrement-desired-capacity on itself.

profile picture
answered 10 months ago
  • Thank you!

  • This would work well depending on volume. Too much volume and you'll end up facing API throttling issues. Depending on how many concurrent tasks you might have going on, adding some logic to the Lambda to not always +1 if its not needed might be good (maybe have it conditionally +1 based on queue size?)

  • How do you prevent race conditions? E.g. overwriting the incremented desiredInstances

0

Hi, would it be possible to run the server as a container? For example as an ECS task? Then you could add an AWS Lambda function as the target of the EventBridge rule and trigger the ECS task from there with minimal delay.

profile pictureAWS
EXPERT
answered 10 months ago
  • I wish! But we need to build Docker images, and ECS does not support Docker in Docker, as far as I know. Thanks for the suggestion tho!

  • While I haven't tested it out, this blog looks like it addresses your Docker in Docker concerns: https://aws.amazon.com/blogs/containers/building-container-images-on-amazon-ecs-on-aws-fargate/

  • Thanks for that article! Kaniko is a great fit for building Docker images without privileged access. However, there is another use case that is not covered by that, which is the "services" feature of GH Actions. Which is just a fancy way of saying "run a container with a service". We use this, for example, for integration testing to run PostgreSQL.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions