Orchestrating single container hosts
We have a fleet of EC2 instances that are used to run one Docker container per instance. These instances are pretty large and use AWS EBS as storage backend as storage io requirements are pretty high. The docker containers are run using basic docker-compose on the EC2 instances and these docker-compose files are managed by ansible globally. The workload requires quite a lot of cpu and iops so EBS as a storage is a requirement
This is a legacy setup and at some point we need to move over away from docker compose to something more flexible and modern. The workload specifics where one huge container is used per host makes me think that this is not really a case for kubernetes or ECS, especially when EBS are in the mix too. At least not vanilla kubernetes in the sense I understand how kubernetes work.
Could anyone point me to a right direction here?
It would be helpful to know the type of workload each of these docker containers is processing. Reasons could be either memory/cpu/network/ebs throughput that you run only one container per instance. For modernizing infrastructure often a good look is needed at the architecture of the application too.
If you are running the application in containers but have a 1-to-1 mapping of app to host, then maybe it would be a lot easier to run the application on the host, without the container layer, and use something like code-deploy to roll out new versions and such.
However, if you want to stick to containers, I think that K8s here would be way too overkill. ECS However, is very straightforward.
You could create a capacity provider in ECS, which will use an ASG, which itself would use a Launch Template for example. In that Launch template, you can then pre-define the EBS mounts that you need, script attachment as needed if your disks are to be persisted separately from the EC2 instance. Then to ensure your application alone runs on these hosts, just set the Capacity Provider of your ECS service to be the only one using that Capacity Provider. Otherwise, if you are happy to share the hosts with other workloads, you can define a placement strategy or just use the "DAEMON" mode which will run 1 container per host, but only the one.
Otherwise, if the reason for having these EBS disks is IOps, and not having a huge storage (>200GB) then you can easily have 1 task per host with AWS Fargate. In fargate you get a NVMe drive with 21 GB for free, and you can go up to 200GB. And you don't pay for these IOps, whereas you probably are for your GP2/GP3 or IO EBS disks.
If you need persistence of the disks beyond the lifespan of the container, then EC2 is your only option indeed. EFS probably won't have the performances you'd need.
As for doing all that, given that you use docker-compose, I'd recommend you keep that. It is very well known, dev friendly, and there are a few tools out there that can make it easy to deploy from "compose to ECS", such as ECS Compose-X
I haven't yet had a use-case "submitted" like yours, but if you are willing, hit me up on Slack to see what to change to your compose file to simplify the deployment for you and have that added to my test cases :)
Just as an anticipation of what I think you'd need to make it work
PS: Although I do love ansible, I'd recommend not to have 1 image with configuration for each environment, and instead rely on env vars / SSM parameters or even a files configuration side-car.
Thank you very much for such an extensive answer! That's a lot to consume :) I will need to do some deep digging and see what is best for us. Compose-X looks interesting too
AWS Backup - two ec2 instances are assigned as the protected services - but I see an EBS snapshot in the vault?asked 6 months ago
Does WorkSpaces/EC2 Support Docker?Accepted Answerasked 2 years ago
Instances unreachable and unable to stop/restartasked 2 years ago
Orchestrating single container hostsAccepted Answerasked a month ago
Restarting a Greengrass component with a container creates two instances of that containerAccepted Answerasked 6 months ago
Do I get charged for stopped instances?Accepted Answerasked 5 years ago
Why are GameLift EC2 instances more expensive than regular EC2 instances?asked 10 days ago
AWS Batch limit number of instancesAccepted Answerasked 2 months ago
Wordpress in Docker container on Fargateasked 3 months ago
Spot Fleet Instance could not be stoppedasked 21 days ago