How to configure an ALB to target an internal application using ec2 instances randomly within a IP range?

0

There is an internal application (Databricks) which is using a cluster of EC2 instances as Master/worker node. We have deployed an ALB within the same region/vpc/AZ where the application is. The external traffic will hit ALB and then will be forwarded to the Databricks application. The Challenge: The ALB targets can be an instance I'd, an IP or a lambda function. As Databricks is not running on a permanent instance, means whenever it restarts, the host might be different EC2. So how to forward the request to the Databricks application from ALB?

已提问 1 年前312 查看次数
2 回答
0

How are you managing your Databricks instances? If you can use an EC2 Auto Scaling group for the instances you want to be ALB targets then you can point the ALB at the Auto Scaling group.

专家
已回答 1 年前
  • Can we use ASG as a target for ALB? I saw only three options in the AWS documentation for the target types, 1) Instance id, 2) IP & 3) Lambda For ref, https://docs.aws.amazon.com/elasticloadbalancing/latest/application/load-balancer-target-groups.html

  • I find that documentation page misleading too. You definitely can attach an ALB to an ASG - see https://docs.aws.amazon.com/autoscaling/ec2/userguide/attach-load-balancer-asg.html for example.

  • Hello Skinsman,

    Thanks for the quick response.

    1. Even when I attach the ASG to the ALB by following the link you shared, we need the Target Group and within that Target Group we need to select the instances. However, the instances are not fixed for Databricks cluster. There is a high possibility that when the cluster is idle for a specific time the instances might get terminated and once the cluster is active again then we might not have the same instances. i.e. the cluster instances are dynamic, which is a challenge here. Please let me know if you find some workaround for this.

    2. Apart from this, there is one more issue I am facing. The health checks (in Target Group) for the databricks instances are failing. I believe the reason behind that is the attribute "path" (default value "/"). I am not sure what path I have to mention for the databrick application. Although I know the port. I tested by creating a demo webpage (index.html) at "/var/www/html" and mentioned "path" as "/index.html" and port as 80 for the health check, this worked.

0

I suggested an EC2 Auto Scaling Group thinking that Databricks might use these for its node scaling, but I see now that it handles it internally.

Looks to me the best option is to leverage the "Init Script" support Databricks has - https://docs.databricks.com/clusters/init-scripts.html. You could write a script that queries the instance's metadata to get the instance ID or IP address and use that to register with the load balancer.

专家
已回答 1 年前

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则