ALB LCU capacity reservation metrics confusion

0

Enter image description here Enter image description here I'm trying to deal with a traffic spikes.

  1. I activated LCU capacity reservation for 100 LCU, but Cloudwatch metric reports 1.6LCU. "Load Balancer Capacity Units (LCU): peak vs. reserved" reports proper metric. Did I miss something?

  2. And - LCU capacity reservation didn't help, as far as I see. I still see the same "ELB 5xx" metric spike, as without reservation. Enter image description here

Enter image description here

asked 11 days ago36 views
2 Answers
0

It’s highly likely this is not an ALB issue but the issue resides with your targets.

500 errors are usually related to your targets. Review your target group stastistics for the 5xx errors.

Depending on the error code, it’s either internal server error from your targets or gateway timeout where your targets can’t handle anymore traffic.

Depending on the outcome you may need to scale up the number of targets or tune your application backend regarding threads, connections, memory etc.

profile picture
EXPERT
answered 11 days ago
  • Thank you, Gary!

    As far as I understand, "ELB 5xx" and similar metrics are regarding ALB, and "Target 5xx" and similar are for my backend. So by using capacity reservation (which is easier than per-warming) I expect to deal with ALB errors. And my next step will be dealing with my backend scaling and sizing.

  • Yes you should review both target and ELB.. Are you seeing 504s? https://docs.aws.amazon.com/elasticloadbalancing/latest/application/load-balancer-troubleshooting.html#http-504-issues

    • The load balancer failed to establish a connection to the target before the connection timeout expired (10 seconds).
    • The load balancer established a connection to the target but the target did not respond before the idle timeout period elapsed.
  • I'm attaching Cloudwatch metrics screenshot to my post. As we can see, there is a small amount of ELB 504 and target errors, 99% are ELB 503 status.

0

It seems there might be some confusion regarding the LCU capacity reservation metrics and their impact on your Application Load Balancer (ALB). Let's address your concerns:

  1. The discrepancy you're seeing between your reserved capacity (100 LCU) and the CloudWatch metric reporting 1.6 LCU is likely due to the difference between reserved capacity and actual usage. The LCU reservation sets a minimum capacity, but it doesn't mean this capacity will always be fully utilized. The metric you're seeing (1.6 LCU) probably represents your actual usage, while the 100 LCU is your reserved capacity. The "Load Balancer Capacity Units (LCU): peak vs. reserved" metric is correctly showing your reservation.

To monitor your reserved LCU utilization, you should look at the PeakLCUs CloudWatch metric. This metric accounts for peaks in your traffic pattern across all scaling dimensions. You can compare the per-minute SUM(PeakLCUs) against your reserved capacity value to determine if you've reserved enough capacity to meet your traffic needs.

  1. Regarding the LCU capacity reservation not helping with the "ELB 5xx" metric spike, it's important to understand that LCU reservation is primarily about scaling speed, not error prevention. The reservation ensures that your load balancer has a minimum capacity available, but it doesn't automatically resolve issues causing 5xx errors.

These 5xx errors could be caused by various factors not directly related to load balancer capacity, such as issues with your backend servers, application errors, or network problems. LCU reservation helps your load balancer scale faster to handle traffic spikes, but if the root cause of the 5xx errors is elsewhere in your application stack, you'll need to investigate further.

To address the 5xx errors, you should:

  • Check your backend server health and capacity
  • Review your application logs for any errors
  • Ensure your security groups and network ACLs are correctly configured
  • Verify that your target groups are properly set up and the targets are responding correctly

Remember, LCU reservation is a tool to help with rapid scaling, but it's not a silver bullet for all types of performance issues. You may need to combine this with other strategies like improving application performance, increasing backend capacity, or optimizing your overall architecture to fully address your traffic spike challenges.
Sources
Load balancer Capacity Unit Reservation for your Application Load Balancer - Elastic Load Balancing
Using Load Balancer Capacity Unit Reservation to prepare for sharp increases in traffic | Networking & Content Delivery
Monitor Load balancer Capacity Unit reservation for your Application Load Balancer - Elastic Load Balancing

profile picture
answered 11 days ago
profile picture
EXPERT
reviewed 11 days ago
  • AI Agent, thank you :-)

    As far as I understand, "ELB 5xx" and similar metrics are regarding ALB, and "Target 5xx" and similar are for my backend. So by using capacity reservation (which is easier than per-warming) I expect to deal with ALB errors. And my next step will be dealing with my backend scaling and sizing.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions