Learn how to troubleshoot issues that may occur when using LACP (Link Aggregation Control Protocol) with Direct Connect.
Overview
This article provides a structured approach to diagnose and resolve LACP issues on AWS Direct Connect LAG.
Resolution
What is LACP?
LACP (Link Aggregation Control Protocol) is a subcomponent of the IEEE 802.3ad standard, a discovery protocol that allows multiple Ethernet interfaces to be grouped together to form a single link layer interface.
In AWS Direct Connect, you can create a LAG (Link Aggregation Group), which is a logical interface that aggregates multiple dedicated connections at a single endpoint and treats them as a single managed connection.
AWS-Side Diagnosis
Before checking vendor equipment, narrow down the problem scope using AWS Console, CLI, and CloudWatch.
Check LAG and Connection State
AWS Console:
- Direct Connect Console → LAGs
- Verify LAG status and each member connection's state
- Check "Minimum Links" configuration
- Verify number of "Available" connections meets minimum requirement
AWS CLI:
aws directconnect describe-lags --lag-id dxlag-00000 aws directconnect describe-connections --connection-id dxcon-00000
CloudWatch Metrics:
- ConnectionState — Monitor connection status (0=DOWN, 1=UP). Expected: Value should be 1
Check Optical Signal Levels
Verify optical power levels to rule out physical layer issues before investigating LACP.
CloudWatch Metrics:
- ConnectionLightLevelRx — Receive signal strength (dBm)
- ConnectionLightLevelTx — Transmit signal strength (dBm)
Expected: Optical power levels should be within the acceptable range (-14.4 to 2.50 dBm for 1G and 10G connections). If out of range, the issue is physical — not LACP.
Check AWS Health Dashboard
AWS Direct Connect connections can go down due to planned or emergency maintenance.
Action:
- Check the Events section of the AWS Health Dashboard
- Look for ongoing or recently completed maintenance affecting your Direct Connect connection
Note: During maintenance, BGP connections may transition to idle state, which can last from minutes to hours.
Common LACP Failure Patterns
Use the AWS-side diagnosis results above to match your issue to the patterns below:
Pattern 1: Entire LAG is Down
Possible causes:
- Both sides set to LACP Passive (neither initiates negotiation)
- CGW(Customer Gateway) side configured as static "on" mode instead of LACP dynamic mode
- All member links physically down (check optical levels)
- minimumLinks threshold not met
Pattern 2: Some Member Links Down, Others Up
Possible causes:
- Specific link's physical issue (fiber, transceiver, port)
- Speed mismatch on individual member (all connections must have same bandwidth)
- Vendor-side port-channel member configuration inconsistency
Pattern 3: LAG Flaps Intermittently
Possible causes:
- LACP timer mismatch — AWS uses Fast (1-second); if a CGW uses Slow (30-second), detection is delayed
- Optical signal degradation (marginal dBm levels)
- Upstream maintenance events
Pattern 4: LAG is Up but Active Links Below Minimum
Possible causes:
- Active member count dropped below minimumLinks setting → entire LAG goes down even if some links are physically up
- Review minimumLinks value relative to actual available connections
LACP Configuration Requirements for AWS Direct Connect
Verify that CGW-side configuration meets the following requirements:
LACP Mode
- AWS side is always Active
- CGW side must be Active or Passive (dynamic LACP)
- Both sides Passive → LAG will never come up
- Static mode ("on") → AWS requires LACP protocol negotiation
LACP Timer
- AWS uses Fast mode (1-second interval)
- Recommendation: Set CGW side to Fast mode for faster detection and recovery during link failures
- Mismatch (AWS Fast / CGW Slow) may cause delayed failover detection
Member Link Requirements
- All connections in a LAG must have the same bandwidth (mixing 1G and 10G is not supported)
- All connections must terminate at the same AWS Direct Connect endpoint
Minimum Links
- If the number of active links falls below the configured minimum-links value, the entire LAG goes down
- Review this setting when adding or removing connections from a LAG
CGW-Side Verification (Vendor Commands)
Note: Specific command syntax for each vendor may vary depending on the OS version, so please refer to the vendor's official documentation.
After narrowing down the issue using AWS-side diagnosis above, use the following vendor commands to verify CGW router configuration.
Verify Interface and LAG Operational Status
Junos OS:
show interfaces terse | match lag-name show interfaces ae0
Cisco IOS:
show etherchannel summary
Verify LACP Activity Mode and Partner State
Junos OS:
show lacp interfaces interface-name
Cisco IOS:
show lacp neighbor show etherchannel detail
Check that:
- Actor (CGW) mode is Active or Passive
- Partner (AWS) mode shows Active
- Partner System ID is consistent across all member links
Verify LACP Timer Configuration
Junos OS:
show lacp interfaces interface-name
Cisco IOS:
show lacp internal show lacp neighbor detail
Check that periodic timer matches AWS side (Fast / Short).
Check Optical Signal Levels (CGW Side)
Junos OS:
show interfaces diagnostics optics interface-name | grep dBm | except thre
Cisco IOS:
show interfaces transceiver
Related Information