- Newest
- Most votes
- Most comments
By default AWS has DPD at 30 seconds. Where as Azure has it at 45 seconds. Increasing both to 120 seconds has produced a stable tunnel in the end. Currently 18+ hours stable at least. Which is better than the previous 2hours.
It would be interesting if someone has an idea why the initial configuration works on 3 of our other tunnels, but this tunnel was the only one that constantly failed every 2 hours due to aws not responding via DPD... (based on what my Azure support says)
Hello Tim,
DPD is generally the symptom of a problem and the fact that there was no DPD response, combined with the fact that it only happens for certain tunnels, seems to suggest there is potentially an underlying problem with network connectivity. Considering changing the timeout to 120 seconds seems to have fixed it, most likely means the blip likely lasts between 30 and 120 seconds. Its worth noting that network blips may not impact certain applications that have built in resiliency mechanisms and have the ability to re-establish connectivity and continue with packet exchange seamlessly, which may very well be the case here.
Further, if DPD timeout is set to 120 seconds on the AWS end, it means that the DPD "R_U_THERE" messages are sent every 10 seconds and will timeout only if 12 consecutive messages have not been responded to. This would mean that if you had an underlying network problem for 110 seconds, the tunnel will still remain online since the 12th DPD message was responded to and the timer will reset. This could be problematic if you have network sensitive applications but may not be a problem if the application is able to recover/re-establish as explained earlier. My recommendation:
if an application using this path is seeing problems, please get in touch with AWS Support via the Support portal from the account that the VPN lives in and mention:
a) The corresponding VPN ID(s) and region
b) Timestamps (with timezone) from when the problem was seen the last couple times and
c) Excerpts of the Azure logs that can be used to compare with that of our own logs
I'm confident we should be able to get to the bottom of this once we look at our logs.
NOTE: Please refrain from divulging any personal information around your AWS resources including Resource IDs, Public IPs and Security group rules to name a few since all posts are publicly available indefinitely. If you need pointed guidance, please reach out to us at AWS Support via the Support console.
Relevant content
- asked 2 months ago
- asked 4 months ago
- AWS OFFICIALUpdated 2 years ago
- AWS OFFICIALUpdated 5 months ago
- AWS OFFICIALUpdated 3 years ago
- AWS OFFICIALUpdated a year ago