Looking for guidance on DNS round-robin behavior with multi-AZ FSx for Windows File Server

0

We are running FSx for Windows as a multi-AZ deployment. It is functional, but a technical question has come up that I can't find details on in the documentation.

The question is around DNS round-robin behavior on an active/passive multi-AZ deployment. When we query the DNS name of the AWS FSx endpoint, we're returned two IP addresses. I believe that one IP is the active share in what I'll call AZ1 and the other IP is the passive share in AZ2. If we attempt to connect directly to AZ1's IP, it works. If we attempt to connect directly to AZ2's IP, it does not. If we attempt to connec via the hostname, it seems to always work.

My question is specific to the round robin behavior and best practices. Is what I've described above a result of a proper configuration? Is the expectation that Windows SMB clients will always re-query the IP address of the name in the event of a lack of connectivity? Normally with round-robin DNS a client system will cache one IP and use that IP until the cache refreshes, so I'm surprised to see this dual IP behavior in the FSx DNS. Any help and documentation would be greatly appreciated.

2 Answers
0

Hi,

I will gladly address the questions you raised here.

  1. Is what I've described above a result of a proper configuration?

Answer: The 2 IPs that you are seeing is a normal behavior in a Multi-AZ environment as in Multi-AZ file system, Amazon FSx automatically provisions and maintains a standby file server in a different Availability Zone. The standby file server will not allow connections unless it is marked as active. This is by design as only one host in a Multi-AZ file system at any given time will serve customer traffic. You see two IPs because of the configuration to allow the smooth failover to the standby server whenever preferred is unavailable.

  1. Is the expectation that Windows SMB clients will always re-query the IP address of the name in the event of a lack of connectivity?

Answer: That is correct, the second IP is not responding which is expected behavior as only one IP can serve SMB traffic at a time. The second IP will only become available only when Multi-AZ file systems automatically failover from the preferred file server to the standby file server. However, accessing the filesystem using the hostname will still be successful as it sends a response to both IP's.

Furthermore, it is recommend that you connect via DNS hostname and not IP as during maintenance the secondary IP becomes active which could result in your system connecting to the incorrect IP.

I would also recommend you to check this AWS Documentation [+] which provides more information regarding Availability and durability of Multi-AZ file systems. I have specifically linked to the Multi-AZ Failover process as I believe this addresses your concern. [+] https://docs.aws.amazon.com/fsx/latest/WindowsGuide/high-availability-multiAZ.html#MultiAZ-Failover

AWS
SUPPORT ENGINEER
Johno_D
answered 9 months ago
0

The DNS round robin is a DNS function that is normal. You need to set up the SPN's to properly allocate the primary node in a cluster. If you create an alias and do not set the SPN, every other session will be on the offline node. Follow the instructions found here: https://docs.aws.amazon.com/fsx/latest/WindowsGuide/walkthrough05-file-system-custom-CNAME.html#step1-assign-dns-alias

answered 6 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions