How do I achieve the least-access secure networking for SageMaker Training on Amazon FSx for Lustre?


I'm trying to figure out a minimally permissive yet operational network configuration for Amazon SageMaker training to train on data from Amazon FSx for Lustre. My understanding is that both the file system and the SageMaker instance can have their own security groups and that FSx uses TCP on ports 988 and 1021-1023. Therefore, I think a good network configuration for using SageMaker with FSx is the following:

  • SageMaker EC2 equipped with the security group SM-SG that allows Inbound only with TCP on 988 and 1021-1023 from FSX-SG only.
  • Amazon FSx equipped with the security group FSX-SG that allows outbound only with TCP on 988 and 1021-1023 towards SM-SG only. Is this configuration enough for the training to work? Do FSx and SageMaker need other ports and sources to be opened to operate normally?
1 Answer
Accepted Answer

For the security group for Amazon FSx (Example: FSx-SG), you need to add the following additional rules:

  1. FSx-SG needs inbound access from the security group for SageMaker (Example: SM-SG). The SageMaker instance needs to initiate a connection to the Amazon FSx file system, which is an inbound TCP packet to FSx.
  2. FSx-SG needs inbound and outbound access to itself. This is because, Amazon FSx for Lustre is a clustered file system, where each file system is typically powered by multiple file servers, and the file servers need to communicate with one another.

For more information on the minimum set of rules required for FSx-SG, see [File system access control with Amazon VPC][1]. [1]:

answered 2 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions