- Mais recentes
- Mais votos
- Mais comentários
Let me address your questions about EFA (Elastic Fabric Adapter) on your g4dn.8xlarge instance:
-
Yes, libfabric pingpong applications should work with the Nitro V3 variant of EFA on g4dn.8xlarge instances. The g4dn instance family supports EFA, and the libfabric layer is designed to work with it.
-
Regarding your
fi_info -p efaoutput: The differences you're seeing between your output and the documentation are normal. The output format can vary between different EFA driver versions. Your output shows "efa-direct" and "efa" fabrics with "rdmap0s29-rdm" domains, while the documentation shows "EFA-fe80::94:3dff:fe89:1b70" fabric with "efa_0-rdm" domain.
This doesn't necessarily mean your EFA setup is missing the GID. The output format has evolved across different EFA versions. What's important is that you're seeing the EFA provider with FI_EP_RDM type and FI_PROTO_EFA protocol, which indicates that EFA is properly detected by libfabric.
- For running libfabric applications with EFA:
- For server address: You typically use the hostname or IP address of the instance, not the EFA device itself. EFA works at a lower level but applications still use the host's identity for initial connection setup.
- The zero GIDs in your
ibv_devinfooutput are expected for EFA devices, as EFA uses a different addressing scheme than traditional InfiniBand. - To run libfabric applications:
a. Make sure your security groups allow all traffic between the instances in the same security group
b. Use the hostname or IP address of the target instance
c. Set the environment variable
FI_PROVIDER=efato ensure libfabric uses the EFA provider d. For MPI applications, useFI_PROVIDER=efa mpirun -np <processes> --hostfile <hostfile> <application>
The EFA setup appears to be working correctly based on your fi_info output showing the EFA provider. The zero GIDs in ibv_devinfo are normal for EFA devices and don't indicate a problem with your configuration.
Sources
Launching a AWS Deep Learning AMIs Instance With EFA - AWS Deep Learning AMIs
Conteúdo relevante
- feita há 5 meses
- feita há 6 meses
- feita há 2 meses
