Cannot register memory region for remote rdma read on EFA.

0

OS
NAME="Ubuntu"
VERSION="18.04.5 LTS (Bionic Beaver)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 18.04.5 LTS"
VERSION_ID="18.04"
Kernel
5.4.0-1045-aws

efadv_query_device showed EFADV_DEVICE_ATTR_CAPS_RDMA_READ is not available.
I followed the steps here:
https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/efa-start.html
Is remote RDMA read supported on EFA?

crhu
질문됨 3년 전491회 조회
8개 답변
0

You do not mention which instance type you are using for your tests, but currently RDMA semantics with EFA are only natively supported on P4d instances.

AWS
답변함 3년 전
0

The instance used was c5n.18xlarge.

crhu
답변함 3년 전
0

Are there future plans to support all EFA instances?

I see libfabric supports rma remote read/write, is it implemented on top of send? Or is it only supported for p4d instances as well?

Edited by: crhu on Jun 9, 2021 11:05 PM

crhu
답변함 3년 전
0

Obviously, we want to grow the number of instance types that support RDMA with EFA beyond P4. Not surprisingly, I cannot comment on specifics of our plans in this forum.

The EFA provider for Libfabric does expose the FI_RMA interface, and will automatically detect if the EFA hardware supports RDMA operations and either use native RDMA features or an emulated send/receive path.

It's worth noting that our RDMA read operation does not conform to the InfiniBand spec (it is not InfiniBand, after all!). In particular, there is no read-once or write-once guarantee. In a retransmit case, we will re-read the source buffer and may write the data more than once. In Libfabric, this is expressed by requiring completion events for the RDMA operations. We also do not provide byte-ordered data placement, so you can not do "poll on last byte" tricks.

AWS
답변함 3년 전
0

Thanks for the information, Brian. If possible, when will we expect RDMA read support on other EFA instances?

crhu
답변함 3년 전
0

Hi Brian, can you confirm a couple of technical details with me on libfabric efa provider? In the emulated RDMA read, the sender sends RXR_SHORT_RTR_PKT/RXR_LONG_RTR_PKT packet to the receiver with the requested read addresses. The receiver polls/receives and inspects the header packet type and does a memcpy on the requested read addresses to RXR_READRSP_PKT. The receiver then sends back the RXR_READRSP_PKT to the requested sender. The sender polls/receives the RXR_READRSP_PKT and does memcpy of the received data to the final destination. Instead of 0 memcpy for real RDMA, the emulated RDMA has memcpy on both sender and receiver side

Edited by: crhu on Jun 15, 2021 11:47 PM

crhu
답변함 3년 전
0

Correct. The receive side memcpy is basically required, because of all the usual memory placement issues. On the send side (the target of the read), there is a potential optimization for larger read requests to send directly from the user buffer, but given the way MPI uses the Libfabric interface, that hasn't been a priority for the team yet.

AWS
답변함 3년 전
0

Hello, I tried to run perftest on both p4d and p3dn to test RDMA recently, but it always shows error "Couldn't allocate MR". I also tried to set the memory lock to unlimited in /etc/security/limits.conf, but it doesn't work. Do you have any suggestions about this?

jiawei
답변함 3년 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠