DDR bandwidth

0

I'd like to calculate the peak bandwidth available to custom logic using the AWS F1 DDR interface. Tutorial video 2 has the following statements about DDR:

  • 4 DDR channels each having 16Gb, 72-bit wide of ECC memory
  • 4 DDR channels, each 12 Gbit/sec
    • 2400 MT/sec @1.2 GHz @ 72-bits
    • maximum bandwidth of 48 Gbit/sec
  • 512-bit busses at 250 MHz
    • max bandwidth of 16 Gbit/sec

Some questions:

  1. Some of these statements conflict with each other? For instance, each 512-bit data interface at 250 MHz = 512-bits * 0.25 G/sec = 128 Gbit/sec but the max bandwidth is stated at 16 Gbits/sec.
  2. How many outstanding reads/writes can be supported at a time per DDR interface - is it 16?
  3. What is the approximate latency of a DDR read/write relative to the 250 MHz clock?
  4. Does the DDR memory controller serialize a 512-bit read request at 250MHz clock into 8 x 64-bit requests to the DDR 1.2 GHz or 1.4 GHz memory and present the data back to the CL logic as a 512-bit access?

Thanks for your help.
Azimuth Technology

asked 3 years ago423 views
3 Answers
0

Hello,

The FPGAs in F1 systems have DDR-2133, 72-bits (ECC) DIMMs. Since ECC is enabled, only 64 bits of bus is used for data transfers.

So theoretical max bandwidth will be = 2133 MT/s * 8 Bytes = ~17 GBytes/s

The DDR-Controller wrapped inside sh_ddr.sv (https://github.com/aws/aws-fpga/blob/master/hdk/common/shell_v04261818/design/sh_ddr/synth/sh_ddr.sv) is a Xilinx IP which provides 512-bit AXI4 interface for the user logic. This IP can be run at a max speed of 250MHz. Hence user logic should be able to run at 250MHz * 64 Bytes (512bits) = ~16 GBytes/s

The DDR Controller abstracts lower level DDR details for the user logic. As such, any AXI4 write transactions presented to the AXI4 interface of the controller will be presented to the DDR memory. Similarly AXI4 read requests will be responded with data from the DDR memory. sh_ddr.sv supports upto 32 outstanding transactions on Reads.

Please let us know if you need any additional details.

Thanks!
Chakra

AWS
answered 3 years ago
0

Thank you for the info, Chakra. The content in the tutorial video was incorrect - it mentioned 16 Gbit/sec whereas your calculations of 16GByte/sec make sense. One additional question I had is about the latency of the DDR read/write access in 250 MHz clocks. For instance, according to your reply, the custom logic could issue back-to-back 512-bit reads to the DDR via AXI4 over 32-cycles. Are there any specs on how long on average it would take in clocks for the reads to return data?

Azimuth

answered 3 years ago
0

Hi Azimuth,

Currently we don't benchmark the average latency for DDR AXI buses. The reason is that there could be many variables involved in that, for example, DDR traffic pattern, burst size, etc, which depend on the customer design and we don't have control over them. Instead, we provide a good example design (https://github.com/aws/aws-fpga/tree/master/hdk/cl/examples/cl_dram_dma) that you can leverage to measure the average latency with specific traffic patterns that you think are better fit for the design. I hope this helps.

Thanks,

  • Chen
AWS
answered 3 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions