By using AWS re:Post, you agree to the Terms of Use

Questions tagged with FPGA Development

Sort by most recent

Browse through the questions and answers listed below or filter and sort to narrow down your results.

AWS F1 Instance-- CL_SDE example: trying to use DPDK Pkt gen

So I am using the AWS F1 instance with a modified CL_SDE example from the GIT repo. We modified the HDL in the example to allow us to invert the loopbacked data through a control register, stop the internal loopback, see the last packet in/out on the control registers, and to see the packet counters.... Something super simple but exercises a lot of our libraries... We run the testpmd tool and the packet counters we have matches the tool. We see that the testpmd tool sends in a 64 byte packet as expected and that seems to repeat. When I invert the data, the testpmd keeps reporting everything as loopbacking back just fine(no data integrity checking). When I stop the traffic via my control register, traffic stops. This all seems like the simple hardware does what I want and matches the sim we created. We are now moving over to use the DPDK pktgen tool to generate some more strenuous and controllable traffic. I was able to get the DPDK pktgen libary to compile (I had to use the same tag for the GIT repo as the DPDK version). It appears to have built and installed just fine. However, it doesn't run well. I have tried a number of things (devbind) and all sorts of things, but I keep getting a message saying "did not find any ports to use". In the Virtual ethernet example, the end to end example uses the PKTGEN tool to generate traffic and send it over an elastic network interface to another VM. I am just trying to get PKTGEN to talk directly to the FPGA. In the link below under end to end, I am just trying to get the PKTGEN tool to bind to the SPP directly. Has anyone been able to get the PKTGEN tool to work directly with the FPGA? https://github.com/aws/aws-fpga/blob/master/sdk/apps/virtual-ethernet/doc/Virtual_Ethernet_Application_Guide.md#HelloWorldLoopback I have tried a number of things, but to no avail on this. Copyright (c) <2010-2020>, Intel Corporation. All rights reserved. Powered by DPDK EAL: Detected 8 lcore(s) EAL: Detected 1 NUMA nodes EAL: Auto-detected process type: PRIMARY EAL: Multi-process socket /var/run/dpdk/pg/mp_socket EAL: Selected IOVA mode 'PA' EAL: No available hugepages reported in hugepages-1048576kB EAL: Probing VFIO support... EAL: No legacy callbacks, legacy socket not created *** Copyright (c) <2010-2020>, Intel Corporation. All rights reserved. *** Pktgen created by: Keith Wiles -- >>> Powered by DPDK <<< Port: Name IfIndex Alias NUMA PCI !PANIC!: *** Did not find any ports to use *** PANIC in pktgen_config_ports(): *** Did not find any ports to use *** 6: [veth_app/Pktgen-DPDK/Builddir/app/pktgen() [0x404baa]] 5: [/lib64/libc.so.6(__libc_start_main+0xea) [0x7ffa3d16013a]] 4: [veth_app/Pktgen-DPDK/Builddir/app/pktgen() [0x4047fd]] 3: [veth_app/Pktgen-DPDK/Builddir/app/pktgen() [0x42681b]] 2: [/usr/local/lib64/librte_eal.so.20.0(__rte_panic+0xba) [0x7ffa3db5aeaa]] 1: [/usr/local/lib64/librte_eal.so.20.0(rte_dump_stack+0x1b) [0x7ffa3db7a09b]] Aborted [ec2-user@ip-10-20-7-111 runtime]$
0
answers
0
votes
12
views
asked 23 days ago

Cannot load my FPGA Image

Hello, I developed an FPGA accelerator for f1-instances. Several years ago I made it. I set it as public. Now, I want to use it again, but cannot load it. Here are some details. In short, it is my image, set also as public. It matches with the shell on the aws-instance. However, I cannot load it. The details are below. Can you please help me resolve the issue and load the image? Kind Regards, Furkan ___ These are the details of my image, with `afi-0da97a1d59bf1e558` and `agfi-05bfb2806dd7970d2`. I am the owner with correct Owner ID. Besides, it is set as public. ``` [centos@ip-172-31-92-65 xdma]$ aws ec2 describe-fpga-images --fpga-image-ids afi-0da97a1d59bf1e558 { "FpgaImages": [ { "UpdateTime": "2019-12-06T13:58:03.000Z", "Name": "he_v1_5_afi", "Tags": [], "PciId": { "SubsystemVendorId": "0xfedd", "VendorId": "0x1d0f", "DeviceId": "0xf000", "SubsystemId": "0x1d51" }, "DataRetentionSupport": false, "FpgaImageGlobalId": "agfi-05bfb2806dd7970d2", "State": { "Code": "available" }, "ShellVersion": "0x04261818", "OwnerId": "210929643974", "FpgaImageId": "afi-0da97a1d59bf1e558", "Public": true, "Description": "he_v1_5_description" } ] } ``` ___ Here is the slot information. You can see that shell version `0x04261818` matches with the accelerator information above ``` [centos@ip-172-31-92-65 xdma]$ sudo fpga-describe-local-image -S 0 -H Type FpgaImageSlot FpgaImageId StatusName StatusCode ErrorName ErrorCode ShVersion AFI 0 none cleared 1 ok 0 0x04261818 Type FpgaImageSlot VendorId DeviceId DBDF AFIDEVICE 0 0x1d0f 0x1042 0000:00:1d.0 ``` ___ Now, if I try to load it, I receive error ``` [centos@ip-172-31-92-65 xdma]$ sudo fpga-load-local-image -S 0 -I agfi-05bfb2806dd7970d2 Error: (5) invalid-afi-id The agfi id passed is invalid or you do not have permission to load the AFI. ```
1
answers
0
votes
46
views
asked 2 months ago

Clock gating / using "highly discouraged" constraint

I have a wide and deep shift register in my design that I want to control using a gated clock. Its size and distribution throughout the entire device make a clock enable an inferior option from a routability and resource usage perspective. I tried the below and believe my design is working but have misgivings about using a constraint that is "highly discouraged" without truly understanding what I'm doing and if there's a recommended alternative. Guidance on the suitability of my approach / why the constraint is "highly discouraged" would be appreciated. I sequentially tried: 1) Inferring from Verilog RTL code in various ways. Nothing looked good. 2) Instantiating a BUFGCE primitive alone. ``` CRITICAL WARNING: [DRC HDPR-59] Clock Net Rule Violation: Illegal clock load 'WRAPPER_INST/CL/BUFGCE_unknown' found on PR boundary clock net 'WRAPPER_INST/SH/kernel_clks_i/clkwiz_sys_clk/inst/CLK_CORE_DRP_I/clk_inst/clk_out1'. Boundary clock nets are not fully supported to drive loads of type BUFGCE inside a reconfigurable region. This type of connection may cause downstream tool issues. The recommended solution is to add an MMCM as the clock load driving the original BUFGCE load. ``` 3) Instantiating a MMCME4_BASE primitive before a BUFGCE primitive. ``` Phase 1.2 IO Placement/ Clock Placement/ Build Placer Device ERROR: [Place 30-718] Sub-optimal placement for an MMCM/PLL-BUFGCE-MMCM/PLL cascade pair.If this sub optimal condition is acceptable for this design, you may use the CLOCK_DEDICATED_ROUTE constraint in the .xdc file to demote this message to a WARNING. However, the use of this override is highly discouraged. These examples can be used directly in the .xdc file to override this clock rule. set_property CLOCK_DEDICATED_ROUTE ANY_CMT_COLUMN [get_nets WRAPPER_INST/SH/kernel_clks_i/clkwiz_sys_clk/inst/CLK_CORE_DRP_I/clk_inst/clk_out1] WRAPPER_INST/SH/kernel_clks_i/clkwiz_sys_clk/inst/CLK_CORE_DRP_I/clk_inst/clkout1_buf (BUFGCE.O) is locked to BUFGCE_X1Y181 (in SLR 1) The loads are distributed to 1 user pblock constraints. In addition, there are 0 loads not in user pblock constraints. Displaying the first 1 loads for pblock constraint 1 WRAPPER_INST/CL/MMCME4_BASE_inst (MMCME4_ADV.CLKIN1) is provisionally placed by clockplacer on MMCM_X0Y5 (in SLR 1) The above error could possibly be related to other connected instances. Following is a list of all the related clock rules and their respective instances. Clock Rule: rule_bufgce_bufg_conflict Status: PASS Rule Description: Only one of the 2 available sites (BUFGCE or BUFGCE_DIV/BUFGCTRL) in a pair can be used at the same time WRAPPER_INST/CL/BUFGCE_unknown (BUFGCE.O) is provisionally placed by clockplacer on BUFGCE_X0Y120 (in SLR 1) Clock Rule: rule_mmcm_bufg Status: PASS Rule Description: A MMCM driving a BUFG must be placed in the same clock region of the device as the BUFG WRAPPER_INST/CL/MMCME4_BASE_inst (MMCME4_ADV.CLKOUT0) is provisionally placed by clockplacer on MMCM_X0Y5 (in SLR 1) WRAPPER_INST/CL/BUFGCE_unknown (BUFGCE.I) is provisionally placed by clockplacer on BUFGCE_X0Y120 (in SLR 1) Clock Rule: rule_bufgce_bufg_conflict Status: PASS Rule Description: Only one of the 2 available sites (BUFGCE or BUFGCE_DIV/BUFGCTRL) in a pair can be used at the same time WRAPPER_INST/SH/kernel_clks_i/clkwiz_sys_clk/inst/CLK_CORE_DRP_I/clk_inst/clkout1_buf (BUFGCE.O) is locked to BUFGCE_X1Y181 (in SLR 1) Clock Rule: rule_mmcm_bufg Status: PASS Rule Description: A MMCM driving a BUFG must be placed in the same clock region of the device as the BUFG WRAPPER_INST/SH/kernel_clks_i/clkwiz_sys_clk/inst/CLK_CORE_DRP_I/clk_inst/mmcme3_adv_inst (MMCME4_ADV.CLKOUT0) is locked to MMCM_X1Y7 (in SLR 1) WRAPPER_INST/SH/kernel_clks_i/clkwiz_sys_clk/inst/CLK_CORE_DRP_I/clk_inst/clkout1_buf (BUFGCE.I) is locked to BUFGCE_X1Y181 (in SLR 1) Clock Rule: rule_bufgce_bufg_conflict Status: PASS Rule Description: Only one of the 2 available sites (BUFGCE or BUFGCE_DIV/BUFGCTRL) in a pair can be used at the same time WRAPPER_INST/SH/kernel_clks_i/clkwiz_sys_clk/inst/CLK_CORE_DRP_I/clk_inst/clkout2_buf (BUFGCE.O) is locked to BUFGCE_X1Y183 (in SLR 1) Clock Rule: rule_bufgce_bufg_conflict Status: PASS Rule Description: Only one of the 2 available sites (BUFGCE or BUFGCE_DIV/BUFGCTRL) in a pair can be used at the same time WRAPPER_INST/SH/kernel_clks_i/clkwiz_sys_clk/inst/CLK_CORE_DRP_I/clk_inst/clkout3_buf (BUFGCE.O) is locked to BUFGCE_X1Y172 (in SLR 1) Clock Rule: rule_bufgce_bufg_conflict Status: PASS Rule Description: Only one of the 2 available sites (BUFGCE or BUFGCE_DIV/BUFGCTRL) in a pair can be used at the same time WRAPPER_INST/SH/kernel_clks_i/clkwiz_sys_clk/inst/CLK_CORE_DRP_I/clk_inst/clkout4_buf (BUFGCE.O) is locked to BUFGCE_X1Y171 (in SLR 1) Resolution: The MMCM/PLL-BUFGCE-MMCM/PLL cascade pair can use the dedicated path between them if they are placed in vertically adjacent clock regions and in the same column (LEFT/RIGHT) of the device. ``` 4) Instantiating a MMCME4_BASE before a BUFGCE primitive and constraining the design with `set_property CLOCK_DEDICATED_ROUTE ANY_CMT_COLUMN [get_nets WRAPPER_INST/SH/kernel_clks_i/clkwiz_sys_clk/inst/CLK_CORE_DRP_I/clk_inst/clk_out1]` in cl_pnr_user.xdc. This met with success.
4
answers
0
votes
50
views
asked 3 months ago

Using-PCIe-Write-Combining example returns unexpected results

Hi, I'm an F1 AWS user, and I'm trying to run the PCIE Write Combine example on a cl_dram_dma example design: https://github.com/awslabs/aws-fpga-app-notes/tree/master/Using-PCIe-Write-Combining In this example, I'm trying to experiment with the wc_perf.c example to write a single line (16 DWORDS) from the host to the DDRA. According to the above note, running the example with the below command: **$ sudo ./wc_perf **Followed by: **$ sudo fpga-describe-local-image -S 0 -C ** outputs the below: ``` DDR0 ** write-count=16 ** read-count=0 ``` While running with "-w" option, **$ sudo ./wc_perf -w **Followed by: **$ sudo fpga-describe-local-image -S 0 -C ** outputs the below: ``` DDR0 ** write-count=1 ** read-count=0 ``` meaning, that the write_count when we use the -w (Write Combine option), should have a write count that is reduced from 16 to 1. Quoting from the notes: "The -w option tells wc_perf to use WC, and the number of write data beats was reduced from 16 to 1. This is the reason why writing a WC region with small operations is faster, because they are accumulated into larger chunks using a 64 byte buffer located in the CPU core bus interface (BIU). This is also the reason why it cannot be used for all accesses." However, when I run the experiment I get different results than the ones described in the notes: When I either run "sudo ./wc_perf -w" or "sudo ./wc_perf" followed by "sudo fpga-describe-local-image -S 0 -C" I get the same write_count of 1600: ``` DDR0 write-count=1600 read-count=0 DDR1 ``` While I understand why the count is 1600 and not 16 (num_of_passes is 100 in wc_perf.c), the count expect to be when using "-w" option is 100 and not 1600, which means 16 times less than without "-w". Am I doing anything wrong in the experiment?
1
answers
0
votes
52
views
asked 4 months ago