Skip to content

After load cl_sde AFI, error (error_number=12)

0

ubuntu@ip-172-31-87-240:~/aws-fpga/hdk/cl/examples/cl_sde/software/runtime$ sudo ./sde_h2c_perf_test 1 131072 0 2025-05-31T01:28:37.701466Z, undefined, ERROR, ../src/sde_lib/sde_mem.c +85: dma_buffer_init(): fpga_dma_mem_alloc_huge failed: error_number=12 2025-05-31T01:28:37.701508Z, undefined, ERROR, ../src/sde_lib/sde_mem.c +125: sde_mem_init(): dma_buffer_init h2c failed: error_number=12 2025-05-31T01:28:37.701513Z, undefined, ERROR, ../src/sde_lib/sde_mgmt.c +65: sde_mgmt_init(): failed to init mem: error_number=12 2025-05-31T01:28:37.701525Z, undefined, ERROR, ../src/sde_lib/sde_mgmt.c +107: sde_mgmt_init_and_cfg(): failed to init sde_mgmt: error_number=12 2025-05-31T01:28:37.701535Z, undefined, ERROR, ../src/sde_h2c_perf_test.c +67: main(): Unable to initialize SDE: error_number=12 Error: (24) software-problem software-problem

asked a year ago228 views
2 Answers
1
Accepted Answer

Hello Matias,

The error indicates some of the things that can go wrong when you encounter a desc-limit-timeout. If bus mastering is not enabled, you may see this error because the SDE was not able to update the new desc_limit in host memory. Otherwise it will be stuck at 64, which is the starting value for the desc_limit.

The readme for these SDE_LIB examples has information about enabling bus mastering.

To check the bus mastering status:

  1. Load the CL_SDE, sudo fpga-load-local-image -S 0 -I agfi-0925b211f5a81b071
  2. Check the bus mastering lspci -d 1d0f:f002 -vv, Look for BusMaster+ in the Control: section.

To enable bus mastering,sudo setpci -s <Domain:Bus:Device.Function> 4.w=6.

You can re-run the lspci command to check that BusMaster+ now appears in the Control: section.

If you are already enabling BusMastering, please post the output of lspci -d 1d0f:f002 -vv so I can detect if anything else is going wrong.

Thanks, Steven T.

AWS
answered a year ago
  • Thanks Steven.

    I appreciate your answer, I run this code and it works well, same meaning but below code works. setpci -s 34:00.0 COMMAND=0x07

    Now I am customizing SDE for my application and EC2 F2 is great solution for us, Thanks.

    Best, Matias

0

Hello,

Greetings of the day!! Thank you for contacting AWS.

I understand that you are receiving an error while running some performance test with the Amazon FPGA Image and you seek our assistance in resolving the same.

The error indicates a timeout while waiting for descriptor credits during an SDE (Smart Data Engine) H2C (Host to Card) performance test on the FPGA hardware accelerator. The core issue manifests as a descriptor credit timeout error (4099), where the device fails to properly update its descriptor credits, which is typically related to bus mastering configuration problems. Also, It means you are encountering issues with the Amazon FPGA Image (AFI) appears to be related to a DMA (Direct Memory Access) allocation failure. The error code 12 typically corresponds to "ENOMEM" in Linux systems, which indicates "Out of memory" - the system cannot allocate the requested memory. It also

Looking further, the issue starts with "fpga_dma_mem_alloc_huge failed" which suggests that the application is trying to allocate a large amount of memory for DMA operations between the host and the FPGA, but this allocation is failing.

This could be due to several reasons:

  1. Insufficient system memory available for the huge page allocation
  2. The AFI might not be properly loaded or in the correct state
  3. Permission issues with memory allocation (even though you're using sudo)


To troubleshoot this issue:


  1. Check the state of your AFI to ensure it's properly loaded and in the "available" state. An AFI can be in one of four states: pending (bitstream generation in progress), available (ready for use), failed (bitstream generation failed), or unavailable (no longer available for use).
  2. Verify system memory resources and huge page configuration. The application is trying to allocate 131072 bytes (as specified in your command parameters), but the system might not have enough contiguous memory available.
  3. Try reducing the buffer size in your command parameters to see if a smaller allocation succeeds.
  4. Check system logs for additional information about memory allocation failures.


The final error "Error: (24) software-problem" is a generic indication that there's an issue with the software stack rather than with the hardware itself.

Further, coming to the the follow-up error,

2025-05-31T01:36:33.602879Z, undefined, ERROR, ../src/sde_lib/sde_mgmt.c +311: sde_mgmt_wait_desc_credit(): Desc Credit Timeout credits_avail=0, num_desc=32, desc_limit=64, desc_consumed=64 2025-05-31T01:36:33.602933Z, undefined, ERROR, ../src/sde_h2c_perf_test.c +94: main(): Error waiting for descriptor credit Error: (4099) descriptor-limit-timeout A descriptor limit timeout was detected. The SDE logic will update the Descriptor Credit "Limit" Counter in local memory. Check that the device has bus mastering enabled.

It might be occurred while executing an SDE host-to-card performance test with parameters set to buffer size 128 bytes. The system encountered a descriptor credit timeout, where the available credits dropped to zero while the system had consumed all 64 possible descriptors against a limit of 64, with 32 descriptors requested for the operation. This exhaustion of descriptor credits triggered error 4099, indicating the SDE logic failed to properly update the Descriptor Credit Limit Counter in local memory.

The system generated this timeout error because it was unable to obtain the necessary descriptor credits to continue operation, suggesting a potential issue with bus mastering. When functioning properly, the SDE logic should continuously update the Descriptor Credit Limit Counter in local memory to maintain data flow. This error typically indicates either a hardware configuration issue, specifically with bus mastering settings, or a communication breakdown between the host system and the FPGA device in managing descriptor credits.

To troubleshoot this issue, you can attempt to run the SDE performance test with modified parameters. First, try reducing the descriptor count by executing the test with a buffer size of 64 bytes using the command 'sudo ./sde_h2c_perf_test 1 64 0'. If this doesn't resolve the issue, you can further reduce the settings to a minimal configuration by running the test with a buffer size of 32 bytes using 'sudo ./sde_h2c_perf_test 1 32 0'. These smaller buffer sizes may help identify if the issue is related to descriptor count limitations or resource constraints in the system.

I hope this information helps you in troubleshooting the issue further. In case if the issue persists, we would require details that are non-public information. Thus, I would request you to kindly open a support case with AWS using the following link - https://console.aws.amazon.com/support/home#/case/create by giving detailed description of the error message along with screenshots/error messages.

Thank you and have a nice day!!

answered a year ago
AWS
SUPPORT ENGINEER
revised a year ago
  • ubuntu@ip-172-31-87-240:~/aws-fpga/hdk/cl/examples/cl_sde/software/runtime$ sudo ./sde_h2c_perf_test 1 128 0 2025-05-31T01:36:33.602879Z, undefined, ERROR, ../src/sde_lib/sde_mgmt.c +311: sde_mgmt_wait_desc_credit(): Desc Credit Timeout credits_avail=0, num_desc=32, desc_limit=64, desc_consumed=64 2025-05-31T01:36:33.602933Z, undefined, ERROR, ../src/sde_h2c_perf_test.c +94: main(): Error waiting for descriptor credit Error: (4099) descriptor-limit-timeout A descriptor limit timeout was detected. The SDE logic will update the Descriptor Credit "Limit" Counter in local memory. Check that the device has bus mastering enabled.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.