- Newest
- Most votes
- Most comments
Hi macleonsh,
When you start an F1 instance, the FPGA is in a cleared state. The cleared FPGA presents the Device ID 0x1042
Two main things to remember here are:
- XRT(XOCL) driver does not bind with Device ID 0x1042.
- XRT MPD opens up a file descriptor on a mailbox subdevice when it binds to a Device ID listed in here: https://github.com/Xilinx/XRT/blob/master/src/runtime_src/core/pcie/driver/linux/xocl/devices.h
If you are using Vitis to run your application, whether it is using Vitis rtl kernel or an ocl kernel, you need the xocl driver to bind with the device. If the slot is in a cleared state, it will not bind to the slot.
To get Xocl to bind to a slot, it needs to have a Device ID loaded that it can bind to.
When you call "sudo fpga-load-local-image -S 1 -I agfi-xxx" or restart MPD, the Device ID loaded is such that it lets the XOCL driver bind.
Note that restarting MPD will automatically load a default AFI(in GA regions only)
In your case, are you switching between Vivado generated AFI's and Vitis generated AFI's? I want to try to understand your flow a bit more to see how we can make this easier for you.
-Deep
Thanks for the feedback, Deep,
Yes I understand that xocl driver need to bind with the device.
Something special in China region is the default AFI can not be accessed so I manually loaded hello-world afi (agfi-0fcf87119b8e97bf3, which is vivado afi--after loading did is f000).
In such case when I launch up one of my applications, which is Vitis created afi, the XRT error as "ERROR: See dmesg log for details. err=-22" will be reported.
It tooks me quite a while to find the reason as the same application runs well on my local FPGA machine. Later I realized if I must preload one whatever afi created by vitis (after loading did=f010) then my vitis application will run well.
This limitation is not serious --but considering if someone who has two kernels (one from vitis, another from Vivado) then this will become a potential issue --through the workaround is also there (preload the relevant afi).maybe you shall put some reminder in help document, to prevent someone else from spending time to "debug it"..
Hi,
I believe this is due to the difference in Device ID.
When when your program wants to load an AFI with a different Device ID the following steps happen:
- The udev monitor will trigger from a ‘remove’ event. It will cause the MPD main thread to close the 2 child threads for this slot.
- AWS fpga management library calls will load the new AFI, make system calls to remove the device from the PCI bus and rescan the PCI bus.
- This causes the XOCL driver to unbind.
- Now the slot is not programmable from the OpenCL application anymore. When that changes, mpd needs to close the file descriptors while the aws-fpga library makes the call to rescan sysfs
So to keep mpd running, it would be better to load an afi that has the device ID you want all your accelerators to run with. That would remove the need for a rescan and xocl will not unbind.
I hope this helps. Let us know if you have more questions around this and we'd be happy to help.
-Deep
Hi Deep.
"It would be better to load an afi that has the device ID you want all your accelerators to run with. " I am not fully understand this statement..
In another word, as soon as I have two different Dids (This will happen suppose I have two kernel, one from vitis and another from vivado)--and if I need switch over from one to another, I shall always preload the relevat afi before I run the application -- There is no alternative way, is that right?
Hi,
Due to the way the XRT MPD opens file descriptors into the sub-device, if the Device ID's change(basically a pcie remove and rescan), MPD will shut down as it detects a remove event and you would have to start MPD again after the new AFI is loaded.
If you could use the same device id for both the Vitis and Vivado based kernels, it would not require a re-scan from the management libraries and could help. However, if you are using Vivado and Vitis generated kernels, what driver are you using with both? Are you using two different device id's as you are using two different drivers to bind to the different kernels?
-Deep
If you could use the same device id for both the Vitis and Vivado based kernels--> this is interesting to be known, what shall I do to get the same device id? I thought it was designed to be different...
"However, if you are using Vivado and Vitis generated kernels, what driver are you using with both? Are you using two different device id's as you are using two different drivers to bind to the different kernels"
I think the device id is generated during afi creation, to be specifically:
- for vitis, after fpga binary build, use the script : $VITIS_DIR/tools/create_vitis_afi.sh to convert the xclbin to awsxclbin and generate afi/agfi, a design tar package will be uploaded to S3. check such afi the dvice id is f000
- for vivado, after fpga binary build, call "aws ec2 create-fpga-image" command to convert the design file to upload the design tar to S3 (I checked out it actually just the dcp file) and get the afi/agfi. check such afi the dvice id is f010
For both build, I did not use the different device driver I suppose? I shall use the same version AWS_FPGA release, same xrt/xocl version. target platform shall be same as well.
Oh I just realized the vitis build was made on the Ubuntu machine and the vivado build was made on one Centos machine -- will that cause any difference on device id?
You could update the Device ID in the create_vitis_afi.sh script:
https://github.com/aws/aws-fpga/blob/master/Vitis/tools/create_vitis_afi.sh We suggest keeping the Vitis Device ID F010
You can update the Device ID for the HDK flow in an ID_defines files for eg: https://github.com/aws/aws-fpga/blob/master/hdk/cl/examples/cl_dram_dma/design/cl_id_defines.vh
That should be automatically picked up in the manifest created by this script here: https://github.com/aws/aws-fpga/blob/master/hdk/common/shell_v04261818/build/scripts/aws_build_dcp_from_cl.sh
The XRT XOCL device driver should bind with either of these once you setup the same Device ID.
I hope this helps! Let me know if you have any further questions!
-Deep
Thank a lot Deep,
Just to double-check, from this definition:
define CL_SH_ID0 32'hF001_1D0F
this defines the device id=f001, correct? I wonder where the "f000" value of some my AFI came from? (I check out some different older branched, they all have the same value F001, not F000).
If I want to have the same device id with Vitis, I can simple change this value to be 32'hF010_1D0F, correct?
To change the device id for your Vivado created cl's, changing the define will work.
The Device ID's are picked up by the aws_build_dcp_from_cl.sh script from that file. If you created AFI's using cl_hello_world as the template, you might see them set to F000 in there: https://github.com/aws/aws-fpga/blob/master/hdk/cl/examples/cl_hello_world/design/cl_id_defines.vh
This is basically picked up from the id_defines file and added to the ingestion manifest by the script. So if you modify the file, the script will use that modification to create the manifest file used for AFI creation.
For the Vitis flow, it is set to F010 by default.
I hope that helps.
-Deep
Relevant content
- asked 5 years ago
- AWS OFFICIALUpdated 2 months ago
- AWS OFFICIALUpdated a year ago
- AWS OFFICIALUpdated a month ago
- AWS OFFICIALUpdated a year ago