Setup FPGA U280 #85
Loading…
Reference in New Issue
Block a user
No description provided.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Installing all required parts to have a functional FPGA is not trivial. We need to package several pieces:
The XRT pieces can be build from source with the standard level of pain. The kernel modules are distributed as DKMS modules, which I manage to build, but they insist to run depmod, so I need to patch them as well.
For Vivado I found a buildFHS combination that seems to make it happy, so I can run the network installer.
Then I will need to run some checks, as it probably needs some massaging to make it work.
The auxiliar power connector has a different layout than the ones provided in fox, so I installed it in raccoon for now. It may not be a bad idea to run the synthesis and the flashing in different machines, so they don't interfere with each other.
Xilinx has dropped support for U280 in https://github.com/Xilinx/XRT/pull/7901, which makes the xblmgmt driver refuse to probe the FPGA. Reverting the commit makes it happy, and detects the FPGA properly.
The xbmgmt utility reports a golden shell loaded in the FPGA:
This shell is properly documented here:
https://xilinx.github.io/XRT/master/html/platforms_partitions.html#shell
https://xilinx.github.io/XRT/master/html/security.html
We will need to build a firmware package to flash the shell with the latest firmware, and see if then it becomes ready.
Got all the pieces to build the partition.xsabin file, except the scheduler runtime ERT:
This needs to be either build from xilinx-xrt or downloaded from an already build XRT debian package. It is not included as a separated firmware like the rest. From our patched XRT:
Managed to build the xsabin partition file:
Let's check everything is ok and try to flash it.
Attempting to flash the shell ends up in an error:
Not sure if this must be happening because of the logic_uuids which are failing to read:
Rebuilding XRT with debug fails:
It looks like enabling the debug in XRT gets propagated to protobuf, which in turn enables the abseil extra checks but it doesn't like it. Likely this issue https://stackoverflow.com/questions/69866575/why-cant-linker-find-absl-references.
Rebuild with RelWithDebInfo, here is the GDB session:
Image seems to load fine.
Okay, so the problem seems to be that it tries to detect which available images are "installed" but it doesn't seem to detect any. Here it seems to be the problem:
https://github.com/Xilinx/XRT/blob/bf207dcd117c5974790371abfcaa4b7b194b944b/src/runtime_src/core/tools/xbmgmt2/flash/firmware_image.h#L35
So, we don't need to flash the SC for now, and it seems the code path for flashing the shell doesn't require checking the firmware in /lib, so here it goes:
Dmesg:
Probably rebooting won't do it, but let's try.
Yep, still in golden. Let's try full power cycle via the BMC:
BMC is not replying. This has happened in the past, so lets wait a bit and see if it comes back.
The machine doesn't seem to be booting, I will need to check what the BIOS is printing now. The SC can also output debug information via the USB port:
https://adaptivesupport.amd.com/s/question/0D54U00007hy2mtSAA/after-flash-base-image-in-u280-host-server-is-not-booting-up?language=en_US
So, the problem seems to be that an option was needed to be enabled to support more than 4 GiB of memory in the BIOS, otherwise it would complain and not boot. After enabling it, the FPGA is properly detected:
After fixing the usb permissions, we can now see the FPGA from Vivado via the hw_server: