diff --git a/JOURNAL.md b/JOURNAL.md index cb36ee4..8f4d367 100644 --- a/JOURNAL.md +++ b/JOURNAL.md @@ -1435,3 +1435,152 @@ with the following boot options: We will also need to run `csrtool all-in-order` to arrive to systemd. +### OBSERVATION: Hangs in `switch_root` again. + +Technically, we cannot discard the hypothesis than only the console has crashed, +as when we switch to the stage 2 we don't have the heartbeat counter. We may as +well run it again before we arrive to systemd just to verify that userland +crashed. + +On the other hand, I don't understand why we hang in such a way when we try to +write to the `0x8000_0000` area from the kernel memtest. I've been reading the +OpenSBI source code and they seem to have a trap handler that can emit verbose +information to the console when a problem with the trap is detected. I would +expect to see some error being dumped to the console in that case. + +From the OpenSBI information, this line: + + Domain0 Region02 : 0x0000000080000000-0x000000008001ffff M: (R,X) S/U: () + +Suggests that it registers a region with no write permission at `0x8000_0000`, +so it should fail right away from the kernel side. However, this is not reported +anywhere in the console. + +As we have an easy way to trigger this situation, maybe we can use it as a test +to modify OpenSBI to report that problem to the console and verify that it is +working. With that information, we could rule out that a similar problem is +happening when we try to run systemd. Maybe we could also try to debug other +traps. + +Another observation is that the memtest lines we see on the console are printed +*before* the actual test begins: + + pr_info(" %pa - %pa pattern %016llx\n", + &this_start, &this_end, cpu_to_be64(pattern)); + memtest(pattern, this_start, this_end - this_start); + +So when this line is shown: + + [ 0.000000] early_memtest: # of tests: 3 + [ 0.000000] 0x0000000080000000 - 0x0000000080013000 pattern 5555555555555555 + +We an infer that the problem is located in that region, which agrees with the +hypothesis that is related with the OpenSBI regions. + +This is the output I get with OpenSBI 1.5: + + OpenSBI v1.5 + ____ _____ ____ _____ + / __ \ / ____| _ \_ _| + | | | |_ __ ___ _ __ | (___ | |_) || | + | | | | '_ \ / _ \ '_ \ \___ \| _ < | | + | |__| | |_) | __/ | | |____) | |_) || |_ + \____/| .__/ \___|_| |_|_____/|____/_____| + | | + |_| + + Platform Name : ox (Rodrigo NixOS version) + Platform Features : medeleg + Platform HART Count : 1 + Platform IPI Device : --- + Platform Timer Device : axi_timer @ 50000000Hz + Platform Console Device : uart8250 + Platform HSM Device : --- + Platform PMU Device : --- + Platform Reboot Device : --- + Platform Shutdown Device : --- + Platform Suspend Device : --- + Platform CPPC Device : --- + Firmware Base : 0x80000000 + Firmware Size : 310 KB + Firmware RW Offset : 0x40000 + Firmware RW Size : 54 KB + Firmware Heap Offset : 0x45000 + Firmware Heap Size : 34 KB (total), 2 KB (reserved), 11 KB (used), 20 KB (free) + Firmware Scratch Size : 4096 B (total), 368 B (used), 3728 B (free) + Runtime SBI Version : 2.0 + + Domain0 Name : root + Domain0 Boot HART : 0 + Domain0 HARTs : 0* + Domain0 Region00 : 0x0000000040000000-0x0000000040000fff M: (I,R,W) S/U: (R,W) + Domain0 Region01 : 0x0000000080040000-0x000000008004ffff M: (R,W) S/U: () + Domain0 Region02 : 0x0000000080000000-0x000000008003ffff M: (R,X) S/U: () + Domain0 Region03 : 0x0000000000000000-0xffffffffffffffff M: () S/U: (R,W,X) + Domain0 Next Address : 0x0000000080200000 + Domain0 Next Arg1 : 0x0000000080017000 + Domain0 Next Mode : S-mode + Domain0 SysReset : yes + Domain0 SysSuspend : yes + + Boot HART ID : 0 + Boot HART Domain : root + Boot HART Priv Version : v1.10 + Boot HART Base ISA : rv64imafdc + Boot HART ISA Extensions : zicntr,zihpm,sdtrig + Boot HART PMP Count : 0 + Boot HART PMP Granularity : 0 bits + Boot HART PMP Address Bits: 0 + Boot HART MHPM Info : 29 (0xfffffff8) + Boot HART Debug Triggers : 0 triggers + Boot HART MIDELEG : 0x0000000000000222 + Boot HART MEDELEG : 0x000000000000b109 + + + Core: 12 devices, 8 uclasses, devicetree: board + Loading Environment from nowhere... OK + In: serial,usbkbd + Out: serial,vidconsole + Err: serial,vidconsole + No working controllers found + Net: No ethernet found. + Working FDT set to 80017000 + Hit any key to stop autoboot: 0 + + Device 0: unknown device + + Device 1: unknown device + scanning bus for devices... + + Device 0: unknown device + starting USB... + No working controllers found + No ethernet found. + No ethernet found. + => + +Where now the regions are slightly off: + + Domain0 Region00 : 0x0000000040000000-0x0000000040000fff M: (I,R,W) S/U: (R,W) + Domain0 Region01 : 0x0000000080040000-0x000000008004ffff M: (R,W) S/U: () + Domain0 Region02 : 0x0000000080000000-0x000000008003ffff M: (R,X) S/U: () + Domain0 Region03 : 0x0000000000000000-0xffffffffffffffff M: () S/U: (R,W,X) + +I would assume that the region 1 is where OpenSBI places its own data, and +region 2 is where it places its own code. Then, in region 0 there is the serial +area. + +Interestingly, I can read and write to the 0x80000000 - 0x81000000 from u-boot +without problems: + + => mtest 0x80000000 0x81000000 0 4 + Testing 80000000 ... 81000000: + Pattern FFFFFFFFFFFFFFFF Writing... Reading...Iteration: 4 + Tested 4 iteration(s) with 0 errors. + +So I suspect that it disables those regions before jumping into U-Boot. + +What I don't understand is why the MMIO region 0 is starting at 0x40000000 while +the UART port should be mapped in 0x40001000 as per the device tree. Maybe we +could try with the generic configuration of OpenSBI and see if it can load the +plic and the serial ports properly directly from the device tree. diff --git a/fpga/upload.sh b/fpga/upload.sh index 67ee05b..0c8807e 100755 --- a/fpga/upload.sh +++ b/fpga/upload.sh @@ -13,7 +13,7 @@ fi rsync -a fpga/fpgactl "$dst" #rsync -a fpga/boot.sh "$dst" rsync -a fpga/env.sh "$dst" -rsync $OPENSBI/share/opensbi/*/fpga/*/firmware/fw_payload.bin "$dst/opensbi.bin" +rsync $(find "$OPENSBI" -name fw_payload.bin) "$dst/opensbi.bin" rsync "$KERNEL/Image" "$dst/kernel.bin" rsync "$INITRD/initrd" "$dst/initrd.bin" if [ -n "$ROOTFS" ]; then diff --git a/lagarto-ox.nix b/lagarto-ox.nix index ab0ab8f..d88f845 100644 --- a/lagarto-ox.nix +++ b/lagarto-ox.nix @@ -310,7 +310,7 @@ }; #NIX_DEBUG=5; makeFlags = [ - "PLATFORM=fpga/ox_alveo" + "PLATFORM=generic" #"CONFIG_SBI_ECALL_RFENCE=n" #"PLATFORM_RISCV_ISA=rv64imafd" # No compressed instructions #"PLATFORM_RISCV_ISA=rv64g" # No compressed instructions @@ -318,7 +318,7 @@ "FW_PAYLOAD_PATH=${final.uboot}/u-boot-nodtb.bin" "FW_FDT_PATH=${final.ox-dtb}" ]; - patches = [ ./ox-alveo-platform-plic.patch ]; + #patches = [ ./ox-alveo-platform-plic.patch ]; }); # opensbi = prev.opensbi.overrideAttrs (old: { # #NIX_DEBUG=5;