From d52691ca338b9d03598f2346efe1dc5440a7e9c3 Mon Sep 17 00:00:00 2001
From: Rodrigo Arias Mallo <rodrigo.arias@bsc.es>
Date: Fri, 5 Jul 2024 16:53:07 +0200
Subject: [PATCH] Add journal to the repository

---
 JOURNAL.md | 649 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 649 insertions(+)
 create mode 100644 JOURNAL.md
diff --git a/JOURNAL.md b/JOURNAL.md
new file mode 100644
index 0000000..e563135
--- /dev/null
+++ b/JOURNAL.md
@@ -0,0 +1,649 @@
+## 2024-07-02
+
+Aleix provided some notes on how to trace the kernel using tracepoints that are
+printed to the console:
+
+```
+BOOT TIME TRACING
+  - compile with CONFIG_BOOTTIME_TRACING=y
+  - add to kernel parameters:
+      trace_options=sym-addr trace_event=initcall:* tp_printk trace_buf_size=1M ftrace=function ftrace_filter="vfs*"
+      trace_options=sym-addr trace_event=sched:* tp_printk trace_buf_size=1M
+  - tp_printk sends tracepoint (TRACE_EVENT) to printk buffer
+  - trace_event=[subsytem:event]; accepts
+     - *:<event_name>
+     - <subsystem>:*
+     - *:* all
+  - many more options at https://www.kernel.org/doc/html/latest/trace/boottime-trace.html
+```
+
+Start with:
+
+```
+trace_options=sym-addr trace_event=sched:* tp_printk trace_buf_size=1M loglevel=7
+```
+
+Can it be caused by the D extension? If I set the ISA to:
+
+```
+riscv,isa = "rv64imaf";
+```
+
+It locks the stage1 script without providing any output. Let see trying again.
+Locked again.
+
+I can disable the FPU in the kernel, and then it will act as a detector of any
+floating point instruction.
+
+Then I just need to rebuild busybox without support for double instructions.
+
+Let see if I can build busybox with double instruction
+
+Nope, the u-boot is reporting the d extension is in the isa:
+
+> riscv,isa = "rv64imafd";
+
+## 2024-07-03
+
+I cannot switch to `gcc.arch = rv64ima` because rust fails to build.
+
+**Assumption**: The extensions F and D work well and don't cause a hang in the<!--{{{-->
+CPU.
+
+Let's go back and try to get the initrd shell, so we can systematically hang it
+in the `switch_root`
+<!--}}}-->
+**Observation**: The riscv-timer seems to be causing interrupts with IRQ 5:<!--{{{-->
+
+```
+[   62.439060] irq_handler_entry: irq=5 name=riscv-timer
+[   62.444980] irq_handler_exit: irq=5 ret=handled
+```
+<!--}}}-->
+**Observation**: Rohan reports the serial startup routine being running *after*{{{
+the init begins.
+<!--}}}-->
+**Observation**: Only interrupts in timer, others are zero.<!--{{{-->
+
+With:
+
+    commit 4c656bd8ddd2e41ccaa976ff8d6bd9209175a632 (HEAD -> lagarto-ox)
+    Author: Rodrigo Arias Mallo <rodrigo.arias@bsc.es>
+    Date:   Wed Jul 3 13:21:04 2024 +0200
+    
+        Add busybox patch to see debug lines
+    
+        The switch_root command seems to hang in the execv() syscall.
+
+I can see this:
+
+    ~ # cat /proc/interrupts ; sleep 10 ; cat /proc/interrupts
+               CPU0
+     10:      42926  RISC-V INTC   5 Edge      riscv-timer
+    IPI0:         0  Rescheduling interrupts
+    IPI1:         0  Function call interrupts
+    IPI2:         0  CPU stop interrupts
+    IPI3:         0  CPU stop (for crash dump) interrupts
+    IPI4:         0  IRQ work interrupts
+    IPI5:         0  Timer broadcast interrupts
+               CPU0
+     10:      46023  RISC-V INTC   5 Edge      riscv-timer
+    IPI0:         0  Rescheduling interrupts
+    IPI1:         0  Function call interrupts
+    IPI2:         0  CPU stop interrupts
+    IPI3:         0  CPU stop (for crash dump) interrupts
+    IPI4:         0  IRQ work interrupts
+    IPI5:         0  Timer broadcast interrupts
+<!--}}}-->
+**Observation**: There is a timer configured in 0x40170000 but in the device<!--{{{-->
+tree we only have one at `timer@40002000`.
+
+    #define OX_ALVEO_TIMER_BASE		0x40170000
+    #define ADDR_TIME_L 			0x0u // 32 lower bits of the time register
+    #define ADDR_TIME_H 			0x1u // 32 higher bits of the time register
+    #define ADDR_TIMECMP_L 			0x2u // 32 lower bits of the time comparator
+    #define ADDR_TIMECMP_H 			0x3u // 32 higher bits of the time comparator
+
+    https://gitlab.bsc.es/hwdesign/bsc-linux/-/blob/d6d194bd30d9a8fe49c2a278ffb3c3ae7852e75d/bsc_tree/patches/ox_alveo/opensbi/0001-opensbi-ox_alveo-platform.patch#L63
+<!--}}}-->
+**Observation**: When the serial console starts, the speed of the serial port<!--{{{-->
+changes to 9600:
+
+    [    6.845400] io scheduler mq-deadline registered
+    [    6.851500] io scheduler kyber registered
+    [   17.644460] Serial: 8250/16550 driver, 4 ports, IRQ sharing disabled
+    [   18.141160] printk: console [ttyS0] disabled
+    [   18.229480] 40001000.serial: ttyS0 at MMIO 0x40001000 (irq = 11, base_baud = 3125000) is a 16550
+    
+    *** baud: 230400 ***
+    
+    *** baud: 460800 ***
+    
+    *** baud: 500000 ***
+    
+    *** baud: 576000 ***
+    
+    *** baud: 500000 ***
+    
+    *** baud: 460800 ***
+    
+    *** baud: 230400 ***
+    
+    *** baud: 115200 ***
+    
+    *** baud: 57600 ***
+    
+    *** baud: 38400 ***
+    
+    *** baud: 19200 ***
+    faaa0?xx
+            <ffa
+                fx3fÆf
+                      ff
+                        ~3xf03faaa0?~3<3<`<~<f<`<憆3~fÆf̆x3fff
+    
+    *** baud: 9600 ***
+    
+    *** baud: 19200 ***
+    
+    *** baud: 38400 ***
+    xxxxxx<xx<xx<xx<xxxxxxxx
+    *** baud: 57600 ***
+    
+    *** baud: 38400 ***
+    xxxxxxx<x<xx<xx<xx<xx<x<xxxxxxx<xxx<x<x<xxx<xxxxxx<xx<xxxxx<x<xxxxxxxxxxxxxx<xxxxxxxx<xxxxx<x
+    *** baud: 19200 ***
+    faaa??~3fx3x3憆`?3f
+                       ff3f<ffa<xÞ?<?f<x<x<`<`<f<x??`?`<怘
+
+    *** baud: 9600 ***
+    [   77.929980] stage-1-init: [Thu Jan  1 00:01:17 UTC 1970] + ln -sfn /proc/self/fd/1 /dev/stdout
+    [   79.258380] stage-1-init: [Thu Jan  1 00:01:18 UTC 1970] + ln -sfn /proc/self/fd/2 /dev/stderr
+    [   80.212360] stage-1-init: [Thu Jan  1 00:01:19 UTC 1970] + mkdir -p /etc/systemd
+    [   81.224900] stage-1-init: [Thu Jan  1 00:01:20 UTC 1970] + ln -sfn /nix/store/r4ycv0ymw9wrnalh0sdgkp39nwy3szqz-link-units /etc/systemd/network
+    [   82.270000] stage-1-init: [Thu Jan  1 00:01:21 UTC 1970] + mkdir -p /etc/udev
+    [   83.249640] stage-1-init: [Thu Jan  1 00:01:22 UTC 1970] + ln -sfn /nix/store/lx1q7j61dldj8bv0p33v42q0nldb6sd7-udev-rules /etc/udev/rules.d
+    [   84.304940] stage-1-init: [Thu Jan  1 00:01:23 UTC 1970] + mkdir -p /dev/.mdadm
+    [   85.221060] stage-1-init: [Thu Jan  1 00:01:24 UTC 1970] + systemd-udevd --daemon
+    [   86.157320] stage-1-init: [Thu Jan  1 00:01:25 UTC 1970] Starting systemd-udevd version 254.3
+    [   87.130060] stage-1-init: [Thu Jan  1 00:01:26 UTC 1970] + udevadm trigger '--action=add'
+    
+    *** break sent ***
+    [  158.281900] stage-1-init: [Thu Jan  1 00:02:37 UTC 1970] + udevadm settle
+    
+    *** break sent ***
+    
+    *** break sent ***
+    [  214.524500] stage-1-init: [Thu Jan  1 00:03:33 UTC 1970] + kbd_mode -u -C /dev/console
+    [  215.392960] stage-1-init: [Thu Jan  1 00:03:34 UTC 1970] kbd_mode: KDSKBMODE: Inappropriate ioctl for device
+
+That was my mistake as I need to put the baud speed in the ttyS0, like this:
+
+    console=ttyS0,115200n8
+<!--}}}-->
+**Observation**: Trying to read from the serial console /dev/ttyS0 causes no<!--{{{-->
+more messages in the console (or a hang).
+<!--}}}-->
+**Question**: Can we make a heartbeat for the kernel? The idea is to keep a<!--{{{-->
+counter in some memory of the kernel so we can see it from the host being moved.
+<!--}}}-->
+**Question**: Can we disable the serial driver 8250 from loading?<!--{{{-->
+
+    initcall_blacklist=<driver_init>
+
+I need to know the 8250 init function name:
+
+    drivers/tty/serial/8250/8250_core.c:static int __init serial8250_init(void)
+
+So...
+
+    initcall_blacklist=serial8250_init
+
+Yes, but that doesn't seem to do anything. It is hanging:
+
+    + modprobe dm_mod
+    [  627.473580] stage-1-init: [Thu Jan  1 00:10:26 UTC 1970] + echo init /nix/store/v6pi2mqfgshxdsbyxlvpm9nvawxrpijv-nixos-system-nixos-riscv-23.11pre-git/init
+    [  628.249440] stage-1-init: [Thu Jan  1 00:10:27 UTC 1970] + set -- init /nix/store/v6pi2mqfgshxdsbyxlvpm9nvawxrpijv-nixos-system-nixos-riscv-23.11pre-git/init
+    [  629.004840] stage-1-init: [Thu Jan  1 00:10:28 UTC 1970] + stage2Init=/nix/store/v6pi2mqfgshxdsbyxlvpm9nvawxrpijv-nixos-system-nixos-riscv-23.11pre-git/init
+    [  629.733920] stage-1-init: [Thu Jan  1 00:10:29 UTC 1970] + echo /nix/store/snvvqpxmryw1szlllk0bxpm37p8vj8sw-extra-utils/bin/modprobe
+
+<!--}}}-->
+**Question**: What happens if we remap the interruptions?<!--{{{-->
+
+- Move the serial from 0 to 1
+- Move the plic from 3 to 2 and remove 7
+
+Now we have one context only:
+
+    [    0.000000] riscv-intc: 64 local interrupts mapped
+    [    0.000000] plic: plic@40800000: mapped 3 interrupts with 0 handlers for 1 contexts.
+    [    0.000000] riscv: providing IPIs using SBI IPI extension
+
+Rather than two:
+
+    [    0.000000] riscv-intc: 64 local interrupts mapped
+    [    0.000000] plic: plic@40800000: mapped 3 interrupts with 0 handlers for 2 contexts.
+    [    0.000000] riscv: providing IPIs using SBI IPI extension
+<!--}}}-->
+**Question**: What happens if we block the `sbi_ipi` driver?<!--{{{-->
+
+    initcall_blacklist=sbi_ipi_init
+
+Nothing, it cannot be disabled it seems. I will remove SMP support so it won't
+be compiled in.
+<!--}}}-->
+**Observation**: Searching for 'riscv,plic0' only matches irq-sifive-plic driver.<!--{{{-->
+
+    hut% rg 'riscv,plic0'
+    Documentation/devicetree/bindings/interrupt-controller/sifive,plic-1.0.0.yaml
+    72:          - const: riscv,plic0
+    
+    drivers/irqchip/irq-sifive-plic.c
+    572:IRQCHIP_DECLARE(riscv_plic0, "riscv,plic0", plic_init); /* for legacy systems */
+
+So it looks that the only driver that setups the plic is the one used by SiFive.
+Here is the doc: https://static.dev.sifive.com/U54-MC-RVCoreIP.pdf
+<!--}}}-->
+**Observation**: The number of handlers is 0, so there are no interruptions.<!--{{{-->
+
+It seems the number next to the phandle of the interrupts-extended attribute in
+the plic follows a different convention of values. Using 9 and 11:
+
+    plic: plic@40800000: mapped 3 interrupts with 1 handlers for 2 contexts.
+<!--}}}-->
+**Remark**: The key combination to run Magic SysRq using the HVC console is<!--{{{-->
+Ctrl-O and then the SysRq key. It only works it the console is being actively
+polled, otherwise it hangs.<!--}}}-->
+
+## 2024-07-04
+
+**Observation**: I saw they changed this option in Cinco Ranch DTS for the<!--{{{-->
+serial:
+
+> reg-shift = <0>; // regs are spaced on 8 bit boundary (modified from Xilinx UART16550 to be ns16550 compatible)
+
+Tested booting with debug1 and the ttyS0 console, and it goes extremely slow
+(but still outputs at 115200) and then continues to fail to read keyboard input.
+<!--}}}-->
+**Question**: Let's try setting the console in poll mode.<!--{{{-->
+
+setenv bootargs "root=/dev/ram0 loglevel=7 debug rw earlycon=uart,io,0x40001000,115200n8 boot.trace console=uart,io,0x40001000,115200n8 debug1 init=/nix/store/wavmnv6wjj8y10ha07wxd5f0sqacivj8-nixos-system-nixos-riscv-23.11pre-git/init"
+
+    [    5.909360] io scheduler mq-deadline registered
+    [    5.914900] io scheduler kyber registered
+    [   14.405980] Serial: 8250/16550 driver, 4 ports, IRQ sharing disabled
+    [   14.856040] 40001000.serial: ttyS0 at MMIO 0x40001000 (irq = 2, base_baud = 3125000) is a 16550
+    [   15.059680] nd_pmem namespace0.0: unable to guarantee persistence of writes
+    ...
+    [   40.250100] clk: Disabling unused clocks
+    [   40.256220] Warning: unable to open an initial console. <---- SEE THIS
+    [   40.618300] Freeing unused kernel image (initmem) memory: 5592K
+    [   40.728300] Checked W+X mappings: passed, no W+X pages found
+    [   40.735540] Run /init as init process
+    [   40.740080]   with arguments:
+    [   40.743920]     /init
+    [   40.746660]   with environment:
+    [   40.750740]     HOME=/
+    [   40.754020]     TERM=linux
+    [   46.569960] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000100
+    [   46.578100] CPU: 0 PID: 1 Comm: init Not tainted 6.6.1 #1-NixOS
+    [   46.584600] Hardware name: Barcelona Supercomputing Center - Lagarto Ox (NixOS) (DT)
+    [   46.592740] Call Trace:
+    [   46.595640] [<ffffffff80006688>] dump_backtrace+0x38/0x48
+    [   46.601760] [<ffffffff80a96a04>] show_stack+0x50/0x68
+    [   46.607540] [<ffffffff80aa3f74>] dump_stack_lvl+0x3c/0x5c
+    [   46.613660] [<ffffffff80aa3fb4>] dump_stack+0x20/0x30
+    [   46.619400] [<ffffffff80a970dc>] panic+0x158/0x374
+    [   46.624900] [<ffffffff8001c18c>] do_exit+0x9e8/0x9f0
+    [   46.630580] [<ffffffff8001c3b0>] do_group_exit+0x44/0xb0
+    [   46.636600] [<ffffffff8001c454>] __wake_up_parent+0x0/0x40
+    [   46.642780] [<ffffffff80aa4eac>] do_trap_ecall_u+0x14c/0x168
+    [   46.649140] [<ffffffff800035ec>] ret_from_exception+0x0/0xac
+    [   46.655500] Kernel Offset: 0x0 from 0xffffffff80000000
+    [   46.661160] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000100 ]---
+
+Also found: `no_console_suspend`
+<!--}}}-->
+**Observation**: There are messages of address space being assigned to<!--{{{-->
+registers:
+
+    Slave segment '/MEEP_uart_0/S_AXI/Reg' is being assigned into address space '/m_axi_uart0' at <0x0000_0000 [ 4K ]>.
+    Slave segment '/MEEP_uart_1/S_AXI/Reg' is being assigned into address space '/m_axi_uart1' at <0x0000_0000 [ 4K ]>.
+<!--}}}-->
+**Question**: What happens if I enable `CONFIG_CONSOLE_POLL`?<!--{{{-->
+
+With `console=ttyS0,115200n8 debug1` I cannot type.
+<!--}}}-->
+**Observation**: I can dump iomem memory with the tool devmem:<!--{{{-->
+
+But it seems I cannot dump the registers of the serial io mapped region:
+
+    ~ # cat /proc/iomem
+    40001000-400010ff : serial
+    60000000-7fffffff : Reserved
+    80000000-ffefffff : System RAM
+      80201000-81fa0b87 : Kernel image
+        80201000-80cb177f : Kernel code
+        81400000-819fffff : Kernel rodata
+        81c00000-81f18747 : Kernel data
+        81f19000-81fa0b87 : Kernel bss
+    100000000-1bfffffff : namespace0.0
+
+    ~ # devmem 0x40001000
+    devmem: mmap: Operation not permitted
+
+It looks like the following options may be required to be disabled to allow
+user-space tools read those regions.
+
+    ~ # zcat /proc/config.gz | grep CONFIG_STRICT_DEVMEM
+    CONFIG_STRICT_DEVMEM=y
+    ~ # zcat /proc/config.gz | grep CONFIG_IO_STRICT_DEVMEM
+    CONFIG_IO_STRICT_DEVMEM=y
+
+Let's try disabling `CONFIG_STRICT_DEVMEM` and see if we can read the serial
+registers.
+
+It works!
+
+    ~ # cat /proc/iomem
+    40001000-400010ff : serial
+    60000000-7fffffff : Reserved
+    80000000-ffefffff : System RAM
+      80201000-81fa0b87 : Kernel image
+        80201000-80cb159f : Kernel code
+        81400000-819fffff : Kernel rodata
+        81c00000-81f18707 : Kernel data
+        81f19000-81fa0b87 : Kernel bss
+    100000000-1bfffffff : namespace0.0
+    ~ # devmem 0x40001000
+    0x0000006E
+    ~ # devmem 0x40001000
+    0x0000006C
+    ~ # devmem 0x40001000
+    0x00000072
+    ~ # devmem 0x40001000
+    0x0000000D
+<!--}}}-->
+**Observation**: The interrupt register of the serial console is 0x0:<!--{{{-->
+
+Assuming the console registers follow AXI UART 16550, here is the IER:
+
+>  0x1004  IER  R/W  Interrupt Enable Register
+
+Which is zero:
+
+~ # devmem 0x40001004
+0x00000000
+
+The line control register is 0x3:
+
+~ # devmem 0x4000100C
+0x00000003
+<!--}}}-->
+**Question**: Can I write to some memory address and see the result from the<!--{{{-->
+host?
+
+For that  I would need to find some address that is mapped to the DMA or to the
+pmem. Xavi recommended `0x6000_0000` as it is uncached.
+
+It seems to have some content already:
+
+    ~ # devmem 0x60000000
+    0x00000093
+    ~ # devmem 0x60000004
+    0x00000113
+    ~ # devmem 0x60000008
+    0x00000193
+    ~ # devmem 0x60000010
+    0x00000293
+    ~ # devmem 0x60000014
+    0x00000313
+    ~ # devmem 0x60000018
+    0x00000393
+
+Writing test seems to work:
+
+    ~ # devmem 0x60000000 32 0x11223344
+    ~ # devmem 0x60000000
+    0x11223344
+
+So, technically we should be using the `/dev/qdma34000-MM-0` device, as we
+already use the other one to map the memory.
+
+    [bsc015557@fpgan02 nixos]$ for i in {0..16}; do addr=$(($i * 0x10000000)); \
+        printf "addr 0x%09x: " $addr; dd if=/dev/qdma34000-MM-0 count=16 bs=1 skip=$addr 2>/dev/null | xxd; done
+    addr 0x000000000: 00000000: 4444 4444 4444 4444 4444 4444 4444 4444  DDDDDDDDDDDDDDDD
+    addr 0x010000000: 00000000: 0000 0000 0000 0000 0000 0000 0000 0000  ................
+    addr 0x020000000: 00000000: 3333 3333 3333 3333 3333 3333 3333 3333  3333333333333333
+    addr 0x030000000: 00000000: cccc cccc cccc cccc cccc cccc cccc cccc  ................
+    addr 0x040000000: 00000000: cccc cccc cccc cccc cccc cccc cccc cccc  ................
+    addr 0x050000000: 00000000: 0000 0000 0000 0000 0000 0000 0000 0000  ................
+    addr 0x060000000: 00000000: 0000 0000 0000 0000 0000 0000 0000 0000  ................
+    addr 0x070000000: 00000000: 0000 0000 0000 0000 0000 0000 0000 0000  ................
+    addr 0x080000000: 00000000: 3333 3333 3333 3333 3333 3333 3333 3333  3333333333333333
+    addr 0x090000000: 00000000: 0000 0000 0000 0000 0000 0000 0000 0000  ................
+    addr 0x0a0000000: 00000000: 0000 0000 0000 0000 0000 0000 0000 0000  ................
+    addr 0x0b0000000: 00000000: 2f2f 2f2f 2f2f 2f2f 2f2f 2f2f 2f2f 2f2f  ////////////////
+    addr 0x0c0000000: 00000000: 6e6e 6e6e 6e6e 6e6e 6e6e 6e6e 6e6e 6e6e  nnnnnnnnnnnnnnnn
+    addr 0x0d0000000: 00000000: 2020 2020 2020 2020 2020 2020 2020 2020
+    addr 0x0e0000000: 00000000: 6c6c 6c6c 6c6c 6c6c 6c6c 6c6c 6c6c 6c6c  llllllllllllllll
+    addr 0x0f0000000: 00000000: 6767 6767 6767 6767 6767 6767 6767 6767  gggggggggggggggg
+    addr 0x100000000: 00000000: 2424 2424 2424 2424 2424 2424 2424 2424  $$$$$$$$$$$$$$$$
+
+    [bsc015557@fpgan02 nixos]$ for i in {0..16}; do addr=$(($i * 0x10000000)); \
+        printf "addr 0x%09x: " $addr; dd if=/dev/qdma34000-MM-1 count=16 bs=1 skip=$addr 2>/dev/null | xxd; done
+    addr 0x000000000: 00000000: 4444 4444 4444 4444 4444 4444 4444 4444  DDDDDDDDDDDDDDDD
+    addr 0x010000000: 00000000: 0000 0000 0000 0000 0000 0000 0000 0000  ................
+    addr 0x020000000: 00000000: 3333 3333 3333 3333 3333 3333 3333 3333  3333333333333333
+    addr 0x030000000: 00000000: cccc cccc cccc cccc cccc cccc cccc cccc  ................
+    addr 0x040000000: 00000000: cccc cccc cccc cccc cccc cccc cccc cccc  ................
+    addr 0x050000000: 00000000: 0000 0000 0000 0000 0000 0000 0000 0000  ................
+    addr 0x060000000: 00000000: 0000 0000 0000 0000 0000 0000 0000 0000  ................
+    addr 0x070000000: 00000000: 0000 0000 0000 0000 0000 0000 0000 0000  ................
+    addr 0x080000000: 00000000: 3333 3333 3333 3333 3333 3333 3333 3333  3333333333333333
+    addr 0x090000000: 00000000: 0000 0000 0000 0000 0000 0000 0000 0000  ................
+    addr 0x0a0000000: 00000000: 0000 0000 0000 0000 0000 0000 0000 0000  ................
+    addr 0x0b0000000: 00000000: 2f2f 2f2f 2f2f 2f2f 2f2f 2f2f 2f2f 2f2f  ////////////////
+    addr 0x0c0000000: 00000000: 6e6e 6e6e 6e6e 6e6e 6e6e 6e6e 6e6e 6e6e  nnnnnnnnnnnnnnnn
+    addr 0x0d0000000: 00000000: 2020 2020 2020 2020 2020 2020 2020 2020
+    addr 0x0e0000000: 00000000: 6c6c 6c6c 6c6c 6c6c 6c6c 6c6c 6c6c 6c6c  llllllllllllllll
+    addr 0x0f0000000: 00000000: 6767 6767 6767 6767 6767 6767 6767 6767  gggggggggggggggg
+    addr 0x100000000: 00000000: 2424 2424 2424 2424 2424 2424 2424 2424  $$$$$$$$$$$$$$$$
+
+None of the two queues seem to have the value 0x11223344 at any multiple of `0x1000_0000`.
+
+Let's verify first that this method works. The kernel is loaded here:
+
+    [bsc015557@fpgan02 nixos]$ printf '0x%x\n' $FPGACTL_KERNEL_ADDR
+    0x24000000
+
+So we should see the same values as the kernel file:
+
+    [bsc015557@fpgan02 nixos]$ dd if=kernel.bin count=16 bs=1 2>/dev/null| xxd
+    00000000: 6f00 400d 0000 0000 0000 2000 0000 0000  o.@....... .....
+
+But we don't see the same:
+
+    [bsc015557@fpgan02 nixos]$ dd if=/dev/qdma34000-MM-1 count=16 bs=1 skip=$FPGACTL_KERNEL_ADDR 2>/dev/null | xxd
+    00000000: 9797 9797 9797 9797 9797 9797 9797 9797  ................
+    [bsc015557@fpgan02 nixos]$ dd if=/dev/qdma34000-MM-0 count=16 bs=1 skip=$FPGACTL_KERNEL_ADDR 2>/dev/null | xxd
+    00000000: 9797 9797 9797 9797 9797 9797 9797 9797  ................<!--}}}-->
+**Question**: Missing forward M to S via Mideleg?<!--{{{-->
+
+Can it be happening that he MEDELEG is not forwarding the interruptions to the
+Supervisor (kernel)?
+
+    Boot HART MIDELEG         : 0x0000000000000222
+    Boot HART MEDELEG         : 0x000000000000b109
+<!--}}}-->
+**Question**: Can we add a timer to the PLIC to test the interrupts?<!--{{{-->
+<!--}}}-->
+**Observation**: Here is the PLIC register dump:<!--{{{-->
+
+    ~ # for i in `seq 0 16`; do addr=$((0x40600000 + $i)); printf '%08x: ' $addr; devmem $addr; done
+    40600000: 0x00010002
+    40600001: 0x09000000
+    40600002: 0x00090000
+    40600003: 0x00000900
+    40600004: 0x00010009
+    40600005: 0x00000000
+    40600006: 0x00000000
+    40600007: 0x00000000
+    40600008: 0x00000000
+    40600009: 0x00000000
+    4060000a: 0x00000000
+    4060000b: 0x00000000
+    4060000c: 0x00000000
+    4060000d: 0x00000000
+    4060000e: 0x00000000
+    4060000f: 0x00000000
+    40600010: 0x00000000
+<!--}}}-->
+**Question**: Can we boot with the new bitstream that includes the second UART?<!--{{{-->
+
+The interruptions are enabled for the UART 1, not the default UART 0.
+
+<!--}}}-->
+**Observation**: I'm using 0x100 not 0x1000 in the serial range:<!--{{{-->
+
+	reg = <0x0 0x40003000 0x0 0x100>;
+	reg = <0x0 0x40003000 0x0 0x1000>;
+
+Can this produce any problem?
+
+It doesn't seem to change anything, still unable to send any bytes.
+<!--}}}-->
+**Question**: Can we use virtio to mount a FS in the DMA shared memory?
+
+## 2024-07-05
+
+**Observation**: The kernel continues working when the console hangs.<!--{{{-->
+
+Switching to 0x100000000 as 0x60000000 shows:
+
+    ~ # devmem 0x6000000
+    0xBADCAB1E
+
+With the following loop:
+
+    ~ # i=0; while [ 1 ]; do let i=$i+1; devmem 0x100000000 32 $i; done &
+    ~ # cat /dev/ttyS0
+    (hangs)
+
+Shows the kernel works:
+
+    [bsc015557@fpgan02 nixos]$ while [ 1 ]; do xxd -s $((0x100000000 - 0x60000000)) -l 4 /dev/qdma34000-MM-1; sleep 0.2; done
+    ...
+    a0000000: 6400 0000                                d...
+    a0000000: 6500 0000                                e...
+    a0000000: 6600 0000                                f...
+    a0000000: 6700 0000                                g...
+    a0000000: 6800 0000                                h...
+    a0000000: 6900 0000                                i...
+<!--}}}-->
+**Question**: Can we reproduce it with `switch_root`?<!--{{{-->
+
+For that I would have to ensure the process continues to operate, even if we
+exit the console. Maybe I can make a double fork?
+
+I cannot use `0x1000_0000` as that is where the pmem will be. But I can try to
+use an address in the end, as we are not filling the whole space.
+
+    [0x1_0000_0000, 0x1_c000_0000) -> PMEM (3072 MiB)
+
+Maybe `0x1_bfff_0000`? Let's try first from the initrd shell.
+
+    i=0; while [ 1 ]; do let i=$i+1; devmem 0x1bfff0000 32 $i; done &
+
+Then
+
+    while [ 1 ]; do xxd -s $((0x1bfff0000 - 0x60000000)) -l 4 /dev/qdma34000-MM-1; sleep 0.2; done
+
+Yes, it seems to be working. Let's load the rootfs too.
+
+I added a loop in the stage1 script.<!--}}}-->
+
+**Question**: Can we see any clock in memory? This will allow us to check if the
+AXI still works.
+
+**Observation**: The kernel stops updating the counter in the mount phase.<!--{{{-->
+
+Managed to reach the mount and hang there:
+
+    [  337.504740] stage-1-init: [Thu Jan  1 00:05:36 UTC 1970] + '[' -d
+    /dev/disk/by-label/NIXOS_SD ]
+    [  338.284560] stage-1-init: [Thu Jan  1 00:05:37 UTC 1970] + mkdir -m 0755 -p
+    /mnt-root/
+    [  339.017420] stage-1-init: [Thu Jan  1 00:05:38 UTC 1970] + local 'n=0'
+    [  339.752560] stage-1-init: [Thu Jan  1 00:05:39 UTC 1970] + true
+    [  340.488960] stage-1-init: [Thu Jan  1 00:05:39 UTC 1970] + mount /mnt-root/
+
+After almost 6 minutes, with 571 beats:
+
+    $ xxd -s 5905514496 -l 4 /dev/qdma34000-MM-1
+    5fff0000: 3b02 0000                                ;...
+
+It looks like the kernel is the one getting stuck *or* at least is unable to
+propagate the heartbeat changes to the host. It would be nice to monitor a
+hardware clock from the DMA region too, so we can discard problems in the AXI.<!--}}}-->
+
+
+    [  177.009540] stage-1-init: [Thu Jan  1 00:02:56 UTC 1970] + udevadm settle
+    + kbd_mode -u -C /dev/console
+    kbd_mode: KDSKBMODE: Inappropriate ioctl for device
+    + printf '\033%%G'
+    + loadkmap
+    [  266.301040] stage-1-init: [Thu Jan  1 00:04:25 UTC 1970] + kbd_mode -u -C /dev/console
+
+**Assumption**: The kernel hangs.<!--{{{-->
+
+If the kernel hangs, there must be an instruction or sequence of instructions
+that causes it. First I need to determine what is being executed by the kernel.
+For that I could use `ftrace` to see which program is running at the time it
+hangs.
+
+    trace_options=sym-addr trace_event=initcall:* tp_printk trace_buf_size=1M
+
+    (prev_comm != 2 && next_comm != 2)
+
+So, we can just enable the `tp_printk` but not the tracer. Then in the initrd
+script, I enable the function tracer and the filter.<!--}}}-->
+
+**Observation**: It takes a long time to init the pty:<!--{{{-->
+
+Interesting timing:
+
+    [   12.612620] initcall_start: func=pty_init+0x0/0x3f4
+    [   20.962640] initcall_finish: func=pty_init+0x0/0x3f4 ret=0
+<!--}}}-->
+**Observation**: The kcompactd0 daemon is using the CPU:<!--{{{-->
+
+    [  290.394920] sched_switch: prev_comm=devmem prev_pid=129 prev_prio=120 prev_state=R ==> next_comm=init next_pid=69 next_prio=120
+    [  290.408160] sched_switch: prev_comm=init prev_pid=69 prev_prio=120 prev_state=R ==> next_comm=tee next_pid=68 next_prio=120
+    [  290.420720] sched_switch: prev_comm=tee prev_pid=68 prev_prio=120 prev_state=R+ ==> next_comm=ksoftirqd/0 next_pid=12 next_prio=120
+    [  290.433960] sched_switch: prev_comm=ksoftirqd/0 prev_pid=12 prev_prio=120 prev_state=R ==> next_comm=init next_pid=1 next_prio=120
+    [  290.447100] sched_switch: prev_comm=init prev_pid=1 prev_prio=120 prev_state=R ==> next_comm=kcompactd0 next_pid=22 next_prio=120
+    [  290.460180] sched_switch: prev_comm=kcompactd0 prev_pid=22 prev_prio=120 prev_state=R ==> next_comm=khvcd next_pid=31 next_prio=120
+    [  290.473400] sched_switch: prev_comm=khvcd prev_pid=31 prev_prio=120 prev_state=R ==> next_comm=kworker/u2:2 next_pid=19 next_prio=120
+    [  290.486960] sched_switch: prev_comm=kworker/u2:2 prev_pid=19 prev_prio=120 prev_state=R ==> next_comm=khungtaskd next_pid=18 next_prio=120
+    [  290.500800] sched_switch: prev_comm=khungtaskd prev_pid=18 prev_prio=120 prev_state=R ==> next_comm=kworker/0:1 next_pid=13 next_prio=120
+    [  290.514560] sched_switch: prev_comm=kworker/0:1 prev_pid=13 prev_prio=120 prev_state=R ==> next_comm=tee next_pid=68 next_prio=120
+    [  290.527720] sched_switch: prev_comm=tee prev_pid=68 prev_prio=120 prev_state=R+ ==> next_comm=init next_pid=69 next_prio=120
+    [  290.540360] sched_switch: prev_comm=init prev_pid=69 prev_prio=120 prev_state=R ==> next_comm=devmem next_pid=129 next_prio=120
+    [  290.553280] sched_switch: prev_comm=devmem prev_pid=129 prev_prio=120 prev_state=R ==> next_comm=ksoftirqd/0 next_pid=12 next_prio=120
+    [  290.566780] sched_switch: prev_comm=ksoftirqd/0 prev_pid=12 prev_prio=120 prev_state=R ==> next_comm=kcompactd0 next_pid=22 next_prio=120
+    [  290.580500] sched_switch: prev_comm=kcompactd0 prev_pid=22 prev_prio=120 prev_state=R ==> next_comm=init next_pid=1 next_prio=120
+    [  290.593740] sched_switch: prev_comm=init prev_pid=1 prev_prio=120 prev_state=R ==> next_comm=khvcd next_pid=31 next_prio=120
+    [  290.606340] sched_switch: prev_comm=khvcd prev_pid=31 prev_prio=120 prev_state=R ==> next_comm=kworker/u2:2 next_pid=19 next_prio=120
+    [  290.619780] sched_switch: prev_comm=kworker/u2:2 prev_pid=19 prev_prio=120 prev_state=R ==> next_comm=khungtaskd next_pid=18 next_prio=120
+    [  290.633620] sched_switch: prev_comm=khungtaskd prev_pid=18 prev_prio=120 prev_state=R ==> next_comm=tee next_pid=68 next_prio=120
+    [  290.646700] sched_switch: prev_comm=tee prev_pid=68 prev_prio=120 prev_state=R+ ==> next_comm=init next_pid=69 next_prio=120
+    [  290.659320] sched_switch: prev_comm=init prev_pid=69 prev_prio=120 prev_state=R ==> next_comm=kworker/0:1 next_pid=13 next_prio=120
+    [  290.672560] sched_switch: prev_comm=kworker/0:1 prev_pid=13 prev_prio=120 prev_state=R ==> next_comm=devmem next_pid=129 next_prio=120
+    [  290.686080] sched_switch: prev_comm=devmem prev_pid=129 prev_prio=120 prev_state=R ==> next_comm=ksoftirqd/0 next_pid=12 next_prio=120
+    [  290.699720] sched_switch: prev_comm=ksoftirqd/0 prev_pid=12 prev_prio=120 prev_state=R ==> next_comm=init next_pid=1 next_prio=120
+    [  290.712880] sched_switch: prev_comm=init prev_pid=1 prev_prio=120 prev_state=R ==> next_comm=khvcd next_pid=31 next_prio=120
+    [  290.725500] sched_switch: prev_comm=khvcd prev_pid=31 prev_prio=120 prev_state=R ==> next_comm=kcompactd0 next_pid=22 next_prio=120
+<!--}}}-->
+**Question**: Can we reproduce this hang with 6.9.7?
+