Use headings to allow hrefs

This commit is contained in:
Rodrigo Arias 2024-07-05 17:05:28 +02:00
parent 540780e508
commit a7c460b034

View File

@ -53,17 +53,17 @@ CPU.
Let's go back and try to get the initrd shell, so we can systematically hang it
in the `switch_root`
<!--}}}-->
**Observation**: The riscv-timer seems to be causing interrupts with IRQ 5:<!--{{{-->
### OBSERVATION: The riscv-timer seems to be causing interrupts with IRQ 5:<!--{{{-->
```
[ 62.439060] irq_handler_entry: irq=5 name=riscv-timer
[ 62.444980] irq_handler_exit: irq=5 ret=handled
```
<!--}}}-->
**Observation**: Rohan reports the serial startup routine being running *after*{{{
### OBSERVATION: Rohan reports the serial startup routine being running *after* <!--{{{-->
the init begins.
<!--}}}-->
**Observation**: Only interrupts in timer, others are zero.<!--{{{-->
### OBSERVATION: Only interrupts in timer, others are zero.<!--{{{-->
With:
@ -95,7 +95,7 @@ I can see this:
IPI4: 0 IRQ work interrupts
IPI5: 0 Timer broadcast interrupts
<!--}}}-->
**Observation**: There is a timer configured in 0x40170000 but in the device<!--{{{-->
### OBSERVATION: There is a timer configured in 0x40170000 but in the device<!--{{{-->
tree we only have one at `timer@40002000`.
#define OX_ALVEO_TIMER_BASE 0x40170000
@ -106,7 +106,7 @@ tree we only have one at `timer@40002000`.
https://gitlab.bsc.es/hwdesign/bsc-linux/-/blob/d6d194bd30d9a8fe49c2a278ffb3c3ae7852e75d/bsc_tree/patches/ox_alveo/opensbi/0001-opensbi-ox_alveo-platform.patch#L63
<!--}}}-->
**Observation**: When the serial console starts, the speed of the serial port<!--{{{-->
### OBSERVATION: When the serial console starts, the speed of the serial port<!--{{{-->
changes to 9600:
[ 6.845400] io scheduler mq-deadline registered
@ -181,13 +181,14 @@ That was my mistake as I need to put the baud speed in the ttyS0, like this:
console=ttyS0,115200n8
<!--}}}-->
**Observation**: Trying to read from the serial console /dev/ttyS0 causes no<!--{{{-->
### OBSERVATION: Trying to read from the serial console /dev/ttyS0 causes no<!--{{{-->
more messages in the console (or a hang).
<!--}}}-->
**Question**: Can we make a heartbeat for the kernel? The idea is to keep a<!--{{{-->
counter in some memory of the kernel so we can see it from the host being moved.
### QUESTION: Can we make a heartbeat for the kernel? <!--{{{-->
The idea is to keep a counter in some memory of the kernel so we can see it from
the host being moved.
<!--}}}-->
**Question**: Can we disable the serial driver 8250 from loading?<!--{{{-->
### QUESTION: Can we disable the serial driver 8250 from loading?<!--{{{-->
initcall_blacklist=<driver_init>
@ -208,7 +209,7 @@ Yes, but that doesn't seem to do anything. It is hanging:
[ 629.733920] stage-1-init: [Thu Jan 1 00:10:29 UTC 1970] + echo /nix/store/snvvqpxmryw1szlllk0bxpm37p8vj8sw-extra-utils/bin/modprobe
<!--}}}-->
**Question**: What happens if we remap the interruptions?<!--{{{-->
### QUESTION: What happens if we remap the interruptions?<!--{{{-->
- Move the serial from 0 to 1
- Move the plic from 3 to 2 and remove 7
@ -225,14 +226,14 @@ Rather than two:
[ 0.000000] plic: plic@40800000: mapped 3 interrupts with 0 handlers for 2 contexts.
[ 0.000000] riscv: providing IPIs using SBI IPI extension
<!--}}}-->
**Question**: What happens if we block the `sbi_ipi` driver?<!--{{{-->
### QUESTION: What happens if we block the `sbi_ipi` driver?<!--{{{-->
initcall_blacklist=sbi_ipi_init
Nothing, it cannot be disabled it seems. I will remove SMP support so it won't
be compiled in.
<!--}}}-->
**Observation**: Searching for 'riscv,plic0' only matches irq-sifive-plic driver.<!--{{{-->
### OBSERVATION: Searching for 'riscv,plic0' only matches irq-sifive-plic driver.<!--{{{-->
hut% rg 'riscv,plic0'
Documentation/devicetree/bindings/interrupt-controller/sifive,plic-1.0.0.yaml
@ -244,7 +245,7 @@ be compiled in.
So it looks that the only driver that setups the plic is the one used by SiFive.
Here is the doc: https://static.dev.sifive.com/U54-MC-RVCoreIP.pdf
<!--}}}-->
**Observation**: The number of handlers is 0, so there are no interruptions.<!--{{{-->
### OBSERVATION: The number of handlers is 0, so there are no interruptions.<!--{{{-->
It seems the number next to the phandle of the interrupts-extended attribute in
the plic follows a different convention of values. Using 9 and 11:
@ -257,7 +258,7 @@ polled, otherwise it hangs.<!--}}}-->
## 2024-07-04
**Observation**: I saw they changed this option in Cinco Ranch DTS for the<!--{{{-->
### OBSERVATION: I saw they changed this option in Cinco Ranch DTS for the<!--{{{-->
serial:
> reg-shift = <0>; // regs are spaced on 8 bit boundary (modified from Xilinx UART16550 to be ns16550 compatible)
@ -265,7 +266,7 @@ serial:
Tested booting with debug1 and the ttyS0 console, and it goes extremely slow
(but still outputs at 115200) and then continues to fail to read keyboard input.
<!--}}}-->
**Question**: Let's try setting the console in poll mode.<!--{{{-->
### QUESTION: Let's try setting the console in poll mode.<!--{{{-->
setenv bootargs "root=/dev/ram0 loglevel=7 debug rw earlycon=uart,io,0x40001000,115200n8 boot.trace console=uart,io,0x40001000,115200n8 debug1 init=/nix/store/wavmnv6wjj8y10ha07wxd5f0sqacivj8-nixos-system-nixos-riscv-23.11pre-git/init"
@ -304,17 +305,17 @@ setenv bootargs "root=/dev/ram0 loglevel=7 debug rw earlycon=uart,io,0x40001000,
Also found: `no_console_suspend`
<!--}}}-->
**Observation**: There are messages of address space being assigned to<!--{{{-->
### OBSERVATION: There are messages of address space being assigned to<!--{{{-->
registers:
Slave segment '/MEEP_uart_0/S_AXI/Reg' is being assigned into address space '/m_axi_uart0' at <0x0000_0000 [ 4K ]>.
Slave segment '/MEEP_uart_1/S_AXI/Reg' is being assigned into address space '/m_axi_uart1' at <0x0000_0000 [ 4K ]>.
<!--}}}-->
**Question**: What happens if I enable `CONFIG_CONSOLE_POLL`?<!--{{{-->
### QUESTION: What happens if I enable `CONFIG_CONSOLE_POLL`?<!--{{{-->
With `console=ttyS0,115200n8 debug1` I cannot type.
<!--}}}-->
**Observation**: I can dump iomem memory with the tool devmem:<!--{{{-->
### OBSERVATION: I can dump iomem memory with the tool devmem:<!--{{{-->
But it seems I cannot dump the registers of the serial io mapped region:
@ -364,7 +365,7 @@ It works!
~ # devmem 0x40001000
0x0000000D
<!--}}}-->
**Observation**: The interrupt register of the serial console is 0x0:<!--{{{-->
### OBSERVATION: The interrupt register of the serial console is 0x0:<!--{{{-->
Assuming the console registers follow AXI UART 16550, here is the IER:
@ -380,8 +381,7 @@ The line control register is 0x3:
~ # devmem 0x4000100C
0x00000003
<!--}}}-->
**Question**: Can I write to some memory address and see the result from the<!--{{{-->
host?
### QUESTION: Can I write to some memory address and see the result from the host?<!--{{{-->
For that I would need to find some address that is mapped to the DMA or to the
pmem. Xavi recommended `0x6000_0000` as it is uncached.
@ -467,18 +467,22 @@ But we don't see the same:
[bsc015557@fpgan02 nixos]$ dd if=/dev/qdma34000-MM-1 count=16 bs=1 skip=$FPGACTL_KERNEL_ADDR 2>/dev/null | xxd
00000000: 9797 9797 9797 9797 9797 9797 9797 9797 ................
[bsc015557@fpgan02 nixos]$ dd if=/dev/qdma34000-MM-0 count=16 bs=1 skip=$FPGACTL_KERNEL_ADDR 2>/dev/null | xxd
00000000: 9797 9797 9797 9797 9797 9797 9797 9797 ................<!--}}}-->
**Question**: Missing forward M to S via Mideleg?<!--{{{-->
00000000: 9797 9797 9797 9797 9797 9797 9797 9797 ................
<!--}}}-->
### QUESTION: Missing forward M to S via Mideleg?<!--{{{-->
Can it be happening that he MEDELEG is not forwarding the interruptions to the
Supervisor (kernel)?
Boot HART MIDELEG : 0x0000000000000222
Boot HART MEDELEG : 0x000000000000b109
<!--}}}-->
**Question**: Can we add a timer to the PLIC to test the interrupts?<!--{{{-->
### QUESTION: Can we add a timer to the PLIC to test the interrupts?<!--{{{-->
<!--}}}-->
**Observation**: Here is the PLIC register dump:<!--{{{-->
### OBSERVATION: Here is the PLIC register dump:<!--{{{-->
~ # for i in `seq 0 16`; do addr=$((0x40600000 + $i)); printf '%08x: ' $addr; devmem $addr; done
40600000: 0x00010002
@ -499,12 +503,12 @@ Supervisor (kernel)?
4060000f: 0x00000000
40600010: 0x00000000
<!--}}}-->
**Question**: Can we boot with the new bitstream that includes the second UART?<!--{{{-->
### QUESTION: Can we boot with the new bitstream that includes the second UART?<!--{{{-->
The interruptions are enabled for the UART 1, not the default UART 0.
<!--}}}-->
**Observation**: I'm using 0x100 not 0x1000 in the serial range:<!--{{{-->
### OBSERVATION: I'm using 0x100 not 0x1000 in the serial range:<!--{{{-->
reg = <0x0 0x40003000 0x0 0x100>;
reg = <0x0 0x40003000 0x0 0x1000>;
@ -512,12 +516,13 @@ The interruptions are enabled for the UART 1, not the default UART 0.
Can this produce any problem?
It doesn't seem to change anything, still unable to send any bytes.
<!--}}}-->
**Question**: Can we use virtio to mount a FS in the DMA shared memory?
### QUESTION: Can we use virtio to mount a FS in the DMA shared memory?
## 2024-07-05
**Observation**: The kernel continues working when the console hangs.<!--{{{-->
### OBSERVATION: The kernel continues working when the console hangs.<!--{{{-->
Switching to 0x100000000 as 0x60000000 shows:
@ -540,8 +545,9 @@ Shows the kernel works:
a0000000: 6700 0000 g...
a0000000: 6800 0000 h...
a0000000: 6900 0000 i...
<!--}}}-->
**Question**: Can we reproduce it with `switch_root`?<!--{{{-->
### QUESTION: Can we reproduce it with `switch_root`?<!--{{{-->
For that I would have to ensure the process continues to operate, even if we
exit the console. Maybe I can make a double fork?
@ -563,10 +569,11 @@ Yes, it seems to be working. Let's load the rootfs too.
I added a loop in the stage1 script.<!--}}}-->
**Question**: Can we see any clock in memory? This will allow us to check if the
AXI still works.
### QUESTION: Can we see any clock in memory?
**Observation**: The kernel stops updating the counter in the mount phase.<!--{{{-->
This will allow us to check if the AXI still works.
### OBSERVATION: The kernel stops updating the counter in the mount phase.<!--{{{-->
Managed to reach the mount and hang there:
@ -595,7 +602,7 @@ hardware clock from the DMA region too, so we can discard problems in the AXI.<!
+ loadkmap
[ 266.301040] stage-1-init: [Thu Jan 1 00:04:25 UTC 1970] + kbd_mode -u -C /dev/console
**Assumption**: The kernel hangs.<!--{{{-->
### ASSUMPTION: The kernel hangs.<!--{{{-->
If the kernel hangs, there must be an instruction or sequence of instructions
that causes it. First I need to determine what is being executed by the kernel.
@ -607,16 +614,18 @@ hangs.
(prev_comm != 2 && next_comm != 2)
So, we can just enable the `tp_printk` but not the tracer. Then in the initrd
script, I enable the function tracer and the filter.<!--}}}-->
script, I enable the function tracer and the filter.
**Observation**: It takes a long time to init the pty:<!--{{{-->
<!--}}}-->
### OBSERVATION: It takes a long time to init the pty:<!--{{{-->
Interesting timing:
[ 12.612620] initcall_start: func=pty_init+0x0/0x3f4
[ 20.962640] initcall_finish: func=pty_init+0x0/0x3f4 ret=0
<!--}}}-->
**Observation**: The kcompactd0 daemon is using the CPU:<!--{{{-->
### OBSERVATION: The kcompactd0 daemon is using the CPU:<!--{{{-->
[ 290.394920] sched_switch: prev_comm=devmem prev_pid=129 prev_prio=120 prev_state=R ==> next_comm=init next_pid=69 next_prio=120
[ 290.408160] sched_switch: prev_comm=init prev_pid=69 prev_prio=120 prev_state=R ==> next_comm=tee next_pid=68 next_prio=120
@ -644,8 +653,9 @@ Interesting timing:
[ 290.699720] sched_switch: prev_comm=ksoftirqd/0 prev_pid=12 prev_prio=120 prev_state=R ==> next_comm=init next_pid=1 next_prio=120
[ 290.712880] sched_switch: prev_comm=init prev_pid=1 prev_prio=120 prev_state=R ==> next_comm=khvcd next_pid=31 next_prio=120
[ 290.725500] sched_switch: prev_comm=khvcd prev_pid=31 prev_prio=120 prev_state=R ==> next_comm=kcompactd0 next_pid=22 next_prio=120
<!--}}}-->
**Question**: Can we reproduce this hang with 6.9.7?
### QUESTION: Can we reproduce this hang with 6.9.7?<!--{{{-->
Disabling clang as it is failing to build:
@ -669,3 +679,4 @@ Disabling clang as it is failing to build:
error: 1 dependencies of derivation '/nix/store/b13shgqj7128rdsdzzp4qicqbzl0wnfw-system-path.drv' failed to build
error: 1 dependencies of derivation '/nix/store/6qghlihqcyg6155309ldj5xm9m0v835i-nixos-system-nixos-riscv-24.11pre-git.drv' failed to build
error: 1 dependencies of derivation '/nix/store/l2x18cih29r1kn6vi8imwhkyk98yhw4i-nix-shell-riscv64-unknown-linux-gnu-env.drv' failed to build
<!--}}}-->