Use headings to allow hrefs
This commit is contained in:
parent
540780e508
commit
a7c460b034
87
JOURNAL.md
87
JOURNAL.md
@ -53,17 +53,17 @@ CPU.
|
||||
Let's go back and try to get the initrd shell, so we can systematically hang it
|
||||
in the `switch_root`
|
||||
<!--}}}-->
|
||||
**Observation**: The riscv-timer seems to be causing interrupts with IRQ 5:<!--{{{-->
|
||||
### OBSERVATION: The riscv-timer seems to be causing interrupts with IRQ 5:<!--{{{-->
|
||||
|
||||
```
|
||||
[ 62.439060] irq_handler_entry: irq=5 name=riscv-timer
|
||||
[ 62.444980] irq_handler_exit: irq=5 ret=handled
|
||||
```
|
||||
<!--}}}-->
|
||||
**Observation**: Rohan reports the serial startup routine being running *after*{{{
|
||||
### OBSERVATION: Rohan reports the serial startup routine being running *after* <!--{{{-->
|
||||
the init begins.
|
||||
<!--}}}-->
|
||||
**Observation**: Only interrupts in timer, others are zero.<!--{{{-->
|
||||
### OBSERVATION: Only interrupts in timer, others are zero.<!--{{{-->
|
||||
|
||||
With:
|
||||
|
||||
@ -95,7 +95,7 @@ I can see this:
|
||||
IPI4: 0 IRQ work interrupts
|
||||
IPI5: 0 Timer broadcast interrupts
|
||||
<!--}}}-->
|
||||
**Observation**: There is a timer configured in 0x40170000 but in the device<!--{{{-->
|
||||
### OBSERVATION: There is a timer configured in 0x40170000 but in the device<!--{{{-->
|
||||
tree we only have one at `timer@40002000`.
|
||||
|
||||
#define OX_ALVEO_TIMER_BASE 0x40170000
|
||||
@ -106,7 +106,7 @@ tree we only have one at `timer@40002000`.
|
||||
|
||||
https://gitlab.bsc.es/hwdesign/bsc-linux/-/blob/d6d194bd30d9a8fe49c2a278ffb3c3ae7852e75d/bsc_tree/patches/ox_alveo/opensbi/0001-opensbi-ox_alveo-platform.patch#L63
|
||||
<!--}}}-->
|
||||
**Observation**: When the serial console starts, the speed of the serial port<!--{{{-->
|
||||
### OBSERVATION: When the serial console starts, the speed of the serial port<!--{{{-->
|
||||
changes to 9600:
|
||||
|
||||
[ 6.845400] io scheduler mq-deadline registered
|
||||
@ -181,13 +181,14 @@ That was my mistake as I need to put the baud speed in the ttyS0, like this:
|
||||
|
||||
console=ttyS0,115200n8
|
||||
<!--}}}-->
|
||||
**Observation**: Trying to read from the serial console /dev/ttyS0 causes no<!--{{{-->
|
||||
### OBSERVATION: Trying to read from the serial console /dev/ttyS0 causes no<!--{{{-->
|
||||
more messages in the console (or a hang).
|
||||
<!--}}}-->
|
||||
**Question**: Can we make a heartbeat for the kernel? The idea is to keep a<!--{{{-->
|
||||
counter in some memory of the kernel so we can see it from the host being moved.
|
||||
### QUESTION: Can we make a heartbeat for the kernel? <!--{{{-->
|
||||
The idea is to keep a counter in some memory of the kernel so we can see it from
|
||||
the host being moved.
|
||||
<!--}}}-->
|
||||
**Question**: Can we disable the serial driver 8250 from loading?<!--{{{-->
|
||||
### QUESTION: Can we disable the serial driver 8250 from loading?<!--{{{-->
|
||||
|
||||
initcall_blacklist=<driver_init>
|
||||
|
||||
@ -208,7 +209,7 @@ Yes, but that doesn't seem to do anything. It is hanging:
|
||||
[ 629.733920] stage-1-init: [Thu Jan 1 00:10:29 UTC 1970] + echo /nix/store/snvvqpxmryw1szlllk0bxpm37p8vj8sw-extra-utils/bin/modprobe
|
||||
|
||||
<!--}}}-->
|
||||
**Question**: What happens if we remap the interruptions?<!--{{{-->
|
||||
### QUESTION: What happens if we remap the interruptions?<!--{{{-->
|
||||
|
||||
- Move the serial from 0 to 1
|
||||
- Move the plic from 3 to 2 and remove 7
|
||||
@ -225,14 +226,14 @@ Rather than two:
|
||||
[ 0.000000] plic: plic@40800000: mapped 3 interrupts with 0 handlers for 2 contexts.
|
||||
[ 0.000000] riscv: providing IPIs using SBI IPI extension
|
||||
<!--}}}-->
|
||||
**Question**: What happens if we block the `sbi_ipi` driver?<!--{{{-->
|
||||
### QUESTION: What happens if we block the `sbi_ipi` driver?<!--{{{-->
|
||||
|
||||
initcall_blacklist=sbi_ipi_init
|
||||
|
||||
Nothing, it cannot be disabled it seems. I will remove SMP support so it won't
|
||||
be compiled in.
|
||||
<!--}}}-->
|
||||
**Observation**: Searching for 'riscv,plic0' only matches irq-sifive-plic driver.<!--{{{-->
|
||||
### OBSERVATION: Searching for 'riscv,plic0' only matches irq-sifive-plic driver.<!--{{{-->
|
||||
|
||||
hut% rg 'riscv,plic0'
|
||||
Documentation/devicetree/bindings/interrupt-controller/sifive,plic-1.0.0.yaml
|
||||
@ -244,7 +245,7 @@ be compiled in.
|
||||
So it looks that the only driver that setups the plic is the one used by SiFive.
|
||||
Here is the doc: https://static.dev.sifive.com/U54-MC-RVCoreIP.pdf
|
||||
<!--}}}-->
|
||||
**Observation**: The number of handlers is 0, so there are no interruptions.<!--{{{-->
|
||||
### OBSERVATION: The number of handlers is 0, so there are no interruptions.<!--{{{-->
|
||||
|
||||
It seems the number next to the phandle of the interrupts-extended attribute in
|
||||
the plic follows a different convention of values. Using 9 and 11:
|
||||
@ -257,7 +258,7 @@ polled, otherwise it hangs.<!--}}}-->
|
||||
|
||||
## 2024-07-04
|
||||
|
||||
**Observation**: I saw they changed this option in Cinco Ranch DTS for the<!--{{{-->
|
||||
### OBSERVATION: I saw they changed this option in Cinco Ranch DTS for the<!--{{{-->
|
||||
serial:
|
||||
|
||||
> reg-shift = <0>; // regs are spaced on 8 bit boundary (modified from Xilinx UART16550 to be ns16550 compatible)
|
||||
@ -265,7 +266,7 @@ serial:
|
||||
Tested booting with debug1 and the ttyS0 console, and it goes extremely slow
|
||||
(but still outputs at 115200) and then continues to fail to read keyboard input.
|
||||
<!--}}}-->
|
||||
**Question**: Let's try setting the console in poll mode.<!--{{{-->
|
||||
### QUESTION: Let's try setting the console in poll mode.<!--{{{-->
|
||||
|
||||
setenv bootargs "root=/dev/ram0 loglevel=7 debug rw earlycon=uart,io,0x40001000,115200n8 boot.trace console=uart,io,0x40001000,115200n8 debug1 init=/nix/store/wavmnv6wjj8y10ha07wxd5f0sqacivj8-nixos-system-nixos-riscv-23.11pre-git/init"
|
||||
|
||||
@ -304,17 +305,17 @@ setenv bootargs "root=/dev/ram0 loglevel=7 debug rw earlycon=uart,io,0x40001000,
|
||||
|
||||
Also found: `no_console_suspend`
|
||||
<!--}}}-->
|
||||
**Observation**: There are messages of address space being assigned to<!--{{{-->
|
||||
### OBSERVATION: There are messages of address space being assigned to<!--{{{-->
|
||||
registers:
|
||||
|
||||
Slave segment '/MEEP_uart_0/S_AXI/Reg' is being assigned into address space '/m_axi_uart0' at <0x0000_0000 [ 4K ]>.
|
||||
Slave segment '/MEEP_uart_1/S_AXI/Reg' is being assigned into address space '/m_axi_uart1' at <0x0000_0000 [ 4K ]>.
|
||||
<!--}}}-->
|
||||
**Question**: What happens if I enable `CONFIG_CONSOLE_POLL`?<!--{{{-->
|
||||
### QUESTION: What happens if I enable `CONFIG_CONSOLE_POLL`?<!--{{{-->
|
||||
|
||||
With `console=ttyS0,115200n8 debug1` I cannot type.
|
||||
<!--}}}-->
|
||||
**Observation**: I can dump iomem memory with the tool devmem:<!--{{{-->
|
||||
### OBSERVATION: I can dump iomem memory with the tool devmem:<!--{{{-->
|
||||
|
||||
But it seems I cannot dump the registers of the serial io mapped region:
|
||||
|
||||
@ -364,7 +365,7 @@ It works!
|
||||
~ # devmem 0x40001000
|
||||
0x0000000D
|
||||
<!--}}}-->
|
||||
**Observation**: The interrupt register of the serial console is 0x0:<!--{{{-->
|
||||
### OBSERVATION: The interrupt register of the serial console is 0x0:<!--{{{-->
|
||||
|
||||
Assuming the console registers follow AXI UART 16550, here is the IER:
|
||||
|
||||
@ -380,8 +381,7 @@ The line control register is 0x3:
|
||||
~ # devmem 0x4000100C
|
||||
0x00000003
|
||||
<!--}}}-->
|
||||
**Question**: Can I write to some memory address and see the result from the<!--{{{-->
|
||||
host?
|
||||
### QUESTION: Can I write to some memory address and see the result from the host?<!--{{{-->
|
||||
|
||||
For that I would need to find some address that is mapped to the DMA or to the
|
||||
pmem. Xavi recommended `0x6000_0000` as it is uncached.
|
||||
@ -467,18 +467,22 @@ But we don't see the same:
|
||||
[bsc015557@fpgan02 nixos]$ dd if=/dev/qdma34000-MM-1 count=16 bs=1 skip=$FPGACTL_KERNEL_ADDR 2>/dev/null | xxd
|
||||
00000000: 9797 9797 9797 9797 9797 9797 9797 9797 ................
|
||||
[bsc015557@fpgan02 nixos]$ dd if=/dev/qdma34000-MM-0 count=16 bs=1 skip=$FPGACTL_KERNEL_ADDR 2>/dev/null | xxd
|
||||
00000000: 9797 9797 9797 9797 9797 9797 9797 9797 ................<!--}}}-->
|
||||
**Question**: Missing forward M to S via Mideleg?<!--{{{-->
|
||||
00000000: 9797 9797 9797 9797 9797 9797 9797 9797 ................
|
||||
|
||||
<!--}}}-->
|
||||
### QUESTION: Missing forward M to S via Mideleg?<!--{{{-->
|
||||
|
||||
Can it be happening that he MEDELEG is not forwarding the interruptions to the
|
||||
Supervisor (kernel)?
|
||||
|
||||
Boot HART MIDELEG : 0x0000000000000222
|
||||
Boot HART MEDELEG : 0x000000000000b109
|
||||
|
||||
<!--}}}-->
|
||||
**Question**: Can we add a timer to the PLIC to test the interrupts?<!--{{{-->
|
||||
### QUESTION: Can we add a timer to the PLIC to test the interrupts?<!--{{{-->
|
||||
|
||||
<!--}}}-->
|
||||
**Observation**: Here is the PLIC register dump:<!--{{{-->
|
||||
### OBSERVATION: Here is the PLIC register dump:<!--{{{-->
|
||||
|
||||
~ # for i in `seq 0 16`; do addr=$((0x40600000 + $i)); printf '%08x: ' $addr; devmem $addr; done
|
||||
40600000: 0x00010002
|
||||
@ -499,12 +503,12 @@ Supervisor (kernel)?
|
||||
4060000f: 0x00000000
|
||||
40600010: 0x00000000
|
||||
<!--}}}-->
|
||||
**Question**: Can we boot with the new bitstream that includes the second UART?<!--{{{-->
|
||||
### QUESTION: Can we boot with the new bitstream that includes the second UART?<!--{{{-->
|
||||
|
||||
The interruptions are enabled for the UART 1, not the default UART 0.
|
||||
|
||||
<!--}}}-->
|
||||
**Observation**: I'm using 0x100 not 0x1000 in the serial range:<!--{{{-->
|
||||
### OBSERVATION: I'm using 0x100 not 0x1000 in the serial range:<!--{{{-->
|
||||
|
||||
reg = <0x0 0x40003000 0x0 0x100>;
|
||||
reg = <0x0 0x40003000 0x0 0x1000>;
|
||||
@ -512,12 +516,13 @@ The interruptions are enabled for the UART 1, not the default UART 0.
|
||||
Can this produce any problem?
|
||||
|
||||
It doesn't seem to change anything, still unable to send any bytes.
|
||||
|
||||
<!--}}}-->
|
||||
**Question**: Can we use virtio to mount a FS in the DMA shared memory?
|
||||
### QUESTION: Can we use virtio to mount a FS in the DMA shared memory?
|
||||
|
||||
## 2024-07-05
|
||||
|
||||
**Observation**: The kernel continues working when the console hangs.<!--{{{-->
|
||||
### OBSERVATION: The kernel continues working when the console hangs.<!--{{{-->
|
||||
|
||||
Switching to 0x100000000 as 0x60000000 shows:
|
||||
|
||||
@ -540,8 +545,9 @@ Shows the kernel works:
|
||||
a0000000: 6700 0000 g...
|
||||
a0000000: 6800 0000 h...
|
||||
a0000000: 6900 0000 i...
|
||||
|
||||
<!--}}}-->
|
||||
**Question**: Can we reproduce it with `switch_root`?<!--{{{-->
|
||||
### QUESTION: Can we reproduce it with `switch_root`?<!--{{{-->
|
||||
|
||||
For that I would have to ensure the process continues to operate, even if we
|
||||
exit the console. Maybe I can make a double fork?
|
||||
@ -563,10 +569,11 @@ Yes, it seems to be working. Let's load the rootfs too.
|
||||
|
||||
I added a loop in the stage1 script.<!--}}}-->
|
||||
|
||||
**Question**: Can we see any clock in memory? This will allow us to check if the
|
||||
AXI still works.
|
||||
### QUESTION: Can we see any clock in memory?
|
||||
|
||||
**Observation**: The kernel stops updating the counter in the mount phase.<!--{{{-->
|
||||
This will allow us to check if the AXI still works.
|
||||
|
||||
### OBSERVATION: The kernel stops updating the counter in the mount phase.<!--{{{-->
|
||||
|
||||
Managed to reach the mount and hang there:
|
||||
|
||||
@ -595,7 +602,7 @@ hardware clock from the DMA region too, so we can discard problems in the AXI.<!
|
||||
+ loadkmap
|
||||
[ 266.301040] stage-1-init: [Thu Jan 1 00:04:25 UTC 1970] + kbd_mode -u -C /dev/console
|
||||
|
||||
**Assumption**: The kernel hangs.<!--{{{-->
|
||||
### ASSUMPTION: The kernel hangs.<!--{{{-->
|
||||
|
||||
If the kernel hangs, there must be an instruction or sequence of instructions
|
||||
that causes it. First I need to determine what is being executed by the kernel.
|
||||
@ -607,16 +614,18 @@ hangs.
|
||||
(prev_comm != 2 && next_comm != 2)
|
||||
|
||||
So, we can just enable the `tp_printk` but not the tracer. Then in the initrd
|
||||
script, I enable the function tracer and the filter.<!--}}}-->
|
||||
script, I enable the function tracer and the filter.
|
||||
|
||||
**Observation**: It takes a long time to init the pty:<!--{{{-->
|
||||
<!--}}}-->
|
||||
### OBSERVATION: It takes a long time to init the pty:<!--{{{-->
|
||||
|
||||
Interesting timing:
|
||||
|
||||
[ 12.612620] initcall_start: func=pty_init+0x0/0x3f4
|
||||
[ 20.962640] initcall_finish: func=pty_init+0x0/0x3f4 ret=0
|
||||
|
||||
<!--}}}-->
|
||||
**Observation**: The kcompactd0 daemon is using the CPU:<!--{{{-->
|
||||
### OBSERVATION: The kcompactd0 daemon is using the CPU:<!--{{{-->
|
||||
|
||||
[ 290.394920] sched_switch: prev_comm=devmem prev_pid=129 prev_prio=120 prev_state=R ==> next_comm=init next_pid=69 next_prio=120
|
||||
[ 290.408160] sched_switch: prev_comm=init prev_pid=69 prev_prio=120 prev_state=R ==> next_comm=tee next_pid=68 next_prio=120
|
||||
@ -644,8 +653,9 @@ Interesting timing:
|
||||
[ 290.699720] sched_switch: prev_comm=ksoftirqd/0 prev_pid=12 prev_prio=120 prev_state=R ==> next_comm=init next_pid=1 next_prio=120
|
||||
[ 290.712880] sched_switch: prev_comm=init prev_pid=1 prev_prio=120 prev_state=R ==> next_comm=khvcd next_pid=31 next_prio=120
|
||||
[ 290.725500] sched_switch: prev_comm=khvcd prev_pid=31 prev_prio=120 prev_state=R ==> next_comm=kcompactd0 next_pid=22 next_prio=120
|
||||
|
||||
<!--}}}-->
|
||||
**Question**: Can we reproduce this hang with 6.9.7?
|
||||
### QUESTION: Can we reproduce this hang with 6.9.7?<!--{{{-->
|
||||
|
||||
Disabling clang as it is failing to build:
|
||||
|
||||
@ -669,3 +679,4 @@ Disabling clang as it is failing to build:
|
||||
error: 1 dependencies of derivation '/nix/store/b13shgqj7128rdsdzzp4qicqbzl0wnfw-system-path.drv' failed to build
|
||||
error: 1 dependencies of derivation '/nix/store/6qghlihqcyg6155309ldj5xm9m0v835i-nixos-system-nixos-riscv-24.11pre-git.drv' failed to build
|
||||
error: 1 dependencies of derivation '/nix/store/l2x18cih29r1kn6vi8imwhkyk98yhw4i-nix-shell-riscv64-unknown-linux-gnu-env.drv' failed to build
|
||||
<!--}}}-->
|
||||
|
Loading…
Reference in New Issue
Block a user