Document hang missing hvc_remove trace point
This commit is contained in:
parent
6721e1e22c
commit
4641e0d9a0
60
JOURNAL.md
60
JOURNAL.md
@ -43,7 +43,7 @@ Nope, the u-boot is reporting the d extension is in the isa:
|
|||||||
|
|
||||||
> riscv,isa = "rv64imafd";
|
> riscv,isa = "rv64imafd";
|
||||||
|
|
||||||
## 2024-07-03
|
## 2024-07-03<!--{{{-->
|
||||||
|
|
||||||
I cannot switch to `gcc.arch = rv64ima` because rust fails to build.
|
I cannot switch to `gcc.arch = rv64ima` because rust fails to build.
|
||||||
|
|
||||||
@ -255,8 +255,8 @@ the plic follows a different convention of values. Using 9 and 11:
|
|||||||
**Remark**: The key combination to run Magic SysRq using the HVC console is<!--{{{-->
|
**Remark**: The key combination to run Magic SysRq using the HVC console is<!--{{{-->
|
||||||
Ctrl-O and then the SysRq key. It only works it the console is being actively
|
Ctrl-O and then the SysRq key. It only works it the console is being actively
|
||||||
polled, otherwise it hangs.<!--}}}-->
|
polled, otherwise it hangs.<!--}}}-->
|
||||||
|
<!--}}}-->
|
||||||
## 2024-07-04
|
## 2024-07-04<!--{{{-->
|
||||||
|
|
||||||
### OBSERVATION: I saw they changed this option in Cinco Ranch DTS for the<!--{{{-->
|
### OBSERVATION: I saw they changed this option in Cinco Ranch DTS for the<!--{{{-->
|
||||||
serial:
|
serial:
|
||||||
@ -518,9 +518,11 @@ Can this produce any problem?
|
|||||||
It doesn't seem to change anything, still unable to send any bytes.
|
It doesn't seem to change anything, still unable to send any bytes.
|
||||||
|
|
||||||
<!--}}}-->
|
<!--}}}-->
|
||||||
### QUESTION: Can we use virtio to mount a FS in the DMA shared memory?
|
### QUESTION: Can we use virtio to mount a FS in the DMA shared memory?<!--{{{-->
|
||||||
|
|
||||||
## 2024-07-05
|
<!--}}}-->
|
||||||
|
<!--}}}-->
|
||||||
|
## 2024-07-05<!--{{{-->
|
||||||
|
|
||||||
### OBSERVATION: The kernel continues working when the console hangs.<!--{{{-->
|
### OBSERVATION: The kernel continues working when the console hangs.<!--{{{-->
|
||||||
|
|
||||||
@ -568,11 +570,10 @@ Then
|
|||||||
Yes, it seems to be working. Let's load the rootfs too.
|
Yes, it seems to be working. Let's load the rootfs too.
|
||||||
|
|
||||||
I added a loop in the stage1 script.<!--}}}-->
|
I added a loop in the stage1 script.<!--}}}-->
|
||||||
|
### QUESTION: Can we see any clock in memory?<!--{{{-->
|
||||||
### QUESTION: Can we see any clock in memory?
|
|
||||||
|
|
||||||
This will allow us to check if the AXI still works.
|
This will allow us to check if the AXI still works.
|
||||||
|
<!--}}}-->
|
||||||
### OBSERVATION: The kernel stops updating the counter in the mount phase.<!--{{{-->
|
### OBSERVATION: The kernel stops updating the counter in the mount phase.<!--{{{-->
|
||||||
|
|
||||||
Managed to reach the mount and hang there:
|
Managed to reach the mount and hang there:
|
||||||
@ -592,8 +593,10 @@ After almost 6 minutes, with 571 beats:
|
|||||||
|
|
||||||
It looks like the kernel is the one getting stuck *or* at least is unable to
|
It looks like the kernel is the one getting stuck *or* at least is unable to
|
||||||
propagate the heartbeat changes to the host. It would be nice to monitor a
|
propagate the heartbeat changes to the host. It would be nice to monitor a
|
||||||
hardware clock from the DMA region too, so we can discard problems in the AXI.<!--}}}-->
|
hardware clock from the DMA region too, so we can discard problems in the AXI.
|
||||||
|
|
||||||
|
<!--}}}-->
|
||||||
|
### OBSERVATION: There is an ioctl failed for /dev/console<!--{{{-->
|
||||||
|
|
||||||
[ 177.009540] stage-1-init: [Thu Jan 1 00:02:56 UTC 1970] + udevadm settle
|
[ 177.009540] stage-1-init: [Thu Jan 1 00:02:56 UTC 1970] + udevadm settle
|
||||||
+ kbd_mode -u -C /dev/console
|
+ kbd_mode -u -C /dev/console
|
||||||
@ -602,6 +605,7 @@ hardware clock from the DMA region too, so we can discard problems in the AXI.<!
|
|||||||
+ loadkmap
|
+ loadkmap
|
||||||
[ 266.301040] stage-1-init: [Thu Jan 1 00:04:25 UTC 1970] + kbd_mode -u -C /dev/console
|
[ 266.301040] stage-1-init: [Thu Jan 1 00:04:25 UTC 1970] + kbd_mode -u -C /dev/console
|
||||||
|
|
||||||
|
<!--}}}-->
|
||||||
### ASSUMPTION: The kernel hangs.<!--{{{-->
|
### ASSUMPTION: The kernel hangs.<!--{{{-->
|
||||||
|
|
||||||
If the kernel hangs, there must be an instruction or sequence of instructions
|
If the kernel hangs, there must be an instruction or sequence of instructions
|
||||||
@ -681,7 +685,6 @@ Disabling clang as it is failing to build:
|
|||||||
error: 1 dependencies of derivation '/nix/store/l2x18cih29r1kn6vi8imwhkyk98yhw4i-nix-shell-riscv64-unknown-linux-gnu-env.drv' failed to build
|
error: 1 dependencies of derivation '/nix/store/l2x18cih29r1kn6vi8imwhkyk98yhw4i-nix-shell-riscv64-unknown-linux-gnu-env.drv' failed to build
|
||||||
|
|
||||||
<!--}}}-->
|
<!--}}}-->
|
||||||
|
|
||||||
### QUESTION: Missing cache information may affect?<!--{{{-->
|
### QUESTION: Missing cache information may affect?<!--{{{-->
|
||||||
|
|
||||||
Other CPUs report the cache details in the DT. For example this one
|
Other CPUs report the cache details in the DT. For example this one
|
||||||
@ -716,9 +719,7 @@ https://github.com/torvalds/linux/blob/master/arch/riscv/boot/dts/sifive/fu540-c
|
|||||||
};
|
};
|
||||||
|
|
||||||
We may want to add it to our DT to be sure that it has no effect.<!--}}}-->
|
We may want to add it to our DT to be sure that it has no effect.<!--}}}-->
|
||||||
|
### OBSERVATION: Arrived to stage 2!<!--{{{-->
|
||||||
|
|
||||||
### OBSERVATION: Arrived to stage 2!
|
|
||||||
|
|
||||||
+ kill -9 74
|
+ kill -9 74
|
||||||
+ readlink /proc/75/exe
|
+ readlink /proc/75/exe
|
||||||
@ -756,13 +757,17 @@ We may want to add it to our DT to be sure that it has no effect.<!--}}}-->
|
|||||||
[ 425.302000] random: perl: uninitialized urandom read (4 bytes read)
|
[ 425.302000] random: perl: uninitialized urandom read (4 bytes read)
|
||||||
|
|
||||||
But then it hangs.
|
But then it hangs.
|
||||||
|
<!--}}}-->
|
||||||
|
<!--}}}-->
|
||||||
|
## 2024-07-08
|
||||||
|
|
||||||
### QUESTION: Who sets the plic interrupts?
|
### QUESTION: Who sets the plic interrupts?<!--{{{-->
|
||||||
|
|
||||||
Shouldn't OpenSBI read the DT and do some configuration in the plic while in
|
Shouldn't OpenSBI read the DT and do some configuration in the PLIC while in
|
||||||
machine mode?
|
machine mode?
|
||||||
|
|
||||||
### OBSERVATION: Semi-stack trace from CincoRanch
|
<!--}}}-->
|
||||||
|
### OBSERVATION: Semi-stack trace from CincoRanch<!--{{{-->
|
||||||
|
|
||||||
hvc_remove?
|
hvc_remove?
|
||||||
console_unlock <-- only called from hvc_remove()
|
console_unlock <-- only called from hvc_remove()
|
||||||
@ -797,8 +802,29 @@ machine mode?
|
|||||||
no_context.part.0
|
no_context.part.0
|
||||||
die_kernel_fault <-- last frame(?)
|
die_kernel_fault <-- last frame(?)
|
||||||
|
|
||||||
### QUESTION: Can we place a tracepoint in `hvc_remove`?
|
<!--}}}-->
|
||||||
|
### QUESTION: Can we place a trace point in `hvc_remove`?<!--{{{-->
|
||||||
|
|
||||||
If we are getting stuck in the same place, we should be able to see the
|
If we are getting stuck in the same place, we should be able to see the
|
||||||
backtrace (assuming the console still works) just before we try to remove the
|
backtrace (assuming the console still works) just before we try to remove the
|
||||||
console device.
|
console device.
|
||||||
|
|
||||||
|
Placed, but still unable to see anything in any hang. Here is a hang in the
|
||||||
|
Stage 2:
|
||||||
|
|
||||||
|
<<< NixOS Stage 2 >>>
|
||||||
|
|
||||||
|
[ 404.158340] EXT4-fs (pmem0p2): re-mounted 44444444-4444-4444-8888-888888888888 r/w. Quota mode: none.
|
||||||
|
[ 404.242500] booting system configuration /nix/store/0za1vqh5alk7mxqs59qxx8izmwmf21w6-nixos-system-nixos-riscv-24.11pre-git
|
||||||
|
running activation script...
|
||||||
|
[ 408.148380] stage-2-init: running activation script...
|
||||||
|
[ 411.612240] random: perl: uninitialized urandom read (4 bytes read)
|
||||||
|
[ 411.866440] random: perl: uninitialized urandom read (4 bytes read)
|
||||||
|
[ 447.588880] random: perl: uninitialized urandom read (4 bytes read)
|
||||||
|
|
||||||
|
Still, it may be hang in a similar way, causing a loop of page faults just
|
||||||
|
while trying to printk to the console, which would explain why we don't see
|
||||||
|
anything and why the heartbeat stops.
|
||||||
|
|
||||||
|
<!--}}}-->
|
||||||
|
|
||||||
|
Loading…
Reference in New Issue
Block a user