Update journal with PLIC experiments

2024-08-01 19:29:04 +02:00 · 2024-08-01 19:29:04 +02:00 · cd7eb7179f
commit cd7eb7179f
parent eee26f2b4d
1 changed files with 665 additions and 0 deletions
--- a/JOURNAL.md
+++ b/JOURNAL.md
@ -2588,4 +2588,669 @@ This one yes:
    [  165.066220] mm_page_alloc: page=(____ptrval____) pfn=0x83d28 order=2 migratetype=0 gfp_flags=GFP_KERNEL
    [  165.066540] console: mm_page_alloc: page=(____ptrval____) pfn=0x83d28 order=2 migratetype=0 gfp_flags=GFP_KERNEL

+## 2024-08-01

+Now that we have a new bitstream with a CLINT connected to a PLIC input, we may
+be able to generate an interrupt.
+
+Here is the comment where I gather the pieces:
+
+---8<---
+
+From https://gitlab.bsc.es/hwdesign/rtl/core-tile/sa-fpga/ I can see that the
+auxiliary timer [is in fact another
+CLINT](https://gitlab.bsc.es/hwdesign/rtl/core-tile/sa-fpga/-/blob/10ba8b2a11ef105d7cda065e13838a3d28f3c951/fpga_core_bridge/rtl/fpga_core_bridge.sv#L685).
+
+I don't have access to the [hlib
+repository](https://gitlab.bsc.es/hwdesign/hlib.git) (@jmendoza can I get access
+to it?) to see the CLINT definition, but based on [this
+CLINT](https://github.com/openhwgroup/cva6/blob/master/corev_apu/clint/clint.sv)
+and [this one](https://github.com/pulp-platform/clint/blob/master/src/clint.sv)
+I can estimate some of the previous information:
+
+> - The information on which port number of the PLIC the timer is connected to.
+
+https://gitlab.bsc.es/hwdesign/rtl/core-tile/sa-fpga/-/blob/main/fpga_core_bridge/rtl/fpga_core_bridge.sv#L1114
+
+```
+        plic #(
+            .PARAMETER_BITWIDTH (7),
+            .NUM_TARGETS        (1),
+            .NUM_SOURCES        (4)
+        ) plic_inst (
+            .clk_i         (clk_i),
+            .rstn_i        (reset),
+            .irq_sources_i ({plic_timer_eirq,eth_irq,uart1_irq}), 
+            .eip_targets_o (irq),
+```
+
+If I read it from right to left starting at 1, it should be **at 4**, as the
+`eth_irq` has two "wires".
+
+
+> - The memory address of the timer and the mapped registers, so I can see it
+>   increasing its value. I think the `aux_timer` you had in the past would be
+>   fine.
+
+https://gitlab.bsc.es/hwdesign/rtl/core-tile/sa-fpga/-/blob/main/fpga_core_bridge/rtl/local_includes/defines.svh#L33-36
+
+```
+//Size: 64KB
+`define AUX_TIMER_XBAR_ID 2
+`define AUX_TIMER_BASE_ADDR 64'h0000_0000_4001_0000 // Need to be this space because we use a clint as aux timer
+`define AUX_TIMER_END_ADDR  64'h0000_0000_4001_FFFF
+```
+
+> - The specific operations I need to do in machine mode to configure the timer
+>   to fire at 1 Hz (probably setting two registers).
+
+Based on the source of the CLINT, **only one interrupt will be generated** after
+setting the mtimecmp register to something larger than the mtime register. Then
+I suspect I would have to make the interrupt run some code to rearm it again by
+modifying the mtimecmp register to some value in the future:
+
+```
+// -----------------------------
+// IRQ Generation
+// -----------------------------
+// The mtime register has a 64-bit precision on all RV32, RV64, and RV128 systems. Platforms provide a 64-bit
+// memory-mapped machine-mode timer compare register (mtimecmp), which causes a timer interrupt to be posted when the
+// mtime register contains a value greater than or equal (mtime >= mtimecmp) to the value in the mtimecmp register.
+// The interrupt remains posted until it is cleared by writing the mtimecmp register. The interrupt will only be taken
+// if interrupts are enabled and the MTIE bit is set in the mie register.
+always_comb begin : irq_gen
+    // check that the mtime cmp register is set to a meaningful value
+    for (int unsigned i = 0; i < NR_CORES; i++) begin
+        if (mtime_q >= mtimecmp_q[i]) begin
+            timer_irq_o[i] = 1'b1;
+        end else begin
+            timer_irq_o[i] = 1'b0;
+        end
+    end
+end
+```
+
+I could ensure that an interrupt has been fired by reading the mtime and
+mtimecmp values, and checking that mtime > mtimecmp.
+
+Now I only need to find a bitstream that has been generated with
+https://gitlab.bsc.es/hwdesign/rtl/core-tile/sa-fpga/-/commit/10ba8b2a11ef105d7cda065e13838a3d28f3c951.
+
+
+This may work:
+
+https://gitlab.bsc.es/hwdesign/fpga/integration-lab/fpga-shell/-/jobs/968583/raw
+
+> Submodule path 'sa-fpga': checked out '12b77cb50cf1c416f107d4c7ab1c52d7b5e59056'
+
+Which is based on fpga-shell https://gitlab.bsc.es/hwdesign/fpga/integration-lab/fpga-shell/-/commit/01265d197f256bce2c7e82d21c7f4bf5dcb44e68
+
+Here is the bitstream job: https://gitlab.bsc.es/hwdesign/fpga/integration-lab/fpga-shell/-/jobs/968585
+
+And the bitstream: [artifacts.zip](/uploads/d8240a779cd485771b9e3d0147e342d1/artifacts.zip)
+
+And full log: [job.log](/uploads/a4215e4d039065b77f7a2d2b1403e475/job.log)
+
+The memory map would need a bit of adjustment in the device tree, but to play with the timer in machine mode not much is needed.
+
+I think I have all the pieces now.
+
+---8<---
+
+I will try with the last bitstream that I already had compiled, as I will have
+to rebuild the required packages in nix.
+
+To compute the memory position of the registers:
+
+    `define AUX_TIMER_XBAR_ID 2
+    `define AUX_TIMER_BASE_ADDR 64'h0000_0000_4001_0000 // Need to be this space because we use a clint as aux timer
+    `define AUX_TIMER_END_ADDR  64'h0000_0000_4001_FFFF
+
+    localparam logic [15:0] MSIP_BASE     = 16'h0;
+    localparam logic [15:0] MTIMECMP_BASE = 16'h4000;
+    localparam logic [15:0] MTIME_BASE    = 16'hbff8;
+
+So, the base address 0x40010000 and the first MTIME at 0xbff8 would give us a
+timer at 0x4001bff8.
+
+Here it is:
+
+    => md 0x4001bff8 1
+    4001bff8: 006e65b8                             .en.
+    => md 0x4001bff8 1
+    4001bff8: 006e9a26                             &.n.
+    => md 0x4001bff8 1
+    4001bff8: 006ebae1                             ..n.
+    => md 0x4001bff8 1
+    4001bff8: 006eda45                             E.n.
+    => md 0x4001bff8 1
+    4001bff8: 006ef9d4                             ..n.
+    => md 0x4001bff8 1
+    4001bff8: 006f1abb                             ..o.
+
+Now, the MTIMECMP should be at 0x40014000, which should be 0.
+
+    => md 0x40014000 1
+    40014000: 00000000                             ....
+
+Good.
+
+Now, I suspect the MSIP is not used, so it should be 0 at 0x40010000 too:
+
+    => md 0x40010000 1
+    40010000: 00000000                             ....
+
+Nice.
+
+Just for testing, let's see if I can make the timer cause any change in the MSIP
+register by setting the MTIMECMP to a value:
+
+    => mw 0x40014000 0x01700000 # Write the MTIMECMP
+    => md 0x40014000 1
+    40014000: 01700000                             ..p.
+    => md 0x4001bff8 1
+    4001bff8: 016da81a                             ..m.
+    => md 0x40010000 1
+    40010000: 00000000                             ....
+    => md 0x4001bff8 1
+    4001bff8: 016f947c                             |.o.
+    => md 0x4001bff8 1
+    4001bff8: 016fff96                             ..o.
+    => md 0x4001bff8 1
+    4001bff8: 01704367                             gCp. # Now we passed it
+    => md 0x40010000 1
+    40010000: 00000000                             .... # But MSIP is still 0
+
+As expected, nothing happens. We cannot monitor the interrupt line from the
+timer itself.
+
+Now, let see if we can inspect the state of the PLIC.
+
+From the `plic_interface` I can see where are the memory addresses of the
+registers exposed.
+
+The PLIC is mapped here:
+
+    //Size: 4MB
+    `define PLIC_XBAR_ID 5
+    `define PLIC_BASE_ADDR 64'h0000_0000_4080_0000
+    `define PLIC_END_ADDR  64'h0000_0000_40BF_FFFF
+
+There are several ways in which the interrupts are not forwarded to the
+destination, and several destinations. The PLIC specification is a good resource
+to understand it:
+
+    https://github.com/riscv/riscv-plic-spec
+
+This is important:
+
+> The interrupt gateways are responsible for converting global interrupt signals
+> into a common interrupt request format, and for controlling the flow of
+> interrupt requests to the PLIC core. At most one interrupt request per
+> interrupt source can be pending in the PLIC core at any time, indicated by
+> setting the source’s IP bit. The gateway only forwards a new interrupt request
+> to the PLIC core after receiving notification that the interrupt handler
+> servicing the previous interrupt request from the same source has completed.
+
+So, there cannot be any pending interrupt, otherwise no more interrupts will be
+sent to the core.
+
+Assuming the PLIC uses the standard memory layout, we should find:
+
+    base + 0x000000: Reserved (interrupt source 0 does not exist)
+    base + 0x000004: Interrupt source 1 priority
+    base + 0x000008: Interrupt source 2 priority
+
+Which they should begin at 0x40800000.
+
+    => md 0x40800000 8
+    40800000: 00000000 00000000 00000000 00000000  ................
+    40800010: 00000000 00000000 00000000 00000000  ................
+
+All the priorities are set to 0.
+
+Let's see the pending interrupts:
+
+    base + 0x000FFC: Interrupt source 1023 priority
+    base + 0x001000: Interrupt Pending bit 0-31
+    base + 0x00107C: Interrupt Pending bit 992-1023
+
+They should be at 0x40801000:
+
+    => md 0x40801000 8
+    40801000: 00000010 00000000 00000000 00000000  ................
+    40801010: 00000000 00000000 00000000 00000000  ................
+
+Whoa, look at that.
+
+                 4321
+    0x00000010 = 10000
+                 |   |
+                 |   int 0 (reserved)
+                 int 4 = timer
+
+We got the interrupt 4 pending in context 0!
+
+Other context don't seem to see anything:
+
+    => md 0x40801080 1
+    40801080: 00000000                             ....
+    => md 0x40801100 1
+    40801100: 00000000                             ....
+    => md 0x40801180 1
+    40801180: 00000000                             ....
+    => md 0x40801200 1
+    40801200: 00000000                             ....
+    => md 0x40801280 1
+    40801280: 00000000                             ....
+    => md 0x40801300 1
+    40801300: 00000000                             ....
+    => md 0x40801380 1
+    40801380: 00000000                             ....
+
+So, as the priority is 0, this means it is ignored:
+
+> If PLIC supports Interrupt Priorities, then each PLIC interrupt source can be
+> assigned a priority by writing to its 32-bit memory-mapped priority register.
+> A priority value of 0 is reserved to mean "never interrupt" and effectively
+> disables the interrupt. Priority 1 is the lowest active priority while the
+> maximum level of priority depends on PLIC implementation. Ties between global
+> interrupts of the same priority are broken by the Interrupt ID; interrupts
+> with the lowest ID have the highest effective priority.
+
+Let's claim the interrupt, by just performing a read from 0x40a00004:
+
+    => md 0x40801000 1
+    40801000: 00000010                             ....
+    => md 0x40a00004 1
+    40a00004: 00000000                             ....
+    => md 0x40801000 1
+    40801000: 00000010                             ....
+
+So, it continues to be pending.
+
+We have to write the completed interrupt, by writing the number 4 to the same
+register:
+
+    => mw 0x40a00004 4
+    => md 0x40801000 1
+    40801000: 00000010                             ....
+
+Still not cleared.
+
+Let's try making the MTIMECMP value much higher than MTIME:
+
+    => md 0x40014000 1
+    40014000: 01700000                             ..p.
+    => md 0x4001bff8 1
+    4001bff8: 03a4584b                             KX..
+    => mw 0x40014000 0xaaaaaaaa
+    => md 0x40014000 1
+    40014000: aaaaaaaa                             ....
+    => md 0x4001bff8 1
+    4001bff8: 03abc84d                             M...
+
+So... the ID that must be written to the completion register is not the
+interrupt number, but the value read from the claim register, which is 0.
+
+    => mw 0x40a00004 0
+    => md 0x40801000 1
+    40801000: 00000010                             ....
+
+Still, nothing.
+
+All interrupts are disabled:
+
+    => md 0x40802000 4
+    40802000: 00000000 00000000 00000000 00000000  ................
+
+Let's try enabling the interrupt 4, by writting:
+
+    => mw 0x40802000 0x10
+    => md 0x40802000 1
+    40802000: 00000010                             ....
+    => md 0x40801000 1
+    40801000: 00000010                             ....
+
+Now, let's set the priority to something else than 0.
+
+First, lets make sure that the context 0 threshold priority is set to 0, so we
+allow all interrupts:
+
+    0x200000: Priority threshold for context 0
+
+    => md 0x40a00000 1
+    40a00000: 00000007                             ....
+
+Oh, so we are only receiving interrupts with priority 7 or higher. But our
+interrupt has priority 0!
+
+    => md 0x40800004 1
+    40800004: 00000000                             ....
+
+Let's make the threshold 0 and our interrupt have priority 1.
+
+    => mw 0x40a00000 0
+    => mw 0x40800004 1
+    => md 0x40800004 1
+    40800004: 00000001                             ....
+    => md 0x40a00000
+    40a00000: 00000000                             ....
+
+Not let's see again the interrupt state:
+
+    => md 0x40801000 1
+    40801000: 00000010                             ....
+
+Still on.
+
+Let's read the claim register again.
+
+    => md 0x40a00004
+    40a00004: 00000000                             ....
+
+Still 0, let's try to complete it:
+
+    => mw 0x40a00004 0
+    => md 0x40801000 1
+    40801000: 00000010                             ....
+
+Nope, still pending.
+
+What, what the hell. The threshold value has changed to 1:
+
+    => md 0x40800004 1
+    40800004: 00000001                             ....
+    => md 0x40a00000 1
+    40a00000: 00000001                             .... <-- this was 0
+
+Let's configure the interruption priority to something bigger than 1.
+
+Wait, I put the priority in the wrong source:
+
+    0x000000: Reserved (interrupt source 0 does not exist)
+    0x000004: Interrupt source 1 priority
+    0x000008: Interrupt source 2 priority
+
+Our timer should be the source 4, so 12 or 0xc:
+
+    => md 0x4080000c 1
+    4080000c: 00000000                             ....
+
+(This is wrong, should be 0x40800010, see below)
+
+Let's make it have priority 0xd:
+
+    => mw 0x4080000c 0xd
+    => md 0x4080000c 1
+    4080000c: 0000000d                             ....
+
+Something weird is going on with the priority register?
+
+    => md 0x40a00000 1
+    40a00000: 00000000                             ....
+    => md 0x40a00000 1
+    40a00000: 0000000d                             ....
+    => md 0x40a00000 1
+    40a00000: 0000000d                             ....
+    => md 0x40a00000 1
+    40a00000: 0000000d                             ....
+    => md 0x40a00000 1
+    40a00000: 0000000d                             ....
+
+Let's see the claim register, which should be in the next word:
+
+    => md 0x40a00004 1
+    40a00004: 00000004                             ....
+
+Yes! Now I can see the claim register with a proper ID. Let's complete this
+interrupt by writing the 4 back to that register:
+
+    => mw 0x40a00004 4
+    => md 0x40801000 1
+    40801000: 00000000                             ....
+
+Perfect! It properly caused the pending interrupt to disappear.
+
+Let's try now setting the MTIMECMP to something smaller than the MTIME, so it
+causes an interrupt. With a value 0 should always work, but lets choose a non
+zero value:
+
+    => md 0x40014000
+    40014000: aaaaaaaa                             ....
+    => mw 0x40014000 00aaaaaa
+    => md 0x40014000
+    40014000: 00aaaaaa                             ....
+    => md 0x4001bff8
+    4001bff8: 06211a0c                             ..!.
+    => md 0x40801000 1
+    40801000: 00000010                             ....
+
+Perfect! It causes the interrupt to appear as pending.
+
+So, using the context 0, we can properly see the interrupt pending, claim it and
+complete it. But the context 0 is not used in OpenSBI, only the 9 and 11:
+
+From `include/sbi/riscv_encoding.h`:
+
+    #define IRQ_S_SOFT			1
+    #define IRQ_VS_SOFT			2
+    #define IRQ_M_SOFT			3
+    #define IRQ_S_TIMER			5
+    #define IRQ_VS_TIMER			6
+    #define IRQ_M_TIMER			7
+    #define IRQ_S_EXT			9
+    #define IRQ_VS_EXT			10
+    #define IRQ_M_EXT			11
+    #define IRQ_S_GEXT			12
+    #define IRQ_PMU_OVF			13
+
+And from `lib/utils/irqchip/fdt_irqchip_plic.c`:
+
+    static int irqchip_plic_update_hartid_table(void *fdt, int nodeoff,
+                            struct plic_data *pd)
+    {
+        const fdt32_t *val;
+        u32 phandle, hwirq, hartid;
+        struct sbi_scratch *scratch;
+        int i, err, count, cpu_offset, cpu_intc_offset;
+
+        val = fdt_getprop(fdt, nodeoff, "interrupts-extended", &count);
+        if (!val || count < sizeof(fdt32_t))
+            return SBI_EINVAL;
+        count = count / sizeof(fdt32_t);
+
+        for (i = 0; i < count; i += 2) {
+            phandle = fdt32_to_cpu(val[i]);
+            hwirq = fdt32_to_cpu(val[i + 1]);
+
+            cpu_intc_offset = fdt_node_offset_by_phandle(fdt, phandle);
+            if (cpu_intc_offset < 0)
+                continue;
+
+            cpu_offset = fdt_parent_offset(fdt, cpu_intc_offset);
+            if (cpu_offset < 0)
+                continue;
+
+            err = fdt_parse_hart_id(fdt, cpu_offset, &hartid);
+            if (err)
+                continue;
+
+            scratch = sbi_hartid_to_scratch(hartid);
+            if (!scratch)
+                continue;
+
+            plic_set_hart_data_ptr(scratch, pd);
+            switch (hwirq) {
+            case IRQ_M_EXT:
+                plic_set_hart_mcontext(scratch, i / 2);
+                break;
+            case IRQ_S_EXT:
+                plic_set_hart_scontext(scratch, i / 2);
+                break;
+            }
+        }
+
+        return 0;
+    }
+
+So, lets try to do the same, but with the context 11 for machine mode
+`IRQ_M_EXT`.
+
+Let's compute the address of the input source for context 11:
+
+    base + 0x002000: Enable bits for sources 0-31 on context 0
+    base + 0x002004: Enable bits for sources 32-63 on context 0
+    ...
+    base + 0x00207C: Enable bits for sources 992-1023 on context 0
+    base + 0x002080: Enable bits for sources 0-31 on context 1
+    base + 0x002084: Enable bits for sources 32-63 on context 1
+    ...
+    base + 0x0020FC: Enable bits for sources 992-1023 on context 1
+    base + 0x002100: Enable bits for sources 0-31 on context 2
+    base + 0x002104: Enable bits for sources 32-63 on context 2
+    ...
+    base + 0x00217C: Enable bits for sources 992-1023 on context 2
+    ...
+    base + 0x1F1F80: Enable bits for sources 0-31 on context 15871
+    base + 0x1F1F84: Enable bits for sources 32-63 on context 15871
+    base + 0x1F1FFC: Enable bits for sources 992-1023 on context 15871
+    ...
+
+It should be:
+
+    >>> hex(0x40800000 + 0x2000 + (11 * 0x80))
+    '0x40802580'
+
+They are all disabled:
+
+    => md 0x40802580
+    40802580: 00000000                             ....
+
+So, let's enable the source 4 by writing 0x10
+
+    => mw 0x40802580 0x10
+    => md 0x40801000 1
+    40801000: 00000010                             ....
+
+Now, let's check the context 11 priority threshold:
+
+    0x200000: Priority threshold for context 0
+    0x201000: Priority threshold for context 1
+    0x202000: Priority threshold for context 2
+    0x203000: Priority threshold for context 3
+
+The priority threshold for context 11 should be at:
+
+    >>> hex(0x40800000 + 0x200000 + (11 * 0x1000))
+    '0x40a0b000'
+
+    => md 0x40a0b000
+    40a0b000: 00000000                             ....
+
+It has value 0, so all interrupts with non-zero priority should pass:
+
+> For example, a threshold value of zero permits all interrupts with non-zero
+> priority.
+
+Let's see the priority of source 4 in context 11:
+
+    0x000000: Reserved (interrupt source 0 does not exist)
+    0x000004: Interrupt source 1 priority
+    0x000008: Interrupt source 2 priority
+    ...
+    0x000FFC: Interrupt source 1023 priority
+
+The address should be at:
+
+    >>> hex(0x40800000 + (4 * 0x4))
+
+    => md 0x40800010
+    40800010: 00000000                             ....
+
+It has priority 0, so it would never work. Let's make it priority 1:
+
+    => mw 0x40800010 1
+    => md 0x40800010 1
+    40800010: 00000001
+
+Let's check the pending interrupts:
+
+    => md 0x40801000 1
+    40801000: 00000010                             ....
+
+It is still pending, so let's clear it my setting the MTIMECMP to a large value.
+
+    => md 0x40014000
+    40014000: 00aaaaaa                             ....
+    => mw 0x40014000 0xaaaaaaaa
+    => md 0x40014000
+    40014000: aaaaaaaa                             ....
+    => md 0x4001bff8
+    4001bff8: 0e8e6066                             f`..
+    => md 0x4001bff8
+    4001bff8: 0e8ea4c9                             ....
+    => md 0x4001bff8
+    4001bff8: 0e8ece24                             $...
+
+Now, let's claim and complete it for the context 0 which was already enabled
+from the test before.
+
+    => md 0x40a00004 1
+    40a00004: 00000004                             ....
+    => mw 0x40a00004 4
+    => md 0x40801000 1
+    40801000: 00000000                             ....
+
+Perfect, now it is not pending anymore.
+
+Now, the context 0 is still enabled, so the interruptions may be sent there
+instead of context 11. So let's disable the context 0 first.
+
+    => mw 0x40802000 0
+    => md 0x40802000 1
+    40802000: 00000000                             ....
+
+Now let's fire the MTIMECMP and see if OpenSBI sees a machine trap.
+
+    => md 0x40014000 1
+    40014000: aaaaaaaa                             ....
+    => mw 0x40014000 00aaaaaa
+    => md 0x40014000 1
+    40014000: 00aaaaaa                             ....
+
+Nothing happened.
+
+The interrupt is pending:
+
+    => md 0x40801000 1
+    40801000: 00000010                             ....
+
+The claim on context 0 returns 0, so not interrupt there which is expected:
+
+    => md 0x40a00004 1
+    40a00004: 00000000                             ....
+
+Let's compute the claim register on context 11:
+
+    0x200004: Interrupt Claim Process for context 0
+    0x201004: Interrupt Claim Process for context 1
+    0x202004: Interrupt Claim Process for context 2
+    0x203004: Interrupt Claim Process for context 3
+    ...
+
+    >>> hex(0x40800000 + 0x200004 + (11 * 0x1000))
+    '0x40a0b004'
+
+    => md 0x40a0b004 1
+    40a0b004: 00000000                             ....
+
+Hmm, there is no claim ID.
+
+So, I checked again, and I cannot enable the interrupt on context 11:
+
+    => md 0x40802580 1
+    40802580: 00000000                             ....
+    => mw 0x40802580 0x10
+    => md 0x40802580 1
+    40802580: 00000000                             ....