diff --git a/doc/dev/channels.md b/doc/dev/channels.md
index c67987c..b703713 100644
--- a/doc/dev/channels.md
+++ b/doc/dev/channels.md
@@ -65,5 +65,5 @@ to write the duplicated value with no error.
A unique function can be set to each channel which will be called once a channel
becomes dirty with `chan_set_dirty_cb()`. This callback will be called before
`chan_set()`, `chan_push()` or `chan_pop()` returns. The [patch
-bay](../patchbay) uses this callback to detect when a channel is modified an run
+bay](patchbay.md) uses this callback to detect when a channel is modified an run
other callbacks.
diff --git a/doc/dev/model.md b/doc/dev/model.md
index 5d7ccc2..91ccf58 100644
--- a/doc/dev/model.md
+++ b/doc/dev/model.md
@@ -22,7 +22,7 @@ If the model is not enabled, no other function will be called.
The create function is called for each enabled model to allow them to allocate
all the required structures to perform the emulation using the
-[extend](../extend) mechanism. All the required channels must be created and
+[extend](extend.md) mechanism. All the required channels must be created and
registered in the patch bay in this function, so other models can found them in
the next stage.
diff --git a/doc/dev/mux.md b/doc/dev/mux.md
index 63b226c..59e49bd 100644
--- a/doc/dev/mux.md
+++ b/doc/dev/mux.md
@@ -1,6 +1,6 @@
# Mux
-The emulator provides a mechanism to interconnect [channels](../channels) in a
+The emulator provides a mechanism to interconnect [channels](channels.md) in a
similar way as an [analog
multiplexer](https://en.wikipedia.org/wiki/Multiplexer) by using the `mux`
module.
@@ -19,7 +19,7 @@ selected. This allows a multiplexer to act as a filter too.
The typical use of multiplexers is to implement the tracking modes of channels.
As an example, the following diagram shows two multiplexers used to implement
-the subsystem view of [Nanos6](../nanos6):
+the subsystem view of [Nanos6](../user/emulation/nanos6.md):
![Mux example](fig/mux.svg)
@@ -51,5 +51,5 @@ Multiplexers allow models to interact with each other in a controlled way. In
the example, the blue channel (*nanos6.thread0.subsystem*) is directly modified by
the Nanos6 model when a new event is received. While the red channels are
controlled by the ovni model. The rest of the channels are automatically updated
-in the propagation phase of the [bay](../patchbay) allowing the ovni model to
+in the propagation phase of the [bay](patchbay.md) allowing the ovni model to
modify the Nanos6 Paraver view of the subsystems.
diff --git a/doc/dev/paraver.md b/doc/dev/paraver.md
index 4adb1ef..301dddf 100644
--- a/doc/dev/paraver.md
+++ b/doc/dev/paraver.md
@@ -16,7 +16,7 @@ A channel can be connected to each row in a trace with `prv_register()`, so the
new values of the channel get written in the trace. Only null and int64 data
values are supported for now.
-The emission phase is controlled by the [patch bay](../patchbay) and runs all
+The emission phase is controlled by the [patch bay](patchbay.md) and runs all
the emit callbacks at once for all dirty channels.
## Duplicate values
diff --git a/doc/dev/patchbay.md b/doc/dev/patchbay.md
index dc44d93..46859e0 100644
--- a/doc/dev/patchbay.md
+++ b/doc/dev/patchbay.md
@@ -1,6 +1,6 @@
# Patch bay
-The patch bay (or simply bay) allows [channels](../channels/) to be registered
+The patch bay (or simply bay) allows [channels](channels.md) to be registered
with their name so they are visible to all parts of the emulator and provides a
way to run callbacks when the channels update their values.
diff --git a/doc/index.md b/doc/index.md
index 27016b3..df6f05b 100644
--- a/doc/index.md
+++ b/doc/index.md
@@ -13,8 +13,9 @@ The ovni project implements a fast instrumentation library that records
small events (starting at 12 bytes) during the execution of programs to
later investigate how the execution happened.
-The instrumentation process is split in two stages: [runtime](runtime)
-tracing and [emulation](emulation/).
+The instrumentation process is split in two stages:
+[runtime](user/runtime/index.md)
+tracing and [emulation](user/emulation/index.md).
During runtime, very short binary events are stored on disk which
describe what is happening. Once the execution finishes, the events are
diff --git a/doc/user/concepts.md b/doc/user/concepts.md
deleted file mode 100644
index c867319..0000000
--- a/doc/user/concepts.md
+++ /dev/null
@@ -1,31 +0,0 @@
-# Overview
-
-The objective of the ovni project is to provide insight into what
-happened at execution of a program.
-
-![Instrumentation process](fig/instrumentation.svg)
-
-The key pieces of software involved are instrumented so they emit events
-during the execution which allow the reconstruction of the execution
-later on.
-
-During the execution phase, the information gathered in the events is
-kept very short and simple, so the overhead is kept at minimum to avoid
-disturbing the execution process. Here is an example of a single event
-emitted during the execution phase, informing the current thread to
-finish the execution:
-
- 00 4f 48 65 52 c0 27 b4 d3 ec 01 00
-
-During the emulation phase, the events are read and processed in the
-emulator, reconstructing the execution. State transitions are recorded
-in a Paraver trace. Here is an example of the same thread ceasing the
-execution:
-
- 2:0:1:1:1:50105669:1:0
-
-Finally, loading the trace in the Paraver program, we can generate a
-timeline visualization of the state change. Here is the example for the
-same state transition of the thread stopping the execution:
-
-![Visualization](fig/visualization.png)
diff --git a/doc/user/concepts/part-model.md b/doc/user/concepts/part-model.md
new file mode 100644
index 0000000..5269630
--- /dev/null
+++ b/doc/user/concepts/part-model.md
@@ -0,0 +1,62 @@
+# Part model
+
+Ovni has a model to represent the hardware components as well as the software
+concepts like threads or processes. Each concept is considered to be a *part*.
+Here is an example diagram depicting the part hierarchy:
+
+![Part model](part-model.svg)
+
+Notice how a loom can restrict the CPUs of the node to its child processes.
+
+## Software parts
+
+These are not physical parts, but they abstract common concepts.
+
+### Thread
+
+A thread in ovni is directly mapped to a [POSIX
+thread](https://en.wikipedia.org/wiki/Pthreads) and they are identified by a
+`TID` which must be unique in a [node](#node). Threads in ovni have [a model with
+an internal state](../emulation/ovni.md/#thread_model) that tries to tracks the
+state of the real thread.
+
+### Process
+
+A process is directly mapped to a UNIX
+[process](https://en.wikipedia.org/wiki/Process_(computing)) and they are
+identified by a `PID` number which must be unique in a [node](#node).
+
+### Loom
+
+A loom has no direct mapping to a usual concept. It consists of a set of
+[CPUs](#cpu) from the same node and a set of processes that can *only run in
+those CPUs*. Each CPUs must belong to one and only one loom. It is often used
+to group CPUs that belong to the same process when running workloads with
+multiple processes (like with MPI).
+
+Each loom has a virtual CPU which collects running threads that are not
+exclusively assigned to a physical CPU, so we cannot determine on which CPU they
+are running.
+
+## Hardware parts
+
+These parts have a physical object assigned.
+
+### CPU
+
+A CPU is a hardware thread that can execute at most one thread at a time. Each
+CPU must have a physical ID that is unique in a node. In ovni there is also a
+virtual CPU, which simply is used to collect threads that are not tied to an
+specific physical CPU, so it cannot be easily determined where they are running.
+
+### Node
+
+A *node* refers to a compute node, often a physical machine with memory and
+network which may contain one or more
+[sockets](https://en.wikipedia.org/wiki/CPU_socket), where each socket has one
+or more CPUs.
+
+### System
+
+A system represents the complete set of hardware parts and software parts that
+are known to ovni in a given trace.
diff --git a/doc/user/concepts/part-model.svg b/doc/user/concepts/part-model.svg
new file mode 100644
index 0000000..b8c5293
--- /dev/null
+++ b/doc/user/concepts/part-model.svg
@@ -0,0 +1,516 @@
+
+
+
+
diff --git a/doc/user/concepts/trace.md b/doc/user/concepts/trace.md
new file mode 100644
index 0000000..f4cd24f
--- /dev/null
+++ b/doc/user/concepts/trace.md
@@ -0,0 +1,123 @@
+# Trace concepts
+
+When using libovni to generate traces or the emulator to process them, there are
+several concepts to keep in mind.
+
+## Trace elements
+
+The information generated by a program or later processed by other ovni tools is
+known as a trace. A runtime trace stores the information as-is in disk from a
+program execution. While a emulation trace is generated from the runtime trace
+for visualization with Paraver.
+
+Both runtime and emulation traces are always stored inside the same directory,
+by default `ovni/`, which is known as the *trace directory*.
+
+Here are the components of a runtime trace, as generated by libovni:
+
+
+
+
+
+### Stream
+
+A stream is a directory which contains a binary stream and the associated stream
+metadata file. Each stream is associated with a given part of a system. As of
+now, libovni can only generate streams associated to [threads](part-model.md#thread).
+
+### Stream metadata
+
+The stream metadata is a JSON file named `stream.json` which holds information
+about the stream itself.
+
+### Binary stream
+
+A binary stream is a file named `stream.obs` (.obs stands for Ovni Binary
+Stream) composed of a header and a concatenated array of events without padding.
+Notice that each event may have different length.
+
+### Event
+
+An event is a point in time that has some information associated. Events written
+at runtime by libovni have at MCV, a clock and a optional payload. The list of
+all events recognized by the emulator can be found [here](../emulation/events.md).
+
+Events can be displayed by ovnidump, which shows an explanation of what the
+event means:
+
+```txt
+$ ovnidump ovni/loom.hop.nosv-u1000/proc.1121064 | grep -A 10 VTx | head
+517267929632815 VTx thread.1121064 executes the task 1 with bodyid 0
+517267930261672 VYc thread.1121064 creates task type 2 with label "task"
+517267930875858 VTC thread.1121064 creates parallel task 2 with type 2
+517267930877789 VU[ thread.1121064 starts submitting a task
+517267930877990 VU] thread.1121064 stops submitting a task
+517267930878098 VTC thread.1121064 creates parallel task 3 with type 2
+517267930878196 VU[ thread.1121064 starts submitting a task
+517267930878349 VU] thread.1121064 stops submitting a task
+517267930878432 VTC thread.1121064 creates parallel task 4 with type 2
+517267930878494 VU[ thread.1121064 starts submitting a task
+```
+
+There are two types or events: normal and jumbo events, the latter can hold
+large attached payloads.
+
+### MCV
+
+The MCV acronym is the abbreviation of Model-Class-Value, which are a three
+characters that identify any event. The MCV is shown in the ovnitop and ovnidump
+tools and allows easy filtering with grep, for a single or related events:
+
+```
+$ ovnitop ovni | grep VT
+VTe 20002
+VTx 20002
+VTC 200
+VTc 2
+VTp 1
+VTr 1
+```
+
+### Clock
+
+A clock is a 64 bit counter, which counts the number of nanoseconds from an
+arbitrary point in time in the past. Each event has the value of the clock
+stored inside, to indicate when that event happened. In a given trace there can
+be multiple clocks which don't refer to the same point in the past and must be
+corrected so they all produce an ordered sequence of events. The ovnisync
+program performs this correction by measuring the difference across clocks of
+different nodes.
+
+### Payload
+
+Events may have associated additional information which is stored in the stream.
+Normal events can hold up to 16 bytes, otherwise the jumbo events must be used
+to hold additional payload.
+
+## Other related concepts
+
+Apart from the trace itself, there are other concepts to keep in mind when the
+trace is being processed by the emulator.
+
+### Event model
+
+Each event belongs to an event model, as identified by the model character in
+the MCV. An event model is composed of several components:
+
+- A set of [events](#event) all with the same model identifier in the
+ [MCV](#mcv)
+- The emulator code that processes those events.
+- A human readable name, like `ovni` or `nanos6`.
+- A semantic version.
+
+### State
+
+A state is a discrete value that can change over time based on the events the
+emulator receives. Usually a single event causes a single state change, which is
+then written to the Paraver traces. An example is the thread state, which can
+change over time based on the events `OH*` that indicate a state transition
+of the current thread.
+
+In contrast with an event, states have a duration associated which can usually
+be observed in Paraver. Notice that the trace only contains events, the states
+are computed at emulation.
diff --git a/doc/user/concepts/trace.svg b/doc/user/concepts/trace.svg
new file mode 100644
index 0000000..73f3f05
--- /dev/null
+++ b/doc/user/concepts/trace.svg
@@ -0,0 +1,474 @@
+
+
+
+
diff --git a/doc/user/emulation/index.md b/doc/user/emulation/index.md
index 203e377..d014fac 100644
--- a/doc/user/emulation/index.md
+++ b/doc/user/emulation/index.md
@@ -47,7 +47,7 @@ the following elements:
- A single byte model identification (for example `O`).
- A set of runtime events with that model identification (see the [list
- of events](events)).
+ of events](events.md)).
- Rules that determine which sequences of events are valid.
- The emulation hooks that process each event and modify the state of
the emulator.
diff --git a/doc/user/emulation/nosv.md b/doc/user/emulation/nosv.md
index a810d2a..21a5977 100644
--- a/doc/user/emulation/nosv.md
+++ b/doc/user/emulation/nosv.md
@@ -60,7 +60,7 @@ For more details, see [this MR][1].
The subsystem view provides a simplified view on what is the nOS-V
runtime doing over time. The view follows the same rules described in
-the [subsystem view of Nanos6](../nanos6/#subsystem_view).
+the [subsystem view of Nanos6](nanos6.md/#subsystem_view).
## Idle view
diff --git a/doc/user/emulation/ovni.md b/doc/user/emulation/ovni.md
index c945d03..8b0dc8c 100644
--- a/doc/user/emulation/ovni.md
+++ b/doc/user/emulation/ovni.md
@@ -60,4 +60,4 @@ will set all the channels to an error state.
The emulator automatically switches the channels from one thread to
another when a thread is switched from the CPU. So the different models
don't need to worry about thread transitions. See the
-[channels](../channels) section for more information.
+[channels](../../dev/channels.md) section for more information.
diff --git a/doc/user/runtime/env.md b/doc/user/runtime/env.md
new file mode 100644
index 0000000..5fb2982
--- /dev/null
+++ b/doc/user/runtime/env.md
@@ -0,0 +1,56 @@
+# Environment variables
+
+Some environment variables can be used to adjust settings during the execution
+of libovni, they all begin with the `OVNI_` prefix. Be sure that all threads of
+the same node use the same environment variables.
+
+## OVNI_TMPDIR
+
+During the execution of your program, a per-thread buffer is kept where the new
+events are being recorded. When this buffer is full, it is written to disk and
+emptied, an operation known as flush. This may take a while depending on the
+underliying filesystem.
+
+Keep in mind that the thread will be blocked until the flush ends, so if your
+filesystem is slow it would interrupt the execution of your program for a long
+time. It is advisable to use the fastest filesystem available (see the tmpfs(5)
+and df(1) manual pages).
+
+You can select a temporary trace directory where the buffers will be flushed
+during the execution by setting the environment variable `OVNI_TMPDIR`. The last
+directory will be created if doesn't exist. In that case, as soon as a process
+calls `ovni_proc_fini()`, the traces of all its threads will be moved to the
+final directory at `$PWD/ovni`. Example:
+
+ OVNI_TMPDIR=$(mktemp -u /dev/shm/ovni.XXXXXX) srun ./your-app
+
+To test the different filesystem speeds, you can use hyperfine and dd. Take a
+closer look at the max time:
+
+```
+$ hyperfine 'dd if=/dev/zero of=/gpfs/projects/bsc15/bsc15557/kk bs=2M count=10'
+Benchmark 1: dd if=/dev/zero of=/gpfs/projects/bsc15/bsc15557/kk bs=2M count=10
+ Time (mean ± σ): 71.7 ms ± 130.4 ms [User: 0.8 ms, System: 10.2 ms]
+ Range (min … max): 14.7 ms … 1113.2 ms 162 runs
+
+ Warning: Statistical outliers were detected. Consider re-running this
+ benchmark on a quiet PC without any interferences from other programs. It
+ might help to use the '--warmup' or '--prepare' options.
+
+$ hyperfine 'dd if=/dev/zero of=/tmp/kk bs=2M count=10'
+Benchmark 1: dd if=/dev/zero of=/tmp/kk bs=2M count=10
+ Time (mean ± σ): 56.2 ms ± 5.7 ms [User: 0.6 ms, System: 14.8 ms]
+ Range (min … max): 45.8 ms … 77.8 ms 63 runs
+
+$ hyperfine 'dd if=/dev/zero of=/dev/shm/kk bs=2M count=10'
+Benchmark 1: dd if=/dev/zero of=/dev/shm/kk bs=2M count=10
+ Time (mean ± σ): 11.4 ms ± 0.4 ms [User: 0.5 ms, System: 11.1 ms]
+ Range (min … max): 9.7 ms … 12.5 ms 269 runs
+```
+
+## OVNI_TRACEDIR
+
+By default, the runtime trace will be placed in the `ovni` directory, inside the
+working directory. You can specify a different location to place the trace by
+setting the `OVNI_TRACEDIR` environment variable. It accepts a relative or
+absolute path, which will be created if it doesn't exist.
diff --git a/doc/user/runtime/fig/mark.png b/doc/user/runtime/fig/mark.png
new file mode 100644
index 0000000..860da9b
Binary files /dev/null and b/doc/user/runtime/fig/mark.png differ
diff --git a/doc/user/runtime/index.md b/doc/user/runtime/index.md
new file mode 100644
index 0000000..d18b09a
--- /dev/null
+++ b/doc/user/runtime/index.md
@@ -0,0 +1,109 @@
+# Introduction
+
+To use *libovni* to instrument a program, follow the next instructions
+carefully, or you may end up with an incomplete trace that is rejected at
+emulation.
+
+You can also generate a valid trace from your own software or hardware
+directly, but be sure to follow the [trace specification](trace_spec.md).
+
+## Initialization
+
+To initialize libovni follow these steps in all threads:
+
+1. **Check the version**. Call `ovni_version_check()` once before calling any
+ ovni function. It can be called multiple times from any thread, but only one
+ is required.
+
+2. **Init the process**. Call `ovni_proc_init()` to initialize the process. It
+ can only be called **once per process** and it must be called before the
+ thread is initialized.
+
+3. **Init the thread**. Call `ovni_thread_init()` to initialize the thread.
+ Multiple attempts to initialize the same thread are ignored with a warning.
+
+The `ovni_proc_init()` arguments are as follows:
+
+```c
+void ovni_proc_init(int app, const char *loom, int pid);
+```
+
+The `app` defines the "appid" of the program, which must be a number >0. This is
+useful to run multiple processes some of which run the same "app", so you can
+tell which one is which. The `loom` argument defines the
+[loom](../concepts/part-model.md#loom) name and maps the process to that
+loom. It must be compose of the host name, a dot and a suffix. The PID is the
+one obtained by `getpid(2)`.
+
+The `ovni_thread_init()` function only accepts one argument, the TID as returned
+by `gettid(2)`.
+
+## Setup metadata
+
+Once the process and thread are initialized, you can begin adding metadata to
+the thread stream.
+
+1. **Require models**. Call `ovni_thread_require()` with the required model
+ version before emitting events for a given model. Only required once from a
+ thread in a given trace.
+
+2. **Emit loom CPUs**. Call `ovni_add_cpu()` to register each CPU in the loom. It can
+ be done from a single thread or multiple threads, in the latter the list of
+ CPUs is merged.
+
+3. **Set the rank**. If you use MPI, call `ovni_proc_set_rank()` to register the
+ rank and number of ranks of the current execution. Only once per process.
+
+## Start the execution
+
+The current thread must switch to the "Running" state before any event can be
+processed by the emulator. Do so by emitting a [`OHx`
+event](../emulation/events.md#OHx) in the stream with the appropriate payload:
+
+```c
+static void thread_execute(int32_t cpu, int32_t ctid, uint64_t tag)
+{
+ struct ovni_ev ev = {0};
+ ovni_ev_set_clock(&ev, ovni_clock_now());
+ ovni_ev_set_mcv(&ev, "OHx");
+ ovni_payload_add(&ev, (uint8_t *) &cpu, sizeof(cpu));
+ ovni_payload_add(&ev, (uint8_t *) &ctid, sizeof(ctid));
+ ovni_payload_add(&ev, (uint8_t *) &tag, sizeof(tag));
+ ovni_ev_emit(&ev);
+}
+```
+
+The `cpu` is the logical index (not the physical ID) of the loom CPU at which
+this thread will begin the execution. Use -1 if it is not known. The `ctid` and
+`tag` allow you to track the exact point at which a given thread was created and
+by which thread but they are not relevant for the first thread, so they can be
+set to -1.
+
+## Emit events
+
+After this point you can emit any other event from this thread. Use the
+`ovni_ev_*` set of functions to create and emit events. Notice that all events
+are refer to the current thread that emits them.
+
+If you need to store metadata information, use the `ovni_attr_*` set of
+functions. The metadata is stored in disk by `ovni_attr_fluch()` and when the
+thread is freed by `ovni_thread_free()`.
+
+Attempting to emit events or writing metadata without having a thread
+initialized will cause your program to abort.
+
+## Finishing the execution
+
+To finalize the execution **every thread** must perform the following steps,
+otherwise the trace **will be rejected**.
+
+1. **End the current thread**. Emit a [`OHe` event](../emulation/events.md#OHe) to inform the current thread ends.
+2. **Flush the buffer**. Call `ovni_flush()` to be sure all events are written
+ to disk.
+3. **Free the thread**. Call `ovni_thread_free()` to complete the stream and
+ free the memory used by the buffer.
+4. **Finish the process**. If this is the last thread, call `ovni_proc_fini()`
+ to set the process state to finished.
+
+If a thread fails to perform these steps, the complete trace will be rejected by
+the emulator as it cannot guarantee the trace to be consistent.
diff --git a/doc/user/runtime/mark.md b/doc/user/runtime/mark.md
index 8ac48bc..a37ffe9 100644
--- a/doc/user/runtime/mark.md
+++ b/doc/user/runtime/mark.md
@@ -1,4 +1,4 @@
-# Mark API
+# Mark events
The mark API allows you to add arbitrary events in a trace to mark regions of
interest while debugging or developing a new program or library. The events are
@@ -80,6 +80,68 @@ void ovni_mark_pop(int32_t type, int64_t value);
The value in the pop call must match the previous pushed value.
+
+Example OmpSs-2 program
+
+
+
+Here is a dummy program showing how to use the mark API with an OmpSs-2 program.
+Notice that there is no initialization of the current thread or process, as it
+already occurs inside the OmpSs-2 runtime before reaching the main.
+
+```c
+/* Build with:
+ * $ clang -fompss-2 -lovni dummy.c -o dummy
+ * Enable instrumentation in nanos6:
+ * $ echo 'version.instrument = "ovni"' > nanos6.toml
+ * Run:
+ * $ ./dummy
+ * Emulate:
+ * $ ovniemu ovni
+ * View timeline:
+ * $ wxparaver ovni/cpu.prv ovni/cfg/cpu/ovni/mark.cfg
+ */
+#include
+#include
+
+enum { INDEX = 0, RUN = 1 };
+
+static void process(int run, int i)
+{
+ ovni_mark_push(RUN, run + 1);
+ ovni_mark_push(INDEX, i + 1);
+ usleep(10000); // Dummy operation for 10 ms
+ ovni_mark_pop(INDEX, i + 1);
+ ovni_mark_pop(RUN, run + 1);
+}
+
+int main(void)
+{
+ ovni_mark_type(INDEX, OVNI_MARK_STACK, "Index");
+ ovni_mark_type(RUN, OVNI_MARK_STACK, "Run");
+
+ for (int run = 0; run < 10; run++) {
+ for (int i = 0; i < 50; i++) {
+ #pragma oss task
+ process(run, i);
+ }
+ }
+
+ #pragma oss taskwait
+
+ return 0;
+}
+```
+
+
+
Here is the resulting timeline loaded in Paraver with the gradient color
+configuration, showing the first mark type (the index):
+
+
+
+
+
+
## Usage in Paraver
Each thread holds a channel for each mark type that you have defined. The
diff --git a/doc/user/runtime/trace_spec.md b/doc/user/runtime/trace_spec.md
index eb2fde2..f877470 100644
--- a/doc/user/runtime/trace_spec.md
+++ b/doc/user/runtime/trace_spec.md
@@ -1,127 +1,149 @@
-# Trace specification
+# Trace specification v3
!!! Important
This document refers to the trace specification for
- the version 2
+ the version 3
-The ovni instrumentation library stores the information collected in a
-trace following the specification of this document.
+The ovni instrumentation library libovni stores the information
+collected in a runtime trace following the specification of this document.
+
+## Structure
+
+An ovni runtime trace (or simply, a trace) is composed of one or more
+[streams](../concepts/trace.md#stream), which are directories containing
+two mandatory files:
+
+- `stream.json` the stream metadata in JSON format.
+- `stream.obs` the binary stream with events.
+
+Each stream is assigned to a single *part* in the [part
+model](../concepts/part-model.md), usually assigned to a given thread.
+
+There are no imposed rules on how to organize the several streams into
+directories, but libovni uses the following approach for thread streams:
The complete trace is stored in a top-level directory named `ovni`.
-Inside this directory you will find the loom directories with the prefix
-`loom.`. The name of the loom is built from the `loom` parameter of
-`ovni_proc_init()`, prefixing it with `loom.`.
+Inside this directory you will find the loom directories. The name of
+the loom directory is built from the `loom` parameter of `ovni_proc_init()`,
+prefixing it with `loom.`.
Each loom directory contains one directory per process of that loom. The
name is composed of the `proc.` prefix and the PID of the process
specified in the `pid` argument to `ovni_proc_init()`.
-Each process directory contains:
+Inside each process there is one directory for each thread, composed by
+the `thread.` prefix and the TID, which are the streams. The files
+`stream.json` and `stream.obs` reside inside. Example:
-- The process metadata file `metadata.json`.
-- The thread streams, composed of:
- - The binary stream like `thread.123.obs`
- - The thread metadata like `thread.123.json`
+```
+ovni/loom.mio.nosv-u1000/proc.89719/thread.89719/stream.json
+ovni/loom.mio.nosv-u1000/proc.89719/thread.89719/stream.obs
+```
-## Process metadata
+This structure prevents collisions among threads with the same TID among nodes,
+while allowing dumping events from a single thread, process or loom with
+ovnidump.
-!!! Important
+## Stream metadata
- Process metadata has version 2
+The `stream.json` metadata file contains information about the part that
+the stream is assigned to. This is generally used to determine the
+hierarchy of the part model.
-The process metadata file contains important information about the trace
-that is invariant during the complete execution, and generally is
-required to be available prior to processing the events in the trace.
-
-The metadata is stored in the JSON file `metadata.json` inside each
-process directory and contains the following keys:
+The JSON must be an object (dictionary) with the following mandatory
+keys:
- `version`: a number specifying the version of the metadata format.
- Must have the value 2 for this version.
-- `app_id`: the application ID, used to distinguish between applications
- running on the same loom.
-- `rank`: the rank of the MPI process (optional).
-- `nranks`: number of total MPI processes (optional).
-- `cpus`: the array of $`N_c`$ CPUs available in the loom. Only one
- process in the loom must contain this mandatory key. Each element is a
- dictionary with the keys:
- - `index`: containing the logical CPU index from 0 to $`N_c - 1`$.
- - `phyid`: the number of the CPU as given by the operating system
- (which can exceed $`N_c`$).
+ Must have the value 3 for this version.
-Here is an example of the `metadata.json` file:
+The rest of information is stored for each model.
-```
-{
- "version": 2,
- "app_id": 1,
- "rank": 0,
- "nranks": 4,
- "cpus": [
- {
- "index": 0,
- "phyid": 0
- },
- {
- "index": 1,
- "phyid": 1
- },
- {
- "index": 2,
- "phyid": 2
- },
- {
- "index": 3,
- "phyid": 3
- }
- ]
-}
-```
+In particular, the `ovni` model enforces the use of:
-## Thread metadata
+- `ovni.part`: the type of part this stream is assigned to, usually
+ `thread`.
+- `ovni.require`: a dictionary of model name and version which will
+ determine which models are enabled at emulation and the required
+ version.
+- `ovni.finished`: must be 1 to ensure the stream is complete (mandatory
+ in all streams).
-!!! Important
+### Thread stream metadata
- Thread metadata has version 2
+For `thread` streams, the following attributes are used.
-The thread metadata stores constant information per thread, like the
-process metadata. The information is stored in a dictionary, where the
-name of the emulation models are used as keys. In particular, the
-libovni library writes information in the "ovni" key, such as the
-model requirements, and other information like the version of libovni
-used. Example:
+- `ovni.tid`: the TID of the thread (mandatory, per-thread).
+- `ovni.pid`: the PID of the process that the thread belongs to (mandatory, per-thread).
+- `ovni.app_id`: the application ID of the process (optional, per-process).
+- `ovni.rank`: the rank of the MPI process (optional, per-process).
+- `ovni.nranks`: number of total MPI processes (optional, per-process).
+- `ovni.loom`: the name of the loom that the process belongs to (mandatory, per-process).
+- `ovni.loom_cpus`: the array of N CPUs available in the loom
+ (mandatory, per-loom). Each element is a dictionary with the keys:
+ - `index`: containing the logical CPU index from 0 to N - 1.
+ - `phyid`: the number of the CPU as given by the operating system
+ (which can exceed N).
+
+Notice that some attributes don't need to be present in all thread
+streams. For example, per-process requires that at least one thread
+contains the attribute for each process. Similarly, per-loom requires
+that at least one thread of the loom emits the attribute.
+
+The final attribute value will be computed by merging all the values from the
+children metadata. Simple values like numbers or strings must match exactly if
+they appear duplicated, arrays are appended.
+
+Other attributes can be used for other models.
+
+Here is an example of the `stream.json` file for a thread of a nOS-V
+program:
```json
{
- "version": 2,
+ "version": 3,
"ovni": {
"lib": {
- "version": "1.4.0",
- "commit": "unknown"
+ "version": "1.10.0",
+ "commit": "dirty"
},
+ "part": "thread",
+ "tid": 89719,
+ "pid": 89719,
+ "loom": "mio.nosv-u1000",
+ "app_id": 1,
"require": {
- "ovni": "1.0.0"
- }
+ "ovni": "1.1.0",
+ "nosv": "2.3.0"
+ },
+ "loom_cpus": [
+ { "index": 0, "phyid": 0 },
+ { "index": 1, "phyid": 1 },
+ { "index": 2, "phyid": 2 },
+ { "index": 3, "phyid": 3 }
+ ],
+ "finished": 1
+ },
+ "nosv": {
+ "can_breakdown": false,
+ "lib_version": "2.3.1"
}
}
```
-The metadata is written to disk when the thread is first initialized
-and when the thread finishes.
-
-## Thread binary streams
+## Binary stream
!!! Important
- Thread binary stream has version 1
+ Binary streams have version 1
-Streams are a binary files that contains a succession of events with
-monotonically increasing clock values. Streams have a small header and
-the variable size events just after the header.
+A binary stream is a binary file named `stream.obs` that contains a
+succession of events with monotonically increasing clock values. They
+have a small header and the variable size events just after the header.
The header contains the magic 4 bytes of "ovni" and a version number of
-4 bytes too. Here is a figure of the data stored in disk:
+4 bytes too. Here is a figure of the data stored in disk on a little
+endian machine:
![Stream](fig/stream.svg)
@@ -145,7 +167,7 @@ payload:
- Normal events: with a payload up to 16 bytes
- Jumbo events: with a payload up to $`2^{32}`$ bytes
-## Normal events
+### Normal events
The normal events are composed of:
@@ -178,7 +200,7 @@ In the following figure you can see each field annotated:
![Normal event with payload content](fig/event-normal-payload.svg)
-## Jumbo events
+### Jumbo events
The jumbo events are just like normal events but they can hold large
data. The size of the jumbo data is stored as a 32 bits integer as a
@@ -203,10 +225,10 @@ In the following figure you can see each field annotated:
![Jumbo event](fig/event-jumbo.svg)
-## Design considerations
+### Design considerations
-The stream format has been designed to be very simple, so writing a
-parser library would take no more than 2 days for a single developer.
+The binary stream format has been designed to be very simple, so writing
+a parser library would take no more than 2 days for a single developer.
The size of the events has been designed to be small, with 12 bytes per
event when no payload is used.
@@ -239,11 +261,7 @@ raw stream in binary, as the MCV codes can be read as ASCII characters:
This allows a human to detect signs of corruption by visually inspecting
the streams.
-## Limitations
+### Limitations
The streams are designed to be read only forward, as they only contain
the size of each event in the header.
-
-Currently, we only support using the threads as sources of events, using
-one stream per thread. However, adding support for more streams from
-multiple sources is planned for the future.
diff --git a/doc/user/runtime/tracing.md b/doc/user/runtime/tracing.md
deleted file mode 100644
index 9a69e0b..0000000
--- a/doc/user/runtime/tracing.md
+++ /dev/null
@@ -1,94 +0,0 @@
-# Tracing a new program
-
-Read carefully this document before using libovni to instrument a new
-component. There are a few rules you must follow to ensure the runtime
-trace is correct.
-
-## Trace processes and threads
-
-- Call `ovni_version_check()` once before calling any ovni function.
-
-- Call `ovni_proc_init()` when a new process begins the execution.
-
-- Call `ovni_thread_init()` when a new thread begins the execution
- (including the main process thread).
-
-- Call `ovni_thread_require()` with the required model version before
- emitting events for that model.
-
-- Call `ovni_flush()` and `ovni_thread_free()` when it finishes (in that
- order).
-
-- Call `ovni_proc_fini()` when a process ends, after all threads have
- finished.
-
-You can use `ovni_ev_emit()` to record a new event. If you need more
-than 16 bytes of payload, use `ovni_ev_jumbo_emit()`. See the [trace
-specification](../trace_spec) for more details.
-
-Compile and link with libovni. When you run your program, a new
-directory ovni will be created in the current directory `$PWD/ovni`
-which contains the execution trace.
-
-You can change the trace directory by defining the `OVNI_TRACEDIR`
-environment variable. The envar accepts a trace directory name, a
-relative path to the trace directory, or its absolute path. In the
-first case, the trace directory will be created in the current
-directory `$PWD`.
-
-## Rules
-
-Follow these rules to avoid losing events:
-
-1. No event may be emitted until the process is initialized with
-`ovni_proc_init()` and the thread with `ovni_thread_init()`.
-
-2. When a thread ends the execution, it must call `ovni_flush()` to write the
-events in the buffer to disk.
-
-3. All threads must have flushed its buffers before calling `ovni_proc_fini()`.
-
-## Select a fast directory
-
-During the execution of your program, a per-thread buffer is kept where the new
-events are being recorded. When this buffer is full, it is written to disk and
-emptied, an operation known as flush. This may take a while depending on the
-underliying filesystem.
-
-Keep in mind that the thread will be blocked until the flush ends, so if your
-filesystem is slow it would interrupt the execution of your program for a long
-time. It is advisable to use the fastest filesystem available (see the tmpfs(5)
-and df(1) manual pages).
-
-You can select the trace directory where the buffers will be flushed during the
-execution by setting the environment variable `OVNI_TMPDIR`. The last directory
-will be created if doesn't exist. In that case, as soon as a process calls
-`ovni_proc_fini()`, the traces of all its threads will be moved to the final
-directory at `$PWD/ovni`. Example:
-
- OVNI_TMPDIR=$(mktemp -u /dev/shm/ovni.XXXXXX) srun ./your-app
-
-To test the different filesystem speeds, you can use hyperfine and dd. Take a
-closer look at the max time:
-
-```
-$ hyperfine 'dd if=/dev/zero of=/gpfs/projects/bsc15/bsc15557/kk bs=2M count=10'
-Benchmark 1: dd if=/dev/zero of=/gpfs/projects/bsc15/bsc15557/kk bs=2M count=10
- Time (mean ± σ): 71.7 ms ± 130.4 ms [User: 0.8 ms, System: 10.2 ms]
- Range (min … max): 14.7 ms … 1113.2 ms 162 runs
-
- Warning: Statistical outliers were detected. Consider re-running this
- benchmark on a quiet PC without any interferences from other programs. It
- might help to use the '--warmup' or '--prepare' options.
-
-$ hyperfine 'dd if=/dev/zero of=/tmp/kk bs=2M count=10'
-Benchmark 1: dd if=/dev/zero of=/tmp/kk bs=2M count=10
- Time (mean ± σ): 56.2 ms ± 5.7 ms [User: 0.6 ms, System: 14.8 ms]
- Range (min … max): 45.8 ms … 77.8 ms 63 runs
-
-$ hyperfine 'dd if=/dev/zero of=/dev/shm/kk bs=2M count=10'
-Benchmark 1: dd if=/dev/zero of=/dev/shm/kk bs=2M count=10
- Time (mean ± σ): 11.4 ms ± 0.4 ms [User: 0.5 ms, System: 11.1 ms]
- Range (min … max): 9.7 ms … 12.5 ms 269 runs
-```
-
diff --git a/include/ovni.h.in b/include/ovni.h.in
index 6844855..2684e2b 100644
--- a/include/ovni.h.in
+++ b/include/ovni.h.in
@@ -18,7 +18,7 @@ extern "C" {
#include
#include
-#define OVNI_METADATA_VERSION 2
+#define OVNI_METADATA_VERSION 3
#define OVNI_TRACEDIR "ovni"
#define OVNI_MAX_HOSTNAME 512
@@ -31,6 +31,9 @@ extern "C" {
#define OVNI_STREAM_EXT ".obs"
+/* Version of the ovni model for events */
+#define OVNI_MODEL_VERSION "1.1.0"
+
/* Follow https://semver.org rules for versioning */
#define OVNI_LIB_VERSION "@PROJECT_VERSION@"
#define OVNI_GIT_COMMIT "@OVNI_GIT_COMMIT@"
diff --git a/mkdocs.yml b/mkdocs.yml
index 2938067..f37d060 100644
--- a/mkdocs.yml
+++ b/mkdocs.yml
@@ -23,10 +23,13 @@ markdown_extensions:
nav:
- index.md
- 'User guide':
- - user/concepts.md
- user/installation.md
+ - 'Concepts':
+ - user/concepts/part-model.md
+ - user/concepts/trace.md
- 'Runtime':
- - user/runtime/tracing.md
+ - user/runtime/index.md
+ - user/runtime/env.md
- user/runtime/mark.md
- user/runtime/distributed.md
- user/runtime/kernel.md
diff --git a/src/emu/CMakeLists.txt b/src/emu/CMakeLists.txt
index 97d696b..871acf6 100644
--- a/src/emu/CMakeLists.txt
+++ b/src/emu/CMakeLists.txt
@@ -30,7 +30,6 @@ add_library(emu STATIC
stream.c
trace.c
loom.c
- metadata.c
mux.c
sort.c
path.c
diff --git a/src/emu/loom.c b/src/emu/loom.c
index 9e2081f..62b1a38 100644
--- a/src/emu/loom.c
+++ b/src/emu/loom.c
@@ -1,4 +1,4 @@
-/* Copyright (c) 2021-2023 Barcelona Supercomputing Center (BSC)
+/* Copyright (c) 2021-2024 Barcelona Supercomputing Center (BSC)
* SPDX-License-Identifier: GPL-3.0-or-later */
#include "loom.h"
@@ -8,15 +8,13 @@
#include "cpu.h"
#include "path.h"
#include "proc.h"
+#include "stream.h"
#include "uthash.h"
-static const char *loom_prefix = "loom.";
-
static void
set_hostname(char host[PATH_MAX], const char name[PATH_MAX])
{
- /* Skip prefix */
- const char *start = name + strlen(loom_prefix);
+ const char *start = name;
/* Copy until dot or end */
int i;
@@ -30,10 +28,19 @@ set_hostname(char host[PATH_MAX], const char name[PATH_MAX])
host[i] = '\0';
}
-int
-loom_matches(const char *path)
+const char *
+loom_name(struct stream *s)
{
- return path_has_prefix(path, loom_prefix);
+ JSON_Object *meta = stream_metadata(s);
+ const char *loom = json_object_dotget_string(meta, "ovni.loom");
+
+ if (loom == NULL) {
+ err("cannot get attribute ovni.loom for stream: %s",
+ s->relpath);
+ return NULL;
+ }
+
+ return loom;
}
int
@@ -41,11 +48,6 @@ loom_init_begin(struct loom *loom, const char *name)
{
memset(loom, 0, sizeof(struct loom));
- if (!path_has_prefix(name, loom_prefix)) {
- err("loom name must start with '%s': %s", loom_prefix, name);
- return -1;
- }
-
if (strchr(name, '/') != NULL) {
err("loom name cannot contain '/': %s", name);
return -1;
@@ -68,6 +70,84 @@ loom_init_begin(struct loom *loom, const char *name)
return 0;
}
+/* Merges the metadata CPUs with the ones in the loom */
+static int
+load_cpus(struct loom *loom, JSON_Object *meta)
+{
+ JSON_Array *cpuarray = json_object_dotget_array(meta, "ovni.loom_cpus");
+
+ /* It may not have the CPUs defined */
+ if (cpuarray == NULL)
+ return 0;
+
+ size_t ncpus = json_array_get_count(cpuarray);
+ if (ncpus == 0) {
+ err("empty 'cpus' array in metadata");
+ return -1;
+ }
+
+ for (size_t i = 0; i < ncpus; i++) {
+ JSON_Object *jcpu = json_array_get_object(cpuarray, i);
+ if (jcpu == NULL) {
+ err("json_array_get_object() failed for cpu");
+ return -1;
+ }
+
+ /* Cast from double */
+ int index = (int) json_object_get_number(jcpu, "index");
+ int phyid = (int) json_object_get_number(jcpu, "phyid");
+
+ struct cpu *cpu = loom_find_cpu(loom, phyid);
+
+ if (cpu) {
+ /* Ensure they have the same index */
+ if (cpu->index != index) {
+ err("mismatch index in existing cpu: %d", index);
+ return -1;
+ }
+
+ /* Duplicated, ignore */
+ continue;
+ }
+
+ cpu = calloc(1, sizeof(struct cpu));
+ if (cpu == NULL) {
+ err("calloc failed:");
+ return -1;
+ }
+
+ cpu_init_begin(cpu, index, phyid, 0);
+
+ if (loom_add_cpu(loom, cpu) != 0) {
+ err("loom_add_cpu() failed");
+ return -1;
+ }
+ }
+
+ return 0;
+}
+
+/** Merges the given metadata with the one stored.
+ *
+ * It is an error to provide metadata that doesn't match with the already stored
+ * in the process.
+ *
+ * Precondition: The stream ovni.part must be "thread".
+ * Precondition: The stream version must be ok.
+ */
+int
+loom_load_metadata(struct loom *loom, struct stream *s)
+{
+ JSON_Object *meta = stream_metadata(s);
+
+ if (load_cpus(loom, meta) != 0) {
+ err("cannot load loom cpus");
+ return -1;
+ }
+
+ return 0;
+}
+
void
loom_set_gindex(struct loom *loom, int64_t gindex)
{
@@ -181,6 +261,44 @@ by_phyid(struct cpu *c1, struct cpu *c2)
return 0;
}
+int
+loom_set_rank_min(struct loom *loom)
+{
+ if (loom->rank_min != INT_MAX) {
+ err("rank_min already set");
+ return -1;
+ }
+
+ /* Ensure that all processes have a rank */
+ for (struct proc *p = loom->procs; p; p = p->hh.next) {
+ if (p->rank >= 0) {
+ loom->rank_enabled = 1;
+ break;
+ }
+ }
+
+ if (!loom->rank_enabled) {
+ dbg("loom %s has no rank information", loom->name);
+ return 0;
+ }
+
+ /* Ensure that all processes have a rank */
+ for (struct proc *p = loom->procs; p; p = p->hh.next) {
+ if (p->rank < 0) {
+ err("process %s has no rank information", p->id);
+ return -1;
+ }
+
+ /* Compute rank_min for CPU sorting */
+ if (p->rank < loom->rank_min)
+ loom->rank_min = p->rank;
+ }
+
+ dbg("loom %s has rank_min %d", loom->name, loom->rank_min);
+
+ return 0;
+}
+
void
loom_sort(struct loom *loom)
{
@@ -198,22 +316,22 @@ loom_sort(struct loom *loom)
int
loom_init_end(struct loom *loom)
{
- /* Set rank enabled */
- for (struct proc *p = loom->procs; p; p = p->hh.next) {
- if (p->rank >= 0) {
- loom->rank_enabled = 1;
- break;
- }
+ /* rank_min must be set */
+ if (loom->rank_enabled && loom->rank_min == INT_MAX) {
+ err("rank_min not set");
+ return -1;
}
- /* Ensure that all processes have a rank */
- if (loom->rank_enabled) {
- for (struct proc *p = loom->procs; p; p = p->hh.next) {
- if (p->rank < 0) {
- err("process %s has no rank information", p->id);
- return -1;
- }
- }
+ /* It is not valid to define a loom without CPUs */
+ if (loom->ncpus == 0) {
+ err("loom %s has no physical CPUs", loom->name);
+ return -1;
+ }
+
+ /* Or without processes */
+ if (loom->nprocs == 0) {
+ err("loom %s has no processes", loom->name);
+ return -1;
}
/* Populate cpus_array */
@@ -222,6 +340,7 @@ loom_init_end(struct loom *loom)
err("calloc failed:");
return -1;
}
+
for (struct cpu *c = loom->cpus; c; c = c->hh.next) {
int index = cpu_get_index(c);
if (index < 0 || (size_t) index >= loom->ncpus) {
@@ -277,34 +396,6 @@ loom_add_proc(struct loom *loom, struct proc *proc)
return -1;
}
- if (!proc->metadata_loaded) {
- err("process %d hasn't loaded metadata", pid);
- return -1;
- }
-
- if (loom->rank_enabled && proc->rank < 0) {
- err("missing rank in process %d", pid);
- return -1;
- }
-
- /* Check previous ranks if any */
- if (!loom->rank_enabled && proc->rank >= 0) {
- loom->rank_enabled = 1;
-
- for (struct proc *p = loom->procs; p; p = p->hh.next) {
- if (p->rank < 0) {
- err("missing rank in process %d", p->pid);
- return -1;
- }
-
- if (p->rank < loom->rank_min)
- loom->rank_min = p->rank;
- }
- }
-
- if (loom->rank_enabled && proc->rank < loom->rank_min)
- loom->rank_min = proc->rank;
-
HASH_ADD_INT(loom->procs, pid, proc);
loom->nprocs++;
diff --git a/src/emu/loom.h b/src/emu/loom.h
index d175cf1..93575c2 100644
--- a/src/emu/loom.h
+++ b/src/emu/loom.h
@@ -12,6 +12,7 @@
#include "cpu.h"
#include "extend.h"
struct proc;
+struct stream;
struct loom {
int64_t gindex;
@@ -50,8 +51,10 @@ struct loom {
struct extend ext;
};
-USE_RET int loom_matches(const char *relpath);
+USE_RET const char *loom_name(struct stream *s);
USE_RET int loom_init_begin(struct loom *loom, const char *name);
+USE_RET int loom_load_metadata(struct loom *loom, struct stream *s);
+USE_RET int loom_set_rank_min(struct loom *loom);
USE_RET int loom_init_end(struct loom *loom);
USE_RET int loom_add_cpu(struct loom *loom, struct cpu *cpu);
USE_RET int64_t loom_get_gindex(struct loom *loom);
diff --git a/src/emu/metadata.c b/src/emu/metadata.c
deleted file mode 100644
index 37706f8..0000000
--- a/src/emu/metadata.c
+++ /dev/null
@@ -1,159 +0,0 @@
-/* Copyright (c) 2021-2023 Barcelona Supercomputing Center (BSC)
- * SPDX-License-Identifier: GPL-3.0-or-later */
-
-#include "metadata.h"
-#include
-#include
-#include "cpu.h"
-#include "loom.h"
-#include "ovni.h"
-#include "parson.h"
-#include "proc.h"
-#include "thread.h"
-
-static JSON_Object *
-load_json(const char *path)
-{
- JSON_Value *vmeta = json_parse_file_with_comments(path);
- if (vmeta == NULL) {
- err("json_parse_file_with_comments() failed");
- return NULL;
- }
-
- JSON_Object *meta = json_value_get_object(vmeta);
- if (meta == NULL) {
- err("json_value_get_object() failed");
- return NULL;
- }
-
- return meta;
-}
-
-static int
-check_version(JSON_Object *meta)
-{
- JSON_Value *version_val = json_object_get_value(meta, "version");
- if (version_val == NULL) {
- err("missing attribute \"version\"");
- return -1;
- }
-
- int version = (int) json_number(version_val);
-
- if (version != OVNI_METADATA_VERSION) {
- err("metadata version mismatch %d (expected %d)",
- version, OVNI_METADATA_VERSION);
- return -1;
- }
-
- return 0;
-}
-
-static int
-has_cpus(JSON_Object *meta)
-{
- /* Only check for the "cpus" key, if it has zero elements is an error
- * that will be reported later */
- if (json_object_get_array(meta, "cpus") != NULL)
- return 1;
-
- return 0;
-}
-
-static int
-load_cpus(struct loom *loom, JSON_Object *meta)
-{
- JSON_Array *cpuarray = json_object_get_array(meta, "cpus");
- if (cpuarray == NULL) {
- err("cannot find 'cpus' array");
- return -1;
- }
-
- size_t ncpus = json_array_get_count(cpuarray);
- if (ncpus == 0) {
- err("empty 'cpus' array in metadata");
- return -1;
- }
-
- if (loom->ncpus > 0) {
- err("loom %s already has cpus", loom->id);
- return -1;
- }
-
- for (size_t i = 0; i < ncpus; i++) {
- JSON_Object *jcpu = json_array_get_object(cpuarray, i);
- if (jcpu == NULL) {
- err("json_array_get_object() failed for cpu");
- return -1;
- }
-
- /* Cast from double */
- int index = (int) json_object_get_number(jcpu, "index");
- int phyid = (int) json_object_get_number(jcpu, "phyid");
-
- struct cpu *cpu = calloc(1, sizeof(struct cpu));
- if (cpu == NULL) {
- err("calloc failed:");
- return -1;
- }
-
- cpu_init_begin(cpu, index, phyid, 0);
-
- if (loom_add_cpu(loom, cpu) != 0) {
- err("loom_add_cpu() failed");
- return -1;
- }
- }
-
- return 0;
-}
-
-int
-metadata_load_proc(const char *path, struct loom *loom, struct proc *proc)
-{
- JSON_Object *meta = load_json(path);
- if (meta == NULL) {
- err("cannot load proc metadata from file %s", path);
- return -1;
- }
-
- if (check_version(meta) != 0) {
- err("version check failed");
- return -1;
- }
-
- /* The appid is populated from the metadata */
- if (proc_load_metadata(proc, meta) != 0) {
- err("cannot load process attributes");
- return -1;
- }
-
- if (has_cpus(meta) && load_cpus(loom, meta) != 0) {
- err("cannot load loom cpus");
- return -1;
- }
-
- return 0;
-}
-
-int
-metadata_load_thread(const char *path, struct thread *thread)
-{
- JSON_Object *meta = load_json(path);
- if (meta == NULL) {
- err("cannot load thread metadata from file %s", path);
- return -1;
- }
-
- if (check_version(meta) != 0) {
- err("version check failed");
- return -1;
- }
-
- if (thread_load_metadata(thread, meta) != 0) {
- err("cannot load thread attributes");
- return -1;
- }
-
- return 0;
-}
diff --git a/src/emu/metadata.h b/src/emu/metadata.h
deleted file mode 100644
index 3d6e9a9..0000000
--- a/src/emu/metadata.h
+++ /dev/null
@@ -1,15 +0,0 @@
-/* Copyright (c) 2021-2023 Barcelona Supercomputing Center (BSC)
- * SPDX-License-Identifier: GPL-3.0-or-later */
-
-#ifndef METADATA_H
-#define METADATA_H
-
-#include "common.h"
-struct loom;
-struct proc;
-struct thread;
-
-USE_RET int metadata_load_proc(const char *path, struct loom *loom, struct proc *proc);
-USE_RET int metadata_load_thread(const char *path, struct thread *thread);
-
-#endif /* METADATA_H */
diff --git a/src/emu/model.c b/src/emu/model.c
index f750784..8e5a96e 100644
--- a/src/emu/model.c
+++ b/src/emu/model.c
@@ -235,13 +235,6 @@ model_finish(struct model *model, struct emu *emu)
static int
should_enable(int have[3], struct model_spec *spec, struct thread *t)
{
- static int compat = 0;
-
- /* Enable all models if we are in compatibility model. Don't check other
- * threads metadata */
- if (compat)
- return 1;
-
if (t->meta == NULL) {
err("missing metadata for thread %s", t->id);
return -1;
@@ -249,12 +242,8 @@ should_enable(int have[3], struct model_spec *spec, struct thread *t)
JSON_Object *require = json_object_dotget_object(t->meta, "ovni.require");
if (require == NULL) {
- warn("missing 'ovni.require' key in thread %s", t->id);
- warn("loading trace in compatibility mode");
- warn("all models will be enabled (expect slowdown)");
- warn("use ovni_thread_require() to enable only required models");
- compat = 1;
- return 1;
+ err("missing 'ovni.require' key in thread %s", t->id);
+ return -1;
}
/* May not have the current model */
diff --git a/src/emu/ovni/setup.c b/src/emu/ovni/setup.c
index 32b1980..c9d596a 100644
--- a/src/emu/ovni/setup.c
+++ b/src/emu/ovni/setup.c
@@ -13,6 +13,7 @@
#include "model_cpu.h"
#include "model_pvt.h"
#include "model_thread.h"
+#include "ovni.h"
#include "pv/pcf.h"
#include "pv/prv.h"
#include "system.h"
@@ -47,7 +48,7 @@ static struct ev_decl model_evlist[] = {
struct model_spec model_ovni = {
.name = model_name,
- .version = "1.1.0",
+ .version = OVNI_MODEL_VERSION,
.evlist = model_evlist,
.model = model_id,
.create = model_ovni_create,
diff --git a/src/emu/ovnisort.c b/src/emu/ovnisort.c
index 812f091..1af8205 100644
--- a/src/emu/ovnisort.c
+++ b/src/emu/ovnisort.c
@@ -361,7 +361,7 @@ execute_sort_plan(struct sortplan *sp)
static int
stream_winsort(struct stream *stream, struct ring *r)
{
- char *fn = stream->path;
+ char *fn = stream->obspath;
int fd = open(fn, O_WRONLY);
if (fd < 0)
diff --git a/src/emu/path.c b/src/emu/path.c
index 1161c2d..340a873 100644
--- a/src/emu/path.c
+++ b/src/emu/path.c
@@ -112,3 +112,47 @@ path_filename(const char *path)
return start;
}
+
+int
+path_append(char dst[PATH_MAX], const char *src, const char *extra)
+{
+ if (snprintf(dst, PATH_MAX, "%s/%s", src, extra) >= PATH_MAX) {
+ err("path too long: %s/%s", src, extra);
+ return -1;
+ }
+
+ return 0;
+}
+
+/** Copy the path src into dst. */
+int
+path_copy(char dst[PATH_MAX], const char *src)
+{
+ if (snprintf(dst, PATH_MAX, "%s", src) >= PATH_MAX) {
+ err("path too long: %s", src);
+ return -1;
+ }
+
+ return 0;
+}
+
+/** Strip last component from path */
+void
+path_dirname(char path[PATH_MAX])
+{
+ path_remove_trailing(path);
+ int n = (int) strlen(path);
+ int i;
+ for (i = n - 1; i >= 0; i--) {
+ if (path[i] == '/') {
+ break;
+ }
+ }
+ /* Remove all '/' */
+ for (; i >= 0; i--) {
+ if (path[i] != '/')
+ break;
+ else
+ path[i] = '\0';
+ }
+}
diff --git a/src/emu/path.h b/src/emu/path.h
index 1d845a7..9b66673 100644
--- a/src/emu/path.h
+++ b/src/emu/path.h
@@ -1,9 +1,10 @@
-/* Copyright (c) 2021-2023 Barcelona Supercomputing Center (BSC)
+/* Copyright (c) 2021-2024 Barcelona Supercomputing Center (BSC)
* SPDX-License-Identifier: GPL-3.0-or-later */
#ifndef PATH_H
#define PATH_H
+#include
#include "common.h"
USE_RET int path_has_prefix(const char *path, const char *prefix);
@@ -13,5 +14,8 @@ USE_RET int path_keep(char *path, int n);
USE_RET int path_strip(const char *path, int n, const char (**next));
void path_remove_trailing(char *path);
USE_RET const char *path_filename(const char *path);
+USE_RET int path_append(char dst[PATH_MAX], const char *src, const char *extra);
+USE_RET int path_copy(char dst[PATH_MAX], const char *src);
+ void path_dirname(char path[PATH_MAX]);
#endif /* PATH_H */
diff --git a/src/emu/proc.c b/src/emu/proc.c
index 7e4f363..b33a5d9 100644
--- a/src/emu/proc.c
+++ b/src/emu/proc.c
@@ -1,4 +1,4 @@
-/* Copyright (c) 2021-2023 Barcelona Supercomputing Center (BSC)
+/* Copyright (c) 2021-2024 Barcelona Supercomputing Center (BSC)
* SPDX-License-Identifier: GPL-3.0-or-later */
#include "proc.h"
@@ -6,85 +6,42 @@
#include
#include
#include "path.h"
+#include "stream.h"
#include "thread.h"
-static int
-get_pid(const char *id, int *pid)
+int
+proc_stream_get_pid(struct stream *s)
{
- /* TODO: Store the PID the metadata.json instead */
+ JSON_Object *meta = stream_metadata(s);
- /* The id must be like "loom.host01.123/proc.345" */
- if (path_count(id, '/') != 1) {
- err("proc id can only contain one '/': %s", id);
+ double pid = json_object_dotget_number(meta, "ovni.pid");
+
+ /* Zero is used for errors, so forbidden for pid too */
+ if (pid == 0) {
+ err("cannot get attribute ovni.pid for stream: %s",
+ s->relpath);
return -1;
}
- /* Get the proc.345 part */
- const char *procname;
- if (path_next(id, '/', &procname) != 0) {
- err("cannot get proc name");
- return -1;
- }
-
- /* Ensure the prefix is ok */
- const char prefix[] = "proc.";
- if (!path_has_prefix(procname, prefix)) {
- err("proc name must start with '%s': %s", prefix, id);
- return -1;
- }
-
- /* Get the 345 part */
- const char *pidstr;
- if (path_next(procname, '.', &pidstr) != 0) {
- err("cannot find proc dot in '%s'", id);
- return -1;
- }
-
- *pid = atoi(pidstr);
-
- return 0;
+ return (int) pid;
}
int
-proc_relpath_get_pid(const char *relpath, int *pid)
-{
- char id[PATH_MAX];
-
- if (snprintf(id, PATH_MAX, "%s", relpath) >= PATH_MAX) {
- err("path too long");
- return -1;
- }
-
- if (path_keep(id, 2) != 0) {
- err("cannot delimite proc dir");
- return -1;
- }
-
- return get_pid(id, pid);
-}
-
-int
-proc_init_begin(struct proc *proc, const char *relpath)
+proc_init_begin(struct proc *proc, int pid)
{
memset(proc, 0, sizeof(struct proc));
proc->gindex = -1;
+ proc->appid = 0;
+ proc->rank = -1;
+ proc->nranks = 0;
+ proc->pid = pid;
- if (snprintf(proc->id, PATH_MAX, "%s", relpath) >= PATH_MAX) {
+ if (snprintf(proc->id, PATH_MAX, "proc.%d", pid) >= PATH_MAX) {
err("path too long");
return -1;
}
- if (path_keep(proc->id, 2) != 0) {
- err("cannot delimite proc dir");
- return -1;
- }
-
- if (get_pid(proc->id, &proc->pid) != 0) {
- err("cannot parse proc pid");
- return -1;
- }
-
dbg("created proc %s", proc->id);
return 0;
@@ -102,38 +59,105 @@ proc_set_loom(struct proc *proc, struct loom *loom)
proc->loom = loom;
}
-int
-proc_load_metadata(struct proc *proc, JSON_Object *meta)
+static int
+load_appid(struct proc *proc, struct stream *s)
{
- if (proc->metadata_loaded) {
- err("process %s already loaded metadata", proc->id);
+ JSON_Object *meta = stream_metadata(s);
+ JSON_Value *appid_val = json_object_dotget_value(meta, "ovni.app_id");
+
+ /* May not be present in all thread streams */
+ if (appid_val == NULL)
+ return 0;
+
+ int appid = (int) json_number(appid_val);
+ if (proc->appid && proc->appid != appid) {
+ err("mismatch previous appid %d with stream: %s",
+ proc->appid, s->relpath);
return -1;
}
- JSON_Value *version_val = json_object_get_value(meta, "version");
- if (version_val == NULL) {
- err("missing attribute 'version' in metadata");
+ if (appid <= 0) {
+ err("appid must be >0, stream: %s", s->relpath);
return -1;
}
- proc->metadata_version = (int) json_number(version_val);
+ proc->appid = appid;
+ return 0;
+}
- JSON_Value *appid_val = json_object_get_value(meta, "app_id");
- if (appid_val == NULL) {
- err("missing attribute 'app_id' in metadata");
+static int
+load_rank(struct proc *proc, struct stream *s)
+{
+ JSON_Object *meta = stream_metadata(s);
+ JSON_Value *rank_val = json_object_dotget_value(meta, "ovni.rank");
+
+ /* Optional */
+ if (rank_val == NULL) {
+ dbg("process %s has no rank", proc->id);
+ return 0;
+ }
+
+ int rank = (int) json_number(rank_val);
+
+ if (rank < 0) {
+ err("rank %d must be >=0, stream: %s", rank, s->relpath);
return -1;
}
- proc->appid = (int) json_number(appid_val);
+ if (proc->rank >= 0 && proc->rank != rank) {
+ err("mismatch previous rank %d with stream: %s",
+ proc->rank, s->relpath);
+ return -1;
+ }
- JSON_Value *rank_val = json_object_get_value(meta, "rank");
+ /* Same with nranks, but it is not optional now */
+ JSON_Value *nranks_val = json_object_dotget_value(meta, "ovni.nranks");
+ if (nranks_val == NULL) {
+ err("missing ovni.nranks attribute: %s", s->relpath);
+ return -1;
+ }
- if (rank_val != NULL)
- proc->rank = (int) json_number(rank_val);
- else
- proc->rank = -1;
+ int nranks = (int) json_number(nranks_val);
- proc->metadata_loaded = 1;
+ if (nranks <= 0) {
+ err("nranks %d must be >0, stream: %s", nranks, s->relpath);
+ return -1;
+ }
+
+ if (proc->nranks > 0 && proc->nranks != nranks) {
+ err("mismatch previous nranks %d with stream: %s",
+ proc->nranks, s->relpath);
+ return -1;
+ }
+
+ /* Ensure rank fits in nranks */
+ if (rank >= nranks) {
+ err("rank %d must be lower than nranks %d: %s",
+ rank, nranks, s->relpath);
+ return -1;
+ }
+
+ dbg("process %s rank=%d nranks=%d",
+ proc->id, rank, nranks);
+ proc->rank = rank;
+ proc->nranks = nranks;
+
+ return 0;
+}
+
+/** Merges the metadata from the stream in the process. */
+int
+proc_load_metadata(struct proc *proc, struct stream *s)
+{
+ if (load_appid(proc, s) != 0) {
+ err("load_appid failed for stream: %s", s->relpath);
+ return -1;
+ }
+
+ if (load_rank(proc, s) != 0) {
+ err("load_rank failed for stream: %s", s->relpath);
+ return -1;
+ }
return 0;
}
@@ -197,8 +221,8 @@ proc_init_end(struct proc *proc)
return -1;
}
- if (!proc->metadata_loaded) {
- err("metadata not loaded");
+ if (proc->appid <= 0) {
+ err("appid not set");
return -1;
}
diff --git a/src/emu/proc.h b/src/emu/proc.h
index e60113b..7964bff 100644
--- a/src/emu/proc.h
+++ b/src/emu/proc.h
@@ -1,4 +1,4 @@
-/* Copyright (c) 2021-2023 Barcelona Supercomputing Center (BSC)
+/* Copyright (c) 2021-2024 Barcelona Supercomputing Center (BSC)
* SPDX-License-Identifier: GPL-3.0-or-later */
#ifndef PROC_H
@@ -8,9 +8,9 @@
#include
#include "common.h"
#include "extend.h"
-#include "parson.h"
#include "uthash.h"
struct loom;
+struct stream;
struct thread;
struct proc {
@@ -18,12 +18,11 @@ struct proc {
char id[PATH_MAX];
int is_init;
- int metadata_loaded;
- int metadata_version;
int pid;
int index;
int appid;
int rank;
+ int nranks;
int nthreads;
struct thread *threads;
@@ -45,14 +44,14 @@ struct proc {
struct extend ext;
};
-USE_RET int proc_relpath_get_pid(const char *relpath, int *pid);
-USE_RET int proc_init_begin(struct proc *proc, const char *id);
+USE_RET int proc_stream_get_pid(struct stream *s);
+USE_RET int proc_init_begin(struct proc *proc, int pid);
USE_RET int proc_init_end(struct proc *proc);
USE_RET int proc_get_pid(struct proc *proc);
void proc_set_gindex(struct proc *proc, int64_t gindex);
void proc_set_loom(struct proc *proc, struct loom *loom);
void proc_sort(struct proc *proc);
-USE_RET int proc_load_metadata(struct proc *proc, JSON_Object *meta);
+USE_RET int proc_load_metadata(struct proc *proc, struct stream *s);
USE_RET struct thread *proc_find_thread(struct proc *proc, int tid);
USE_RET int proc_add_thread(struct proc *proc, struct thread *thread);
void proc_sort(struct proc *proc);
diff --git a/src/emu/stream.c b/src/emu/stream.c
index ddfe2b6..faaf309 100644
--- a/src/emu/stream.c
+++ b/src/emu/stream.c
@@ -71,41 +71,22 @@ load_stream_fd(struct stream *stream, int fd)
return 0;
}
-int
-stream_load(struct stream *stream, const char *tracedir, const char *relpath)
+static int
+load_obs(struct stream *stream, const char *path)
{
- memset(stream, 0, sizeof(struct stream));
-
- if (snprintf(stream->path, PATH_MAX, "%s/%s", tracedir, relpath) >= PATH_MAX) {
- err("path too long: %s/%s", tracedir, relpath);
- return -1;
- }
-
- /* Allow loading a trace with empty relpath */
- path_remove_trailing(stream->path);
-
- if (snprintf(stream->relpath, PATH_MAX, "%s", relpath) >= PATH_MAX) {
- err("path too long: %s", relpath);
- return -1;
- }
-
- dbg("loading %s", stream->relpath);
-
int fd;
- if ((fd = open(stream->path, O_RDWR)) == -1) {
- err("open %s failed:", stream->path);
+ if ((fd = open(path, O_RDWR)) == -1) {
+ err("open %s failed:", path);
return -1;
}
if (load_stream_fd(stream, fd) != 0) {
- err("load_stream_fd failed for stream '%s'",
- stream->path);
+ err("load_stream_fd failed for: %s", path);
return -1;
}
if (check_stream_header(stream) != 0) {
- err("stream '%s' has bad header",
- stream->path);
+ err("stream has bad header: %s", path);
return -1;
}
@@ -132,6 +113,97 @@ stream_load(struct stream *stream, const char *tracedir, const char *relpath)
return 0;
}
+static int
+check_version(JSON_Object *meta)
+{
+ JSON_Value *version_val = json_object_get_value(meta, "version");
+ if (version_val == NULL) {
+ err("missing attribute \"version\"");
+ return -1;
+ }
+
+ int version = (int) json_number(version_val);
+
+ if (version != OVNI_METADATA_VERSION) {
+ err("metadata version mismatch %d (expected %d)",
+ version, OVNI_METADATA_VERSION);
+ return -1;
+ }
+
+ return 0;
+}
+
+static JSON_Object *
+load_json(const char *path)
+{
+ JSON_Value *vmeta = json_parse_file_with_comments(path);
+ if (vmeta == NULL) {
+ err("json_parse_file_with_comments() failed");
+ return NULL;
+ }
+
+ JSON_Object *meta = json_value_get_object(vmeta);
+ if (meta == NULL) {
+ err("json_value_get_object() failed");
+ return NULL;
+ }
+
+ if (check_version(meta) != 0) {
+ err("check_version failed");
+ return NULL;
+ }
+
+ return meta;
+}
+
+/** Loads a stream from disk.
+ *
+ * The relpath must be pointing to a directory with the stream.json and
+ * stream.obs files.
+ */
+int
+stream_load(struct stream *stream, const char *tracedir, const char *relpath)
+{
+ memset(stream, 0, sizeof(struct stream));
+
+ if (snprintf(stream->path, PATH_MAX, "%s/%s", tracedir, relpath) >= PATH_MAX) {
+ err("path too long: %s/%s", tracedir, relpath);
+ return -1;
+ }
+
+ /* Allow loading a trace with empty relpath */
+ path_remove_trailing(stream->path);
+
+ if (snprintf(stream->relpath, PATH_MAX, "%s", relpath) >= PATH_MAX) {
+ err("path too long: %s", relpath);
+ return -1;
+ }
+
+ dbg("loading %s", stream->relpath);
+
+ if (path_append(stream->jsonpath, stream->path, "stream.json") != 0) {
+ err("path_append failed");
+ return -1;
+ }
+
+ if ((stream->meta = load_json(stream->jsonpath)) == NULL) {
+ err("load_json failed for: %s", stream->jsonpath);
+ return -1;
+ }
+
+ if (path_append(stream->obspath, stream->path, "stream.obs") != 0) {
+ err("path_append failed");
+ return -1;
+ }
+
+ if (load_obs(stream, stream->obspath) != 0) {
+ err("load_obs failed");
+ return -1;
+ }
+
+ return 0;
+}
+
void
stream_data_set(struct stream *stream, void *data)
{
@@ -144,6 +216,16 @@ stream_data_get(struct stream *stream)
return stream->data;
}
+/* Is never NULL */
+JSON_Object *
+stream_metadata(struct stream *stream)
+{
+ if (stream->meta == NULL)
+ die("stream metadata is NULL: %s", stream->relpath);
+
+ return stream->meta;
+}
+
int
stream_clkoff_set(struct stream *stream, int64_t clkoff)
{
diff --git a/src/emu/stream.h b/src/emu/stream.h
index 2bbfd13..16cb60a 100644
--- a/src/emu/stream.h
+++ b/src/emu/stream.h
@@ -1,4 +1,4 @@
-/* Copyright (c) 2021-2023 Barcelona Supercomputing Center (BSC)
+/* Copyright (c) 2021-2024 Barcelona Supercomputing Center (BSC)
* SPDX-License-Identifier: GPL-3.0-or-later */
#ifndef STREAM_H
@@ -8,6 +8,7 @@
#include
#include "common.h"
#include "heap.h"
+#include "parson.h"
struct ovni_ev;
struct stream {
@@ -27,13 +28,17 @@ struct stream {
int active;
int unsorted;
- char path[PATH_MAX];
+ char path[PATH_MAX]; /* To stream dir */
char relpath[PATH_MAX]; /* To tracedir */
+ char obspath[PATH_MAX]; /* To obs file */
+ char jsonpath[PATH_MAX]; /* To json file */
int64_t usize; /* Useful size for events */
int64_t offset;
double progress;
+
+ JSON_Object *meta;
};
USE_RET int stream_load(struct stream *stream, const char *tracedir, const char *relpath);
@@ -46,5 +51,6 @@ USE_RET int64_t stream_lastclock(struct stream *stream);
void stream_allow_unsorted(struct stream *stream);
void stream_data_set(struct stream *stream, void *data);
USE_RET void *stream_data_get(struct stream *stream);
+USE_RET JSON_Object *stream_metadata(struct stream *stream);
#endif /* STREAM_H */
diff --git a/src/emu/system.c b/src/emu/system.c
index 3b30d85..86fdccc 100644
--- a/src/emu/system.c
+++ b/src/emu/system.c
@@ -11,7 +11,6 @@
#include "cpu.h"
#include "emu_args.h"
#include "loom.h"
-#include "metadata.h"
#include "proc.h"
#include "pv/prf.h"
#include "pv/pvt.h"
@@ -24,11 +23,11 @@
struct bay;
static struct thread *
-create_thread(struct proc *proc, const char *tracedir, const char *relpath)
+create_thread(struct proc *proc, struct stream *s)
{
int tid;
- if (thread_relpath_get_tid(relpath, &tid) != 0) {
- err("cannot get thread tid from %s", relpath);
+ if ((tid = thread_stream_get_tid(s)) < 0) {
+ err("cannot get thread tid from stream: %s", s->relpath);
return NULL;
}
@@ -45,21 +44,13 @@ create_thread(struct proc *proc, const char *tracedir, const char *relpath)
return NULL;
}
- if (thread_init_begin(thread, relpath) != 0) {
- err("cannot init thread");
+ if (thread_init_begin(thread, tid) != 0) {
+ err("thread_init_begin failed: %s", s->relpath);
return NULL;
}
- /* Build metadata path */
- char mpath[PATH_MAX];
- if (snprintf(mpath, PATH_MAX, "%s/%s/thread.%d.json",
- tracedir, proc->id, tid) >= PATH_MAX) {
- err("path too long");
- return NULL;
- }
-
- if (metadata_load_thread(mpath, thread) != 0) {
- err("cannot load metadata from %s", mpath);
+ if (thread_load_metadata(thread, s) != 0) {
+ err("thread_load_metadata failed: %s", s->relpath);
return NULL;
}
@@ -72,48 +63,39 @@ create_thread(struct proc *proc, const char *tracedir, const char *relpath)
}
static struct proc *
-create_proc(struct loom *loom, const char *tracedir, const char *relpath)
+create_proc(struct loom *loom, struct stream *s)
{
- int pid;
- if (proc_relpath_get_pid(relpath, &pid) != 0) {
- err("cannot get proc pid from %s", relpath);
+ int pid = proc_stream_get_pid(s);
+ if (pid < 0) {
+ err("cannot get proc pid from stream: %s", s->relpath);
return NULL;
}
struct proc *proc = loom_find_proc(loom, pid);
-
- if (proc != NULL)
- return proc;
-
- proc = malloc(sizeof(struct proc));
-
if (proc == NULL) {
- err("malloc failed:");
- return NULL;
+ /* Create a new process */
+
+ proc = malloc(sizeof(struct proc));
+
+ if (proc == NULL) {
+ err("malloc failed:");
+ return NULL;
+ }
+
+ if (proc_init_begin(proc, pid) != 0) {
+ err("proc_init_begin failed: %s", s->relpath);
+ return NULL;
+ }
+
+ if (loom_add_proc(loom, proc) != 0) {
+ err("loom_add_proc failed");
+ return NULL;
+ }
}
- if (proc_init_begin(proc, relpath) != 0) {
- err("proc_init_begin failed");
- return NULL;
- }
-
- /* Build metadata path */
- char mpath[PATH_MAX];
-
- if (snprintf(mpath, PATH_MAX, "%s/%s/metadata.json",
- tracedir, proc->id) >= PATH_MAX) {
- err("path too long");
- return NULL;
- }
-
- /* Load metadata too */
- if (metadata_load_proc(mpath, loom, proc) != 0) {
- err("cannot load metadata from %s", mpath);
- return NULL;
- }
-
- if (loom_add_proc(loom, proc) != 0) {
- err("loom_add_proc failed");
+ /* The appid is populated from the metadata */
+ if (proc_load_metadata(proc, s) != 0) {
+ err("proc_load_metadata failed");
return NULL;
}
@@ -131,16 +113,11 @@ find_loom(struct system *sys, const char *id)
}
static struct loom *
-create_loom(struct system *sys, const char *relpath)
+create_loom(struct system *sys, struct stream *s)
{
- char name[PATH_MAX];
- if (snprintf(name, PATH_MAX, "%s", relpath) >= PATH_MAX) {
- err("path too long: %s", relpath);
- return NULL;
- }
-
- if (strtok(name, "/") == NULL) {
- err("cannot find first '/': %s", relpath);
+ const char *name = loom_name(s);
+ if (name == NULL) {
+ err("loom_name failed");
return NULL;
}
@@ -163,6 +140,11 @@ create_loom(struct system *sys, const char *relpath)
sys->nlooms++;
}
+ if (loom_load_metadata(loom, s) != 0) {
+ err("loom_load_metadata failed for stream: %s", s->relpath);
+ return NULL;
+ }
+
return loom;
}
@@ -220,11 +202,33 @@ report_libovni_version(struct system *sys)
return 0;
}
+static int
+is_thread_stream(struct stream *s)
+{
+ JSON_Object *meta = stream_metadata(s);
+ if (meta == NULL) {
+ err("no metadata for stream: %s", s->relpath);
+ return -1;
+ }
+
+ /* All streams must have a ovni.part attribute */
+ const char *part_type = json_object_dotget_string(meta, "ovni.part");
+
+ if (part_type == NULL) {
+ err("cannot get attribute ovni.part for stream: %s",
+ s->relpath);
+ return -1;
+ }
+
+ if (strcmp(part_type, "thread") == 0)
+ return 1;
+
+ return 0;
+}
+
static int
create_system(struct system *sys, struct trace *trace)
{
- const char *dir = trace->tracedir;
-
/* Allocate the lpt map */
sys->lpt = calloc((size_t) trace->nstreams, sizeof(struct lpt));
if (sys->lpt == NULL) {
@@ -234,25 +238,28 @@ create_system(struct system *sys, struct trace *trace)
size_t i = 0;
for (struct stream *s = trace->streams; s ; s = s->next) {
- if (!loom_matches(s->relpath)) {
+ int ok = is_thread_stream(s);
+ if (ok < 0) {
+ err("is_thread_stream failed");
+ return -1;
+ } else if (ok == 0) {
warn("ignoring unknown stream %s", s->relpath);
continue;
}
- struct loom *loom = create_loom(sys, s->relpath);
+ struct loom *loom = create_loom(sys, s);
if (loom == NULL) {
err("create_loom failed");
return -1;
}
- /* Loads metadata too */
- struct proc *proc = create_proc(loom, dir, s->relpath);
+ struct proc *proc = create_proc(loom, s);
if (proc == NULL) {
err("create_proc failed");
return -1;
}
- struct thread *thread = create_thread(proc, dir, s->relpath);
+ struct thread *thread = create_thread(proc, s);
if (thread == NULL) {
err("create_thread failed");
return -1;
@@ -267,14 +274,6 @@ create_system(struct system *sys, struct trace *trace)
stream_data_set(s, lpt);
}
- /* Ensure all looms have at least one CPU */
- for (struct loom *l = sys->looms; l; l = l->next) {
- if (l->ncpus == 0) {
- err("loom %s has no physical CPUs", l->id);
- return -1;
- }
- }
-
return 0;
}
@@ -552,6 +551,12 @@ set_sort_criteria(struct system *sys)
int some_have = 0;
int all_have = 1;
for (struct loom *l = sys->looms; l; l = l->next) {
+ /* Set the rank_min for later sorting */
+ if (loom_set_rank_min(l) != 0) {
+ err("loom_set_rank_min failed");
+ return -1;
+ }
+
if (l->rank_enabled)
some_have = 1;
else
diff --git a/src/emu/thread.c b/src/emu/thread.c
index 72995e8..9534bb8 100644
--- a/src/emu/thread.c
+++ b/src/emu/thread.c
@@ -15,6 +15,7 @@
#include "pv/prv.h"
#include "pv/pvt.h"
#include "recorder.h"
+#include "stream.h"
#include "value.h"
struct proc;
@@ -59,75 +60,37 @@ static const struct pcf_value_label (*pcf_labels[TH_CHAN_MAX])[] = {
[TH_CHAN_STATE] = &state_name,
};
-static int
-get_tid(const char *id, int *tid)
+int
+thread_stream_get_tid(struct stream *s)
{
- /* The id must be like "loom.host01.123/proc.345/thread.567" */
- if (path_count(id, '/') != 2) {
- err("proc id can only contain two '/': %s", id);
+ JSON_Object *meta = stream_metadata(s);
+
+ double tid = json_object_dotget_number(meta, "ovni.tid");
+
+ /* Zero is used for errors, so forbidden for tid too */
+ if (tid == 0) {
+ err("cannot get attribute ovni.tid for stream: %s",
+ s->relpath);
return -1;
}
- /* Get the thread.567 part */
- const char *thname;
- if (path_strip(id, 2, &thname) != 0) {
- err("cannot get thread name");
- return -1;
- }
-
- /* Ensure the prefix is ok */
- const char prefix[] = "thread.";
- if (!path_has_prefix(thname, prefix)) {
- err("thread name must start with '%s': %s", prefix, thname);
- return -1;
- }
-
- /* Get the 567 part */
- const char *tidstr;
- if (path_next(thname, '.', &tidstr) != 0) {
- err("cannot find thread dot in '%s'", id);
- return -1;
- }
-
- char *endptr;
- errno = 0;
- *tid = (int) strtol(tidstr, &endptr, 10);
- if (errno != 0) {
- err("strtol failed for '%s':", tidstr);
- return -1;
- }
- if (endptr == tidstr) {
- err("no digits in tid string '%s'", tidstr);
- return -1;
- }
-
- return 0;
+ return (int) tid;
}
int
-thread_relpath_get_tid(const char *relpath, int *tid)
-{
- return get_tid(relpath, tid);
-}
-
-int
-thread_init_begin(struct thread *thread, const char *relpath)
+thread_init_begin(struct thread *thread, int tid)
{
memset(thread, 0, sizeof(struct thread));
thread->state = TH_ST_UNKNOWN;
thread->gindex = -1;
+ thread->tid = tid;
- if (snprintf(thread->id, PATH_MAX, "%s", relpath) >= PATH_MAX) {
+ if (snprintf(thread->id, PATH_MAX, "thread.%d", tid) >= PATH_MAX) {
err("relpath too long");
return -1;
}
- if (get_tid(thread->id, &thread->tid) != 0) {
- err("cannot parse thread tid");
- return -1;
- }
-
return 0;
}
@@ -445,20 +408,19 @@ thread_migrate_cpu(struct thread *th, struct cpu *cpu)
}
int
-thread_load_metadata(struct thread *thread, JSON_Object *meta)
+thread_load_metadata(struct thread *thread, struct stream *s)
{
- if (meta == NULL) {
- err("metadata is null");
- return -1;
- }
+ JSON_Object *meta = stream_metadata(s);
if (thread->meta != NULL) {
err("thread %s already loaded metadata", thread->id);
return -1;
}
- if (json_object_dotget_number(meta, "ovni.finished") != 1)
- warn("thread didn't finish properly: %s", thread->id);
+ if (json_object_dotget_number(meta, "ovni.finished") != 1) {
+ err("missing ovni.finished: %s", s->relpath);
+ return -1;
+ }
thread->meta = meta;
diff --git a/src/emu/thread.h b/src/emu/thread.h
index dd0ac46..4e24d71 100644
--- a/src/emu/thread.h
+++ b/src/emu/thread.h
@@ -20,6 +20,7 @@ struct pcf;
struct proc;
struct recorder;
struct value;
+struct stream;
/* Emulated thread runtime status */
enum thread_state {
@@ -82,10 +83,10 @@ struct thread {
UT_hash_handle hh; /* threads in the process */
};
-USE_RET int thread_relpath_get_tid(const char *relpath, int *tid);
-USE_RET int thread_init_begin(struct thread *thread, const char *relpath);
+USE_RET int thread_stream_get_tid(struct stream *s);
+USE_RET int thread_init_begin(struct thread *thread, int tid);
USE_RET int thread_init_end(struct thread *thread);
-USE_RET int thread_load_metadata(struct thread *thread, JSON_Object *meta);
+USE_RET int thread_load_metadata(struct thread *thread, struct stream *s);
USE_RET int thread_set_state(struct thread *th, enum thread_state state);
USE_RET int thread_set_cpu(struct thread *th, struct cpu *cpu);
USE_RET int thread_unset_cpu(struct thread *th);
diff --git a/src/emu/trace.c b/src/emu/trace.c
index 4b8bd49..41bd938 100644
--- a/src/emu/trace.c
+++ b/src/emu/trace.c
@@ -27,7 +27,7 @@ add_stream(struct trace *trace, struct stream *stream)
}
static int
-load_stream(struct trace *trace, const char *path)
+load_stream(struct trace *trace, const char *json_path)
{
struct stream *stream = calloc(1, sizeof(struct stream));
@@ -36,6 +36,14 @@ load_stream(struct trace *trace, const char *path)
return -1;
}
+ /* The json_path must end in .../stream.json, so remove it */
+ char path[PATH_MAX];
+ if (path_copy(path, json_path) != 0) {
+ err("path_copy failed");
+ return -1;
+ }
+ path_dirname(path);
+
int offset = (int) strlen(trace->tracedir);
const char *relpath = path + offset;
@@ -53,46 +61,16 @@ load_stream(struct trace *trace, const char *path)
}
static int
-has_suffix(const char *str, const char *suffix)
+is_stream(const char *fpath)
{
- if (!str || !suffix)
- return 0;
+ const char *filename = path_filename(fpath);
- int lenstr = (int) strlen(str);
- int lensuffix = (int) strlen(suffix);
-
- if (lensuffix > lenstr)
- return 0;
-
- const char *p = str + lenstr - lensuffix;
- if (strncmp(p, suffix, (size_t) lensuffix) == 0)
+ if (strcmp(filename, "stream.json") == 0)
return 1;
return 0;
}
-static int
-is_stream(const char *fpath)
-{
- if (has_suffix(fpath, OVNI_STREAM_EXT))
- return 1;
-
- /* For compatibility load the old streams too */
- const char *filename = path_filename(fpath);
-
- const char prefix[] = "thread.";
- if (!path_has_prefix(filename, prefix))
- return 0;
-
- const char *tid = filename + strlen(prefix);
- for (int i = 0; tid[i]; i++) {
- if (tid[i] < '0' || tid[i] > '9')
- return 0;
- }
-
- return 1;
-}
-
static int
cb_nftw(const char *fpath, const struct stat *sb,
int typeflag, struct FTW *ftwbuf)
@@ -127,6 +105,10 @@ trace_load(struct trace *trace, const char *tracedir)
return -1;
}
+ /* Remove trailing slashes from tracedir */
+ path_remove_trailing(trace->tracedir);
+ tracedir = trace->tracedir;
+
/* Try to open the directory to catch permission errors */
DIR *dir = opendir(tracedir);
if (dir == NULL) {
diff --git a/src/rt/ovni.c b/src/rt/ovni.c
index 92a3b98..424cefb 100644
--- a/src/rt/ovni.c
+++ b/src/rt/ovni.c
@@ -5,6 +5,8 @@
#include
#include
#include
+#include
+#include
#include
#include
#include
@@ -17,6 +19,21 @@
#include "ovni.h"
#include "parson.h"
#include "version.h"
+#include "utlist.h"
+
+enum {
+ ST_UNINIT = 0,
+ ST_INIT,
+ ST_READY,
+ ST_GONE,
+};
+
+struct ovni_rcpu {
+ int index;
+ int phyid;
+ struct ovni_rcpu *next;
+ struct ovni_rcpu *prev;
+};
/* State of each thread on runtime */
struct ovni_rthread {
@@ -35,6 +52,16 @@ struct ovni_rthread {
/* Buffer to write events */
uint8_t *evbuf;
+ struct ovni_rcpu *cpus;
+
+ int rank_set;
+ int rank;
+ int nranks;
+
+ /* Where the stream dir is finally copied */
+ char thdir_final[PATH_MAX];
+ char thdir[PATH_MAX];
+
JSON_Value *meta;
};
@@ -54,10 +81,9 @@ struct ovni_rproc {
int app;
int pid;
char loom[OVNI_MAX_HOSTNAME];
- int ncpus;
clockid_t clockid;
- int ready;
+ atomic_int st;
JSON_Value *meta;
};
@@ -104,17 +130,40 @@ void ovni_version_check_str(const char *version)
/* Ignore the patch number */
}
+/* Create dir $procdir/thread.$tid and return it in path. */
+static void
+mkdir_thread(char *path, const char *procdir, int tid)
+{
+ if (snprintf(path, PATH_MAX, "%s/thread.%d",
+ procdir, tid) >= PATH_MAX) {
+ die("path too long: %s/thread.%d", procdir, tid);
+ }
+
+ if (mkpath(path, 0755, /* subdir */ 1))
+ die("mkpath %s failed:", path);
+}
+
+static void
+create_thread_dir(int tid)
+{
+ /* The procdir must have been created earlier */
+ mkdir_thread(rthread.thdir, rproc.procdir, tid);
+ if (rproc.move_to_final)
+ mkdir_thread(rthread.thdir_final, rproc.procdir_final, tid);
+}
+
static void
create_trace_stream(void)
{
char path[PATH_MAX];
- int written = snprintf(path, PATH_MAX, "%s/thread.%d%s",
- rproc.procdir, rthread.tid, OVNI_STREAM_EXT);
+ int written = snprintf(path, PATH_MAX, "%s/thread.%d/stream.obs",
+ rproc.procdir, rthread.tid);
- if (written >= PATH_MAX)
- die("thread trace path too long: %s/thread.%d%s",
- rproc.procdir, rthread.tid, OVNI_STREAM_EXT);
+ if (written >= PATH_MAX) {
+ die("path too long: %s/thread.%d/stream.obs",
+ rproc.procdir, rthread.tid);
+ }
rthread.streamfd = open(path, O_WRONLY | O_CREAT, 0644);
@@ -122,31 +171,6 @@ create_trace_stream(void)
die("open %s failed:", path);
}
-static void
-proc_metadata_init(struct ovni_rproc *proc)
-{
- proc->meta = json_value_init_object();
-
- if (proc->meta == NULL)
- die("failed to create metadata JSON object");
-}
-
-static void
-proc_metadata_store(JSON_Value *meta, const char *procdir)
-{
- char path[PATH_MAX];
-
- if (meta == NULL)
- die("process metadata not initialized");
-
- if (snprintf(path, PATH_MAX, "%s/metadata.json", procdir) >= PATH_MAX)
- die("metadata path too long: %s/metadata.json",
- procdir);
-
- if (json_serialize_to_file_pretty(meta, path) != JSONSuccess)
- die("failed to write process metadata");
-}
-
void
ovni_add_cpu(int index, int phyid)
{
@@ -156,101 +180,34 @@ ovni_add_cpu(int index, int phyid)
if (phyid < 0)
die("cannot use negative CPU id %d", phyid);
- if (!rproc.ready)
- die("process not yet initialized");
+ if (atomic_load(&rproc.st) != ST_READY)
+ die("process not ready");
- if (rproc.meta == NULL)
- die("metadata not initialized");
+ if (!rthread.ready)
+ die("thread not yet initialized");
- JSON_Object *meta = json_value_get_object(rproc.meta);
-
- if (meta == NULL)
- die("json_value_get_object() failed");
-
- int first_time = 0;
-
- /* Find the CPU array and create it if needed */
- JSON_Array *cpuarray = json_object_dotget_array(meta, "cpus");
-
- if (cpuarray == NULL) {
- JSON_Value *value = json_value_init_array();
- if (value == NULL)
- die("json_value_init_array() failed");
-
- cpuarray = json_array(value);
- if (cpuarray == NULL)
- die("json_array() failed");
-
- first_time = 1;
- }
-
- JSON_Value *valcpu = json_value_init_object();
- if (valcpu == NULL)
- die("json_value_init_object() failed");
-
- JSON_Object *cpu = json_object(valcpu);
+ struct ovni_rcpu *cpu = malloc(sizeof(*cpu));
if (cpu == NULL)
- die("json_object() failed");
+ die("malloc failed:");
- if (json_object_set_number(cpu, "index", index) != 0)
- die("json_object_set_number() failed");
+ cpu->index = index;
+ cpu->phyid = phyid;
- if (json_object_set_number(cpu, "phyid", phyid) != 0)
- die("json_object_set_number() failed");
-
- if (json_array_append_value(cpuarray, valcpu) != 0)
- die("json_array_append_value() failed");
-
- if (first_time) {
- JSON_Value *value = json_array_get_wrapping_value(cpuarray);
- if (value == NULL)
- die("json_array_get_wrapping_value() failed");
-
- if (json_object_set_value(meta, "cpus", value) != 0)
- die("json_object_set_value failed");
- }
-}
-
-static void
-proc_set_app(int appid)
-{
- JSON_Object *meta = json_value_get_object(rproc.meta);
-
- if (meta == NULL)
- die("json_value_get_object failed");
-
- if (json_object_set_number(meta, "app_id", appid) != 0)
- die("json_object_set_number for app_id failed");
-}
-
-static void
-proc_set_version(void)
-{
- JSON_Object *meta = json_value_get_object(rproc.meta);
-
- if (meta == NULL)
- die("json_value_get_object failed");
-
- if (json_object_set_number(meta, "version", OVNI_METADATA_VERSION) != 0)
- die("json_object_set_number for version failed");
+ DL_APPEND(rthread.cpus, cpu);
}
void
ovni_proc_set_rank(int rank, int nranks)
{
- if (!rproc.ready)
- die("process not yet initialized");
+ if (atomic_load(&rproc.st) != ST_READY)
+ die("process not ready");
- JSON_Object *meta = json_value_get_object(rproc.meta);
+ if (!rthread.ready)
+ die("thread not yet initialized");
- if (meta == NULL)
- die("json_value_get_object failed");
-
- if (json_object_set_number(meta, "rank", rank) != 0)
- die("json_object_set_number for rank failed");
-
- if (json_object_set_number(meta, "nranks", nranks) != 0)
- die("json_object_set_number for nranks failed");
+ rthread.rank_set = 1;
+ rthread.rank = rank;
+ rthread.nranks = nranks;
}
/* Create $tracedir/loom.$loom/proc.$pid and return it in path. */
@@ -292,10 +249,19 @@ create_proc_dir(const char *loom, int pid)
void
ovni_proc_init(int app, const char *loom, int pid)
{
- if (rproc.ready)
- die("pid %d already initialized", pid);
+ /* Protect against two threads calling at the same time */
+ int st = ST_UNINIT;
+ bool was_uninit = atomic_compare_exchange_strong(&rproc.st,
+ &st, ST_INIT);
- memset(&rproc, 0, sizeof(rproc));
+ if (!was_uninit) {
+ if (st == ST_INIT)
+ die("pid %d already being initialized", pid);
+ else if (st == ST_READY)
+ die("pid %d already initialized", pid);
+ else if (st == ST_GONE)
+ die("pid %d has finished, cannot init again", pid);
+ }
if (strlen(loom) >= OVNI_MAX_HOSTNAME)
die("loom name too long: %s", loom);
@@ -307,17 +273,13 @@ ovni_proc_init(int app, const char *loom, int pid)
create_proc_dir(loom, pid);
- proc_metadata_init(&rproc);
-
- rproc.ready = 1;
-
- proc_set_version();
- proc_set_app(app);
+ atomic_store(&rproc.st, ST_READY);
}
static int
move_thread_to_final(const char *src, const char *dst)
{
+ info("moving src=%s to dst=%s", src, dst);
char buffer[1024];
FILE *infile = fopen(src, "r");
@@ -350,38 +312,38 @@ move_thread_to_final(const char *src, const char *dst)
}
static void
-move_procdir_to_final(const char *procdir, const char *procdir_final)
+move_thdir_to_final(const char *thdir, const char *thdir_final)
{
DIR *dir;
int ret = 0;
- if ((dir = opendir(procdir)) == NULL) {
- err("opendir %s failed:", procdir);
+ if ((dir = opendir(thdir)) == NULL) {
+ err("opendir %s failed:", thdir);
return;
}
struct dirent *dirent;
- const char *prefix = "thread.";
+ const char *prefix = "stream.";
while ((dirent = readdir(dir)) != NULL) {
- /* It should only contain thread.* directories, skip others */
+ /* It should only contain stream.* directories, skip others */
if (strncmp(dirent->d_name, prefix, strlen(prefix)) != 0)
continue;
char thread[PATH_MAX];
- if (snprintf(thread, PATH_MAX, "%s/%s", procdir,
+ if (snprintf(thread, PATH_MAX, "%s/%s", thdir,
dirent->d_name)
>= PATH_MAX) {
- err("snprintf: path too large: %s/%s", procdir,
+ err("snprintf: path too large: %s/%s", thdir,
dirent->d_name);
ret = 1;
continue;
}
char thread_final[PATH_MAX];
- if (snprintf(thread_final, PATH_MAX, "%s/%s", procdir_final,
+ if (snprintf(thread_final, PATH_MAX, "%s/%s", thdir_final,
dirent->d_name)
>= PATH_MAX) {
- err("snprintf: path too large: %s/%s", procdir_final,
+ err("snprintf: path too large: %s/%s", thdir_final,
dirent->d_name);
ret = 1;
continue;
@@ -395,7 +357,7 @@ move_procdir_to_final(const char *procdir, const char *procdir_final)
/* Warn the user, but we cannot do much at this point */
if (ret)
- err("errors occurred when moving the trace to %s", procdir_final);
+ err("errors occurred when moving the thread dir to %s", thdir_final);
}
static void
@@ -409,20 +371,18 @@ try_clean_dir(const char *dir)
void
ovni_proc_fini(void)
{
- if (!rproc.ready)
- die("process not initialized");
+ /* Protect against two threads calling at the same time */
+ int st = ST_READY;
+ bool was_ready = atomic_compare_exchange_strong(&rproc.st,
+ &st, ST_GONE);
- /* Mark the process no longer ready */
- rproc.ready = 0;
+ if (!was_ready)
+ die("process not ready");
if (rproc.move_to_final) {
- proc_metadata_store(rproc.meta, rproc.procdir_final);
- move_procdir_to_final(rproc.procdir, rproc.procdir_final);
try_clean_dir(rproc.procdir);
try_clean_dir(rproc.loomdir);
try_clean_dir(rproc.tmpdir);
- } else {
- proc_metadata_store(rproc.meta, rproc.procdir);
}
}
@@ -465,11 +425,11 @@ static void
thread_metadata_store(void)
{
char path[PATH_MAX];
- int written = snprintf(path, PATH_MAX, "%s/thread.%d.json",
+ int written = snprintf(path, PATH_MAX, "%s/thread.%d/stream.json",
rproc.procdir, rthread.tid);
if (written >= PATH_MAX)
- die("thread trace path too long: %s/thread.%d.json",
+ die("thread trace path too long: %s/thread.%d/stream.json",
rproc.procdir, rthread.tid);
if (json_serialize_to_file_pretty(rthread.meta, path) != JSONSuccess)
@@ -530,6 +490,21 @@ thread_metadata_populate(void)
if (json_object_dotset_string(meta, "ovni.lib.commit", OVNI_GIT_COMMIT) != 0)
die("json_object_dotset_string failed");
+
+ if (json_object_dotset_string(meta, "ovni.part", "thread") != 0)
+ die("json_object_dotset_string failed");
+
+ if (json_object_dotset_number(meta, "ovni.tid", (double) rthread.tid) != 0)
+ die("json_object_dotset_number failed");
+
+ if (json_object_dotset_number(meta, "ovni.pid", (double) rproc.pid) != 0)
+ die("json_object_dotset_number failed");
+
+ if (json_object_dotset_string(meta, "ovni.loom", rproc.loom) != 0)
+ die("json_object_dotset_string failed");
+
+ if (json_object_dotset_number(meta, "ovni.app_id", rproc.app) != 0)
+ die("json_object_dotset_number for ovni.app_id failed");
}
static void
@@ -560,8 +535,8 @@ ovni_thread_init(pid_t tid)
if (tid == 0)
die("cannot use tid=%d", tid);
- if (!rproc.ready)
- die("process not yet initialized");
+ if (atomic_load(&rproc.st) != ST_READY)
+ die("process not ready");
memset(&rthread, 0, sizeof(rthread));
@@ -572,12 +547,59 @@ ovni_thread_init(pid_t tid)
if (rthread.evbuf == NULL)
die("malloc failed:");
+ create_thread_dir(tid);
create_trace_stream();
write_stream_header();
thread_metadata_init();
rthread.ready = 1;
+
+ ovni_thread_require("ovni", OVNI_MODEL_VERSION);
+}
+
+static void
+set_thread_rank(JSON_Object *meta)
+{
+ if (json_object_dotset_number(meta, "ovni.rank", rthread.rank) != 0)
+ die("json_object_set_number for rank failed");
+
+ if (json_object_dotset_number(meta, "ovni.nranks", rthread.nranks) != 0)
+ die("json_object_set_number for nranks failed");
+}
+
+static void
+set_thread_cpus(JSON_Object *meta)
+{
+ JSON_Value *value = json_value_init_array();
+ if (value == NULL)
+ die("json_value_init_array() failed");
+
+ JSON_Array *cpuarray = json_array(value);
+ if (cpuarray == NULL)
+ die("json_array() failed");
+
+ for (struct ovni_rcpu *c = rthread.cpus; c; c = c->next) {
+ JSON_Value *valcpu = json_value_init_object();
+ if (valcpu == NULL)
+ die("json_value_init_object() failed");
+
+ JSON_Object *cpu = json_object(valcpu);
+ if (cpu == NULL)
+ die("json_object() failed");
+
+ if (json_object_set_number(cpu, "index", c->index) != 0)
+ die("json_object_set_number() failed");
+
+ if (json_object_set_number(cpu, "phyid", c->phyid) != 0)
+ die("json_object_set_number() failed");
+
+ if (json_array_append_value(cpuarray, valcpu) != 0)
+ die("json_array_append_value() failed");
+ }
+
+ if (json_object_dotset_value(meta, "ovni.loom_cpus", value) != 0)
+ die("json_object_dotset_value failed");
}
void
@@ -594,6 +616,14 @@ ovni_thread_free(void)
if (meta == NULL)
die("json_value_get_object failed");
+ if (rthread.rank_set)
+ set_thread_rank(meta);
+
+ /* It can happen there are no CPUs defined if there is another
+ * process in the loom that defines them. */
+ if (rthread.cpus)
+ set_thread_cpus(meta);
+
/* Mark it finished so we can detect partial streams */
if (json_object_dotset_number(meta, "ovni.finished", 1) != 0)
die("json_object_dotset_string failed");
@@ -606,8 +636,14 @@ ovni_thread_free(void)
close(rthread.streamfd);
rthread.streamfd = -1;
- rthread.ready = 0;
+ if (rproc.move_to_final) {
+ /* The dir rthread.thdir_final must exist in the FS */
+ move_thdir_to_final(rthread.thdir, rthread.thdir_final);
+ try_clean_dir(rthread.thdir);
+ }
+
rthread.finished = 1;
+ rthread.ready = 0;
}
int
@@ -734,8 +770,8 @@ ovni_flush(void)
if (!rthread.ready)
die("thread is not initialized");
- if (!rproc.ready)
- die("process is not initialized");
+ if (atomic_load(&rproc.st) != ST_READY)
+ die("process not ready");
ovni_ev_set_clock(&pre, ovni_clock_now());
ovni_ev_set_mcv(&pre, "OF[");
diff --git a/test/emu/common/instr.h b/test/emu/common/instr.h
index 9ac5114..12061f1 100644
--- a/test/emu/common/instr.h
+++ b/test/emu/common/instr.h
@@ -102,9 +102,11 @@ instr_start(int rank, int nranks)
ovni_version_check();
ovni_proc_init(1, rankname, getpid());
- ovni_proc_set_rank(rank, nranks);
ovni_thread_init(get_tid());
+ if (nranks > 0)
+ ovni_proc_set_rank(rank, nranks);
+
/* All ranks inform CPUs */
for (int i = 0; i < nranks; i++)
ovni_add_cpu(i, i);
diff --git a/test/emu/nosv/multiple-segment.c b/test/emu/nosv/multiple-segment.c
index 3285fcb..8e2aee9 100644
--- a/test/emu/nosv/multiple-segment.c
+++ b/test/emu/nosv/multiple-segment.c
@@ -27,11 +27,12 @@ main(void)
int app = rank / N;
char loom[128];
- if (snprintf(loom, 128, "loom.%04d", app) >= 128)
+ if (snprintf(loom, 128, "node0.%04d", app) >= 128)
die("snprintf failed");
ovni_proc_init(1 + app, loom, getpid());
ovni_proc_set_rank(rank, nranks);
+ ovni_thread_init(get_tid());
/* Leader of the segment, must emit CPUs */
if (rank % N == 0) {
@@ -39,13 +40,13 @@ main(void)
for (int i = 0; i < N; i++) {
cpus[i] = app * N + i;
ovni_add_cpu(i, cpus[i]);
+ info("adding cpu %d to rank %d", i, rank);
}
}
int nlooms = nranks / N;
int lcpu = rank % N;
- ovni_thread_init(get_tid());
instr_require("ovni");
instr_nosv_init();
instr_thread_execute(lcpu, -1, 0);
diff --git a/test/emu/ovni/CMakeLists.txt b/test/emu/ovni/CMakeLists.txt
index e011ce2..0bd5b7b 100644
--- a/test/emu/ovni/CMakeLists.txt
+++ b/test/emu/ovni/CMakeLists.txt
@@ -11,6 +11,7 @@ test_emu(sort-first-and-full-ring.c SORT
SHOULD_FAIL REGEX "cannot find a event previous to clock")
test_emu(burst-stats.c REGEX "burst stats: median/avg/max = 33/ 33/ 33 ns")
test_emu(mp-simple.c MP)
+test_emu(merge-cpus-loom.c MP)
test_emu(version-good.c)
test_emu(version-bad.c SHOULD_FAIL REGEX "incompatible .* version")
test_emu(clockgate.c MP SHOULD_FAIL REGEX "detected large clock gate")
@@ -18,11 +19,11 @@ test_emu(no-cpus.c SHOULD_FAIL REGEX "loom .* has no physical CPUs")
test_emu(sort-cpus-by-loom.c MP)
test_emu(sort-cpus-by-rank.c MP)
test_emu(tracedir-subdir.c MP DRIVER "tracedir-subdir.driver.sh")
-test_emu(empty-stream.c SHOULD_FAIL REGEX "model_ovni_finish: thread .* is not dead")
+test_emu(empty-stream.c SHOULD_FAIL REGEX "missing ovni.finished")
test_emu(require-bad-version.c SHOULD_FAIL REGEX "unsupported ovni model version (want 666.66.6, have .*)")
-test_emu(require-compat.c REGEX "loading trace in compatibility mode")
+test_emu(require-compat.c)
test_emu(require-repeated.c)
-test_emu(thread-crash.c SHOULD_FAIL REGEX "incomplete stream")
+test_emu(thread-crash.c SHOULD_FAIL REGEX "missing ovni.finished")
test_emu(thread-free-isready.c)
test_emu(flush-tmpdir.c MP DRIVER "flush-tmpdir.driver.sh")
test_emu(tmpdir-metadata.c MP DRIVER "tmpdir-metadata.driver.sh")
diff --git a/test/emu/ovni/clockgate.c b/test/emu/ovni/clockgate.c
index 66ed98e..24c4152 100644
--- a/test/emu/ovni/clockgate.c
+++ b/test/emu/ovni/clockgate.c
@@ -37,8 +37,8 @@ start_delayed(int rank, int nranks)
ovni_version_check();
ovni_proc_init(1, rankname, getpid());
- ovni_proc_set_rank(rank, nranks);
ovni_thread_init(get_tid());
+ ovni_proc_set_rank(rank, nranks);
instr_require("ovni");
/* All ranks inform CPUs */
diff --git a/test/emu/ovni/merge-cpus-loom.c b/test/emu/ovni/merge-cpus-loom.c
new file mode 100644
index 0000000..7d6f950
--- /dev/null
+++ b/test/emu/ovni/merge-cpus-loom.c
@@ -0,0 +1,52 @@
+/* Copyright (c) 2024 Barcelona Supercomputing Center (BSC)
+ * SPDX-License-Identifier: GPL-3.0-or-later */
+
+#include
+#include "compat.h"
+#include "instr.h"
+
+/* Ensure we can emit CPUs from multiple threads of the same loom */
+
+static inline void
+start(int rank, int nranks)
+{
+ char hostname[OVNI_MAX_HOSTNAME];
+
+ if (gethostname(hostname, OVNI_MAX_HOSTNAME) != 0)
+ die("gethostname failed");
+
+ ovni_version_check();
+
+ /* Only one loom */
+ ovni_proc_init(1, hostname, getpid());
+ ovni_thread_init(get_tid());
+ ovni_proc_set_rank(rank, nranks);
+
+ /* Only emit a subset of CPUs up to the rank number */
+ for (int i = 0; i <= rank; i++)
+ ovni_add_cpu(i, i);
+
+ int curcpu = rank;
+
+ dbg("thread %d has cpu %d (ncpus=%d)",
+ get_tid(), curcpu, nranks);
+
+ instr_require("ovni");
+ instr_thread_execute(curcpu, -1, 0);
+}
+
+
+int
+main(void)
+{
+ int rank = atoi(getenv("OVNI_RANK"));
+ int nranks = atoi(getenv("OVNI_NRANKS"));
+
+ start(rank, nranks);
+
+ sleep_us(50 * 1000);
+
+ instr_end();
+
+ return 0;
+}
diff --git a/test/emu/ovni/sort-cpus-by-loom.c b/test/emu/ovni/sort-cpus-by-loom.c
index 7fe4f3f..c143a57 100644
--- a/test/emu/ovni/sort-cpus-by-loom.c
+++ b/test/emu/ovni/sort-cpus-by-loom.c
@@ -1,4 +1,4 @@
-/* Copyright (c) 2023 Barcelona Supercomputing Center (BSC)
+/* Copyright (c) 2023-2024 Barcelona Supercomputing Center (BSC)
* SPDX-License-Identifier: GPL-3.0-or-later */
#include
@@ -35,6 +35,7 @@ main(void)
for (int i = 0; i < N; i++)
ovni_add_cpu(i, cpus[i]);
+ instr_require("ovni");
instr_thread_execute(-1, -1, 0);
instr_end();
diff --git a/test/emu/ovni/sort-cpus-by-rank.c b/test/emu/ovni/sort-cpus-by-rank.c
index b919cb7..ecb036e 100644
--- a/test/emu/ovni/sort-cpus-by-rank.c
+++ b/test/emu/ovni/sort-cpus-by-rank.c
@@ -1,4 +1,4 @@
-/* Copyright (c) 2023 Barcelona Supercomputing Center (BSC)
+/* Copyright (c) 2023-2024 Barcelona Supercomputing Center (BSC)
* SPDX-License-Identifier: GPL-3.0-or-later */
#include
@@ -36,6 +36,7 @@ main(void)
for (int i = 0; i < N; i++)
ovni_add_cpu(i, cpus[i]);
+ instr_require("ovni");
instr_thread_execute(-1, -1, 0);
instr_end();
diff --git a/test/emu/ovni/tmpdir-metadata.driver.sh b/test/emu/ovni/tmpdir-metadata.driver.sh
index 9c7a8f2..3c855a7 100644
--- a/test/emu/ovni/tmpdir-metadata.driver.sh
+++ b/test/emu/ovni/tmpdir-metadata.driver.sh
@@ -8,9 +8,9 @@ test_files() {
test -e "$dst"
test -e "$dst/loom.node.1"
test -e "$dst/loom.node.1/proc.123"
- test -e "$dst/loom.node.1/proc.123/metadata.json"
- test -e "$dst/loom.node.1/proc.123/thread.123.json"
- test -e "$dst/loom.node.1/proc.123/thread.123.obs"
+ test -e "$dst/loom.node.1/proc.123/thread.123"
+ test -e "$dst/loom.node.1/proc.123/thread.123/stream.json"
+ test -e "$dst/loom.node.1/proc.123/thread.123/stream.obs"
}
test_no_files() {
@@ -18,9 +18,9 @@ test_no_files() {
test '!' -e "$dst"
test '!' -e "$dst/loom.node.1"
test '!' -e "$dst/loom.node.1/proc.123"
- test '!' -e "$dst/loom.node.1/proc.123/metadata.json"
- test '!' -e "$dst/loom.node.1/proc.123/thread.123.json"
- test '!' -e "$dst/loom.node.1/proc.123/thread.123.obs"
+ test '!' -e "$dst/loom.node.1/proc.123/thread.123"
+ test '!' -e "$dst/loom.node.1/proc.123/thread.123/stream.json"
+ test '!' -e "$dst/loom.node.1/proc.123/thread.123/stream.obs"
}
# Test setting OVNI_TMPDIR
diff --git a/test/rt/nanos6/spawn-task-external.c b/test/rt/nanos6/spawn-task-external.c
index 5710148..c9b7f0a 100644
--- a/test/rt/nanos6/spawn-task-external.c
+++ b/test/rt/nanos6/spawn-task-external.c
@@ -1,4 +1,4 @@
-/* Copyright (c) 2021-2023 Barcelona Supercomputing Center (BSC)
+/* Copyright (c) 2021-2024 Barcelona Supercomputing Center (BSC)
* SPDX-License-Identifier: GPL-3.0-or-later */
/* Spawn a task from an external thread that calls some nanos6
@@ -67,7 +67,7 @@ instr_thread_start(int32_t cpu, int32_t creator_tid, uint64_t tag)
ovni_payload_add(&ev, (uint8_t *) &tag, sizeof(tag));
ovni_ev_emit(&ev);
- /* Flush the events to disk before killing the thread */
+ /* Flush the events to disk after creating the thread */
ovni_flush();
}
@@ -82,6 +82,9 @@ instr_thread_end(void)
/* Flush the events to disk before killing the thread */
ovni_flush();
+
+ /* Finish the thread */
+ ovni_thread_free();
}
/* Call the nanos6_spawn_function from an external thread */
diff --git a/test/unit/CMakeLists.txt b/test/unit/CMakeLists.txt
index 8bbe3f2..efdbd6a 100644
--- a/test/unit/CMakeLists.txt
+++ b/test/unit/CMakeLists.txt
@@ -17,7 +17,6 @@ unit_test(mux.c)
unit_test(prv.c)
unit_test(stream.c)
unit_test(task.c)
-unit_test(thread.c)
unit_test(value.c)
unit_test(version.c)
unit_test(path.c)
diff --git a/test/unit/cpu.c b/test/unit/cpu.c
index b39b12b..9de24fc 100644
--- a/test/unit/cpu.c
+++ b/test/unit/cpu.c
@@ -1,4 +1,4 @@
-/* Copyright (c) 2021-2023 Barcelona Supercomputing Center (BSC)
+/* Copyright (c) 2021-2024 Barcelona Supercomputing Center (BSC)
* SPDX-License-Identifier: GPL-3.0-or-later */
#include "emu/cpu.h"
@@ -27,18 +27,16 @@ test_oversubscription(void)
struct proc proc;
- if (proc_init_begin(&proc, "loom.0/proc.0") != 0)
+ if (proc_init_begin(&proc, 1) != 0)
die("proc_init_begin failed");
+ proc.appid = 1;
+
proc_set_gindex(&proc, 0);
- /* FIXME: We shouldn't need to recreate a full process to test the CPU
- * affinity rules */
- proc.metadata_loaded = 1;
-
struct thread th0, th1;
- if (thread_init_begin(&th0, "loom.0/proc.0/thread.0.obs") != 0)
+ if (thread_init_begin(&th0, 1) != 0)
die("thread_init_begin failed");
thread_set_gindex(&th0, 0);
@@ -47,7 +45,7 @@ test_oversubscription(void)
if (thread_init_end(&th0) != 0)
die("thread_init_end failed");
- if (thread_init_begin(&th1, "loom.1/proc.1/thread.1.obs") != 0)
+ if (thread_init_begin(&th1, 2) != 0)
die("thread_init_begin failed");
thread_set_gindex(&th1, 1);
diff --git a/test/unit/loom.c b/test/unit/loom.c
index 6283f90..d2fa8cc 100644
--- a/test/unit/loom.c
+++ b/test/unit/loom.c
@@ -1,4 +1,4 @@
-/* Copyright (c) 2021-2023 Barcelona Supercomputing Center (BSC)
+/* Copyright (c) 2021-2024 Barcelona Supercomputing Center (BSC)
* SPDX-License-Identifier: GPL-3.0-or-later */
#include "emu/loom.h"
@@ -8,19 +8,17 @@
#include "emu/proc.h"
#include "unittest.h"
-char testloom[] = "loom.0";
-char testproc[] = "loom.0/proc.1";
+char testloom[] = "node1";
+int testproc = 1;
static void
test_bad_name(struct loom *loom)
{
- ERR(loom_init_begin(loom, "blah"));
ERR(loom_init_begin(loom, "loom/blah"));
- ERR(loom_init_begin(loom, "loom.123/testloom"));
- ERR(loom_init_begin(loom, "loom.123/"));
ERR(loom_init_begin(loom, "/loom.123"));
ERR(loom_init_begin(loom, "./loom.123"));
OK(loom_init_begin(loom, "loom.123"));
+ OK(loom_init_begin(loom, "foo"));
err("ok");
}
@@ -28,7 +26,7 @@ test_bad_name(struct loom *loom)
static void
test_hostname(struct loom *loom)
{
- OK(loom_init_begin(loom, "loom.node1.blah"));
+ OK(loom_init_begin(loom, "node1.blah"));
if (strcmp(loom->hostname, "node1") != 0)
die("wrong hostname: %s", loom->hostname);
@@ -67,7 +65,6 @@ test_duplicate_procs(struct loom *loom)
struct proc proc;
OK(loom_init_begin(loom, testloom));
OK(proc_init_begin(&proc, testproc));
- proc.metadata_loaded = 1;
OK(loom_add_proc(loom, &proc));
ERR(loom_add_proc(loom, &proc));
diff --git a/test/unit/stream.c b/test/unit/stream.c
index 53b8079..362e2d5 100644
--- a/test/unit/stream.c
+++ b/test/unit/stream.c
@@ -1,4 +1,4 @@
-/* Copyright (c) 2021-2023 Barcelona Supercomputing Center (BSC)
+/* Copyright (c) 2021-2024 Barcelona Supercomputing Center (BSC)
* SPDX-License-Identifier: GPL-3.0-or-later */
#include
@@ -11,10 +11,27 @@
#include "ovni.h"
#include "unittest.h"
+static void
+write_dummy_json(const char *path)
+{
+ const char *json = "{ \"version\" : 3 }";
+ FILE *f = fopen(path, "w");
+
+ if (f == NULL)
+ die("fopen json failed:");
+
+ if (fwrite(json, strlen(json), 1, f) != 1)
+ die("fwrite json failed:");
+
+ fclose(f);
+}
+
static void
test_ok(void)
{
- const char *fname = "stream-ok.obs";
+ OK(mkdir("ok", 0755));
+
+ const char *fname = "ok/stream.obs";
FILE *f = fopen(fname, "w");
if (f == NULL)
@@ -30,8 +47,10 @@ test_ok(void)
fclose(f);
+ write_dummy_json("ok/stream.json");
+
struct stream stream;
- OK(stream_load(&stream, ".", fname));
+ OK(stream_load(&stream, ".", "ok"));
if (stream.active)
die("stream is active");
@@ -42,7 +61,9 @@ test_ok(void)
static void
test_bad(void)
{
- const char *fname = "stream-bad.obs";
+ OK(mkdir("bad", 0755));
+
+ const char *fname = "bad/stream.obs";
FILE *f = fopen(fname, "w");
if (f == NULL)
@@ -58,8 +79,10 @@ test_bad(void)
fclose(f);
+ write_dummy_json("bad/stream.json");
+
struct stream stream;
- ERR(stream_load(&stream, ".", fname));
+ ERR(stream_load(&stream, ".", "bad"));
err("OK");
}
diff --git a/test/unit/thread.c b/test/unit/thread.c
deleted file mode 100644
index 05fe8ef..0000000
--- a/test/unit/thread.c
+++ /dev/null
@@ -1,35 +0,0 @@
-/* Copyright (c) 2021-2023 Barcelona Supercomputing Center (BSC)
- * SPDX-License-Identifier: GPL-3.0-or-later */
-
-#include "common.h"
-#include "emu/thread.h"
-#include "unittest.h"
-
-/* Ensure we can load the old trace format */
-static void
-test_old_trace(void)
-{
- struct thread th;
-
- OK(thread_init_begin(&th, "loom.0/proc.0/thread.1.obs"));
- if (th.tid != 1)
- die("wrong tid");
-
- OK(thread_init_begin(&th, "loom.0/proc.0/thread.2"));
- if (th.tid != 2)
- die("wrong tid");
-
- ERR(thread_init_begin(&th, "loom.0/proc.0/thread.kk"));
- ERR(thread_init_begin(&th, "loom.0/proc.0/thread."));
- ERR(thread_init_begin(&th, "loom.0/proc.0/thread"));
- ERR(thread_init_begin(&th, "thread.prv"));
-
- err("ok");
-}
-
-int main(void)
-{
- test_old_trace();
-
- return 0;
-}