Convert documentation to Markdown and mkdocs

2022-08-29 16:24:29 +02:00 · 2022-08-29 16:24:29 +02:00 · a1b668a872
commit a1b668a872
parent 6141c2e303
16 changed files with 330 additions and 281 deletions
--- a/doc/emu_chan.txt
+++ b/doc/emu_chan.txt
@ -1,157 +0,0 @@
--- Channels ---
-
-As the emulation progresses, information is written in the PRV trace to record
-the new states. The emulator has specific mechanism to handle the output of new
-states in the PRV trace via channels. A channel stores an integer that
-represents an state at a given point in time and corresponds to the value that
-will be observed in the Paraver timeline.
-
-  NOTE: In general, the emulator receives events, then performs a state
-  transition and the new state (or states) are written into the PRV file.
-
-There are two classes of channels: CPU and thread channels. Both CPU and threads
-have the same fixed number of channels, given by the enumeration `enum chan`.
-
-For example the CHAN_OVNI_STATE of the thread stores the execution state of the
-thread (running, paused ...). Whereas, the CPU channel CHAN_OVNI_NRTHREADS
-records how many running threads a given CPU has.
-
-The channels are used in the following way:
-
-1) In the "pre" phase, the emulator modifies the state of the emulator based on
-the new event. The channels are then updated accordingly in this phase, for
-example when a thread goes from running to paused it must update the
-CHAN_OVNI_STATE channel of the thread by also the CHAN_OVNI_NRTHREADS channel of
-the CPU.
-
-2) In the "emit" phase, the emulator calls the chan_emit() method on those channels
-that have been modified. Those have the dirty attribute set to 1.
-
-3) The optional "post" phase is used to perform some operations before the next
-event is loaded, but is not commonly used.
-
-Then the emulator then loads the next event and repeats the process again.
-
-- Disabling and enabling channels --------------------------------------------
-
-Some channels provide information that only makes sense in some conditions. For
-example, the CPU channel CHAN_OVNI_TID tracks the TID of the thread currently
-running in the CPU. When there is no thread running or there are multiple
-threads running in the same CPU, this channel cannot output valid information.
-
-For those cases, the channels can be enabled or disabled as to only provide
-information when it is necessary. When a channel is disabled, it will emit the
-value stored in `badst` which by default is set to 0.
-
-Notice that if a channel was in a given state A, and was disabled, it must emit
-the new state is 0. When the channel is enabled again, it will emit again the
-state A.
-
-- Thread tracking channels ----------------------------------------------------------
-
-Regarding thread channels, there are two common conditions that cause the
-channels to become disabled. When the thread is no longer running, and then the
-thread is not active.
-
-For those cases, the thread channels can be configured to automatically be
-enabled or disabled, following the execution state of the thread. The tracking
-mode specifies how the tracking must be done:
-
- CHAN_TRACK_NONE: nothing to track
- CHAN_TRACK_RUNNING_TH: enable the channel only if the thread is running
- CHAN_TRACK_ACTIVE_TH: enable the channel only if the thread is running,
-  cooling or warming.
-
-This mechanism removes the complexity of detecting when a thread stops running,
-to update a channel of a given module. As the thread state changes as handled by
-the emu_ovni.c module only.
-
-- CPU tracking channels ------------------------------------------------------
-
-Similarly, CPU channels can also be configured to track the execution state of
-the threads. They become disabled when the tracking condition is not met, but
-also copy the state of the tracking thread channel.
-
-They share the same tracking modes, but their behavior is slightly different:
-
-In the case of tracking the running thread, if the CPU has more than one thread
-running, the channel will always output the error state ST_TOO_MANY_TH.
-
-If is has no threads running, will be disabled and emit a 0 state by default.
-
-Otherwise, it will emit the same value as the running thread. If the thread
-channel is disabled, it will emit a ST_BAD error state.
-
-Regarding the active thread tracking mode, the CPU channels behave similarly,
-but with the active threads instead of running ones.
-
-The CPU tracking mechanism simplify the process of updating CPU channels, as
-the modules don't need to worry about the execution model. Only the channels
-need to be configured to follow the proper execution state.
-
-- Channel state modes --------------------------------------------------------
-
-The channels can be updated in three ways:
-
-1) A fixed state can be set to the channel using chan_set(), which overrides the
-previous state.
-
-2) The new state can be stored in a stack with chan_push() and chan_pop(), to
-remember the history of the previous states. The emitted event will be the one
-on the top.
-
-3) Using a punctual event.
-
-Setting the channel state is commonly used to track quantities such as the
-number of threads running per CPU. While the stack mode is commonly used to
-track functions or sections of code delimited with enter and exit events, which
-can call an return to the previous state.
-
-An example program may be instrumented like this:
-
-	int bar() {
-		instr("Xb[");
-		...
-		instr("Xb]");
-	}
-
-	int foo() {
-		instr("Xf[");
-		bar();
-		instr("Xf]");
-	}
-
-Then, in the emulator, when processing the events "Xf[" and "Xf]", we could track
-of the state as follows:
-
-	int hook_pre_foo(struct ovni_chan *chan, int value) {
-		switch(value) {
-			case '[': chan_push(chan, 2); break;
-			case ']': chan_pop(chan, 2); break;
-			default: break;
-		}
-	}
-
-	int hook_pre_bar(struct ovni_chan *chan, int value) {
-		switch(value) {
-			case '[': chan_push(chan, 1); break;
-			case ']': chan_pop(chan, 1); break;
-			default: break;
-		}
-	}
-
-The channel will emit the following sequence of states: 0, 1, 2, 1, 0.
-
-Notice that the chan_pop() function uses the same state being pop()'ed as
-argument. The function checks that the stack contains the expected state,
-forcing the emulator to always receive a matching pair of enter and exit events.
-
-- Punctual events ------------------------------------------------------------
-
-There are some conditions that are better mapped to events rather than to state
-transitions. For those cases, the channels provide punctual events which are
-emitted as a state than only has 1 ns of duration.
-
-When a channel is configured to emit a punctual event with chan_ev(), it will
-first output the new state at the current time minus 1 ns, then restore the
-previous channel state and emit it at the current time.
--- a/doc/emu_nanos6.txt
+++ b/doc/emu_nanos6.txt
@ -1,63 +0,0 @@
--- Nanos6 Emulation ---
-
-The Nanos6 emulator generates four different Paraver views, which are explained
-in this document.
-
--- Task id ---
-
-The task id view represents the id of the Nanos6 task instance that is currently
-executing on each thread/cpu. This id is a monotonically increasing identifier
-assigned on task creation. Lower ids correspond to tasks created at an earlier
-point than higher ids.
-
--- Task type ---
-
-Every task in Nanos6 contains a task type, which roughly corresponds to the
-actual location in the code a task was declared. For example if a function fn()
-is declared as a Nanos6 task, and it is called multiple times in a program,
-every created task will have a different id, but the same type.
-
-In the view, each type is shown with a label declared in the source with the
-label() attribute of the task. If no label was specified, one is automatically
-generated for each type.
-
-Note that in this view, event value is a hash function of the type label, so two
-distinct types (tasks declared in different parts of the code) with the same
-label will share event value and will hence be indistinguishable.
-
--- MPI Rank ---
-
-Represents the current MPI rank for the currently running task in a thread or cpu.
-
--- Subsystem ---
-
-Represents the internal Nanos6 subsystem each thread or cpu is currently
-running. Here is a summary of each possible value with its meaning:
-
- - Null or black (value 0): Either the thread is idle (blocked because there is
-   no work) or the current subsystem is not instrumented
- - "Scheduler: Waiting for tasks": Actively waiting for tasks inside the scheduler
-   subsystem, registered but not holding the scheduler lock
- - "Scheduler: Serving tasks": Inside the scheduler lock, serving tasks to other
-   threads
- - "Scheduler: Adding ready tasks": Adding tasks to the scheduler queues, but outside of
-   the scheduler lock
- - "Task: Running": Executing user task code
- - "Task: Spawning function": Registering a new spawn function (programmatically
-   created task)
- - "Task: Creating": Creating a new task, through nanos6_create_task
- - "Task: Submitting": Submitting a recently created task, through
-   nanos6_submit_task
- - "Dependency: Registering": Registering a task's dependencies
- - "Dependency: Unregistering": Releasing a task's dependencies because it has
-   ended
- - "Blocking: Taskwait": Task is blocked while inside a taskwait
- - "Blocking: Blocking current task": Task is blocked through the Nanos6
-   blocking API
- - "Blocking: Unblocking remote task": Unblocking a different task using the
-   Nanos6 blocking API
- - "Blocking: Wait For": Blocking a deadline task, which will be re-enqueued
-   when a certain amount of time has passed
- - "Threading: Attached as external thread": External/Leader thread (which has
-   registered to Nanos6) is running
-
--- a/doc/emulation/channels.md
+++ b/doc/emulation/channels.md
@ -0,0 +1,170 @@
+# Channels
+
+As the emulation progresses, information is written in the PRV trace to
+record the new states. The emulator has specific mechanism to handle the
+output of new states in the PRV trace via channels. A channel stores an
+integer that represents an state at a given point in time and
+corresponds to the value that will be observed in the Paraver timeline.
+
+!!! Note
+
+	In general, the emulator receives events, then performs a state
+	transition and the new state (or states) are written into the
+	PRV file.
+
+There are two classes of channels: CPU and thread channels. Both CPU and
+threads have the same fixed number of channels, given by the enumeration
+`enum chan`.
+
+For example the `CHAN_OVNI_STATE` of the thread stores the execution
+state of the thread (running, paused ...). Whereas, the CPU channel
+`CHAN_OVNI_NRTHREADS` records how many running threads a given CPU has.
+
+The channels are used in the following way:
+
+- In the "pre" phase, the emulator modifies the state of the emulator
+  based on the new event. The channels are then updated accordingly in
+  this phase, for example when a thread goes from running to paused it
+  must update the `CHAN_OVNI_STATE` channel of the thread by also the
+  `CHAN_OVNI_NRTHREADS` channel of the CPU.
+
+- In the "emit" phase, the emulator calls the `chan_emit()` method on
+  those channels that have been modified. Those have the dirty attribute
+  set to 1.
+
+- The optional "post" phase is used to perform some operations before
+  the next event is loaded, but is not commonly used.
+
+Then the emulator then loads the next event and repeats the process
+again.
+
+## Disabling and enabling channels
+
+Some channels provide information that only makes sense in some
+conditions. For example, the CPU channel `CHAN_OVNI_TID` tracks the TID
+of the thread currently running in the CPU. When there is no thread
+running or there are multiple threads running in the same CPU, this
+channel cannot output valid information.
+
+For those cases, the channels can be enabled or disabled as to only
+provide information when it is necessary. When a channel is disabled, it
+will emit the value stored in `badst` which by default is set to 0.
+
+Notice that if a channel was in a given state A, and was disabled, it
+must emit the new state is 0. When the channel is enabled again, it will
+emit again the state A.
+
+## Thread tracking channels
+
+Regarding thread channels, there are two common conditions that cause
+the channels to become disabled. When the thread is no longer running,
+and then the thread is not active.
+
+For those cases, the thread channels can be configured to automatically
+be enabled or disabled, following the execution state of the thread. The
+tracking mode specifies how the tracking must be done:
+
+- `CHAN_TRACK_NONE`: nothing to track
+- `CHAN_TRACK_RUNNING_TH`: enable the channel only if the thread is
+  running
+- `CHAN_TRACK_ACTIVE_TH`: enable the channel only if the thread is
+  running, cooling or warming.
+
+This mechanism removes the complexity of detecting when a thread stops
+running, to update a channel of a given module. As the thread state
+changes as handled by the `emu_ovni.c` module only.
+
+## CPU tracking channels
+
+Similarly, CPU channels can also be configured to track the execution
+state of the threads. They become disabled when the tracking condition
+is not met, but also copy the state of the tracking thread channel.
+
+They share the same tracking modes, but their behavior is slightly
+different:
+
+In the case of tracking the running thread, if the CPU has more than one
+thread running, the channel will always output the error state
+`ST_TOO_MANY_TH`.
+
+If is has no threads running, will be disabled and emit a 0 state by
+default.
+
+Otherwise, it will emit the same value as the running thread. If the
+thread channel is disabled, it will emit a `ST_BAD` error state.
+
+Regarding the active thread tracking mode, the CPU channels behave
+similarly, but with the active threads instead of running ones.
+
+The CPU tracking mechanism simplify the process of updating CPU
+channels, as the modules don't need to worry about the execution model.
+Only the channels need to be configured to follow the proper execution
+state.
+
+## Channel state modes
+
+The channels can be updated in three ways:
+
+1) A fixed state can be set to the channel using `chan_set()`, which
+overrides the previous state.
+
+2) The new state can be stored in a stack with `chan_push()` and
+`chan_pop()`, to remember the history of the previous states. The
+emitted event will be the one on the top.
+
+3) Using a punctual event.
+
+Setting the channel state is commonly used to track quantities such as
+the number of threads running per CPU. While the stack mode is commonly
+used to track functions or sections of code delimited with enter and
+exit events, which can call an return to the previous state.
+
+An example program may be instrumented like this:
+
+	int bar() {
+		instr("Xb[");
+		...
+		instr("Xb]");
+	}
+
+	int foo() {
+		instr("Xf[");
+		bar();
+		instr("Xf]");
+	}
+
+Then, in the emulator, when processing the events "Xf[" and "Xf]", we
+could track of the state as follows:
+
+	int hook_pre_foo(struct ovni_chan *chan, int value) {
+		switch(value) {
+			case '[': chan_push(chan, 2); break;
+			case ']': chan_pop(chan, 2); break;
+			default: break;
+		}
+	}
+
+	int hook_pre_bar(struct ovni_chan *chan, int value) {
+		switch(value) {
+			case '[': chan_push(chan, 1); break;
+			case ']': chan_pop(chan, 1); break;
+			default: break;
+		}
+	}
+
+The channel will emit the following sequence of states: 0, 1, 2, 1, 0.
+
+Notice that the `chan_pop()` function uses the same state being pop()'ed
+as argument. The function checks that the stack contains the expected
+state, forcing the emulator to always receive a matching pair of enter
+and exit events.
+
+## Punctual events
+
+There are some conditions that are better mapped to events rather than
+to state transitions. For those cases, the channels provide punctual
+events which are emitted as a state than only has 1 ns of duration.
+
+When a channel is configured to emit a punctual event with `chan_ev()`,
+it will first output the new state at the current time minus 1 ns, then
+restore the previous channel state and emit it at the current time.
--- a/doc/emulation/events.md
+++ b/doc/emulation/events.md
@ -1,9 +1,12 @@
+# Emulator events
+
 This file contains an exhaustive list of events supported by the emulator.

 - Punctual events don't produce a state transition.
 - All events refer to the current thread.
 - Descriptions must be kept short.

+```
 **********************************************************
 Please keep this list synchronized with the emulator code!
 **********************************************************
@ -168,3 +171,4 @@ KCI	Is back in the CPU due to a context switch

 6Bu	Begins to unblock the given task
 6BU	Ends unblocking the given task
+```
--- a/doc/emulation/index.md
+++ b/doc/emulation/index.md
@ -0,0 +1,26 @@
+# Emulation overview
+
+The emulator reads the events stored during runtime and reconstructs the
+execution, restoring the state of each thread and CPU as time evolves. During
+the emulation process, a detailed trace is generated with the state of the
+execution in the Paraver PRV format.
+
+The emulator has an execution model to represent the real execution that
+happened on the hardware. It consists of CPUs which can execute multiple threads
+at the same time.
+
+The emulator uses several models to identify how the resources are being
+used. The following diagram despicts the resource, process and task
+model.
+
+![Model](model.png)
+
+The resource model directly maps to the available hardware on the
+machine. It consists of clusters which contains nodes, where each node
+contains a set of CPUs that can execute instructions.
+
+The process model tracks the state of processes and threads. Processes
+that use the same CPUs in a single node are grouped into looms.
+
+The task model includes the information of MPI and tasks of the
+programming model (OmpSs-2).
--- a/doc/emulation/model.png
+++ b/doc/emulation/model.png
--- a/doc/emulation/nanos6.md
+++ b/doc/emulation/nanos6.md
@ -1,4 +1,4 @@
-# Nanos6 Emulation
+# Nanos6 model

 The Nanos6 emulator generates four different Paraver views, which are
 explained in this document.
@ -50,68 +50,77 @@ exiting each section), and one common section of code which is shared
 across the subsystems, U, of no interest. We also assume any other code
 not belonging to the runtime to be in the U section.

-Every instruction of the runtime belongs to *exactly one section*.
+!!! remark
+
+     Every instruction of the runtime belongs to *exactly one section*.

 To determine the state of a thread, we look into the stack to see what
 is the top-most instrumented section.

 At any given point in time, a thread may be executing code with a stack
-that spawns multiple sections, for example \[ S1, U, S2, S3, U \] (the
+that spawns multiple sections, for example *S1, U, S2, S3* and *U* (the
 last is on top). The subsystem view selects the last subsystem section
-from the stack ignoring the common section U, and presents that section
-as the current state of the execution, in this case the S3.
+from the stack ignoring the common section *U*, and presents that section
+as the current state of the execution, in this case the *S3*.

-Additionally, the runtime sections are grouped together in systems,
+Additionally, the runtime sections are grouped together in subsystems,
 which form a group of closely related functions. A complete set of
-states for the subsystem view is listed below. The system is listed
-first and then the subsystem:
+states for each subsystem is listed below.

- **No subsystem**: There is no instrumented section in the stack of the
-thread.
+When there is no instrumented section in the thread stack, the state is
+set to **No subsystem**.

-The **Scheduler** system groups the actions that relate to the queueing
-and dequeueing of ready tasks. The subsystems are:
+### Task subsystem

- **Scheduler: Waiting for tasks**: Actively waiting for tasks inside the
-scheduler subsystem, registered but not holding the scheduler lock
+The **Task** subsystem contains the code that controls the lifecycle of
+tasks. It contains the following sections:

- **Scheduler: Serving tasks**: Inside the scheduler lock, serving tasks
-to other threads
+- **Running**: Executing the body of the task (user defined code).

- **Scheduler: Adding ready tasks**: Adding tasks to the scheduler queues,
-but outside of the scheduler lock.
-
-The **Task** system contains the code that controls the lifecycle of
-tasks.
-
- **Task: Running**: Executing the body of the task (user defined code).
-
- **Task: Spawning function**: Registering a new spawn function
+- **Spawning function**: Registering a new spawn function
 (programmatically created task)

- **Task: Creating**: Creating a new task, through `nanos6_create_task`
+- **Creating**: Creating a new task, through `nanos6_create_task`

- **Task: Submitting**: Submitting a recently created task, through
+- **Submitting**: Submitting a recently created task, through
 `nanos6_submit_task`

-The **Dependency** group only contains the dependency code:
+### Scheduler subsystem

- **Dependency: Registering**: Registering a task's dependencies
+The **Scheduler** system groups the actions that relate to the queueing
+and dequeueing of ready tasks. It contains the following sections:

- **Dependency: Unregistering**: Releasing a task's dependencies because
+- **Waiting for tasks**: Actively waiting for tasks inside the
+scheduler subsystem, registered but not holding the scheduler lock
+
+- **Serving tasks**: Inside the scheduler lock, serving tasks
+to other threads
+
+- **Adding ready tasks**: Adding tasks to the scheduler queues,
+but outside of the scheduler lock.
+
+### Dependency subsystem
+
+The **Dependency** system only contains the code that manages the
+registration of task dependencies. It contains the following sections:
+
+- **Registering**: Registering a task's dependencies
+
+- **Unregistering**: Releasing a task's dependencies because
 it has ended

- **Blocking: Taskwait**: Task is blocked while inside a taskwait
+### Blocking subsystem

- **Blocking: Blocking current task**: Task is blocked through the Nanos6
+The **Blocking** subsystem deals with the code stops the thread
+execution. It contains the following sections:
+
+- **Taskwait**: Task is blocked while inside a taskwait
+
+- **Blocking current task**: Task is blocked through the Nanos6
 blocking API

- **Blocking: Unblocking remote task**: Unblocking a different task using
+- **Unblocking remote task**: Unblocking a different task using
 the Nanos6 blocking API

- **Blocking: Wait For**: Blocking a deadline task, which will be
+- **Wait For**: Blocking a deadline task, which will be
 re-enqueued when a certain amount of time has passed
-
- **Threading: Attached as external thread**: External/Leader thread
-(which has registered to Nanos6) is running
-
--- a/doc/nosv_type_labels.md
+++ b/doc/nosv_type_labels.md
@ -1,4 +1,6 @@
-# nOS-V task type colors
+# nOS-V model
+
+## nOS-V task type colors

 The color assigned to each nOS-V task type is computed from the task
 type label using a hash function; the task type id doesn't affect in any
--- a/doc/emulation/ovni-thread-model.png
+++ b/doc/emulation/ovni-thread-model.png
--- a/doc/emulation/ovni.md
+++ b/doc/emulation/ovni.md
@ -0,0 +1,5 @@
+# Ovni model
+
+The ovni model tracks the state of threads and cpus.
+
+![Thread states](ovni-thread-model.png)
--- a/doc/index.md
+++ b/doc/index.md
@ -0,0 +1,29 @@
+![Ovni logo](logo2.png)
+
+This is the documentation of ovni, the Obtuse (but Versatile) Nanoscale
+Instrumentation project.
+
+!!! Note
+
+	Preferably write the name of the project as lowercase *ovni*
+	unless the grammar rules suggest otherwise, such as starting a
+	new sentence.
+
+The instrumentation process is split in two stages: [runtime](runtime)
+tracing and [emulation](emulation/).
+
+During runtime, very simple and short events are stored on disk which
+describe what is happening. Once the execution finishes, the events are
+read and processed to reproduce the execution during the emulation
+process, and the final execution trace is generated.
+
+By splitting the runtime and emulation processes we can perform
+expensive computations during the trace generation without disturbing
+the runtime process.
+
+Each event belongs to a model, which has a direct mapping to a target
+library or program. Each model is independent of other models, and they
+can be instrumented concurrently.
+
+The events are classified by using three identifiers known as *model*,
+*category* and *value* (or MCV for short).
--- a/doc/logo.png
+++ b/doc/logo.png
--- a/doc/logo2.png
+++ b/doc/logo2.png
--- a/doc/runtime/kernel.md
+++ b/doc/runtime/kernel.md
@ -1,4 +1,4 @@
--- Kernel support ---
+# Kernel support

 Currently, only context switch events are supported. The kernel events are
 usually written by the kernel into a buffer, without any action from user space.
@ -6,7 +6,7 @@ This behavior poses a problem, as the user space events and kernel events can
 leave a unsorted trace.

 The current workaround involves surounding the kernel events by two special ovni
-event markers OU[ and OU] which determine the region of events which must be
+event markers `OU[` and `OU]` which determine the region of events which must be
 sorted first. Notice that the events inside the region must be sorted!

 The `ovnisort` tool has been designed to sort the events enclosed by those
--- a/doc/runtime/tracing.md
+++ b/doc/runtime/tracing.md
@ -1,36 +1,38 @@
--- Using libovni ---
+# Tracing a program with ovni

 Read carefully this document before using libovni to generate a trace.

-- Mark the start and end of processes and threads ----------------------------
+## Mark the start and end of processes and threads

-Call ovni_proc_init() when a new program begins the execution.
+Call `ovni_proc_init()` when a new program begins the execution.

-Call ovni_thread_init() when a new thread begins the execution (including the
-main process thread). Call ovni_flush() and ovni_thread_free() when it finishes.
+Call `ovni_thread_init()` when a new thread begins the execution
+(including the main process thread). Call `ovni_flush()` and
+`ovni_thread_free()` when it finishes.

-Call ovni_proc_fini() when the program ends, after all threads have finished.
+Call `ovni_proc_fini()` when the program ends, after all threads have
+finished.

-You can use ovni_ev_emit() to record a new event. If you need more than 16 bytes
-of payload, use ovni_ev_jumbo_emit().
+You can use `ovni_ev_emit()` to record a new event. If you need more
+than 16 bytes of payload, use `ovni_ev_jumbo_emit()`.

-Compile and link with libovni. When you run your program, a new directory ovni
-will be created in the current directory ($PWD/ovni) which contains the
-execution trace.
+Compile and link with libovni. When you run your program, a new
+directory ovni will be created in the current directory ($PWD/ovni)
+which contains the execution trace.

-- Rules ----------------------------------------------------------------------
+## Rules

 Follow these rules to avoid losing events:

 1. No event may be emitted until the process is initialized with
-ovni_proc_init() and the thread with ovni_thread_init().
+`ovni_proc_init()` and the thread with `ovni_thread_init()`.

-2. When a thread ends the execution, it must call ovni_flush() to write the
+2. When a thread ends the execution, it must call `ovni_flush()` to write the
 events in the buffer to disk.

-3. All threads must have flushed its buffers before calling ovni_proc_fini().
+3. All threads must have flushed its buffers before calling `ovni_proc_fini()`.

-- Select a fast directory ----------------------------------------------------
+## Select a fast directory

 During the execution of your program, a per-thread buffer is kept where the new
 events are being recorded. When this buffer is full, it is written to disk and
@ -43,16 +45,17 @@ time. It is advisable to use the fastest filesystem available (see the tmpfs(5)
 and df(1) manual pages).

 You can select the trace directory where the buffers will be flushed during the
-execution by setting the environment variable OVNI_TMPDIR. The last directory
+execution by setting the environment variable `OVNI_TMPDIR`. The last directory
 will be created if doesn't exist. In that case, as soon as a process calls
-ovni_proc_fini(), the traces of all its threads will be moved to the final
-directory at $PWD/ovni. Example:
+`ovni_proc_fini()`, the traces of all its threads will be moved to the final
+directory at `$PWD/ovni`. Example:

-  OVNI_TMPDIR=$(mktemp -u /dev/shm/ovni.XXXXXX) srun ./your-app
+	OVNI_TMPDIR=$(mktemp -u /dev/shm/ovni.XXXXXX) srun ./your-app

 To test the different filesystem speeds, you can use hyperfine and dd. Take a
 closer look at the max time:

+```
 $ hyperfine 'dd if=/dev/zero of=/gpfs/projects/bsc15/bsc15557/kk bs=2M count=10'
 Benchmark 1: dd if=/dev/zero of=/gpfs/projects/bsc15/bsc15557/kk bs=2M count=10
  Time (mean ± σ):      71.7 ms ± 130.4 ms    [User: 0.8 ms, System: 10.2 ms]
@ -71,4 +74,5 @@ $ hyperfine 'dd if=/dev/zero of=/dev/shm/kk bs=2M count=10'
 Benchmark 1: dd if=/dev/zero of=/dev/shm/kk bs=2M count=10
  Time (mean ± σ):      11.4 ms ±   0.4 ms    [User: 0.5 ms, System: 11.1 ms]
  Range (min … max):     9.7 ms …  12.5 ms    269 runs
+```

--- a/mkdocs.yml
+++ b/mkdocs.yml
@ -0,0 +1,20 @@
+site_name: ovni
+theme: readthedocs
+docs_dir: doc
+markdown_extensions:
+  - admonition
+  - toc:
+      permalink: "#"
+      separator: "_"
+nav:
+  - index.md
+  - 'Runtime':
+    - runtime/tracing.md
+    - runtime/kernel.md
+  - 'Emulation':
+    - emulation/index.md
+    - emulation/ovni.md
+    - emulation/nosv.md
+    - emulation/nanos6.md
+    - emulation/events.md
+    - emulation/channels.md