107 lines
4.2 KiB
Markdown
107 lines
4.2 KiB
Markdown
# nOS-V model
|
|
|
|
The [nOS-V library][nosv] implements a user space runtime that can schedule
|
|
tasks to run in multiple CPUs. The nOS-V library is instrumented to
|
|
track the internal state of the runtime as well as emit information
|
|
about the tasks that are running.
|
|
|
|
[nosv]: https://github.com/bsc-pm/nos-v
|
|
|
|
## Task model
|
|
|
|
The nOS-V runtime is composed of tasks that can be scheduled to run in
|
|
threads. Tasks can be paused and resumed, leaving the CPUs free to
|
|
execute other tasks.
|
|
|
|
In nOS-V, parallel tasks can also be scheduled multiple times and the
|
|
same task may run concurrently in several CPUs. To model this scenario,
|
|
we introduce the concept of *body*, which maps to each execution of the
|
|
same task, with a unique body id.
|
|
|
|
![Parallel tasks](fig/parallel-tasks.svg)
|
|
|
|
A normal task only has one body, while a parallel task (created with
|
|
`TASK_FLAG_PARALLEL`) can have more than one body. Each body holds the
|
|
execution state, and can transition to different execution states
|
|
following this state diagram:
|
|
|
|
![Body model](fig/body-model.svg)
|
|
|
|
Bodies begin in the Created state and transition to Running when they
|
|
begin the execution. Bodies that can be paused (created with the flag
|
|
`BODY_FLAG_PAUSE` can transition to the Paused state.
|
|
|
|
Additionally, bodies can run multiple times if they are created with the
|
|
`BODY_FLAG_RESURRECT`, and transition from Dead to Running. This
|
|
transition is required to model the tasks that implement the taskiter in
|
|
NODES, which will be submitted multiple times for execution reusing the
|
|
same task id and body id. Every time a body runs again, the iteration
|
|
number is increased.
|
|
|
|
## Task type colors
|
|
|
|
In the Paraver timeline, the color assigned to each nOS-V task type is
|
|
computed from the task type label using a hash function; the task type
|
|
id doesn't affect in any way how the color gets assigned. This method
|
|
provides two desirable properties:
|
|
|
|
- Invariant type colors over time: the order in which task types are
|
|
created doesn't affect their color.
|
|
|
|
- Deterministic colors among threads: task types with the same label end
|
|
up mapped to the same color, even if they are from different threads
|
|
located in different nodes.
|
|
|
|
For more details, see [this MR][1].
|
|
|
|
[1]: https://pm.bsc.es/gitlab/rarias/ovni/-/merge_requests/27
|
|
|
|
## Subsystem view
|
|
|
|
The subsystem view provides a simplified view on what is the nOS-V
|
|
runtime doing over time. The view follows the same rules described in
|
|
the [subsystem view of Nanos6](../nanos6/#subsystem_view).
|
|
|
|
|
|
## Idle view
|
|
|
|
The idle view shows the progress state of the running threads:
|
|
*Progressing* and *Resting*. The *Progressing* state is shown when they
|
|
are making useful progress and the *Resting* state when they are waiting
|
|
for work. When workers start running, by definition, they begin in the
|
|
Progressing state and there are some situations that make them
|
|
transition to Resting:
|
|
|
|
- When workers are waiting in the delegation lock after some spins or
|
|
when instructed to go to sleep.
|
|
- When the server is trying to serve tasks, but there are no more tasks
|
|
available.
|
|
|
|
They will go back to Progressing as soon as they receive work. The
|
|
specific points at which they do so can be read in [nOS-V source
|
|
code](https://gitlab.bsc.es/nos-v/nos-v) by looking at the
|
|
`instr_worker_resting()` and `instr_worker_progressing()` trace points.
|
|
|
|
This view is intended to detect parts of the execution time on which the
|
|
workers don't have work, typically because the application doesn't have
|
|
enough parallelism or the scheduler is unable to serve work fast enough.
|
|
|
|
## Breakdown view
|
|
|
|
The breakdown view displays a summary of what is happening in all CPUs
|
|
by mixing in a single timeline the subsystem, idle and task type views.
|
|
Specifically, it shows how many CPUs are resting as defined by the idle
|
|
view, how many are inside a given task by showing the task type label,
|
|
and how many are in a particular subsystem of the runtime.
|
|
|
|
!!! Important
|
|
|
|
You must specify *ovni.level = 3* or higher in *nosv.toml* and pass
|
|
the *-b* option to ovniemu to generate the breakdown view.
|
|
|
|
Notice that the vertical axis shows the **number**
|
|
of CPUs in that state, not the physical CPUs like other views.
|
|
Here is an example of the Heat mini-app:
|
|
|
|
![Breakdown example](fig/breakdown-nosv.png)
|