trace-v3 #7

Open
rarias wants to merge 36 commits from trace-v3 into master
5 changed files with 11 additions and 109 deletions
Showing only changes of commit d115ecad64 - Show all commits

View File

@ -13,9 +13,8 @@ The ovni project implements a fast instrumentation library that records
small events (starting at 12 bytes) during the execution of programs to
later investigate how the execution happened.
<!-- FIXME: Add a index for runtime -->
The instrumentation process is split in two stages:
[runtime](user/runtime/tracing.md)
[runtime](user/runtime/index.md)
tracing and [emulation](user/emulation/index.md).
During runtime, very short binary events are stored on disk which

View File

@ -4,7 +4,7 @@ Ovni has a model to represent the hardware components as well as the software
concepts like threads or processes. Each concept is considered to be a *part*.
Here is an example diagram depicting the part hierarchy:
![lalala](part-model.svg "foo bar")
![Part model](part-model.svg)
Notice how a loom can restrict the CPUs of the node to its child processes.

View File

@ -15,14 +15,12 @@ To initialize libovni follow these steps in all threads:
ovni function. It can be called multiple times from any thread, but only one
is required.
2. **Init the process**. Call `ovni_proc_init()` to initialize the process when
a new process begins the execution. It can only be called **once per
process** and it must be called before the thread is initialized.
2. **Init the process**. Call `ovni_proc_init()` to initialize the process. It
can only be called **once per process** and it must be called before the
thread is initialized.
3. **Init the thread**. Call `ovni_thread_init()` when a new thread begins the
execution (including the main process thread after the process is
initialized). Multiple attempts to initialize the thread are ignored with a
warning.
3. **Init the thread**. Call `ovni_thread_init()` to initialize the thread.
Multiple attempts to initialize the same thread are ignored with a warning.
The `ovni_proc_init()` arguments are as follows:
@ -32,8 +30,8 @@ void ovni_proc_init(int app, const char *loom, int pid);
The `app` defines the "appid" of the program, which must be a number >0. This is
useful to run multiple processes some of which run the same "app", so you can
tell which one is which. The `loom` defines the
[loom](../concepts/part-model.md#loom) name and assignes the process to that
tell which one is which. The `loom` argument defines the
[loom](../concepts/part-model.md#loom) name and maps the process to that
loom. It must be compose of the host name, a dot and a suffix. The PID is the
one obtained by `getpid(2)`.
@ -59,8 +57,8 @@ the thread stream.
## Start the execution
The current thread must switch to the "Running" state before any event can be
processed by the emulator. Do so by emitting a `OHx` event in the stream with
the appropriate payload:
processed by the emulator. Do so by emitting a [`OHx`
event](../emulation/events.md#OHx) in the stream with the appropriate payload:
```c
static void thread_execute(int32_t cpu, int32_t ctid, uint64_t tag)

View File

@ -1,94 +0,0 @@
# Tracing a new program
Read carefully this document before using libovni to instrument a new
component. There are a few rules you must follow to ensure the runtime
trace is correct.
## Trace processes and threads
- Call `ovni_version_check()` once before calling any ovni function.
- Call `ovni_proc_init()` when a new process begins the execution.
- Call `ovni_thread_init()` when a new thread begins the execution
(including the main process thread).
- Call `ovni_thread_require()` with the required model version before
emitting events for that model.
- Call `ovni_flush()` and `ovni_thread_free()` when it finishes (in that
order).
- Call `ovni_proc_fini()` when a process ends, after all threads have
finished.
You can use `ovni_ev_emit()` to record a new event. If you need more
than 16 bytes of payload, use `ovni_ev_jumbo_emit()`. See the [trace
specification](trace_spec.md) for more details.
Compile and link with libovni. When you run your program, a new
directory ovni will be created in the current directory `$PWD/ovni`
which contains the execution trace.
You can change the trace directory by defining the `OVNI_TRACEDIR`
environment variable. The envar accepts a trace directory name, a
relative path to the trace directory, or its absolute path. In the
first case, the trace directory will be created in the current
directory `$PWD`.
## Rules
Follow these rules to avoid losing events:
1. No event may be emitted until the process is initialized with
`ovni_proc_init()` and the thread with `ovni_thread_init()`.
2. When a thread ends the execution, it must call `ovni_flush()` to write the
events in the buffer to disk.
3. All threads must have flushed its buffers before calling `ovni_proc_fini()`.
## Select a fast directory
During the execution of your program, a per-thread buffer is kept where the new
events are being recorded. When this buffer is full, it is written to disk and
emptied, an operation known as flush. This may take a while depending on the
underliying filesystem.
Keep in mind that the thread will be blocked until the flush ends, so if your
filesystem is slow it would interrupt the execution of your program for a long
time. It is advisable to use the fastest filesystem available (see the tmpfs(5)
and df(1) manual pages).
You can select the trace directory where the buffers will be flushed during the
execution by setting the environment variable `OVNI_TMPDIR`. The last directory
will be created if doesn't exist. In that case, as soon as a process calls
`ovni_proc_fini()`, the traces of all its threads will be moved to the final
directory at `$PWD/ovni`. Example:
OVNI_TMPDIR=$(mktemp -u /dev/shm/ovni.XXXXXX) srun ./your-app
To test the different filesystem speeds, you can use hyperfine and dd. Take a
closer look at the max time:
```
$ hyperfine 'dd if=/dev/zero of=/gpfs/projects/bsc15/bsc15557/kk bs=2M count=10'
Benchmark 1: dd if=/dev/zero of=/gpfs/projects/bsc15/bsc15557/kk bs=2M count=10
Time (mean ± σ): 71.7 ms ± 130.4 ms [User: 0.8 ms, System: 10.2 ms]
Range (min … max): 14.7 ms … 1113.2 ms 162 runs
Warning: Statistical outliers were detected. Consider re-running this
benchmark on a quiet PC without any interferences from other programs. It
might help to use the '--warmup' or '--prepare' options.
$ hyperfine 'dd if=/dev/zero of=/tmp/kk bs=2M count=10'
Benchmark 1: dd if=/dev/zero of=/tmp/kk bs=2M count=10
Time (mean ± σ): 56.2 ms ± 5.7 ms [User: 0.6 ms, System: 14.8 ms]
Range (min … max): 45.8 ms … 77.8 ms 63 runs
$ hyperfine 'dd if=/dev/zero of=/dev/shm/kk bs=2M count=10'
Benchmark 1: dd if=/dev/zero of=/dev/shm/kk bs=2M count=10
Time (mean ± σ): 11.4 ms ± 0.4 ms [User: 0.5 ms, System: 11.1 ms]
Range (min … max): 9.7 ms … 12.5 ms 269 runs
```

View File

@ -29,7 +29,6 @@ nav:
- user/concepts/trace-model.md
- 'Runtime':
- user/runtime/index.md
- user/runtime/tracing.md
- user/runtime/mark.md
- user/runtime/distributed.md
- user/runtime/kernel.md