Document the results and pp stages

This commit is contained in:
Rodrigo Arias 2020-11-05 14:52:57 +01:00
parent 634d2040b5
commit 33682ef48d
3 changed files with 153 additions and 35 deletions

View File

@ -4,7 +4,7 @@ TTYOPT=-rPO=4m -rLL=72m
#TTYOPT=-dpaper=a0 -rPO=4m -rLL=72m #TTYOPT=-dpaper=a0 -rPO=4m -rLL=72m
%.pdf: %.ms %.pdf: %.ms
REFER=ref.i groff -ms -t -p -R -Tpdf $^ > $@ REFER=ref.i groff -ms -dpaper=a4 -k -t -p -R -Tpdf $^ > $@
-killall -HUP mupdf -killall -HUP mupdf
%.utf8: %.ms %.utf8: %.ms

View File

@ -1,15 +1,14 @@
.TL .TL
Garlic: the post-processing pipeline Garlic: the postprocess pipeline
.AU .AU
Rodrigo Arias Mallo Rodrigo Arias Mallo
.AI .AI
Barcelona Supercomputing Center Barcelona Supercomputing Center
.AB .AB
.LP .LP
In this document the stages posterior to the execution of the experiment This document covers the format used to store the results of the
are explained. We consider the post-processing pipeline the steps to go execution and the postprocess steps used to generate a set of
from the generated data from the experiment to a set of plots or tables figures from the results to present the data.
that present the data in a human readable form.
.AE .AE
.\"##################################################################### .\"#####################################################################
.nr GROWPS 3 .nr GROWPS 3
@ -20,6 +19,7 @@ that present the data in a human readable form.
.R1 .R1
bracket-label " [" ] ", " bracket-label " [" ] ", "
accumulate accumulate
move-punctuation
.R2 .R2
.\"##################################################################### .\"#####################################################################
.NH 1 .NH 1
@ -27,24 +27,64 @@ Introduction
.LP .LP
After the correct execution of an experiment some measurements are After the correct execution of an experiment some measurements are
recorded in the results for further investigation. Typically the time of recorded in the results for further investigation. Typically the time of
the execution is measured and presented later in a plot or a table. The the execution or other quantities are measured and presented later in a
steps to analyze the results and present them in a convenient way is figure (generally a plot or a table).
called the The
.I "post-processing pipeline" . .I "postprocess pipeline"
Similarly to the execution pipeline consists of all the steps required to create a set of figures from the
results. Similarly to the execution pipeline where several stages run
sequentially,
.[ .[
garlic execution garlic execution
.] .]
where several stages run sequentially, the the postprocess pipeline is also formed by multiple stages executed
post-processing pipeline is also formed by multiple stages executed in in order.
order.
.PP .PP
The rationale behind dividing execution and post-processing is The rationale behind dividing execution and postprocess is
that usually the experiments are costly to run (they take a long time to that usually the experiments are costly to run (they take a long time to
complete) while generating a plot is usually shorter. Refining the plots complete) while generating a figure require less time. Refining the
multiple times reusing the same experimental results doesn't require the figures multiple times reusing the same experimental results doesn't
execution of the complete experiment, so the experimenter can try require the execution of the complete experiment, so the experimenter
multiple ways to present the data in a rapid cycle. can try multiple ways to present the data without waiting a large delay.
.NH 1
Results
.LP
The results are generated in the same
.I "target"
machine where the experiment is executed and are stored in the garlic
.I output ,
organized into a directory structure following the experiment name, the
unit name and the run number (governed by the
.I control
stage):
.QS
.CW
|-- 6lp88vlj7m8hvvhpfz25p5mvvg7ycflb-experiment
| |-- 8lpmmfix52a8v7kfzkzih655awchl9f1-unit
| | |-- 1
| | | |-- stderr.log
| | | |-- stdout.log
| | | |-- ...
| | |-- 2
...
.QE
In order to provide an easier access to the results, an index is also
created by taking the
.I expName
and
.I unitName
attributes (defined in the experiment configuration) and linking them to
the appropriate experiment and unit directories. These links are
overwritten by the last experiment with the same names so they are only
valid for the last execution. The output and index directories are
placed into a per-user directory, as we cannot guarantee the complete
execution of each unit when multiple users can share units.
.PP
The messages printed to the standard output and error are
are stored in the log files with the same name inside each run
directory. Additional data is sometimes generated by the experiments,
and is found in each run directory. As the generated data can be very
large, is ignored by default when considering the results.
.NH 1 .NH 1
Fetching the results Fetching the results
.LP .LP
@ -57,8 +97,9 @@ program\[em]any changes in the program will cause nix to build it again
using the updated program. The results will also depend on the using the updated program. The results will also depend on the
execution pipeline, and the graph on the results. This chain of execution pipeline, and the graph on the results. This chain of
dependencies can be shown in the following dependency graph: dependencies can be shown in the following dependency graph:
.\"circlerad=0.22; arrowhead=7;
.PS .PS
circlerad=0.22;
linewid=0.35;
right right
circle "Prog" circle "Prog"
arrow arrow
@ -74,21 +115,23 @@ Ideally, the dependencies should be handled by nix, so it can detect any
change and rebuild the necessary parts automatically. Unfortunately, nix change and rebuild the necessary parts automatically. Unfortunately, nix
is not able to build the result as a derivation directly as it requires access is not able to build the result as a derivation directly as it requires access
to the to the
.I "target cluster" .I "target"
with several user accounts. In order to let several users reuse the same results from a cache, we machine with several user accounts. In order to let several users reuse
the same results from a cache, we
use the use the
.I "nix store" .I "nix store"
to make them available. To generate the results from the to make them available. To generate the results from the
experiment, we add some extra steps that must be executed manually. experiment, we add some extra steps that must be executed manually.
.PS .PS
right
circlerad=0.22; arrowhead=7;
circle "Prog" circle "Prog"
arrow arrow
diag=linewid + circlerad;
far=circlerad*3 + linewid*4
E: circle "EP" E: circle "EP"
RUN: circle "Run" at E + (0.8,-0.5) dashed R: circle "Result" at E + (far,0)
FETCH: circle "Fetch" at E + (1.6,-0.5) dashed RUN: circle "Run" at E + (diag,-diag) dashed
R: circle "Result" at E + (2.4,0) FETCH: circle "Fetch" at R + (-diag,-diag) dashed
move to R.e
arrow arrow
P: circle "PP" P: circle "PP"
arrow arrow
@ -101,13 +144,13 @@ arrow from E to R chop
The run and fetch steps are provided by the helper tool The run and fetch steps are provided by the helper tool
.I "garlic(1)" , .I "garlic(1)" ,
which launches the experiment using the user credentials at the which launches the experiment using the user credentials at the
.I "target cluster" .I "target"
and then fetches the results, placing them in a directory known by nix. machine and then fetches the results, placing them in a directory known
When the result derivation needs to be built, nix will look in this by nix. When the result derivation needs to be built, nix will look in
directory for the results of the execution. If the directory is not this directory for the results of the execution. If the directory is not
found, a message is printed to suggest the user to launch the found, a message is printed to suggest the user to launch the experiment
experiment and the build process is stopped. When the and the build process is stopped. When the result is successfully built
result is successfully built by any user, is stored in the by any user, is stored in the
.I "nix store" .I "nix store"
and it won't need to be rebuilt again until the experiment changes, as and it won't need to be rebuilt again until the experiment changes, as
the hash only depends on the experiment and not on the contents of the the hash only depends on the experiment and not on the contents of the
@ -126,3 +169,73 @@ and can be incremented to create copies that only differs on that
number. The experiment hash will change but the experiment will be the number. The experiment hash will change but the experiment will be the
same, as long as the revision number is ignored along the execution same, as long as the revision number is ignored along the execution
stages. stages.
.NH 1
Postprocess stages
.LP
Once the results are completely generated in the
.I "target"
machine there are several stages required to build a set of figures:
.PP
.I fetch \[em]
waits until all the experiment units are completed and then executes the
next stage. This stage is performed by the
.I garlic(1)
tool using the
.I -F
option and also reports the current state of the execution.
.PP
.I store \[em]
copies from the
.I target
machine into the nix store all log files generated by the experiment,
keeping the same directory structure. It tracks the execution state of
each unit and only copies the results once the experiment is complete.
Other files are ignored as they are often very large and not required
for the subsequent stages.
.PP
.I timetable \[em]
converts the results of the experiment into a NDJSON file with one
line per run for each unit. Each line is a valid JSON object, containing
the
.I exp ,
.I unit
and
.I run
keys and the unit configuration (as a JSON object) in the
.I config
key. The execution time is captured from the standard output and is
added in the
.I time
key.
.PP
.I merge \[em]
one or more timetable datasets are joined, by simply concatenating them.
This step allows building one dataset to compare multiple experiments in
the same figure.
.PP
.I rPlot \[em]
one ot more figures are generated by a single R script
.[
r cookbook
.]
which takes as input the previously generated dataset.
The path of the dataset is recorded in the figure as well, which
contains enough information to determine all the stages in the execution
and postprocess pipelines.
.SH 1
Appendix A: Current setup
.LP
As of this moment, the
.I build
machine which contains the nix store is
.I xeon07
and the
.I "target"
machine used to run the experiments is Mare Nostrum 4 with the
.I output
directory placed at
.CW /gpfs/projects/bsc15/garlic .
By default, the experiment results are never deleted from the
.I target
so you may want to remove the ones already stored in the nix store to
free space.

View File

@ -1,4 +1,9 @@
%A Rodrigo Arias Mallo %A Rodrigo Arias Mallo
%D 2020 %D 2020
%K garlic execution
%T Garlic: the execution pipeline %T Garlic: the execution pipeline
%A Winston Chang
%T R Graphics Cookbook: Practical Recipes for Visualizing Data
%D 2020
%I O'Reilly Media
%O 2nd edition