Document the results and pp stages
This commit is contained in:
parent
634d2040b5
commit
33682ef48d
@ -4,7 +4,7 @@ TTYOPT=-rPO=4m -rLL=72m
|
|||||||
#TTYOPT=-dpaper=a0 -rPO=4m -rLL=72m
|
#TTYOPT=-dpaper=a0 -rPO=4m -rLL=72m
|
||||||
|
|
||||||
%.pdf: %.ms
|
%.pdf: %.ms
|
||||||
REFER=ref.i groff -ms -t -p -R -Tpdf $^ > $@
|
REFER=ref.i groff -ms -dpaper=a4 -k -t -p -R -Tpdf $^ > $@
|
||||||
-killall -HUP mupdf
|
-killall -HUP mupdf
|
||||||
|
|
||||||
%.utf8: %.ms
|
%.utf8: %.ms
|
||||||
|
179
garlic/doc/pp.ms
179
garlic/doc/pp.ms
@ -1,15 +1,14 @@
|
|||||||
.TL
|
.TL
|
||||||
Garlic: the post-processing pipeline
|
Garlic: the postprocess pipeline
|
||||||
.AU
|
.AU
|
||||||
Rodrigo Arias Mallo
|
Rodrigo Arias Mallo
|
||||||
.AI
|
.AI
|
||||||
Barcelona Supercomputing Center
|
Barcelona Supercomputing Center
|
||||||
.AB
|
.AB
|
||||||
.LP
|
.LP
|
||||||
In this document the stages posterior to the execution of the experiment
|
This document covers the format used to store the results of the
|
||||||
are explained. We consider the post-processing pipeline the steps to go
|
execution and the postprocess steps used to generate a set of
|
||||||
from the generated data from the experiment to a set of plots or tables
|
figures from the results to present the data.
|
||||||
that present the data in a human readable form.
|
|
||||||
.AE
|
.AE
|
||||||
.\"#####################################################################
|
.\"#####################################################################
|
||||||
.nr GROWPS 3
|
.nr GROWPS 3
|
||||||
@ -20,6 +19,7 @@ that present the data in a human readable form.
|
|||||||
.R1
|
.R1
|
||||||
bracket-label " [" ] ", "
|
bracket-label " [" ] ", "
|
||||||
accumulate
|
accumulate
|
||||||
|
move-punctuation
|
||||||
.R2
|
.R2
|
||||||
.\"#####################################################################
|
.\"#####################################################################
|
||||||
.NH 1
|
.NH 1
|
||||||
@ -27,24 +27,64 @@ Introduction
|
|||||||
.LP
|
.LP
|
||||||
After the correct execution of an experiment some measurements are
|
After the correct execution of an experiment some measurements are
|
||||||
recorded in the results for further investigation. Typically the time of
|
recorded in the results for further investigation. Typically the time of
|
||||||
the execution is measured and presented later in a plot or a table. The
|
the execution or other quantities are measured and presented later in a
|
||||||
steps to analyze the results and present them in a convenient way is
|
figure (generally a plot or a table).
|
||||||
called the
|
The
|
||||||
.I "post-processing pipeline" .
|
.I "postprocess pipeline"
|
||||||
Similarly to the execution pipeline
|
consists of all the steps required to create a set of figures from the
|
||||||
|
results. Similarly to the execution pipeline where several stages run
|
||||||
|
sequentially,
|
||||||
.[
|
.[
|
||||||
garlic execution
|
garlic execution
|
||||||
.]
|
.]
|
||||||
where several stages run sequentially, the
|
the postprocess pipeline is also formed by multiple stages executed
|
||||||
post-processing pipeline is also formed by multiple stages executed in
|
in order.
|
||||||
order.
|
|
||||||
.PP
|
.PP
|
||||||
The rationale behind dividing execution and post-processing is
|
The rationale behind dividing execution and postprocess is
|
||||||
that usually the experiments are costly to run (they take a long time to
|
that usually the experiments are costly to run (they take a long time to
|
||||||
complete) while generating a plot is usually shorter. Refining the plots
|
complete) while generating a figure require less time. Refining the
|
||||||
multiple times reusing the same experimental results doesn't require the
|
figures multiple times reusing the same experimental results doesn't
|
||||||
execution of the complete experiment, so the experimenter can try
|
require the execution of the complete experiment, so the experimenter
|
||||||
multiple ways to present the data in a rapid cycle.
|
can try multiple ways to present the data without waiting a large delay.
|
||||||
|
.NH 1
|
||||||
|
Results
|
||||||
|
.LP
|
||||||
|
The results are generated in the same
|
||||||
|
.I "target"
|
||||||
|
machine where the experiment is executed and are stored in the garlic
|
||||||
|
.I output ,
|
||||||
|
organized into a directory structure following the experiment name, the
|
||||||
|
unit name and the run number (governed by the
|
||||||
|
.I control
|
||||||
|
stage):
|
||||||
|
.QS
|
||||||
|
.CW
|
||||||
|
|-- 6lp88vlj7m8hvvhpfz25p5mvvg7ycflb-experiment
|
||||||
|
| |-- 8lpmmfix52a8v7kfzkzih655awchl9f1-unit
|
||||||
|
| | |-- 1
|
||||||
|
| | | |-- stderr.log
|
||||||
|
| | | |-- stdout.log
|
||||||
|
| | | |-- ...
|
||||||
|
| | |-- 2
|
||||||
|
...
|
||||||
|
.QE
|
||||||
|
In order to provide an easier access to the results, an index is also
|
||||||
|
created by taking the
|
||||||
|
.I expName
|
||||||
|
and
|
||||||
|
.I unitName
|
||||||
|
attributes (defined in the experiment configuration) and linking them to
|
||||||
|
the appropriate experiment and unit directories. These links are
|
||||||
|
overwritten by the last experiment with the same names so they are only
|
||||||
|
valid for the last execution. The output and index directories are
|
||||||
|
placed into a per-user directory, as we cannot guarantee the complete
|
||||||
|
execution of each unit when multiple users can share units.
|
||||||
|
.PP
|
||||||
|
The messages printed to the standard output and error are
|
||||||
|
are stored in the log files with the same name inside each run
|
||||||
|
directory. Additional data is sometimes generated by the experiments,
|
||||||
|
and is found in each run directory. As the generated data can be very
|
||||||
|
large, is ignored by default when considering the results.
|
||||||
.NH 1
|
.NH 1
|
||||||
Fetching the results
|
Fetching the results
|
||||||
.LP
|
.LP
|
||||||
@ -57,8 +97,9 @@ program\[em]any changes in the program will cause nix to build it again
|
|||||||
using the updated program. The results will also depend on the
|
using the updated program. The results will also depend on the
|
||||||
execution pipeline, and the graph on the results. This chain of
|
execution pipeline, and the graph on the results. This chain of
|
||||||
dependencies can be shown in the following dependency graph:
|
dependencies can be shown in the following dependency graph:
|
||||||
.\"circlerad=0.22; arrowhead=7;
|
|
||||||
.PS
|
.PS
|
||||||
|
circlerad=0.22;
|
||||||
|
linewid=0.35;
|
||||||
right
|
right
|
||||||
circle "Prog"
|
circle "Prog"
|
||||||
arrow
|
arrow
|
||||||
@ -74,21 +115,23 @@ Ideally, the dependencies should be handled by nix, so it can detect any
|
|||||||
change and rebuild the necessary parts automatically. Unfortunately, nix
|
change and rebuild the necessary parts automatically. Unfortunately, nix
|
||||||
is not able to build the result as a derivation directly as it requires access
|
is not able to build the result as a derivation directly as it requires access
|
||||||
to the
|
to the
|
||||||
.I "target cluster"
|
.I "target"
|
||||||
with several user accounts. In order to let several users reuse the same results from a cache, we
|
machine with several user accounts. In order to let several users reuse
|
||||||
|
the same results from a cache, we
|
||||||
use the
|
use the
|
||||||
.I "nix store"
|
.I "nix store"
|
||||||
to make them available. To generate the results from the
|
to make them available. To generate the results from the
|
||||||
experiment, we add some extra steps that must be executed manually.
|
experiment, we add some extra steps that must be executed manually.
|
||||||
.PS
|
.PS
|
||||||
right
|
|
||||||
circlerad=0.22; arrowhead=7;
|
|
||||||
circle "Prog"
|
circle "Prog"
|
||||||
arrow
|
arrow
|
||||||
|
diag=linewid + circlerad;
|
||||||
|
far=circlerad*3 + linewid*4
|
||||||
E: circle "EP"
|
E: circle "EP"
|
||||||
RUN: circle "Run" at E + (0.8,-0.5) dashed
|
R: circle "Result" at E + (far,0)
|
||||||
FETCH: circle "Fetch" at E + (1.6,-0.5) dashed
|
RUN: circle "Run" at E + (diag,-diag) dashed
|
||||||
R: circle "Result" at E + (2.4,0)
|
FETCH: circle "Fetch" at R + (-diag,-diag) dashed
|
||||||
|
move to R.e
|
||||||
arrow
|
arrow
|
||||||
P: circle "PP"
|
P: circle "PP"
|
||||||
arrow
|
arrow
|
||||||
@ -101,13 +144,13 @@ arrow from E to R chop
|
|||||||
The run and fetch steps are provided by the helper tool
|
The run and fetch steps are provided by the helper tool
|
||||||
.I "garlic(1)" ,
|
.I "garlic(1)" ,
|
||||||
which launches the experiment using the user credentials at the
|
which launches the experiment using the user credentials at the
|
||||||
.I "target cluster"
|
.I "target"
|
||||||
and then fetches the results, placing them in a directory known by nix.
|
machine and then fetches the results, placing them in a directory known
|
||||||
When the result derivation needs to be built, nix will look in this
|
by nix. When the result derivation needs to be built, nix will look in
|
||||||
directory for the results of the execution. If the directory is not
|
this directory for the results of the execution. If the directory is not
|
||||||
found, a message is printed to suggest the user to launch the
|
found, a message is printed to suggest the user to launch the experiment
|
||||||
experiment and the build process is stopped. When the
|
and the build process is stopped. When the result is successfully built
|
||||||
result is successfully built by any user, is stored in the
|
by any user, is stored in the
|
||||||
.I "nix store"
|
.I "nix store"
|
||||||
and it won't need to be rebuilt again until the experiment changes, as
|
and it won't need to be rebuilt again until the experiment changes, as
|
||||||
the hash only depends on the experiment and not on the contents of the
|
the hash only depends on the experiment and not on the contents of the
|
||||||
@ -126,3 +169,73 @@ and can be incremented to create copies that only differs on that
|
|||||||
number. The experiment hash will change but the experiment will be the
|
number. The experiment hash will change but the experiment will be the
|
||||||
same, as long as the revision number is ignored along the execution
|
same, as long as the revision number is ignored along the execution
|
||||||
stages.
|
stages.
|
||||||
|
.NH 1
|
||||||
|
Postprocess stages
|
||||||
|
.LP
|
||||||
|
Once the results are completely generated in the
|
||||||
|
.I "target"
|
||||||
|
machine there are several stages required to build a set of figures:
|
||||||
|
.PP
|
||||||
|
.I fetch \[em]
|
||||||
|
waits until all the experiment units are completed and then executes the
|
||||||
|
next stage. This stage is performed by the
|
||||||
|
.I garlic(1)
|
||||||
|
tool using the
|
||||||
|
.I -F
|
||||||
|
option and also reports the current state of the execution.
|
||||||
|
.PP
|
||||||
|
.I store \[em]
|
||||||
|
copies from the
|
||||||
|
.I target
|
||||||
|
machine into the nix store all log files generated by the experiment,
|
||||||
|
keeping the same directory structure. It tracks the execution state of
|
||||||
|
each unit and only copies the results once the experiment is complete.
|
||||||
|
Other files are ignored as they are often very large and not required
|
||||||
|
for the subsequent stages.
|
||||||
|
.PP
|
||||||
|
.I timetable \[em]
|
||||||
|
converts the results of the experiment into a NDJSON file with one
|
||||||
|
line per run for each unit. Each line is a valid JSON object, containing
|
||||||
|
the
|
||||||
|
.I exp ,
|
||||||
|
.I unit
|
||||||
|
and
|
||||||
|
.I run
|
||||||
|
keys and the unit configuration (as a JSON object) in the
|
||||||
|
.I config
|
||||||
|
key. The execution time is captured from the standard output and is
|
||||||
|
added in the
|
||||||
|
.I time
|
||||||
|
key.
|
||||||
|
.PP
|
||||||
|
.I merge \[em]
|
||||||
|
one or more timetable datasets are joined, by simply concatenating them.
|
||||||
|
This step allows building one dataset to compare multiple experiments in
|
||||||
|
the same figure.
|
||||||
|
.PP
|
||||||
|
.I rPlot \[em]
|
||||||
|
one ot more figures are generated by a single R script
|
||||||
|
.[
|
||||||
|
r cookbook
|
||||||
|
.]
|
||||||
|
which takes as input the previously generated dataset.
|
||||||
|
The path of the dataset is recorded in the figure as well, which
|
||||||
|
contains enough information to determine all the stages in the execution
|
||||||
|
and postprocess pipelines.
|
||||||
|
.SH 1
|
||||||
|
Appendix A: Current setup
|
||||||
|
.LP
|
||||||
|
As of this moment, the
|
||||||
|
.I build
|
||||||
|
machine which contains the nix store is
|
||||||
|
.I xeon07
|
||||||
|
and the
|
||||||
|
.I "target"
|
||||||
|
machine used to run the experiments is Mare Nostrum 4 with the
|
||||||
|
.I output
|
||||||
|
directory placed at
|
||||||
|
.CW /gpfs/projects/bsc15/garlic .
|
||||||
|
By default, the experiment results are never deleted from the
|
||||||
|
.I target
|
||||||
|
so you may want to remove the ones already stored in the nix store to
|
||||||
|
free space.
|
||||||
|
@ -1,4 +1,9 @@
|
|||||||
%A Rodrigo Arias Mallo
|
%A Rodrigo Arias Mallo
|
||||||
%D 2020
|
%D 2020
|
||||||
%K garlic execution
|
|
||||||
%T Garlic: the execution pipeline
|
%T Garlic: the execution pipeline
|
||||||
|
|
||||||
|
%A Winston Chang
|
||||||
|
%T R Graphics Cookbook: Practical Recipes for Visualizing Data
|
||||||
|
%D 2020
|
||||||
|
%I O'Reilly Media
|
||||||
|
%O 2nd edition
|
||||||
|
Loading…
Reference in New Issue
Block a user