WIP: postprocessing doc
This commit is contained in:
parent
62c9da2474
commit
f0122d557f
@ -1,8 +1,14 @@
|
|||||||
all: execution.pdf execution.txt pp.pdf pp.txt
|
all: execution.pdf execution.ascii pp.pdf pp.ascii
|
||||||
|
|
||||||
|
TTYOPT=-rPO=4m -rLL=72m
|
||||||
|
#TTYOPT=-dpaper=a0 -rPO=4m -rLL=72m
|
||||||
|
|
||||||
%.pdf: %.ms
|
%.pdf: %.ms
|
||||||
groff -ms -t -p -Tpdf $^ > $@
|
REFER=ref.i groff -ms -t -p -R -Tpdf $^ > $@
|
||||||
-killall -HUP mupdf
|
-killall -HUP mupdf
|
||||||
|
|
||||||
%.txt: %.ms
|
%.utf8: %.ms
|
||||||
groff -ms -t -p -Tutf8 $^ > $@
|
REFER=ref.i groff -ms -t -p -R $(TTYOPT) -Tutf8 $^ > $@
|
||||||
|
|
||||||
|
%.ascii: %.ms
|
||||||
|
REFER=ref.i groff -ms -t -p -R $(TTYOPT) -Tascii $^ > $@
|
||||||
|
119
garlic/doc/pp.ms
119
garlic/doc/pp.ms
@ -1,75 +1,128 @@
|
|||||||
.TL
|
.TL
|
||||||
Garlic: experiment results
|
Garlic: the post-processing pipeline
|
||||||
.AU
|
.AU
|
||||||
Rodrigo Arias Mallo
|
Rodrigo Arias Mallo
|
||||||
.AI
|
.AI
|
||||||
Barcelona Supercomputing Center
|
Barcelona Supercomputing Center
|
||||||
|
.AB
|
||||||
|
.LP
|
||||||
|
In this document the stages posterior to the execution of the experiment
|
||||||
|
are explained. We consider the post-processing pipeline the steps to go
|
||||||
|
from the generated data from the experiment to a set of plots or tables
|
||||||
|
that present the data in a human readable form.
|
||||||
|
.AE
|
||||||
.\"#####################################################################
|
.\"#####################################################################
|
||||||
.nr GROWPS 3
|
.nr GROWPS 3
|
||||||
.nr PSINCR 1.5p
|
.nr PSINCR 1.5p
|
||||||
.\".nr PD 0.5m
|
.\".nr PD 0.5m
|
||||||
.nr PI 2m
|
.nr PI 2m
|
||||||
\".2C
|
.\".2C
|
||||||
|
.R1
|
||||||
|
bracket-label " [" ] ", "
|
||||||
|
accumulate
|
||||||
|
.R2
|
||||||
.\"#####################################################################
|
.\"#####################################################################
|
||||||
|
.NH 1
|
||||||
|
Introduction
|
||||||
|
.LP
|
||||||
|
After the correct execution of an experiment some measurements are
|
||||||
|
recorded in the results for further investigation. Typically the time of
|
||||||
|
the execution is measured and presented later in a plot or a table. The
|
||||||
|
steps to analyze the results and present them in a convenient way is
|
||||||
|
called the
|
||||||
|
.I "post-processing pipeline" .
|
||||||
|
Similarly to the execution pipeline
|
||||||
|
.[
|
||||||
|
garlic execution
|
||||||
|
.]
|
||||||
|
where several stages run sequentially, the
|
||||||
|
post-processing pipeline is also formed by multiple stages executed in
|
||||||
|
order.
|
||||||
|
.PP
|
||||||
|
The rationale behind dividing execution and post-processing is
|
||||||
|
that usually the experiments are costly to run (they take a long time to
|
||||||
|
complete) while generating a plot is usually shorter. Refining the plots
|
||||||
|
multiple times reusing the same experimental results doesn't require the
|
||||||
|
execution of the complete experiment, so the experimenter can try
|
||||||
|
multiple ways to present the data in a rapid cycle.
|
||||||
|
.NH 1
|
||||||
|
Fetching the results
|
||||||
.LP
|
.LP
|
||||||
Consider a program of interest for which an experiment has been designed to
|
Consider a program of interest for which an experiment has been designed to
|
||||||
measure some properties. When the experiment is executed, it will generate some
|
measure some properties that the experimenter wants to present in a
|
||||||
results which are generally non-deterministic. The experimenter may want to
|
visual plot. When the experiment is launched, the execution
|
||||||
present some information in a visual plot or graph based on these results.
|
pipeline (EP) is completely executed and it will generate some
|
||||||
.PP
|
results. In this escenario, the execution pipeline depends on the
|
||||||
In this escenario, the experiment depends on the program\[em]any
|
program\[em]any changes in the program will cause nix to build it again
|
||||||
changes in the program will cause nix to build the experiment again using the
|
using the updated program. The results will also depend on the
|
||||||
updated program. The results will also depend on the experiment, and
|
execution pipeline, and the graph on the results. This chain of
|
||||||
the graph on the results. This chain of dependencies can be shown in
|
dependencies can be shown in the following dependency graph:
|
||||||
the following dependency tree:
|
.\"circlerad=0.22; arrowhead=7;
|
||||||
.PS
|
.PS
|
||||||
right
|
right
|
||||||
circlerad=0.22; arrowhead=7;
|
|
||||||
circle "Prog"
|
circle "Prog"
|
||||||
arrow
|
arrow
|
||||||
circle "Exp"
|
circle "EP"
|
||||||
arrow
|
arrow
|
||||||
circle "Result"
|
circle "Result"
|
||||||
arrow
|
arrow
|
||||||
circle "Graph"
|
circle "PP"
|
||||||
|
arrow
|
||||||
|
circle "Plot"
|
||||||
.PE
|
.PE
|
||||||
Ideally, the dependencies should be handled by nix, so it can detect any
|
Ideally, the dependencies should be handled by nix, so it can detect any
|
||||||
change and rebuild the necessary parts automatically. Unfortunately, nix
|
change and rebuild the necessary parts automatically. Unfortunately, nix
|
||||||
is not able to build R as a derivation directly as it requires access
|
is not able to build the result as a derivation directly as it requires access
|
||||||
to the
|
to the
|
||||||
.I "target cluster"
|
.I "target cluster"
|
||||||
with several user accounts. In addition, the results are often
|
with several user accounts. In order to let several users reuse the same results from a cache, we
|
||||||
non-deterministic so the graph G cannot depend on the content of the
|
use the
|
||||||
results.
|
|
||||||
.PP
|
|
||||||
In order to let several users use the results from a cache, we use the
|
|
||||||
.I "nix store"
|
.I "nix store"
|
||||||
to make them available for read only. To generate the results from the
|
to make them available. To generate the results from the
|
||||||
experiment, we add some extra steps that must be executed manually.
|
experiment, we add some extra steps that must be executed manually.
|
||||||
.PS
|
.PS
|
||||||
right
|
right
|
||||||
circlerad=0.22; arrowhead=7;
|
circlerad=0.22; arrowhead=7;
|
||||||
circle "Prog"
|
circle "Prog"
|
||||||
arrow
|
arrow
|
||||||
E: circle "Exp"
|
E: circle "EP"
|
||||||
RUN: circle "Run" at E + (0.8,-0.5)
|
RUN: circle "Run" at E + (0.8,-0.5) dashed
|
||||||
FETCH: circle "Fetch" at E + (1.6,-0.5)
|
FETCH: circle "Fetch" at E + (1.6,-0.5) dashed
|
||||||
R: circle "Result" at E + (2.4,0)
|
R: circle "Result" at E + (2.4,0)
|
||||||
arrow
|
arrow
|
||||||
G: circle "Graph"
|
P: circle "PP"
|
||||||
|
arrow
|
||||||
|
circle "Plot"
|
||||||
arrow dashed from E to RUN chop
|
arrow dashed from E to RUN chop
|
||||||
arrow dashed from RUN to FETCH chop
|
arrow dashed from RUN to FETCH chop
|
||||||
arrow dashed from FETCH to R chop
|
arrow dashed from FETCH to R chop
|
||||||
arrow from E to R chop
|
arrow from E to R chop
|
||||||
.PE
|
.PE
|
||||||
The run and fetch steps are provided by the helper tool
|
The run and fetch steps are provided by the helper tool
|
||||||
.I garlic ,
|
.I "garlic(1)" ,
|
||||||
which launches the experiment using the user credential at the
|
which launches the experiment using the user credentials at the
|
||||||
.I "target cluster"
|
.I "target cluster"
|
||||||
and then fetches the results, placing them in a directory known by nix.
|
and then fetches the results, placing them in a directory known by nix.
|
||||||
Is the directory is not found, nix will issue a message to suggest the
|
When the result derivation needs to be built, nix will look in this
|
||||||
user to launch the experiment and it will fail to build the result
|
directory for the results of the execution. If the directory is not
|
||||||
derivation. When the result is successfully built by any user, the
|
found, a message is printed to suggest the user to launch the
|
||||||
derivation won't need to be rebuilt again until the experiment changes,
|
experiment and the build process is stopped. When the
|
||||||
as the hash only depends on the experiment and not on the contents of
|
result is successfully built by any user, is stored in the
|
||||||
the results.
|
.I "nix store"
|
||||||
|
and it won't need to be rebuilt again until the experiment changes, as
|
||||||
|
the hash only depends on the experiment and not on the contents of the
|
||||||
|
results.
|
||||||
|
.PP
|
||||||
|
Notice that this mechanism violates the deterministic nature of the nix
|
||||||
|
store, as from a given input (the experiment) we can generate different
|
||||||
|
outputs (each result from different executions). We knowingly relaxed
|
||||||
|
this restriction by providing a guarantee that the results are
|
||||||
|
equivalent and there is no need to execute an experiment more than once.
|
||||||
|
.PP
|
||||||
|
To force the execution of an experiment you can use the
|
||||||
|
.I rev
|
||||||
|
attribute which is a number assigned to each experiment
|
||||||
|
and can be incremented to create copies that only differs on that
|
||||||
|
number. The experiment hash will change but the experiment will be the
|
||||||
|
same, as long as the revision number is ignored along the execution
|
||||||
|
stages.
|
||||||
|
Loading…
Reference in New Issue
Block a user