user guide: expand the develop section

2021-01-25 20:02:25 +01:00 · 2021-01-25 20:02:25 +01:00 · 60cab85fc4
commit 60cab85fc4
parent 95809bd2bf
2 changed files with 180 additions and 12 deletions
--- a/garlic/doc/Makefile
+++ b/garlic/doc/Makefile
@ -1,5 +1,5 @@
 all: execution.pdf execution.utf8 execution.ascii pp.pdf pp.utf8 pp.ascii\
-	branch.pdf blackbox.pdf
+	branch.pdf blackbox.pdf ug.mm.pdf
 TTYOPT=-rPO=4m -rLL=72m
 PDFOPT=-dpaper=a4 -rPO=4c -rLL=13c
@ -7,6 +7,9 @@ PREPROC=-k -t -p -R
 blackbox.pdf: blackbox.ms Makefile
 	REFER=ref.i groff -ms $(PREPROC) -dpaper=a4 -rPO=2c -rLL=17c -Tpdf $< > $@
 %.mm.pdf: %.mm Makefile
 	groff -mm $(PREPROC) -Tpdf $< > $@
 	-killall -HUP mupdf
 %.pdf: %.ms Makefile
--- a/garlic/doc/ug.mm
+++ b/garlic/doc/ug.mm
@ -1,3 +1,5 @@
 \"Header point size
 .ds HP "15 12 12 0 0 0 0 0 0 0 0 0 0 0"
 .COVER
 .TL
 Garlic: User guide
@ -5,29 +7,183 @@ Garlic: User guide
 .AU "Rodrigo Arias Mallo"
 .COVEND
 .H 1 "Overview"
-Dependency graph of a complete experiment that produces a figure. Each box
+.P
-is a derivation and arrows represent \fBbuild dependencies\fP.
+The garlic framework is designed to fulfill all the requirements of an
 experimenter in all the steps up to publication. The experience gained
 while using it suggests that we move along three stages despicted in the
 following diagram:
 .DS CB
 .PS
 linewid=0.9;
 right
 box "Source" "code"
-arrow <-> "Develop" above
+arrow "Development" above
 box "Program"
-arrow <-> "Experiment" above
+arrow "Experiment" above
 box "Results"
-arrow <-> "Data" "exploration"
+arrow "Data" "exploration"
 box "Figures"
 .PE
 .DE
-.H 1 "Development"
+In the development phase the experimenter changes the source code in
 order to introduce new features or fix bugs. Once the program is
 considered functional, the next phase is the experimentation, where
 several experiment configurations are tested to evaluate the program. It
 is common that some problems are spotted during this phase, which lead
 the experimenter to go back to the development phase and change the
 source code.
 .P
-The development phase consists in creating a functional program by
+Finally, when the experiment is considered completed, the
-modifying the source code. This process is generally cyclic, where the
+experimenter moves to the next phase, which envolves the exploration of
-developer needs to compile the program, correct mistakes and debug the
+the data generated by the experiment. During this phase, it is common to
-program.
+generate results in the form of plots or tables which provide a clear
 insight in those quantities of interest. It is also common that after
 looking at the figures, some changes in the experiment configuration
 need to be introduced (or even in the source code of the program).
 .P
-It requires to be running in the target machine.
+Therefore, the experimenter may move forward and backwards along three
 phases several times. The garlic framework provides support for all the
 three stages (with different degrees of madurity).
 .H 1 "Development (work in progress)"
 .P
 During the development phase, a functional program is produced by
 modifying its source code. This process is generally cyclic: the
 developer needs to compile, debug and correct mistakes. We want to
 minimize the delay times, so the programs can be executed as soon as
 needed, but under a controlled environment so that the same behavior
 occurs during the experimentation phase.
 .P
 The development phase is typically carried directly in the target
 machine, so we need the resources first.
 .H 2 "Allocating resources for development"
 .P
 Our target machine (MareNostrum 4) provides an interactive shell, that
 can be requested with the number of computational resources required for
 development.
 .P
 To do so, connect to it and allocate an interactive session:
 .DS I
 .VERBON
 build% ssh target
 target% salloc ...
 compute%
 .VERBOFF
 .DE
 This operation may take some minutes to complete depending on the load
 of the cluster. But once the session is ready, any subsequent execution
 will be immediate.
 .H 2 "Getting the development tools"
 .P
 In order to get the same packages provided for the experiments, we can
 use the \fInix-develop\fP utility, which creates a namespace where the
 required packages are installed. Use the build machine to generate a
 develop environment:
 .DS I
 .VERBON
 build% nix-build -A garlic.develop
 \&...
 build% grep ln result
 ln -fs /gpfs/projects/bsc15/nix/...olate/bin/stage1 .nix-develop
 .VERBOFF
 .DE
 Copy the \fIln\fP command and run it in the target machine, in a new
 directory used for your program development. The link will be placed in
 a hidden file named \fI.nix-develop\fP and will be used to remember your
 environment. Several environments can be stored using this method, with
 different packages on each.
 .P
 Now you can access the newly created environment by running:
 .DS I
 .VERBON
 compute% nix-develop
 develop%
 .VERBOFF
 .DE
 The spawned shell contains all the packages pre-defined in the
 \fIgarlic.develop\fP derivation, and can now be accessed by typing the
 name of the commands.
 .DS I
 .VERBON
 develop$ which gcc
 /nix/store/azayfhqyg9...s8aqfmy-gcc-wrapper-9.3.0/bin/gcc
 develop$ which gdb
 /nix/store/1c833b2y8j...pnjn2nv9d46zv44dk-gdb-9.2/bin/gdb
 .VERBOFF
 .DE
 If you need additional packages, you can add them in the
 \fIgarlic/index.nix\fP file:
 .\" FIXME: Unify garlic.unsafeDevelop in garlic.develop, so we can
 .\" specify the packages directly
 .DS I
 .VERBON
 unsafeDevelop = callPackage ./develop/default.nix {
  extraInputs = with self; [
    coreutils htop procps-ng vim which strace
    tmux gdb kakoune universal-ctags bashInteractive
    glibcLocales ncurses git screen curl
    # Add more nixpkgs packages here...
    bsc.slurm bsc.clangOmpss2 bsc.icc bsc.mcxx bsc.perf
    # Add more bscpkgs packages here...
  ];
 };
 .VERBOFF
 .DE
 Then re-execute the steps again, to build the new develop environment.
 .H 2 "Execution"
 The allocated shell can only execute tasks in the current node, which
 may be enough for some tests. To do so, you can directly run your
 program as:
 .DS I
 .VERBON
 develop$ ./program
 .VERBOFF
 .DE
 If you need to run a multi-node program, typically using MPI
 communications, then you can do so by using srun. Notice that you need
 to allocate several nodes when calling salloc previously. The srun
 command will execute the given program \fBoutside\fP the develop
 environment if executed as-is. So we re-enter the develop environment by
 calling nix-develop as a wrapper of the program:
 .\" FIXME: wrap srun to reenter the develop environment by its own
 .DS I
 .VERBON
 develop$ srun nix-develop ./program
 .VERBOFF
 .DE
 .H 2 "Debugging"
 The debugger can be used to directly execute the program if is executed
 in only one node by using:
 .DS I
 .VERBON
 develop$ gdb ./program
 .VERBOFF
 .DE
 Or it can be attached to an already running program by using its pid.
 You will need to first connect to the node running it, and run gdb
 inside the nix-develop environment. Use squeue to see the compute nodes
 running your program: 
 .DS I
 .VERBON
 target$ ssh compute
 compute$ cd project-develop
 compute$ nix-develop
 develop$ gdb -p $pid
 .VERBOFF
 .DE
 You can repeat this step in other nodes to control the execution in
 multiple nodes.
 .P
 In those cases where the program crashes before being able to attach the
 debugger, you can enable the generation of core dumps:
 .DS I
 .VERBON
 develop$ ulimit -c unlimited
 .VERBOFF
 .DE
 And rerun the program, which will generate a core file that can be
 opened by gdb and contains the state of the memory when the crash
 happened. Beware that the core dump file can be very large, depending on
 the memory used by your program at the crash.
 .\" ===================================================================
 .H 1 "Experimentation"
 The experimentation phase begins with a functional program which is the
@ -178,4 +334,13 @@ lastly the number of units. The rationale is that each unit that is
 shared among experiments gets assigned the same hash. Therefore, you can
 iteratively add more units to an experiment, and if they are already
 executed (and the results were generated) is reused.
 .SK
 .H 1 "Annex A: Branch name diagram"
 .DS CB
 .S -2
 .PS 4.6/25.4
 copy "gitbranch.pic"
 .PE
 .S P
 .DE
 .TC