From 60cab85fc4630afb45dc7e78c23030416686edb5 Mon Sep 17 00:00:00 2001 From: Rodrigo Arias Mallo Date: Mon, 25 Jan 2021 20:02:25 +0100 Subject: [PATCH] user guide: expand the develop section --- garlic/doc/Makefile | 5 +- garlic/doc/ug.mm | 187 +++++++++++++++++++++++++++++++++++++++++--- 2 files changed, 180 insertions(+), 12 deletions(-) diff --git a/garlic/doc/Makefile b/garlic/doc/Makefile index f2ecc0a..77adcd0 100644 --- a/garlic/doc/Makefile +++ b/garlic/doc/Makefile @@ -1,5 +1,5 @@ all: execution.pdf execution.utf8 execution.ascii pp.pdf pp.utf8 pp.ascii\ - branch.pdf blackbox.pdf + branch.pdf blackbox.pdf ug.mm.pdf TTYOPT=-rPO=4m -rLL=72m PDFOPT=-dpaper=a4 -rPO=4c -rLL=13c @@ -7,6 +7,9 @@ PREPROC=-k -t -p -R blackbox.pdf: blackbox.ms Makefile REFER=ref.i groff -ms $(PREPROC) -dpaper=a4 -rPO=2c -rLL=17c -Tpdf $< > $@ + +%.mm.pdf: %.mm Makefile + groff -mm $(PREPROC) -Tpdf $< > $@ -killall -HUP mupdf %.pdf: %.ms Makefile diff --git a/garlic/doc/ug.mm b/garlic/doc/ug.mm index b0a4c48..de1a558 100644 --- a/garlic/doc/ug.mm +++ b/garlic/doc/ug.mm @@ -1,3 +1,5 @@ +\"Header point size +.ds HP "15 12 12 0 0 0 0 0 0 0 0 0 0 0" .COVER .TL Garlic: User guide @@ -5,29 +7,183 @@ Garlic: User guide .AU "Rodrigo Arias Mallo" .COVEND .H 1 "Overview" -Dependency graph of a complete experiment that produces a figure. Each box -is a derivation and arrows represent \fBbuild dependencies\fP. +.P +The garlic framework is designed to fulfill all the requirements of an +experimenter in all the steps up to publication. The experience gained +while using it suggests that we move along three stages despicted in the +following diagram: .DS CB .PS linewid=0.9; right box "Source" "code" -arrow <-> "Develop" above +arrow "Development" above box "Program" -arrow <-> "Experiment" above +arrow "Experiment" above box "Results" -arrow <-> "Data" "exploration" +arrow "Data" "exploration" box "Figures" .PE .DE -.H 1 "Development" +In the development phase the experimenter changes the source code in +order to introduce new features or fix bugs. Once the program is +considered functional, the next phase is the experimentation, where +several experiment configurations are tested to evaluate the program. It +is common that some problems are spotted during this phase, which lead +the experimenter to go back to the development phase and change the +source code. .P -The development phase consists in creating a functional program by -modifying the source code. This process is generally cyclic, where the -developer needs to compile the program, correct mistakes and debug the -program. +Finally, when the experiment is considered completed, the +experimenter moves to the next phase, which envolves the exploration of +the data generated by the experiment. During this phase, it is common to +generate results in the form of plots or tables which provide a clear +insight in those quantities of interest. It is also common that after +looking at the figures, some changes in the experiment configuration +need to be introduced (or even in the source code of the program). .P -It requires to be running in the target machine. +Therefore, the experimenter may move forward and backwards along three +phases several times. The garlic framework provides support for all the +three stages (with different degrees of madurity). +.H 1 "Development (work in progress)" +.P +During the development phase, a functional program is produced by +modifying its source code. This process is generally cyclic: the +developer needs to compile, debug and correct mistakes. We want to +minimize the delay times, so the programs can be executed as soon as +needed, but under a controlled environment so that the same behavior +occurs during the experimentation phase. +.P +The development phase is typically carried directly in the target +machine, so we need the resources first. +.H 2 "Allocating resources for development" +.P +Our target machine (MareNostrum 4) provides an interactive shell, that +can be requested with the number of computational resources required for +development. +.P +To do so, connect to it and allocate an interactive session: +.DS I +.VERBON +build% ssh target +target% salloc ... +compute% +.VERBOFF +.DE +This operation may take some minutes to complete depending on the load +of the cluster. But once the session is ready, any subsequent execution +will be immediate. +.H 2 "Getting the development tools" +.P +In order to get the same packages provided for the experiments, we can +use the \fInix-develop\fP utility, which creates a namespace where the +required packages are installed. Use the build machine to generate a +develop environment: +.DS I +.VERBON +build% nix-build -A garlic.develop +\&... +build% grep ln result +ln -fs /gpfs/projects/bsc15/nix/...olate/bin/stage1 .nix-develop +.VERBOFF +.DE +Copy the \fIln\fP command and run it in the target machine, in a new +directory used for your program development. The link will be placed in +a hidden file named \fI.nix-develop\fP and will be used to remember your +environment. Several environments can be stored using this method, with +different packages on each. +.P +Now you can access the newly created environment by running: +.DS I +.VERBON +compute% nix-develop +develop% +.VERBOFF +.DE +The spawned shell contains all the packages pre-defined in the +\fIgarlic.develop\fP derivation, and can now be accessed by typing the +name of the commands. +.DS I +.VERBON +develop$ which gcc +/nix/store/azayfhqyg9...s8aqfmy-gcc-wrapper-9.3.0/bin/gcc +develop$ which gdb +/nix/store/1c833b2y8j...pnjn2nv9d46zv44dk-gdb-9.2/bin/gdb +.VERBOFF +.DE +If you need additional packages, you can add them in the +\fIgarlic/index.nix\fP file: +.\" FIXME: Unify garlic.unsafeDevelop in garlic.develop, so we can +.\" specify the packages directly +.DS I +.VERBON +unsafeDevelop = callPackage ./develop/default.nix { + extraInputs = with self; [ + coreutils htop procps-ng vim which strace + tmux gdb kakoune universal-ctags bashInteractive + glibcLocales ncurses git screen curl + # Add more nixpkgs packages here... + bsc.slurm bsc.clangOmpss2 bsc.icc bsc.mcxx bsc.perf + # Add more bscpkgs packages here... + ]; +}; +.VERBOFF +.DE +Then re-execute the steps again, to build the new develop environment. +.H 2 "Execution" +The allocated shell can only execute tasks in the current node, which +may be enough for some tests. To do so, you can directly run your +program as: +.DS I +.VERBON +develop$ ./program +.VERBOFF +.DE +If you need to run a multi-node program, typically using MPI +communications, then you can do so by using srun. Notice that you need +to allocate several nodes when calling salloc previously. The srun +command will execute the given program \fBoutside\fP the develop +environment if executed as-is. So we re-enter the develop environment by +calling nix-develop as a wrapper of the program: +.\" FIXME: wrap srun to reenter the develop environment by its own +.DS I +.VERBON +develop$ srun nix-develop ./program +.VERBOFF +.DE +.H 2 "Debugging" +The debugger can be used to directly execute the program if is executed +in only one node by using: +.DS I +.VERBON +develop$ gdb ./program +.VERBOFF +.DE +Or it can be attached to an already running program by using its pid. +You will need to first connect to the node running it, and run gdb +inside the nix-develop environment. Use squeue to see the compute nodes +running your program: +.DS I +.VERBON +target$ ssh compute +compute$ cd project-develop +compute$ nix-develop +develop$ gdb -p $pid +.VERBOFF +.DE +You can repeat this step in other nodes to control the execution in +multiple nodes. +.P +In those cases where the program crashes before being able to attach the +debugger, you can enable the generation of core dumps: +.DS I +.VERBON +develop$ ulimit -c unlimited +.VERBOFF +.DE +And rerun the program, which will generate a core file that can be +opened by gdb and contains the state of the memory when the crash +happened. Beware that the core dump file can be very large, depending on +the memory used by your program at the crash. .\" =================================================================== .H 1 "Experimentation" The experimentation phase begins with a functional program which is the @@ -178,4 +334,13 @@ lastly the number of units. The rationale is that each unit that is shared among experiments gets assigned the same hash. Therefore, you can iteratively add more units to an experiment, and if they are already executed (and the results were generated) is reused. +.SK +.H 1 "Annex A: Branch name diagram" +.DS CB +.S -2 +.PS 4.6/25.4 +copy "gitbranch.pic" +.PE +.S P +.DE .TC