forked from rarias/bscpkgs
		
	Document the results and pp stages
This commit is contained in:
		
							parent
							
								
									634d2040b5
								
							
						
					
					
						commit
						33682ef48d
					
				| @ -4,7 +4,7 @@ TTYOPT=-rPO=4m -rLL=72m | ||||
| #TTYOPT=-dpaper=a0 -rPO=4m -rLL=72m
 | ||||
| 
 | ||||
| %.pdf: %.ms | ||||
| 	REFER=ref.i groff -ms -t -p -R -Tpdf $^ > $@ | ||||
| 	REFER=ref.i groff -ms -dpaper=a4 -k -t -p -R -Tpdf $^ > $@ | ||||
| 	-killall -HUP mupdf | ||||
| 
 | ||||
| %.utf8: %.ms | ||||
|  | ||||
							
								
								
									
										179
									
								
								garlic/doc/pp.ms
									
									
									
									
									
								
							
							
						
						
									
										179
									
								
								garlic/doc/pp.ms
									
									
									
									
									
								
							| @ -1,15 +1,14 @@ | ||||
| .TL | ||||
| Garlic: the post-processing pipeline | ||||
| Garlic: the postprocess pipeline | ||||
| .AU | ||||
| Rodrigo Arias Mallo | ||||
| .AI | ||||
| Barcelona Supercomputing Center | ||||
| .AB | ||||
| .LP | ||||
| In this document the stages posterior to the execution of the experiment | ||||
| are explained. We consider the post-processing pipeline the steps to go | ||||
| from the generated data from the experiment to a set of plots or tables | ||||
| that present the data in a human readable form. | ||||
| This document covers the format used to store the results of the | ||||
| execution and the postprocess steps used to generate a set of | ||||
| figures from the results to present the data. | ||||
| .AE | ||||
| .\"##################################################################### | ||||
| .nr GROWPS 3 | ||||
| @ -20,6 +19,7 @@ that present the data in a human readable form. | ||||
| .R1 | ||||
| bracket-label " [" ] ", " | ||||
| accumulate | ||||
| move-punctuation | ||||
| .R2 | ||||
| .\"##################################################################### | ||||
| .NH 1 | ||||
| @ -27,24 +27,64 @@ Introduction | ||||
| .LP | ||||
| After the correct execution of an experiment some measurements are | ||||
| recorded in the results for further investigation. Typically the time of | ||||
| the execution is measured and presented later in a plot or a table. The | ||||
| steps to analyze the results and present them in a convenient way is | ||||
| called the | ||||
| .I "post-processing pipeline" . | ||||
| Similarly to the execution pipeline | ||||
| the execution or other quantities are measured and presented later in a | ||||
| figure (generally a plot or a table). | ||||
| The | ||||
| .I "postprocess pipeline" | ||||
| consists of all the steps required to create a set of figures from the | ||||
| results. Similarly to the execution pipeline where several stages run | ||||
| sequentially, | ||||
| .[ | ||||
| garlic execution | ||||
| .] | ||||
| where several stages run sequentially, the | ||||
| post-processing pipeline is also formed by multiple stages executed in | ||||
| order. | ||||
| the postprocess pipeline is also formed by multiple stages executed | ||||
| in order. | ||||
| .PP | ||||
| The rationale behind dividing execution and post-processing is | ||||
| The rationale behind dividing execution and postprocess is | ||||
| that usually the experiments are costly to run (they take a long time to | ||||
| complete) while generating a plot is usually shorter. Refining the plots | ||||
| multiple times reusing the same experimental results doesn't require the | ||||
| execution of the complete experiment, so the experimenter can try | ||||
| multiple ways to present the data in a rapid cycle. | ||||
| complete) while generating a figure require less time. Refining the | ||||
| figures multiple times reusing the same experimental results doesn't | ||||
| require the execution of the complete experiment, so the experimenter | ||||
| can try multiple ways to present the data without waiting a large delay. | ||||
| .NH 1 | ||||
| Results | ||||
| .LP | ||||
| The results are generated in the same | ||||
| .I "target" | ||||
| machine where the experiment is executed and are stored in the garlic | ||||
| .I output , | ||||
| organized into a directory structure following the experiment name, the | ||||
| unit name and the run number (governed by the | ||||
| .I control | ||||
| stage): | ||||
| .QS | ||||
| .CW | ||||
|  |-- 6lp88vlj7m8hvvhpfz25p5mvvg7ycflb-experiment | ||||
|  |   |-- 8lpmmfix52a8v7kfzkzih655awchl9f1-unit  | ||||
|  |   |   |-- 1  | ||||
|  |   |   |   |-- stderr.log | ||||
|  |   |   |   |-- stdout.log | ||||
|  |   |   |   |-- ... | ||||
|  |   |   |-- 2  | ||||
|  ... | ||||
| .QE | ||||
| In order to provide an easier access to the results, an index is also | ||||
| created by taking the | ||||
| .I expName | ||||
| and | ||||
| .I unitName | ||||
| attributes (defined in the experiment configuration) and linking them to | ||||
| the appropriate experiment and unit directories. These links are | ||||
| overwritten by the last experiment with the same names so they are only | ||||
| valid for the last execution. The output and index directories are | ||||
| placed into a per-user directory, as we cannot guarantee the complete | ||||
| execution of each unit when multiple users can share units. | ||||
| .PP | ||||
| The messages printed to the standard output and error are | ||||
| are stored in the log files with the same name inside each run | ||||
| directory. Additional data is sometimes generated by the experiments, | ||||
| and is found in each run directory. As the generated data can be very | ||||
| large, is ignored by default when considering the results. | ||||
| .NH 1 | ||||
| Fetching the results | ||||
| .LP | ||||
| @ -57,8 +97,9 @@ program\[em]any changes in the program will cause nix to build it again | ||||
| using the updated program. The results will also depend on the | ||||
| execution pipeline, and the graph on the results. This chain of | ||||
| dependencies can be shown in the following dependency graph: | ||||
| .\"circlerad=0.22; arrowhead=7; | ||||
| .PS | ||||
| circlerad=0.22; | ||||
| linewid=0.35; | ||||
| right | ||||
| circle "Prog" | ||||
| arrow | ||||
| @ -74,21 +115,23 @@ Ideally, the dependencies should be handled by nix, so it can detect any | ||||
| change and rebuild the necessary parts automatically. Unfortunately, nix | ||||
| is not able to build the result as a derivation directly as it requires access | ||||
| to the | ||||
| .I "target cluster" | ||||
| with several user accounts. In order to let several users reuse the same results from a cache, we | ||||
| .I "target" | ||||
| machine with several user accounts. In order to let several users reuse | ||||
| the same results from a cache, we | ||||
| use the | ||||
| .I "nix store" | ||||
| to make them available. To generate the results from the | ||||
| experiment, we add some extra steps that must be executed manually. | ||||
| .PS | ||||
| right | ||||
| circlerad=0.22; arrowhead=7; | ||||
| circle "Prog" | ||||
| arrow | ||||
| diag=linewid + circlerad; | ||||
| far=circlerad*3 + linewid*4 | ||||
| E: circle "EP" | ||||
| RUN: circle "Run" at E + (0.8,-0.5) dashed | ||||
| FETCH: circle "Fetch" at E + (1.6,-0.5) dashed | ||||
| R: circle "Result" at E + (2.4,0) | ||||
| R: circle "Result" at E + (far,0) | ||||
| RUN: circle "Run" at E + (diag,-diag) dashed | ||||
| FETCH: circle "Fetch" at R + (-diag,-diag) dashed | ||||
| move to R.e | ||||
| arrow | ||||
| P: circle "PP" | ||||
| arrow | ||||
| @ -101,13 +144,13 @@ arrow from E to R chop | ||||
| The run and fetch steps are provided by the helper tool | ||||
| .I "garlic(1)" , | ||||
| which launches the experiment using the user credentials at the | ||||
| .I "target cluster" | ||||
| and then fetches the results, placing them in a directory known by nix. | ||||
| When the result derivation needs to be built, nix will look in this | ||||
| directory for the results of the execution. If the directory is not | ||||
| found, a message is printed to suggest the user to launch the | ||||
| experiment and the build process is stopped. When the | ||||
| result is successfully built by any user, is stored in the | ||||
| .I "target" | ||||
| machine and then fetches the results, placing them in a directory known | ||||
| by nix.  When the result derivation needs to be built, nix will look in | ||||
| this directory for the results of the execution. If the directory is not | ||||
| found, a message is printed to suggest the user to launch the experiment | ||||
| and the build process is stopped. When the result is successfully built | ||||
| by any user, is stored in the | ||||
| .I "nix store" | ||||
| and it won't need to be rebuilt again until the experiment changes, as | ||||
| the hash only depends on the experiment and not on the contents of the | ||||
| @ -126,3 +169,73 @@ and can be incremented to create copies that only differs on that | ||||
| number. The experiment hash will change but the experiment will be the | ||||
| same, as long as the revision number is ignored along the execution | ||||
| stages. | ||||
| .NH 1 | ||||
| Postprocess stages | ||||
| .LP | ||||
| Once the results are completely generated in the | ||||
| .I "target" | ||||
| machine there are several stages required to build a set of figures: | ||||
| .PP | ||||
| .I fetch \[em] | ||||
| waits until all the experiment units are completed and then executes the | ||||
| next stage. This stage is performed by the | ||||
| .I garlic(1) | ||||
| tool using the | ||||
| .I -F | ||||
| option and also reports the current state of the execution. | ||||
| .PP | ||||
| .I store \[em] | ||||
| copies from the | ||||
| .I target | ||||
| machine into the nix store all log files generated by the experiment,  | ||||
| keeping the same directory structure. It tracks the execution state of | ||||
| each unit and only copies the results once the experiment is complete. | ||||
| Other files are ignored as they are often very large and not required | ||||
| for the subsequent stages. | ||||
| .PP | ||||
| .I timetable \[em] | ||||
| converts the results of the experiment into a NDJSON file with one | ||||
| line per run for each unit. Each line is a valid JSON object, containing | ||||
| the | ||||
| .I exp , | ||||
| .I unit | ||||
| and | ||||
| .I run | ||||
| keys and the unit configuration (as a JSON object) in the | ||||
| .I config | ||||
| key. The execution time is captured from the standard output and is | ||||
| added in the | ||||
| .I time | ||||
| key. | ||||
| .PP | ||||
| .I merge \[em] | ||||
| one or more timetable datasets are joined, by simply concatenating them. | ||||
| This step allows building one dataset to compare multiple experiments in | ||||
| the same figure. | ||||
| .PP | ||||
| .I rPlot \[em] | ||||
| one ot more figures are generated by a single R script | ||||
| .[ | ||||
| r cookbook | ||||
| .] | ||||
| which takes as input the previously generated dataset. | ||||
| The path of the dataset is recorded in the figure as well, which | ||||
| contains enough information to determine all the stages in the execution | ||||
| and postprocess pipelines. | ||||
| .SH 1 | ||||
| Appendix A: Current setup | ||||
| .LP | ||||
| As of this moment, the | ||||
| .I build | ||||
| machine which contains the nix store is | ||||
| .I xeon07 | ||||
| and the | ||||
| .I "target" | ||||
| machine used to run the experiments is Mare Nostrum 4 with the | ||||
| .I output | ||||
| directory placed at | ||||
| .CW /gpfs/projects/bsc15/garlic . | ||||
| By default, the experiment results are never deleted from the | ||||
| .I target | ||||
| so you may want to remove the ones already stored in the nix store to | ||||
| free space. | ||||
|  | ||||
| @ -1,4 +1,9 @@ | ||||
| %A Rodrigo Arias Mallo | ||||
| %D 2020 | ||||
| %K garlic execution | ||||
| %T Garlic: the execution pipeline | ||||
| 
 | ||||
| %A Winston Chang | ||||
| %T R Graphics Cookbook: Practical Recipes for Visualizing Data | ||||
| %D 2020 | ||||
| %I O'Reilly Media | ||||
| %O 2nd edition | ||||
|  | ||||
		Loading…
	
	
			
			x
			
			
		
	
		Reference in New Issue
	
	Block a user