757 Commits

Author SHA1 Message Date
a4752603e9 cn6: pin commit 2021-04-20 17:34:53 +02:00
5b4bb30e55 nbody: update and simplify figures 2021-04-20 17:16:17 +02:00
e1433fedb8 nbody: refactor experiments into common.nix 2021-04-20 17:13:29 +02:00
f729fc4006 nbody: rename granularity experiment file 2021-04-19 17:27:52 +02:00
Antoni Navarro
03298228e4 nbody: add strong scaling experiment 2021-04-19 17:27:52 +02:00
Antoni Navarro
58294d4467 nbody: add "nodes or sockets" experiment 2021-04-19 17:27:52 +02:00
Antoni Navarro
48a61dc292 nbody: update indexes 2021-04-19 17:27:52 +02:00
Antoni Navarro
5815a9af09 nbody: move "old" experiments to another folder 2021-04-19 17:27:52 +02:00
Antoni Navarro
ea66d7e4e0 nbody: update granularity tests 2021-04-19 17:27:52 +02:00
3e197da8a3 hpcg: update figures and remove old ones 2021-04-19 16:05:10 +02:00
866d4561d3 hpcg: remove old experiments 2021-04-19 16:01:11 +02:00
9a88319153 hpcg: add granularity experiment 2021-04-19 16:00:55 +02:00
a96839d11a hpcg: merge weak scaling and add size experiment
The scaling.nix file defines both the strong and weak experiments by
using the parameter "enableStrong".
2021-04-19 15:57:31 +02:00
a71ae9c2c6 hpcg: avoid mismatching names for gen units 2021-04-16 16:15:16 +02:00
d490ef2694 hpcg: remove unused extrae.xml file 2021-04-16 16:14:48 +02:00
b4e37a15a9 hpcg: refactor ss and gen using a common file
- The file gen.nix now provides an experiment for each unit, to reduce
  the evaluation time.

- The pipeline is specified in the common.nix file only.

- The input dataset path is no longer symlinked, but is specified in the
  "--load" argument.

- The size is renamed to "sizePerTask" instead of "n".
2021-04-16 11:51:34 +02:00
9bb570af7f tools: add floatTruncate function 2021-04-16 11:49:37 +02:00
Raúl Peñacoba
4d629fe8f7 hpcg: remove old comments 2021-04-16 09:32:28 +02:00
Raúl Peñacoba
f5c8d0cb88 hpcg: choose a smaller strong scaling problem size 2021-04-16 09:32:28 +02:00
Raúl Peñacoba
cb6577b439 hpcg: add strongscaling
HPCG rounds problem size axis when its value is < 16
2021-04-16 09:32:28 +02:00
Raúl Peñacoba
b60a46b683 hpcg: add weakscaling over some nblocks to check which axis is better 2021-04-16 09:32:28 +02:00
Raúl Peñacoba
1a6075a2b1 hpcg: add first granularity/scalability exps for tampi+isend+oss+task
- oss.nix runs valid hpcg layouts whereas slices.nix does not
2021-04-16 09:32:28 +02:00
12ff1fd506 garlicd: send logs to the builder 2021-04-16 09:29:33 +02:00
732b0c0e9c garlic tool: improve unit status information 2021-04-16 09:29:33 +02:00
64f077c4f6 stages: prepend the stage name to messages 2021-04-16 09:29:33 +02:00
7c94997023 control: add trap for bad exit 2021-04-16 09:29:33 +02:00
fb0dee4b61 exp: move exit1 experiment to slurm 2021-04-16 09:29:33 +02:00
bde54c69c5 sbatch: store queued status 2021-04-16 09:29:33 +02:00
2151e20bd6 exp: add exit1 experiment
Tests unit bad exits
2021-04-16 09:29:33 +02:00
886d16bcc6 garlic tool: add jq as dependency
So we can parse the experiment configuration in JSON
2021-04-16 09:29:33 +02:00
5c0f179830 stdexp: rename "name" to "clusterName" 2021-04-16 09:29:33 +02:00
422d359b48 script: stop on error by default 2021-04-16 09:29:33 +02:00
60248ab06b article: remove not used figures 2021-04-16 09:29:33 +02:00
1cb63b464d osu: adjust figures for publication 2021-04-16 09:29:33 +02:00
821b4f0d15 rplot: patch scales and fontconfig 2021-04-16 09:29:33 +02:00
0cf35decc5 osu: add mtu and eager experiments 2021-04-16 09:29:33 +02:00
26e3a86c78 garlic tool: check the presence of all the units
This check prevents a user from removing units between the
execution of the experiment and the fetch.
2021-04-16 09:29:33 +02:00
b96c39e0ba noise: add srun signal bug to the list 2021-04-16 09:29:33 +02:00
f842f1e01d slurm: add sigsegv experiment
Ensure that we can catch a sigsegv signal before and after the
MPI_Finalize call.
2021-04-16 09:29:33 +02:00
71c06d02da stages: add baywatch stage to check the exit code
This workaround stage prevents srun from returning 0 to the upper stages
when a signal happens after MPI_Finalize. It writes the return code to a
file named .srun.rc.$rank and later checks that exists and contains a 0.

When the program is killed, exits with non-zero and the error is
propagated to the baywatch stage, which aborts immediately without
creating the rc file.
2021-04-16 09:29:26 +02:00
604cfd90a3 test: add sigsegv after MPI_Finalize test
The current srun version used in MN4 returns 0 if the program crashes
after MPI_Finalize, as shown by this test.
2021-04-16 09:28:02 +02:00
07253c3fa0 fwi: update figure index 2021-04-14 17:18:46 +02:00
eab323a13a fwi: update io figure 2021-04-14 17:18:24 +02:00
8ce2a68cd7 fwi: update strong scaling figure script 2021-04-14 17:16:12 +02:00
99c6196734 fwi: update granularity figure 2021-04-14 17:05:09 +02:00
dd75a840ce fwi: use enableIO instead of ioFreq 2021-04-12 20:09:17 +02:00
e49e3b087f fwi: rename big io experiment 2021-04-12 19:49:31 +02:00
59040d9355 fwi: fix inverted resources 2021-04-12 19:31:35 +02:00
6422741cb7 fwi: merge io experiments into one file
The enableExtended parameter control if the experiment runs with
multiple nodes or only one.
2021-04-12 19:27:45 +02:00
99beac9b23 fwi: generate the model in every node
As we are using local storage, we need a copy of the input in every
node. The current method is to run the generator only in the rank which
has assigned the cpu 0 in the mask.
2021-04-12 19:01:10 +02:00