Antoni Navarro
03298228e4
nbody: add strong scaling experiment
2021-04-19 17:27:52 +02:00
Antoni Navarro
58294d4467
nbody: add "nodes or sockets" experiment
2021-04-19 17:27:52 +02:00
Antoni Navarro
48a61dc292
nbody: update indexes
2021-04-19 17:27:52 +02:00
Antoni Navarro
5815a9af09
nbody: move "old" experiments to another folder
2021-04-19 17:27:52 +02:00
Antoni Navarro
ea66d7e4e0
nbody: update granularity tests
2021-04-19 17:27:52 +02:00
3e197da8a3
hpcg: update figures and remove old ones
2021-04-19 16:05:10 +02:00
866d4561d3
hpcg: remove old experiments
2021-04-19 16:01:11 +02:00
9a88319153
hpcg: add granularity experiment
2021-04-19 16:00:55 +02:00
a96839d11a
hpcg: merge weak scaling and add size experiment
...
The scaling.nix file defines both the strong and weak experiments by
using the parameter "enableStrong".
2021-04-19 15:57:31 +02:00
a71ae9c2c6
hpcg: avoid mismatching names for gen units
2021-04-16 16:15:16 +02:00
d490ef2694
hpcg: remove unused extrae.xml file
2021-04-16 16:14:48 +02:00
b4e37a15a9
hpcg: refactor ss and gen using a common file
...
- The file gen.nix now provides an experiment for each unit, to reduce
the evaluation time.
- The pipeline is specified in the common.nix file only.
- The input dataset path is no longer symlinked, but is specified in the
"--load" argument.
- The size is renamed to "sizePerTask" instead of "n".
2021-04-16 11:51:34 +02:00
9bb570af7f
tools: add floatTruncate function
2021-04-16 11:49:37 +02:00
Raúl Peñacoba
4d629fe8f7
hpcg: remove old comments
2021-04-16 09:32:28 +02:00
Raúl Peñacoba
f5c8d0cb88
hpcg: choose a smaller strong scaling problem size
2021-04-16 09:32:28 +02:00
Raúl Peñacoba
cb6577b439
hpcg: add strongscaling
...
HPCG rounds problem size axis when its value is < 16
2021-04-16 09:32:28 +02:00
Raúl Peñacoba
b60a46b683
hpcg: add weakscaling over some nblocks to check which axis is better
2021-04-16 09:32:28 +02:00
Raúl Peñacoba
1a6075a2b1
hpcg: add first granularity/scalability exps for tampi+isend+oss+task
...
- oss.nix runs valid hpcg layouts whereas slices.nix does not
2021-04-16 09:32:28 +02:00
12ff1fd506
garlicd: send logs to the builder
2021-04-16 09:29:33 +02:00
732b0c0e9c
garlic tool: improve unit status information
2021-04-16 09:29:33 +02:00
64f077c4f6
stages: prepend the stage name to messages
2021-04-16 09:29:33 +02:00
7c94997023
control: add trap for bad exit
2021-04-16 09:29:33 +02:00
fb0dee4b61
exp: move exit1 experiment to slurm
2021-04-16 09:29:33 +02:00
bde54c69c5
sbatch: store queued status
2021-04-16 09:29:33 +02:00
2151e20bd6
exp: add exit1 experiment
...
Tests unit bad exits
2021-04-16 09:29:33 +02:00
886d16bcc6
garlic tool: add jq as dependency
...
So we can parse the experiment configuration in JSON
2021-04-16 09:29:33 +02:00
5c0f179830
stdexp: rename "name" to "clusterName"
2021-04-16 09:29:33 +02:00
422d359b48
script: stop on error by default
2021-04-16 09:29:33 +02:00
60248ab06b
article: remove not used figures
2021-04-16 09:29:33 +02:00
1cb63b464d
osu: adjust figures for publication
2021-04-16 09:29:33 +02:00
821b4f0d15
rplot: patch scales and fontconfig
2021-04-16 09:29:33 +02:00
0cf35decc5
osu: add mtu and eager experiments
2021-04-16 09:29:33 +02:00
26e3a86c78
garlic tool: check the presence of all the units
...
This check prevents a user from removing units between the
execution of the experiment and the fetch.
2021-04-16 09:29:33 +02:00
b96c39e0ba
noise: add srun signal bug to the list
2021-04-16 09:29:33 +02:00
f842f1e01d
slurm: add sigsegv experiment
...
Ensure that we can catch a sigsegv signal before and after the
MPI_Finalize call.
2021-04-16 09:29:33 +02:00
71c06d02da
stages: add baywatch stage to check the exit code
...
This workaround stage prevents srun from returning 0 to the upper stages
when a signal happens after MPI_Finalize. It writes the return code to a
file named .srun.rc.$rank and later checks that exists and contains a 0.
When the program is killed, exits with non-zero and the error is
propagated to the baywatch stage, which aborts immediately without
creating the rc file.
2021-04-16 09:29:26 +02:00
604cfd90a3
test: add sigsegv after MPI_Finalize test
...
The current srun version used in MN4 returns 0 if the program crashes
after MPI_Finalize, as shown by this test.
2021-04-16 09:28:02 +02:00
07253c3fa0
fwi: update figure index
2021-04-14 17:18:46 +02:00
eab323a13a
fwi: update io figure
2021-04-14 17:18:24 +02:00
8ce2a68cd7
fwi: update strong scaling figure script
2021-04-14 17:16:12 +02:00
99c6196734
fwi: update granularity figure
2021-04-14 17:05:09 +02:00
dd75a840ce
fwi: use enableIO instead of ioFreq
2021-04-12 20:09:17 +02:00
e49e3b087f
fwi: rename big io experiment
2021-04-12 19:49:31 +02:00
59040d9355
fwi: fix inverted resources
2021-04-12 19:31:35 +02:00
6422741cb7
fwi: merge io experiments into one file
...
The enableExtended parameter control if the experiment runs with
multiple nodes or only one.
2021-04-12 19:27:45 +02:00
99beac9b23
fwi: generate the model in every node
...
As we are using local storage, we need a copy of the input in every
node. The current method is to run the generator only in the rank which
has assigned the cpu 0 in the mask.
2021-04-12 19:01:10 +02:00
58dc277d3d
fwi: refactor ss-io with common.nix
...
Also, keep the names short and consistent.
2021-04-12 17:57:46 +02:00
47b326c646
fwi: generate the input at runtime
2021-04-12 17:46:07 +02:00
419e7f95cc
fwi: avoid input generation
...
The ModelGenerator is now included in the fwi-params, so that the input
can be generated at runtime.
2021-04-12 17:43:30 +02:00
b0af9b8608
srun: add postSrun hook
2021-04-12 17:41:59 +02:00