64f077c4f6
stages: prepend the stage name to messages
2021-04-16 09:29:33 +02:00
7c94997023
control: add trap for bad exit
2021-04-16 09:29:33 +02:00
bde54c69c5
sbatch: store queued status
2021-04-16 09:29:33 +02:00
422d359b48
script: stop on error by default
2021-04-16 09:29:33 +02:00
71c06d02da
stages: add baywatch stage to check the exit code
...
This workaround stage prevents srun from returning 0 to the upper stages
when a signal happens after MPI_Finalize. It writes the return code to a
file named .srun.rc.$rank and later checks that exists and contains a 0.
When the program is killed, exits with non-zero and the error is
propagated to the baywatch stage, which aborts immediately without
creating the rc file.
2021-04-16 09:29:26 +02:00
b0af9b8608
srun: add postSrun hook
2021-04-12 17:41:59 +02:00
87fa3bb336
sbatch: assert types to avoid silent parse errors
2021-03-19 16:37:31 +01:00
051a74b85d
srun: allow commands to run before srun
2021-02-26 17:00:09 +01:00
8a77900201
srun: don't expand variables on install
2021-02-26 16:59:29 +01:00
ebcbf91fbe
exec: allow manual specification of program path
2021-02-23 15:22:18 +01:00
e5561b8735
control: save total execution time
2021-02-08 14:14:08 +01:00
2b9c3da911
Add script stage
2021-01-12 18:19:49 +01:00
aeac1a6068
exec: Force newlines
...
Allow single line commands like pre="true"
2021-01-11 19:15:37 +01:00
130fe39c8e
exec: Abort on error
...
We need exit on the first error, as otherwise we cannot track a bad
execution when no exec is done (when post is not empty).
2021-01-11 18:29:30 +01:00
7d4db6b6de
control: Exit on error
...
This prevents srun from silently returning with an error, without
actually queueing the job of a run.
2020-12-07 16:33:40 +01:00
1bdeca9e7d
unit: Remove dangerous slash from index names
2020-12-03 16:33:48 +01:00
c858f521bf
isolate: add $TMPDIR in the namespace
2020-12-03 13:22:10 +01:00
da4bbf8533
isolate: only load some files from /etc
2020-12-03 12:04:51 +01:00
f87d830218
isolate: preserve TERM
2020-12-02 13:06:55 +01:00
3d352fee19
isolate: allow argument passing
2020-12-02 13:06:35 +01:00
1f841649f8
exec: add support for nixPrefix
2020-12-02 11:57:40 +01:00
a147a396d9
trebuchet: add the experiment as attribute
2020-11-20 15:35:36 +01:00
8bc5656461
tools: recursive getExperiment
...
It allows getExperimentStage to be called from any stage above the
experiment.
2020-11-20 15:34:14 +01:00
d192a59fdc
control: Export the run iteration
2020-11-20 15:32:41 +01:00
734d494d96
stdexp: Allow extra mounts
2020-11-20 15:30:47 +01:00
David Alvarez
0c438d4dac
Setup for test experiment
2020-11-20 13:57:12 +01:00
e8f649327a
exec: Avoid variable expansion at build
...
All bash variables passed in env, pre or post are now expanded at
execution time..
2020-11-20 13:54:45 +01:00
e1e34ddf75
exec: add pre and post code to allow cleanup tasks
2020-11-17 16:09:38 +01:00
641e752bd5
Add a trace message at unit evaluation
2020-11-17 11:12:12 +01:00
317409f6ac
Move index and out inside the user directory
2020-11-03 19:10:00 +01:00
5e2797bcde
Create index files for the experiments
2020-11-03 19:10:00 +01:00
efd7df068e
Print full experiment path
2020-11-03 19:10:00 +01:00
3bd4e61f3f
WIP: Testing with automatic fetching
2020-11-03 19:09:59 +01:00
59346fa97e
control: Add status file
2020-11-03 19:09:59 +01:00
4beb069627
WIP: postprocessing pipeline
...
Now each run is executed in a independent folder
2020-11-03 19:09:59 +01:00
2680dcb66f
Don't nest the unit results
...
The experiment directory now contains symlinks to the units, keeping the
old structure. The unit results are directly placed in the garlic out
directory.
2020-11-03 19:09:58 +01:00
c3659d316d
Add perf stage
2020-11-03 19:09:58 +01:00
80ccd1240a
Less verbose execution
2020-10-14 16:29:22 +02:00
9d8f7d9074
Print the experiment being run
2020-10-14 16:28:27 +02:00
c7d2e2d866
Write the unit config in a file
2020-10-14 16:27:47 +02:00
7a37913b4e
Set the ssh host from the machine config
2020-10-13 14:30:03 +02:00
a38ff31cca
Introduce the runexp stage
2020-10-13 13:00:59 +02:00
6ab448b10a
Fix trebuchet description
2020-10-09 20:28:00 +02:00
4de20d3aa5
Remove old stages and update some
2020-10-09 20:12:52 +02:00
27bc977590
Remove strace from isolate stage
2020-10-09 19:50:28 +02:00
332b738889
Move apps into garlic/apps
2020-10-09 16:42:06 +02:00
a576be8031
WIP stage redesign
2020-10-09 16:42:06 +02:00
654e243735
Include an index in the trebuchet
2020-10-09 16:42:06 +02:00
45afe7d391
Simplify experiment stage
2020-10-09 16:42:06 +02:00
d599b8c52f
New naming convention
2020-10-09 16:42:06 +02:00