Mount the hut nix store for SLURM jobs #68

Merged
rarias merged 0 commits from slurm-shared-nix-store into master 2023-09-21 17:22:31 +02:00
rarias commented 2023-09-20 19:51:15 +02:00 (Migrated from pm.bsc.es)

Until we don't transition to a global nix store #42 or fix the overlay problems #41, this is an intermediate solution that allows us to run parallel jobs without the need to copy derivations to the compute nodes.

The trick resides in the private mount namespace that systemd creates to the slurm daemon, which replaces the /nix/store by a read only mount of the hut store exported via NFS.

There are some drawbacks:

  • The local binaries in /run/current-system/sw/bin are not available, as the overlay FS doesn't work. But at least it allows us to run some jobs in the meanwhile.

  • The nix build/shell/develop run as if executed outside the slurm mount namespace, as they contact with the daemon for build operations, and the daemon only sees the local store. But nothing will appear inside the slurm namespace. The environment must be entered from the hut node first, and then the srun command must be launched with all dependencies in hut.

It seems to be immune to the overlay FS "caching" problem, where a ls of a missing path that later becomes readable doesn't work:

hut% nix eval nixpkgs#cowsay.outPath
"/nix/store/k3xhwh54bf9z6xxsdz32lhq4h7c5fimj-cowsay-3.7.0"
hut% ls -d /nix/store/k3xhwh54bf9z6xxsdz32lhq4h7c5fimj-cowsay-3.7.0
ls: cannot access '/nix/store/k3xhwh54bf9z6xxsdz32lhq4h7c5fimj-cowsay-3.7.0': No such file or directory
hut% srun ls -d /nix/store/k3xhwh54bf9z6xxsdz32lhq4h7c5fimj-cowsay-3.7.0
/run/current-system/sw/bin/ls: cannot access '/nix/store/k3xhwh54bf9z6xxsdz32lhq4h7c5fimj-cowsay-3.7.0': No such file or directory
srun: error: owl1: task 0: Exited with exit code 2
hut% nix shell nixpkgs#cowsay
hut% cowsay hi from hut
 _____________
< hi from hut >
 -------------
        \   ^__^
         \  (oo)\_______
            (__)\       )\/\
                ||----w |
                ||     ||
hut% srun cowsay hi from owl
 _____________
< hi from owl >
 -------------
        \   ^__^
         \  (oo)\_______
            (__)\       )\/\
                ||----w |
                ||     ||
hut% srun ls -d /nix/store/k3xhwh54bf9z6xxsdz32lhq4h7c5fimj-cowsay-3.7.0
/nix/store/k3xhwh54bf9z6xxsdz32lhq4h7c5fimj-cowsay-3.7.0
Until we don't transition to a global nix store #42 or fix the overlay problems #41, this is an intermediate solution that allows us to run parallel jobs without the need to copy derivations to the compute nodes. The trick resides in the private mount namespace that systemd creates to the slurm daemon, which replaces the `/nix/store` by a read only mount of the hut store exported via NFS. There are some drawbacks: - The local binaries in `/run/current-system/sw/bin` are not available, as the overlay FS doesn't work. But at least it allows us to run some jobs in the meanwhile. - The nix build/shell/develop run as if executed outside the slurm mount namespace, as they contact with the daemon for build operations, and the daemon only sees the local store. But nothing will appear inside the slurm namespace. The environment must be entered from the hut node first, and then the srun command must be launched with all dependencies in hut. It seems to be immune to the overlay FS "caching" problem, where a ls of a missing path that later becomes readable doesn't work: ``` hut% nix eval nixpkgs#cowsay.outPath "/nix/store/k3xhwh54bf9z6xxsdz32lhq4h7c5fimj-cowsay-3.7.0" hut% ls -d /nix/store/k3xhwh54bf9z6xxsdz32lhq4h7c5fimj-cowsay-3.7.0 ls: cannot access '/nix/store/k3xhwh54bf9z6xxsdz32lhq4h7c5fimj-cowsay-3.7.0': No such file or directory hut% srun ls -d /nix/store/k3xhwh54bf9z6xxsdz32lhq4h7c5fimj-cowsay-3.7.0 /run/current-system/sw/bin/ls: cannot access '/nix/store/k3xhwh54bf9z6xxsdz32lhq4h7c5fimj-cowsay-3.7.0': No such file or directory srun: error: owl1: task 0: Exited with exit code 2 hut% nix shell nixpkgs#cowsay hut% cowsay hi from hut _____________ < hi from hut > ------------- \ ^__^ \ (oo)\_______ (__)\ )\/\ ||----w | || || hut% srun cowsay hi from owl _____________ < hi from owl > ------------- \ ^__^ \ (oo)\_______ (__)\ )\/\ ||----w | || || hut% srun ls -d /nix/store/k3xhwh54bf9z6xxsdz32lhq4h7c5fimj-cowsay-3.7.0 /nix/store/k3xhwh54bf9z6xxsdz32lhq4h7c5fimj-cowsay-3.7.0 ```
rarias commented 2023-09-20 19:51:16 +02:00 (Migrated from pm.bsc.es)

requested review from @arocanon

requested review from @arocanon
rarias commented 2023-09-20 19:51:16 +02:00 (Migrated from pm.bsc.es)

assigned to @rarias

assigned to @rarias
arocanon commented 2023-09-21 11:19:45 +02:00 (Migrated from pm.bsc.es)

I expected to see some config changes for the owl nodes to forward builds to hut. Aren't they needed yet?

I expected to see some config changes for the owl nodes to forward builds to hut. Aren't they needed yet?
rarias commented 2023-09-21 11:29:17 +02:00 (Migrated from pm.bsc.es)

The builds in the owl machines are still done locally. This is required to switch to a new system while we keep the profile installed in /nix/var/nix/profiles/system which is loaded by the grub script, as we continue to boot from the disk.

The only way to alter the shared nix store seen by the slurm jobs is by issuing the nix build/shell/develop commands from the hut node.

We could add a bind mount in /nix/var/daemon-socket to the slurmd systemd service and connect it to the hut daemon so the builds can also be done from the compute nodes from the slurm mount namespace. If this setup proves to work reliably we can try to add this capability later too, but for now it allows me to begin the ovni CI testing with multiple nodes.

The builds in the owl machines are still done locally. This is required to switch to a new system while we keep the profile installed in `/nix/var/nix/profiles/system` which is loaded by the grub script, as we continue to boot from the disk. The only way to alter the shared nix store seen by the slurm jobs is by issuing the nix build/shell/develop commands from the hut node. We could add a bind mount in `/nix/var/daemon-socket` to the slurmd systemd service and connect it to the hut daemon so the builds can also be done from the compute nodes from the slurm mount namespace. If this setup proves to work reliably we can try to add this capability later too, but for now it allows me to begin the ovni CI testing with multiple nodes.
arocanon commented 2023-09-21 11:58:33 +02:00 (Migrated from pm.bsc.es)

Good for me, but if we merge this to master, shouldn't we add some doc about this behavior?

Good for me, but if we merge this to master, shouldn't we add some doc about this behavior?
rarias commented 2023-09-21 13:52:14 +02:00 (Migrated from pm.bsc.es)

added 1 commit

  • 9ee71114 - Document the hut shared nix store for SLURM

Compare with previous version

added 1 commit <ul><li>9ee71114 - Document the hut shared nix store for SLURM</li></ul> [Compare with previous version](/gitlab/rarias/jungle/-/merge_requests/24/diffs?diff_id=9401&start_sha=8de3d2b149dff53de62aa8bca40d4b99718abb53)
rarias commented 2023-09-21 13:53:20 +02:00 (Migrated from pm.bsc.es)

Yeah. I added some documentation under the owl page, but this will be better covered by the introductory guide I'm preparing, where I describe how to use the whole cluster to run jobs, build deerivations, etc.

Yeah. I added some documentation under the owl page, but this will be better covered by the introductory guide I'm preparing, where I describe how to use the whole cluster to run jobs, build deerivations, etc.
arocanon commented 2023-09-21 17:21:51 +02:00 (Migrated from pm.bsc.es)

Perfect, thank you very much!

Perfect, thank you very much!
arocanon commented 2023-09-21 17:22:01 +02:00 (Migrated from pm.bsc.es)

resolved all threads

resolved all threads
arocanon commented 2023-09-21 17:22:19 +02:00 (Migrated from pm.bsc.es)

approved this merge request

approved this merge request
arocanon (Migrated from pm.bsc.es) approved these changes 2024-05-29 10:53:28 +02:00
Sign in to join this conversation.
No reviewers
No Milestone
No project
No Assignees
1 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: rarias/jungle#68
No description provided.