Update fox documentation for SLURM and FS

Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
This commit is contained in:
Rodrigo Arias 2025-09-02 17:21:37 +02:00
parent 6c59bd1f38
commit 09be67c989

View File

@ -21,17 +21,28 @@ the detailed specifications:
## Access
To access the machine, request a SLURM session from [hut](/hut) using the `fox`
partition:
To access the machine, request a SLURM session from [apex](/apex) using the `fox`
partition. If you need the machine for performance measurements, use an
exclusive reservation:
hut% salloc -p fox
apex% salloc -p fox --exclusive
Then connect via ssh:
Otherwise, specify the CPUs that you need so other users can also use the node
at the same time:
hut% ssh fox
apex% salloc -p fox -c 8
Then use srun to execute an interactive shell:
apex% srun --pty $SHELL
fox%
Follow [these steps](/access) if you don't have access to hut or fox.
Make sure you get all CPUs you expect:
fox% grep Cpus_allowed_list /proc/self/status
Cpus_allowed_list: 0-191
Follow [these steps](/access) if you don't have access to apex or fox.
## CUDA
@ -89,9 +100,5 @@ Then just run `nix develop` from the same directory:
The machine has several file systems available.
- `$HOME`: Mounted via NFS across all nodes. It is slow and has low capacity.
Don't abuse.
- `/ceph/home/$USER`: Shared Ceph file system across jungle nodes. Slow but high
capacity. Stores three redundant copies of every file.
- `/nvme{0,1}/$USER`: The two local NVME disks, very fast and large capacity.
- `/tmp`: tmpfs, fast but not backed by a disk. Will be erased on reboot.