f4fcb7c72c
Set the SLURM_CONF variable
2025-10-01 16:40:16 +02:00
b4ede66387
Enable slurm-exporter service
2025-10-01 16:40:16 +02:00
ec351a157c
Add prometheus-slurm-exporter package
2025-10-01 16:40:16 +02:00
63d63fd39a
Mount the hut nix store for SLURM jobs
2025-10-01 16:40:16 +02:00
beae9d240e
Enable direnv integration
2025-10-01 16:40:16 +02:00
e925b00489
Remove bscpkgs from the registry and nixPath
...
This is done to prevent accidental evaluations where the nixpkgs input
of bscpkgs is still pointing to a different version that the one
specified in the jungle flake. Instead use jungle#bscpkgs.X to get a
package from bscpkgs.
2025-10-01 16:40:16 +02:00
5594e3615d
Add bscpkgs and nixpkgs top level attributes
...
Allows the evaluation of packages of the intermediate overlays.
2025-10-01 16:40:16 +02:00
384c4ee766
Use hut packages as the default package set
...
Allows the user to directly access nixpkgs and bscpkgs from the top
level as `nix build jungle#htop` and `nix build jungle#bsc.ovni`.
2025-10-01 16:40:16 +02:00
87871de141
Don't fetch registry flakes from the net
2025-10-01 16:40:16 +02:00
c5a058f96a
flake.lock: Update
...
Flake lock file updates:
• Updated input 'bscpkgs':
'git+https://pm.bsc.es/gitlab/rarias/bscpkgs.git?ref=refs/heads/master&rev=6122fef92701701e1a0622550ac0fc5c2beb5906 ' (2023-09-07)
→ 'git+https://pm.bsc.es/gitlab/rarias/bscpkgs.git?ref=refs/heads/master&rev=3a4062ac04be6263c64a481420d8e768c2521b80 ' (2023-09-14)
2025-10-01 16:40:16 +02:00
79077477e1
Revert "Update slurm to 23.02.05.1"
...
This reverts commit aaefddc44a9073166ac52b8bd56ac96258d3b053.
2025-10-01 16:40:16 +02:00
f5a6055f21
Open ports in firewall of compute nodes
2025-10-01 16:40:16 +02:00
0333b57851
Update slurm to 23.02.05.1
2025-10-01 16:40:16 +02:00
00068cb11c
Monitor storage nodes via IPMI too
2025-10-01 16:40:16 +02:00
a992b266bb
Enable fstrim service
2025-10-01 16:40:16 +02:00
c26cff7bdb
Serve the nix store from hut
2025-10-01 16:40:16 +02:00
1b5469af13
Add encrypted munge key with agenix
2025-10-01 16:40:16 +02:00
78c883a274
Remove unused large port hole in firewall
2025-10-01 16:40:16 +02:00
3385252f5f
Make exporters listen in localhost only
2025-10-01 16:40:16 +02:00
241b888a7c
Allow only some ports for srun
2025-10-01 16:40:16 +02:00
b7aba3d15c
Block ssfhead from reaching our slurm daemon
2025-10-01 16:40:16 +02:00
e35b51cd00
Poweroff idle slurm nodes after 1 hour
2025-10-01 16:40:16 +02:00
2e460f49bd
Add IB and IPMI node host names
2025-10-01 16:40:16 +02:00
a13a2caf57
flake.lock: Update
...
Flake lock file updates:
• Updated input 'bscpkgs':
'git+https://pm.bsc.es/gitlab/rarias/bscpkgs.git?ref=refs/heads/master&rev=ee24b910a1cb95bd222e253da43238e843816f2f ' (2023-09-01)
→ 'git+https://pm.bsc.es/gitlab/rarias/bscpkgs.git?ref=refs/heads/master&rev=6122fef92701701e1a0622550ac0fc5c2beb5906 ' (2023-09-07)
2025-10-01 16:40:16 +02:00
ac3817d99b
Unlock ovni gitlab runners
2025-10-01 16:40:16 +02:00
c1d9b01ed1
flake.lock: Update
...
Flake lock file updates:
• Updated input 'bscpkgs':
'git+https://pm.bsc.es/gitlab/rarias/bscpkgs.git?ref=refs/heads/master&rev=18d64c352c10f9ce74aabddeba5a5db02b74ec27 ' (2023-08-31)
→ 'git+https://pm.bsc.es/gitlab/rarias/bscpkgs.git?ref=refs/heads/master&rev=ee24b910a1cb95bd222e253da43238e843816f2f ' (2023-09-01)
• Updated input 'nixpkgs':
'github:NixOS/nixpkgs/d680ded26da5cf104dd2735a51e88d2d8f487b4d' (2023-08-19)
→ 'github:NixOS/nixpkgs/e56990880811a451abd32515698c712788be5720' (2023-09-02)
2025-10-01 16:40:16 +02:00
e7aa2d3fe3
Add agenix to all nodes
2025-10-01 16:40:16 +02:00
d2860ce437
Add agenix module to ceph
2025-10-01 16:40:16 +02:00
1f199c73f1
Remove old secrets
2025-10-01 16:40:16 +02:00
657a1b328a
Mount /ceph in owl1 and owl2
2025-10-01 16:40:16 +02:00
875e6fe6c7
Warn about the owl2 omnipath device
2025-10-01 16:40:16 +02:00
7abee55da4
Clean owl2 configuration
2025-10-01 16:40:16 +02:00
758ddc71cb
Move the ceph client config to an external module
2025-10-01 16:40:16 +02:00
224bafd20d
Reorganize secrets and ssh keys
...
The agenix tools needs to read the secrets from a standalone file, but
we also need the same information for the SSH keys.
2025-10-01 16:40:16 +02:00
8b1fa938ea
Add anavarro user
2025-10-01 16:40:16 +02:00
94b110dc57
Set zsh inc_append_history option
2025-10-01 16:40:16 +02:00
0c5207bd2d
Set zsh shell for rarias
2025-10-01 16:40:16 +02:00
4200e6162d
Enable zsh and fix key bindings
2025-10-01 16:40:16 +02:00
c0ae33fbb5
Keep a log over time with the config commits
2025-10-01 16:40:16 +02:00
1ec4942794
Configure bscpkgs.nixpkgs to follow nixpkgs
2025-10-01 16:40:16 +02:00
ff00c2be8d
Store nixos config in /etc/nixos/config.rev
2025-10-01 16:40:16 +02:00
32d5adf900
Enable binary emulation for other architectures
2025-10-01 16:40:16 +02:00
ad491140f4
Enable watchdog
2025-10-01 16:40:16 +02:00
eac4c60e1a
Enable all osd on boot in lake2
2025-10-01 16:40:16 +02:00
6ada09fe91
Scrape lake2 too
2025-10-01 16:40:16 +02:00
9f71aae1ff
Also enable monitoring in lake2
2025-10-01 16:40:16 +02:00
0db43352ac
Scrape metrics from bay
2025-10-01 16:40:16 +02:00
27ae97c4d7
Add monitoring in the bay node
2025-10-01 16:40:16 +02:00
7bb16f858e
Add fio tool
2025-10-01 16:40:16 +02:00
cb76e7afa3
Add ceph tools in hut too
2025-10-01 16:40:16 +02:00