341 Commits

Author SHA1 Message Date
5f492ee1d7 Enable slurm-exporter service 2023-09-21 21:40:02 +02:00
9071a4de8b Add prometheus-slurm-exporter package 2023-09-21 21:34:18 +02:00
3040a803b2 Mount the hut nix store for SLURM jobs 2023-09-20 19:38:43 +02:00
70a9e855cf Enable direnv integration 2023-09-20 09:32:58 +02:00
aa64e9ef24 Remove bscpkgs from the registry and nixPath
This is done to prevent accidental evaluations where the nixpkgs input
of bscpkgs is still pointing to a different version that the one
specified in the jungle flake. Instead use jungle#bscpkgs.X to get a
package from bscpkgs.
2023-09-15 12:00:33 +02:00
ba2b74fd5a Add bscpkgs and nixpkgs top level attributes
Allows the evaluation of packages of the intermediate overlays.
2023-09-15 12:00:33 +02:00
1ae5d9e25e Use hut packages as the default package set
Allows the user to directly access nixpkgs and bscpkgs from the top
level as `nix build jungle#htop` and `nix build jungle#bsc.ovni`.
2023-09-15 12:00:28 +02:00
ff98ba47c4 Don't fetch registry flakes from the net 2023-09-15 12:00:28 +02:00
599b23ef52 flake.lock: Update
Flake lock file updates:

• Updated input 'bscpkgs':
    'git+https://pm.bsc.es/gitlab/rarias/bscpkgs.git?ref=refs/heads/master&rev=6122fef92701701e1a0622550ac0fc5c2beb5906' (2023-09-07)
  → 'git+https://pm.bsc.es/gitlab/rarias/bscpkgs.git?ref=refs/heads/master&rev=3a4062ac04be6263c64a481420d8e768c2521b80' (2023-09-14)
2023-09-15 11:50:47 +02:00
8dbee06d1d Revert "Update slurm to 23.02.05.1"
This reverts commit 7bfd786c01c36131cd00b90fc6a9503fd1226578.
2023-09-14 15:46:18 +02:00
d522113cb9 Open ports in firewall of compute nodes 2023-09-14 15:45:43 +02:00
7bfd786c01 Update slurm to 23.02.05.1 2023-09-13 17:44:24 +02:00
5a5f4672cd Monitor storage nodes via IPMI too 2023-09-13 15:57:13 +02:00
2646ad4b70 Enable fstrim service 2023-09-12 16:39:45 +02:00
b120a7ca85 Serve the nix store from hut 2023-09-12 12:19:43 +02:00
2a0254b684 Add encrypted munge key with agenix 2023-09-08 19:05:45 +02:00
e3e6e7662d Remove unused large port hole in firewall 2023-09-08 18:22:48 +02:00
868f825e26 Make exporters listen in localhost only 2023-09-08 18:13:04 +02:00
f231dc81f1 Allow only some ports for srun 2023-09-08 17:51:37 +02:00
a758eef354 Block ssfhead from reaching our slurm daemon 2023-09-08 17:36:28 +02:00
9c9c41fb57 Poweroff idle slurm nodes after 1 hour 2023-09-08 16:49:53 +02:00
1a1708f16f Add IB and IPMI node host names 2023-09-08 13:21:37 +02:00
efe1b7e399 flake.lock: Update
Flake lock file updates:

• Updated input 'bscpkgs':
    'git+https://pm.bsc.es/gitlab/rarias/bscpkgs.git?ref=refs/heads/master&rev=ee24b910a1cb95bd222e253da43238e843816f2f' (2023-09-01)
  → 'git+https://pm.bsc.es/gitlab/rarias/bscpkgs.git?ref=refs/heads/master&rev=6122fef92701701e1a0622550ac0fc5c2beb5906' (2023-09-07)
2023-09-07 11:13:45 +02:00
eb9876aff6 Unlock ovni gitlab runners 2023-09-05 16:59:45 +02:00
8d31c552f5 flake.lock: Update
Flake lock file updates:

• Updated input 'bscpkgs':
    'git+https://pm.bsc.es/gitlab/rarias/bscpkgs.git?ref=refs/heads/master&rev=18d64c352c10f9ce74aabddeba5a5db02b74ec27' (2023-08-31)
  → 'git+https://pm.bsc.es/gitlab/rarias/bscpkgs.git?ref=refs/heads/master&rev=ee24b910a1cb95bd222e253da43238e843816f2f' (2023-09-01)
• Updated input 'nixpkgs':
    'github:NixOS/nixpkgs/d680ded26da5cf104dd2735a51e88d2d8f487b4d' (2023-08-19)
  → 'github:NixOS/nixpkgs/e56990880811a451abd32515698c712788be5720' (2023-09-02)
2023-09-05 15:03:26 +02:00
68f4d54dd1 Add agenix to all nodes 2023-09-04 22:10:43 +02:00
2042d58b72 Add agenix module to ceph 2023-09-04 22:07:07 +02:00
2c8c90e6e4 Remove old secrets 2023-09-04 22:04:32 +02:00
208dcb7dde Mount /ceph in owl1 and owl2 2023-09-04 22:00:36 +02:00
e2f82a6383 Warn about the owl2 omnipath device 2023-09-04 22:00:17 +02:00
d704816de9 Clean owl2 configuration 2023-09-04 21:59:56 +02:00
74ec4eb22a Move the ceph client config to an external module 2023-09-04 21:59:04 +02:00
0a5f9b55f5 Reorganize secrets and ssh keys
The agenix tools needs to read the secrets from a standalone file, but
we also need the same information for the SSH keys.
2023-09-04 21:36:31 +02:00
900de39e2f Add anavarro user 2023-09-04 16:00:01 +02:00
1e466d07df Set zsh inc_append_history option 2023-09-03 16:57:53 +02:00
13807c5e8f Set zsh shell for rarias 2023-09-03 16:46:27 +02:00
d8d6d6d421 Enable zsh and fix key bindings 2023-09-03 16:42:04 +02:00
a242ddd39c Keep a log over time with the config commits 2023-09-03 00:02:14 +02:00
a2c5fe1f5e Configure bscpkgs.nixpkgs to follow nixpkgs 2023-09-02 23:37:59 +02:00
2c52ef9ff0 Store nixos config in /etc/nixos/config.rev 2023-09-02 23:37:11 +02:00
acb91695ac Enable binary emulation for other architectures 2023-08-31 17:27:08 +02:00
9d93760e6f Enable watchdog 2023-08-30 16:32:17 +02:00
aad67b9d99 Enable all osd on boot in lake2 2023-08-30 16:32:17 +02:00
e1d406023d Scrape lake2 too 2023-08-29 12:33:26 +02:00
db6bb90af8 Also enable monitoring in lake2 2023-08-29 12:29:41 +02:00
1266c8f04e Scrape metrics from bay 2023-08-29 11:58:00 +02:00
2b7823788c Add monitoring in the bay node 2023-08-29 11:53:32 +02:00
86eacdd3e5 Add fio tool 2023-08-29 11:27:50 +02:00
4fa074f893 Add ceph tools in hut too 2023-08-28 17:58:21 +02:00
a260a1bc1b Switch ceph logs to journal 2023-08-28 17:58:08 +02:00