176 Commits

Author SHA1 Message Date
e7aa2d3fe3 Add agenix to all nodes 2025-10-01 16:40:16 +02:00
d2860ce437 Add agenix module to ceph 2025-10-01 16:40:16 +02:00
1f199c73f1 Remove old secrets 2025-10-01 16:40:16 +02:00
657a1b328a Mount /ceph in owl1 and owl2 2025-10-01 16:40:16 +02:00
875e6fe6c7 Warn about the owl2 omnipath device 2025-10-01 16:40:16 +02:00
7abee55da4 Clean owl2 configuration 2025-10-01 16:40:16 +02:00
758ddc71cb Move the ceph client config to an external module 2025-10-01 16:40:16 +02:00
224bafd20d Reorganize secrets and ssh keys
The agenix tools needs to read the secrets from a standalone file, but
we also need the same information for the SSH keys.
2025-10-01 16:40:16 +02:00
8b1fa938ea Add anavarro user 2025-10-01 16:40:16 +02:00
94b110dc57 Set zsh inc_append_history option 2025-10-01 16:40:16 +02:00
0c5207bd2d Set zsh shell for rarias 2025-10-01 16:40:16 +02:00
4200e6162d Enable zsh and fix key bindings 2025-10-01 16:40:16 +02:00
c0ae33fbb5 Keep a log over time with the config commits 2025-10-01 16:40:16 +02:00
ff00c2be8d Store nixos config in /etc/nixos/config.rev 2025-10-01 16:40:16 +02:00
32d5adf900 Enable binary emulation for other architectures 2025-10-01 16:40:16 +02:00
ad491140f4 Enable watchdog 2025-10-01 16:40:16 +02:00
eac4c60e1a Enable all osd on boot in lake2 2025-10-01 16:40:16 +02:00
6ada09fe91 Scrape lake2 too 2025-10-01 16:40:16 +02:00
9f71aae1ff Also enable monitoring in lake2 2025-10-01 16:40:16 +02:00
0db43352ac Scrape metrics from bay 2025-10-01 16:40:16 +02:00
27ae97c4d7 Add monitoring in the bay node 2025-10-01 16:40:16 +02:00
7bb16f858e Add fio tool 2025-10-01 16:40:16 +02:00
cb76e7afa3 Add ceph tools in hut too 2025-10-01 16:40:16 +02:00
9cd45365ec Switch ceph logs to journal 2025-10-01 16:40:16 +02:00
a48ae143cc Move pkgs overlay to overlay.nix 2025-10-01 16:40:16 +02:00
637b48752e Enable ceph osd daemons in lake2 2025-10-01 16:40:16 +02:00
5f2fe97cd4 Add the lake2 hostname to the hosts 2025-10-01 16:40:16 +02:00
a21b95fd8b Use the sda for lake2 2025-10-01 16:40:16 +02:00
d49bf2f802 Remove netboot module 2025-10-01 16:40:16 +02:00
4a40098459 Disable pixiecore in hut for now 2025-10-01 16:40:16 +02:00
c360937d52 Add PXE helper 2025-10-01 16:40:16 +02:00
fa230307d8 Enable netboot again for PXE 2025-10-01 16:40:16 +02:00
1a9f2a72f2 Specify the disk by path 2025-10-01 16:40:16 +02:00
e952e716bf Prepare lake2 config after bootstrap
The disk ID is different under NixOS.
2025-10-01 16:40:16 +02:00
aa6140411a Add lake2 bootstrap config 2025-10-01 16:40:16 +02:00
b5c061be41 Add agenix to PATH in hut 2025-10-01 16:40:16 +02:00
33cc03eb34 Store ceph secret key in age
This allows a node to mount the ceph FS without any extra ceph
configuration in /etc/ceph.
2025-10-01 16:40:16 +02:00
ac1783c516 Add rarias key for secrets 2025-10-01 16:40:16 +02:00
71000731c0 Add ceph metrics to prometheus 2025-10-01 16:40:16 +02:00
e320e9ced4 Mount the ceph filesystem in hut 2025-10-01 16:40:16 +02:00
479d63f842 Add ceph config in bay 2025-10-01 16:40:16 +02:00
503a63539c Add the bay host name 2025-10-01 16:40:16 +02:00
3adaea0fdd Remove netboot and fixes 2025-10-01 16:40:15 +02:00
54083c60cd Add bay node 2025-10-01 16:40:15 +02:00
49153acfbd Monitor power from other nodes via LAN 2025-10-01 16:40:15 +02:00
04c2974a8e Increase prometheus retention time to one year 2025-10-01 16:40:15 +02:00
8cb7cf087c Don't set all_proxy 2025-10-01 16:40:15 +02:00
5e3470f3bf Allow access to devices for node_exporter 2025-10-01 16:40:15 +02:00
d92e06d7b7 GRUB version no longer needed 2025-10-01 16:40:15 +02:00
a096a386a0 Kill slurmd remaining processes on upgrade 2025-10-01 16:40:15 +02:00