27 Commits

Author SHA1 Message Date
b4ede66387 Enable slurm-exporter service 2025-10-01 16:40:16 +02:00
00068cb11c Monitor storage nodes via IPMI too 2025-10-01 16:40:16 +02:00
c26cff7bdb Serve the nix store from hut 2025-10-01 16:40:16 +02:00
3385252f5f Make exporters listen in localhost only 2025-10-01 16:40:16 +02:00
e35b51cd00 Poweroff idle slurm nodes after 1 hour 2025-10-01 16:40:16 +02:00
ac3817d99b Unlock ovni gitlab runners 2025-10-01 16:40:16 +02:00
e7aa2d3fe3 Add agenix to all nodes 2025-10-01 16:40:16 +02:00
1f199c73f1 Remove old secrets 2025-10-01 16:40:16 +02:00
758ddc71cb Move the ceph client config to an external module 2025-10-01 16:40:16 +02:00
224bafd20d Reorganize secrets and ssh keys
The agenix tools needs to read the secrets from a standalone file, but
we also need the same information for the SSH keys.
2025-10-01 16:40:16 +02:00
32d5adf900 Enable binary emulation for other architectures 2025-10-01 16:40:16 +02:00
6ada09fe91 Scrape lake2 too 2025-10-01 16:40:16 +02:00
0db43352ac Scrape metrics from bay 2025-10-01 16:40:16 +02:00
7bb16f858e Add fio tool 2025-10-01 16:40:16 +02:00
cb76e7afa3 Add ceph tools in hut too 2025-10-01 16:40:16 +02:00
4a40098459 Disable pixiecore in hut for now 2025-10-01 16:40:16 +02:00
c360937d52 Add PXE helper 2025-10-01 16:40:16 +02:00
b5c061be41 Add agenix to PATH in hut 2025-10-01 16:40:16 +02:00
33cc03eb34 Store ceph secret key in age
This allows a node to mount the ceph FS without any extra ceph
configuration in /etc/ceph.
2025-10-01 16:40:16 +02:00
ac1783c516 Add rarias key for secrets 2025-10-01 16:40:16 +02:00
71000731c0 Add ceph metrics to prometheus 2025-10-01 16:40:16 +02:00
e320e9ced4 Mount the ceph filesystem in hut 2025-10-01 16:40:16 +02:00
49153acfbd Monitor power from other nodes via LAN 2025-10-01 16:40:15 +02:00
04c2974a8e Increase prometheus retention time to one year 2025-10-01 16:40:15 +02:00
5e3470f3bf Allow access to devices for node_exporter 2025-10-01 16:40:15 +02:00
6ec7353a27 Add owl and all partition 2025-10-01 16:40:15 +02:00
d679fd6314 Simplify flake and expose host pkgs
The configuration of the machines is now moved to m/
2025-10-01 16:40:15 +02:00