a8c0ce5d06
Make blackbox exporter use the proxy
...
By default it was trying to reach the targets using the default gateway,
but since the electrical cut of 2023-10-20, the login node has not
enabled forwarding again. So better if we don't rely on it.
Reviewed-By: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2025-10-01 16:40:16 +02:00
ca0937859d
Add docker runner too
2025-10-01 16:40:16 +02:00
4d362351cb
Monitor gitlab.bsc.es too
2025-10-01 16:40:16 +02:00
e9b4d87d9f
Monitor PM webpage via blackbox
2025-10-01 16:40:16 +02:00
457e403258
Temporarily disable pm runners
2025-10-01 16:40:16 +02:00
32b9cc17a9
Add runner for gitlab.bsc.es
2025-10-01 16:40:16 +02:00
fbabc06641
Allow anonymous access to grafana
2025-10-01 16:40:16 +02:00
b84066fde5
Enable slurm-exporter service
2025-10-01 16:40:16 +02:00
44667e8e40
Monitor storage nodes via IPMI too
2025-10-01 16:40:16 +02:00
66b5074ff1
Serve the nix store from hut
2025-10-01 16:40:16 +02:00
09ac1d6c13
Make exporters listen in localhost only
2025-10-01 16:40:16 +02:00
4c88f9a783
Poweroff idle slurm nodes after 1 hour
2025-10-01 16:40:16 +02:00
aa52236a80
Unlock ovni gitlab runners
2025-10-01 16:40:16 +02:00
6850bf3a71
Add agenix to all nodes
2025-10-01 16:40:16 +02:00
da92154d33
Remove old secrets
2025-10-01 16:40:16 +02:00
8cedffe040
Move the ceph client config to an external module
2025-10-01 16:40:16 +02:00
8a027d8b09
Reorganize secrets and ssh keys
...
The agenix tools needs to read the secrets from a standalone file, but
we also need the same information for the SSH keys.
2025-10-01 16:40:16 +02:00
76e6ae2f00
Enable binary emulation for other architectures
2025-10-01 16:40:16 +02:00
042ca9e882
Scrape lake2 too
2025-10-01 16:40:16 +02:00
005a1be48a
Scrape metrics from bay
2025-10-01 16:40:16 +02:00
af29f639e2
Add fio tool
2025-10-01 16:40:16 +02:00
0fe025e8be
Add ceph tools in hut too
2025-10-01 16:40:16 +02:00
81baeee5b1
Disable pixiecore in hut for now
2025-10-01 16:40:16 +02:00
686f750c06
Add PXE helper
2025-10-01 16:40:16 +02:00
3c83996e26
Add agenix to PATH in hut
2025-10-01 16:40:16 +02:00
a4fc3d131a
Store ceph secret key in age
...
This allows a node to mount the ceph FS without any extra ceph
configuration in /etc/ceph.
2025-10-01 16:40:16 +02:00
660a8ae163
Add rarias key for secrets
2025-10-01 16:40:16 +02:00
91270b26bb
Add ceph metrics to prometheus
2025-10-01 16:40:16 +02:00
94ce6fedf9
Mount the ceph filesystem in hut
2025-10-01 16:40:16 +02:00
8fcb5a1079
Monitor power from other nodes via LAN
2025-10-01 16:40:15 +02:00
b80656228d
Increase prometheus retention time to one year
2025-10-01 16:40:15 +02:00
ae2007e2fe
Allow access to devices for node_exporter
2025-10-01 16:40:15 +02:00
6ec7353a27
Add owl and all partition
2025-10-01 16:40:15 +02:00
d679fd6314
Simplify flake and expose host pkgs
...
The configuration of the machines is now moved to m/
2025-10-01 16:40:15 +02:00