bf8f0ac583
Make blackbox exporter use the proxy
...
By default it was trying to reach the targets using the default gateway,
but since the electrical cut of 2023-10-20, the login node has not
enabled forwarding again. So better if we don't rely on it.
Reviewed-By: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2025-10-01 16:40:16 +02:00
33c1da6c40
Add docker runner too
2025-10-01 16:40:16 +02:00
5dc41a86e5
Monitor gitlab.bsc.es too
2025-10-01 16:40:16 +02:00
697c3d884e
Monitor PM webpage via blackbox
2025-10-01 16:40:16 +02:00
5a537c7478
Temporarily disable pm runners
2025-10-01 16:40:16 +02:00
c06b706e49
Add runner for gitlab.bsc.es
2025-10-01 16:40:16 +02:00
270cff123d
Allow anonymous access to grafana
2025-10-01 16:40:16 +02:00
b4ede66387
Enable slurm-exporter service
2025-10-01 16:40:16 +02:00
00068cb11c
Monitor storage nodes via IPMI too
2025-10-01 16:40:16 +02:00
c26cff7bdb
Serve the nix store from hut
2025-10-01 16:40:16 +02:00
3385252f5f
Make exporters listen in localhost only
2025-10-01 16:40:16 +02:00
e35b51cd00
Poweroff idle slurm nodes after 1 hour
2025-10-01 16:40:16 +02:00
ac3817d99b
Unlock ovni gitlab runners
2025-10-01 16:40:16 +02:00
e7aa2d3fe3
Add agenix to all nodes
2025-10-01 16:40:16 +02:00
1f199c73f1
Remove old secrets
2025-10-01 16:40:16 +02:00
758ddc71cb
Move the ceph client config to an external module
2025-10-01 16:40:16 +02:00
224bafd20d
Reorganize secrets and ssh keys
...
The agenix tools needs to read the secrets from a standalone file, but
we also need the same information for the SSH keys.
2025-10-01 16:40:16 +02:00
32d5adf900
Enable binary emulation for other architectures
2025-10-01 16:40:16 +02:00
6ada09fe91
Scrape lake2 too
2025-10-01 16:40:16 +02:00
0db43352ac
Scrape metrics from bay
2025-10-01 16:40:16 +02:00
7bb16f858e
Add fio tool
2025-10-01 16:40:16 +02:00
cb76e7afa3
Add ceph tools in hut too
2025-10-01 16:40:16 +02:00
4a40098459
Disable pixiecore in hut for now
2025-10-01 16:40:16 +02:00
c360937d52
Add PXE helper
2025-10-01 16:40:16 +02:00
b5c061be41
Add agenix to PATH in hut
2025-10-01 16:40:16 +02:00
33cc03eb34
Store ceph secret key in age
...
This allows a node to mount the ceph FS without any extra ceph
configuration in /etc/ceph.
2025-10-01 16:40:16 +02:00
ac1783c516
Add rarias key for secrets
2025-10-01 16:40:16 +02:00
71000731c0
Add ceph metrics to prometheus
2025-10-01 16:40:16 +02:00
e320e9ced4
Mount the ceph filesystem in hut
2025-10-01 16:40:16 +02:00
49153acfbd
Monitor power from other nodes via LAN
2025-10-01 16:40:15 +02:00
04c2974a8e
Increase prometheus retention time to one year
2025-10-01 16:40:15 +02:00
5e3470f3bf
Allow access to devices for node_exporter
2025-10-01 16:40:15 +02:00
6ec7353a27
Add owl and all partition
2025-10-01 16:40:15 +02:00
d679fd6314
Simplify flake and expose host pkgs
...
The configuration of the machines is now moved to m/
2025-10-01 16:40:15 +02:00