31 Commits

Author SHA1 Message Date
fc9285f89d Monitor PM webpage via blackbox 2023-10-06 15:17:07 +02:00
fbe238f5b6 Temporarily disable pm runners 2023-10-06 15:17:07 +02:00
9874da566d Add runner for gitlab.bsc.es 2023-10-06 15:17:07 +02:00
ebc5c4d84f Allow anonymous access to grafana 2023-09-22 10:51:30 +02:00
5f492ee1d7 Enable slurm-exporter service 2023-09-21 21:40:02 +02:00
5a5f4672cd Monitor storage nodes via IPMI too 2023-09-13 15:57:13 +02:00
b120a7ca85 Serve the nix store from hut 2023-09-12 12:19:43 +02:00
868f825e26 Make exporters listen in localhost only 2023-09-08 18:13:04 +02:00
9c9c41fb57 Poweroff idle slurm nodes after 1 hour 2023-09-08 16:49:53 +02:00
eb9876aff6 Unlock ovni gitlab runners 2023-09-05 16:59:45 +02:00
68f4d54dd1 Add agenix to all nodes 2023-09-04 22:10:43 +02:00
2c8c90e6e4 Remove old secrets 2023-09-04 22:04:32 +02:00
74ec4eb22a Move the ceph client config to an external module 2023-09-04 21:59:04 +02:00
0a5f9b55f5 Reorganize secrets and ssh keys
The agenix tools needs to read the secrets from a standalone file, but
we also need the same information for the SSH keys.
2023-09-04 21:36:31 +02:00
acb91695ac Enable binary emulation for other architectures 2023-08-31 17:27:08 +02:00
e1d406023d Scrape lake2 too 2023-08-29 12:33:26 +02:00
1266c8f04e Scrape metrics from bay 2023-08-29 11:58:00 +02:00
86eacdd3e5 Add fio tool 2023-08-29 11:27:50 +02:00
4fa074f893 Add ceph tools in hut too 2023-08-28 17:58:21 +02:00
f18f1937ae Disable pixiecore in hut for now 2023-08-25 13:21:00 +02:00
4b78ec9134 Add PXE helper 2023-08-25 12:05:33 +02:00
832866cbfa Add agenix to PATH in hut 2023-08-23 17:42:50 +02:00
9fc393bb6a Store ceph secret key in age
This allows a node to mount the ceph FS without any extra ceph
configuration in /etc/ceph.
2023-08-23 17:26:44 +02:00
d81d9d58e1 Add rarias key for secrets 2023-08-23 17:15:26 +02:00
d54dcc8d8f Add ceph metrics to prometheus 2023-08-22 16:33:55 +02:00
a5fae4a289 Mount the ceph filesystem in hut 2023-08-22 16:15:46 +02:00
1622b3e7fc Monitor power from other nodes via LAN 2023-08-22 11:28:54 +02:00
3424cac761 Increase prometheus retention time to one year 2023-08-22 11:28:54 +02:00
e497e1b88b Allow access to devices for node_exporter 2023-07-28 13:55:35 +02:00
30c21155af Add owl and all partition 2023-06-16 11:34:00 +02:00
a43016ebee Simplify flake and expose host pkgs
The configuration of the machines is now moved to m/
2023-06-16 11:31:31 +02:00