33 Commits

Author SHA1 Message Date
72658ee5e6 Add docker runner too 2023-10-04 07:55:26 +02:00
cfa3e08e4b Monitor gitlab.bsc.es too 2023-10-03 09:45:13 +02:00
10101c631d Monitor PM webpage via blackbox 2023-10-03 08:58:07 +02:00
4d865d7a7e Temporarily disable pm runners 2023-09-28 14:14:41 +02:00
d9511dab22 Add runner for gitlab.bsc.es 2023-09-28 14:11:30 +02:00
c3ecba513d Allow anonymous access to grafana 2023-09-22 10:50:14 +02:00
4ca4e0fae9 Enable slurm-exporter service 2023-09-21 21:38:34 +02:00
de3a28b7df Monitor storage nodes via IPMI too 2023-09-13 15:57:13 +02:00
826d6263fd Serve the nix store from hut 2023-09-12 12:19:43 +02:00
dd616a7fb1 Make exporters listen in localhost only 2023-09-08 18:13:04 +02:00
bdd03dac60 Poweroff idle slurm nodes after 1 hour 2023-09-08 13:31:23 +02:00
d91c9b7473 Unlock ovni gitlab runners 2023-09-05 16:24:27 +02:00
ae4ad95902 Add agenix to all nodes 2023-09-04 22:09:40 +02:00
8fc87885da Remove old secrets 2023-09-04 22:04:32 +02:00
c13022596a Move the ceph client config to an external module 2023-09-04 21:59:04 +02:00
875622ad0f Reorganize secrets and ssh keys
The agenix tools needs to read the secrets from a standalone file, but
we also need the same information for the SSH keys.
2023-09-04 21:36:31 +02:00
48727d3a88 Enable binary emulation for other architectures 2023-08-31 17:22:36 +02:00
4495cbf380 Scrape lake2 too 2023-08-29 12:33:26 +02:00
c47c190c79 Scrape metrics from bay 2023-08-29 11:58:00 +02:00
042e56b5b2 Add fio tool 2023-08-29 11:27:50 +02:00
a510a41eed Add ceph tools in hut too 2023-08-28 17:58:21 +02:00
300690df4c Disable pixiecore in hut for now 2023-08-25 13:21:00 +02:00
9d15c13a44 Add PXE helper 2023-08-25 12:03:30 +02:00
591a4c774e Add agenix to PATH in hut 2023-08-23 17:42:50 +02:00
e8d5eeb5cf Store ceph secret key in age
This allows a node to mount the ceph FS without any extra ceph
configuration in /etc/ceph.
2023-08-23 17:18:17 +02:00
2516559fac Add rarias key for secrets 2023-08-23 17:15:26 +02:00
bb8bf86051 Add ceph metrics to prometheus 2023-08-22 16:33:55 +02:00
2416ec7806 Mount the ceph filesystem in hut 2023-08-22 15:57:49 +02:00
199358a5e3 Monitor power from other nodes via LAN 2023-08-17 18:55:40 +02:00
776a582c10 Increase prometheus retention time to one year 2023-07-28 16:19:59 +02:00
b978839406 Allow access to devices for node_exporter 2023-07-28 13:48:30 +02:00
e0ab4e1408 Add owl and all partition 2023-06-16 11:34:00 +02:00
3cb263ea71 Simplify flake and expose host pkgs
The configuration of the machines is now moved to m/
2023-06-14 17:28:00 +02:00