Commit Graph

107 Commits

Author SHA1 Message Date
70321ce237 Scrape metrics from bay 2023-08-29 11:58:00 +02:00
5bd1d67333 Add monitoring in the bay node 2023-08-29 11:53:32 +02:00
fad9df61e1 Add fio tool 2023-08-29 11:27:50 +02:00
d2a80c8c18 Add ceph tools in hut too 2023-08-28 17:58:21 +02:00
599613d139 Switch ceph logs to journal 2023-08-28 17:58:08 +02:00
cb3a7b19f7 Move pkgs overlay to overlay.nix 2023-08-25 18:12:00 +02:00
f5d6bf627b Enable ceph osd daemons in lake2 2023-08-25 14:54:51 +02:00
f1ce815edd Add the lake2 hostname to the hosts 2023-08-25 14:44:35 +02:00
a2075cfd65 Use the sda for lake2 2023-08-25 13:40:10 +02:00
8f1f6f92a8 Remove netboot module 2023-08-25 13:39:01 +02:00
3416416864 Disable pixiecore in hut for now 2023-08-25 13:21:00 +02:00
815888fb07 Add PXE helper 2023-08-25 12:05:33 +02:00
029d9cb1db Enable netboot again for PXE 2023-08-24 19:08:23 +02:00
95fa67ede1 Specify the disk by path 2023-08-24 15:27:37 +02:00
a19347161f Prepare lake2 config after bootstrap
The disk ID is different under NixOS.
2023-08-24 13:54:53 +02:00
58c1cc1f7c Add lake2 bootstrap config 2023-08-24 12:30:46 +02:00
077eece6b9 Add agenix to PATH in hut 2023-08-23 17:42:50 +02:00
b3ef53de51 Store ceph secret key in age
This allows a node to mount the ceph FS without any extra ceph
configuration in /etc/ceph.
2023-08-23 17:26:44 +02:00
e0852ee89b Add rarias key for secrets 2023-08-23 17:15:26 +02:00
dfffc0bdce Add ceph metrics to prometheus 2023-08-22 16:33:55 +02:00
8257c245b1 Mount the ceph filesystem in hut 2023-08-22 16:15:46 +02:00
cd5853cf53 Add ceph config in bay 2023-08-22 15:58:48 +02:00
b677b827d4 Add the bay host name 2023-08-22 15:56:09 +02:00
b1d5185cca Remove netboot and fixes 2023-08-22 12:12:15 +02:00
a7e66e2246 Add bay node 2023-08-22 12:12:15 +02:00
f8fb5fa4ff Monitor power from other nodes via LAN 2023-08-22 11:28:54 +02:00
acf9b71f04 Increase prometheus retention time to one year 2023-08-22 11:28:54 +02:00
bf692e6e4e Don't set all_proxy 2023-08-22 11:28:54 +02:00
55d6c17776 Allow access to devices for node_exporter 2023-07-28 13:55:35 +02:00
14b173f67e GRUB version no longer needed 2023-07-27 17:22:20 +02:00
f892d43b47 Kill slurmd remaining processes on upgrade 2023-07-27 14:49:20 +02:00
79adbe76a8 koro: Add vlopez user 2023-07-21 13:00:43 +02:00
66fb848ba8 Add koro node 2023-07-21 13:00:08 +02:00
40b1a8f0df eudy: Add fcsv3 and intermediate versions for testing 2023-07-21 11:27:51 +02:00
a0b9d10b14 eudy: Enable memory overcommit 2023-07-21 11:27:51 +02:00
4c309dea2f eudy: disable all cpu mitigations 2023-07-21 11:27:51 +02:00
7c1fe1455b Enable NTP using the BSC time server 2023-06-30 14:02:15 +02:00
2d4b178895 Add the ssfhead node as gateway 2023-06-30 14:01:35 +02:00
4dd25f2f89 Use our host names first by default 2023-06-23 16:22:18 +02:00
6dcd9d8144 Add DNS tools to resolve hosts 2023-06-23 16:15:45 +02:00
31be81d2b1 Lower perf_event_paranoid to -1 2023-06-23 16:01:27 +02:00
826cfdf43f Set perf paranoid to 0 by default 2023-06-21 16:24:19 +02:00
a1f258c5ce Add perf to packages 2023-06-21 15:41:06 +02:00
1c1d3f3231 Allow srun to specify the cpu binding
The task/affinity plugin needs to be selected.
2023-06-21 13:16:23 +02:00
623d46c03f Move authorized keys to users.nix 2023-06-20 14:08:34 +02:00
518a4d6af3 Add rpenacob user 2023-06-20 12:54:26 +02:00
60077948d6 Add osumb to the system packages 2023-06-16 19:22:41 +02:00
1724535495 Use explicit order in overlays 2023-06-16 18:26:51 +02:00
ab04855382 Add mpich overlay 2023-06-16 18:26:51 +02:00
684d5e41c5 Add coments in slurm config 2023-06-16 18:26:50 +02:00
316ea18e24 Add eudy host key to known hosts 2023-06-16 17:29:48 +02:00
c916157fcc Rename xeon08 to eudy
From Eudyptula, a little penguin.
2023-06-16 17:16:05 +02:00
94320d9256 Add ssh host keys 2023-06-16 12:01:12 +02:00
9f5941c2be Set the name of the slurm cluster to jungle 2023-06-16 12:00:54 +02:00
fba0f7b739 Change owl hostnames 2023-06-16 11:42:39 +02:00
2e95281af5 Add owl and all partition 2023-06-16 11:34:00 +02:00
f4ac9f3186 Simplify flake and expose host pkgs
The configuration of the machines is now moved to m/
2023-06-16 11:31:31 +02:00