|
77cb3c494e
|
Poweroff idle slurm nodes after 1 hour
|
2023-09-08 16:49:53 +02:00 |
|
|
6db5772ac4
|
Add IB and IPMI node host names
|
2023-09-08 13:21:37 +02:00 |
|
|
dca274d020
|
Unlock ovni gitlab runners
|
2023-09-05 16:59:45 +02:00 |
|
|
02f40a8217
|
Add agenix to all nodes
|
2023-09-04 22:10:43 +02:00 |
|
|
77d43b6da9
|
Add agenix module to ceph
|
2023-09-04 22:07:07 +02:00 |
|
|
ab55aac5ff
|
Remove old secrets
|
2023-09-04 22:04:32 +02:00 |
|
|
9b5bfbb7a3
|
Mount /ceph in owl1 and owl2
|
2023-09-04 22:00:36 +02:00 |
|
|
a69a71d1b0
|
Warn about the owl2 omnipath device
|
2023-09-04 22:00:17 +02:00 |
|
|
98374bd303
|
Clean owl2 configuration
|
2023-09-04 21:59:56 +02:00 |
|
|
3b6be8a2fc
|
Move the ceph client config to an external module
|
2023-09-04 21:59:04 +02:00 |
|
|
2bb366b9ac
|
Reorganize secrets and ssh keys
The agenix tools needs to read the secrets from a standalone file, but
we also need the same information for the SSH keys.
|
2023-09-04 21:36:31 +02:00 |
|
|
2d16709648
|
Add anavarro user
|
2023-09-04 16:00:01 +02:00 |
|
|
9344daa31c
|
Set zsh inc_append_history option
|
2023-09-03 16:57:53 +02:00 |
|
|
80c98041b5
|
Set zsh shell for rarias
|
2023-09-03 16:46:27 +02:00 |
|
|
3418e57907
|
Enable zsh and fix key bindings
|
2023-09-03 16:42:04 +02:00 |
|
|
6848b58e39
|
Keep a log over time with the config commits
|
2023-09-03 00:02:14 +02:00 |
|
|
f9c77b433a
|
Store nixos config in /etc/nixos/config.rev
|
2023-09-02 23:37:11 +02:00 |
|
|
9d487845f6
|
Enable binary emulation for other architectures
|
2023-08-31 17:27:08 +02:00 |
|
|
3c99c2a662
|
Enable watchdog
|
2023-08-30 16:32:17 +02:00 |
|
|
7d09108c9f
|
Enable all osd on boot in lake2
|
2023-08-30 16:32:17 +02:00 |
|
|
0f0a861896
|
Scrape lake2 too
|
2023-08-29 12:33:26 +02:00 |
|
|
beb0d5940e
|
Also enable monitoring in lake2
|
2023-08-29 12:29:41 +02:00 |
|
|
70321ce237
|
Scrape metrics from bay
|
2023-08-29 11:58:00 +02:00 |
|
|
5bd1d67333
|
Add monitoring in the bay node
|
2023-08-29 11:53:32 +02:00 |
|
|
fad9df61e1
|
Add fio tool
|
2023-08-29 11:27:50 +02:00 |
|
|
d2a80c8c18
|
Add ceph tools in hut too
|
2023-08-28 17:58:21 +02:00 |
|
|
599613d139
|
Switch ceph logs to journal
|
2023-08-28 17:58:08 +02:00 |
|
|
cb3a7b19f7
|
Move pkgs overlay to overlay.nix
|
2023-08-25 18:12:00 +02:00 |
|
|
f5d6bf627b
|
Enable ceph osd daemons in lake2
|
2023-08-25 14:54:51 +02:00 |
|
|
f1ce815edd
|
Add the lake2 hostname to the hosts
|
2023-08-25 14:44:35 +02:00 |
|
|
a2075cfd65
|
Use the sda for lake2
|
2023-08-25 13:40:10 +02:00 |
|
|
8f1f6f92a8
|
Remove netboot module
|
2023-08-25 13:39:01 +02:00 |
|
|
3416416864
|
Disable pixiecore in hut for now
|
2023-08-25 13:21:00 +02:00 |
|
|
815888fb07
|
Add PXE helper
|
2023-08-25 12:05:33 +02:00 |
|
|
029d9cb1db
|
Enable netboot again for PXE
|
2023-08-24 19:08:23 +02:00 |
|
|
95fa67ede1
|
Specify the disk by path
|
2023-08-24 15:27:37 +02:00 |
|
|
a19347161f
|
Prepare lake2 config after bootstrap
The disk ID is different under NixOS.
|
2023-08-24 13:54:53 +02:00 |
|
|
58c1cc1f7c
|
Add lake2 bootstrap config
|
2023-08-24 12:30:46 +02:00 |
|
|
077eece6b9
|
Add agenix to PATH in hut
|
2023-08-23 17:42:50 +02:00 |
|
|
b3ef53de51
|
Store ceph secret key in age
This allows a node to mount the ceph FS without any extra ceph
configuration in /etc/ceph.
|
2023-08-23 17:26:44 +02:00 |
|
|
e0852ee89b
|
Add rarias key for secrets
|
2023-08-23 17:15:26 +02:00 |
|
|
dfffc0bdce
|
Add ceph metrics to prometheus
|
2023-08-22 16:33:55 +02:00 |
|
|
8257c245b1
|
Mount the ceph filesystem in hut
|
2023-08-22 16:15:46 +02:00 |
|
|
cd5853cf53
|
Add ceph config in bay
|
2023-08-22 15:58:48 +02:00 |
|
|
b677b827d4
|
Add the bay host name
|
2023-08-22 15:56:09 +02:00 |
|
|
b1d5185cca
|
Remove netboot and fixes
|
2023-08-22 12:12:15 +02:00 |
|
|
a7e66e2246
|
Add bay node
|
2023-08-22 12:12:15 +02:00 |
|
|
f8fb5fa4ff
|
Monitor power from other nodes via LAN
|
2023-08-22 11:28:54 +02:00 |
|
|
acf9b71f04
|
Increase prometheus retention time to one year
|
2023-08-22 11:28:54 +02:00 |
|
|
bf692e6e4e
|
Don't set all_proxy
|
2023-08-22 11:28:54 +02:00 |
|
|
55d6c17776
|
Allow access to devices for node_exporter
|
2023-07-28 13:55:35 +02:00 |
|
|
14b173f67e
|
GRUB version no longer needed
|
2023-07-27 17:22:20 +02:00 |
|
|
f892d43b47
|
Kill slurmd remaining processes on upgrade
|
2023-07-27 14:49:20 +02:00 |
|
|
79adbe76a8
|
koro: Add vlopez user
|
2023-07-21 13:00:43 +02:00 |
|
|
66fb848ba8
|
Add koro node
|
2023-07-21 13:00:08 +02:00 |
|
|
40b1a8f0df
|
eudy: Add fcsv3 and intermediate versions for testing
|
2023-07-21 11:27:51 +02:00 |
|
|
a0b9d10b14
|
eudy: Enable memory overcommit
|
2023-07-21 11:27:51 +02:00 |
|
|
4c309dea2f
|
eudy: disable all cpu mitigations
|
2023-07-21 11:27:51 +02:00 |
|
|
7c1fe1455b
|
Enable NTP using the BSC time server
|
2023-06-30 14:02:15 +02:00 |
|
|
2d4b178895
|
Add the ssfhead node as gateway
|
2023-06-30 14:01:35 +02:00 |
|
|
4dd25f2f89
|
Use our host names first by default
|
2023-06-23 16:22:18 +02:00 |
|
|
6dcd9d8144
|
Add DNS tools to resolve hosts
|
2023-06-23 16:15:45 +02:00 |
|
|
31be81d2b1
|
Lower perf_event_paranoid to -1
|
2023-06-23 16:01:27 +02:00 |
|
|
826cfdf43f
|
Set perf paranoid to 0 by default
|
2023-06-21 16:24:19 +02:00 |
|
|
a1f258c5ce
|
Add perf to packages
|
2023-06-21 15:41:06 +02:00 |
|
|
1c1d3f3231
|
Allow srun to specify the cpu binding
The task/affinity plugin needs to be selected.
|
2023-06-21 13:16:23 +02:00 |
|
|
623d46c03f
|
Move authorized keys to users.nix
|
2023-06-20 14:08:34 +02:00 |
|
|
518a4d6af3
|
Add rpenacob user
|
2023-06-20 12:54:26 +02:00 |
|
|
60077948d6
|
Add osumb to the system packages
|
2023-06-16 19:22:41 +02:00 |
|
|
1724535495
|
Use explicit order in overlays
|
2023-06-16 18:26:51 +02:00 |
|
|
ab04855382
|
Add mpich overlay
|
2023-06-16 18:26:51 +02:00 |
|
|
684d5e41c5
|
Add coments in slurm config
|
2023-06-16 18:26:50 +02:00 |
|
|
316ea18e24
|
Add eudy host key to known hosts
|
2023-06-16 17:29:48 +02:00 |
|
|
c916157fcc
|
Rename xeon08 to eudy
From Eudyptula, a little penguin.
|
2023-06-16 17:16:05 +02:00 |
|
|
94320d9256
|
Add ssh host keys
|
2023-06-16 12:01:12 +02:00 |
|
|
9f5941c2be
|
Set the name of the slurm cluster to jungle
|
2023-06-16 12:00:54 +02:00 |
|
|
fba0f7b739
|
Change owl hostnames
|
2023-06-16 11:42:39 +02:00 |
|
|
2e95281af5
|
Add owl and all partition
|
2023-06-16 11:34:00 +02:00 |
|
|
f4ac9f3186
|
Simplify flake and expose host pkgs
The configuration of the machines is now moved to m/
|
2023-06-16 11:31:31 +02:00 |
|