344 Commits

Author SHA1 Message Date
66fe179a9d
weasel: add custom nix-serve
Proper override for haskell package

madness

Fix nix-serve-ng override
2025-10-06 17:58:28 +02:00
0a282d856d
Add https github to allowed uris 2025-10-06 17:19:42 +02:00
28d925e446
Make hydra shut up 2025-10-06 17:19:42 +02:00
926de4d1d5
Add bscpm and gitlab-internal to allowed-uris 2025-10-06 17:19:42 +02:00
ca854704e8
weasel: enable hydra tcp port in firewall 2025-10-06 17:19:42 +02:00
c5b70bebe0
hydra: set listen host 2025-10-06 17:19:42 +02:00
fe06e38761
Enable hydra on weasel 2025-10-06 17:19:42 +02:00
0f1e9d7ccb
weasel: use tent cache 2025-10-06 17:19:41 +02:00
5a3184f2f7
Add nixfmt-rfc-style to common packages 2025-10-06 17:19:41 +02:00
3a8ed797c7
Add packages to user abonerib 2025-10-06 17:19:41 +02:00
ec79ed4d0e
Add nix-output-monitor to default packages 2025-10-06 17:19:41 +02:00
3ebb00d1c0
Set fish shell for user abonerib 2025-10-06 17:19:41 +02:00
8f3b13ec3f
weasel: create user folders in /var/lib/podman-users
/home is a nfs mount, which does not support extra filesystem arguments
needed to run podman. We need to have a local home.
2025-10-06 17:19:41 +02:00
b7b9160d03
weasel: add podman 2025-10-06 17:19:40 +02:00
00456a86b7
Enable custom sys-devices system feature 2025-10-02 17:54:48 +02:00
e42058f08b Allow access to hut from fox
Reviewed-by: Rodrigo Arias Mallo <rodrigo.arias@bsc.es>
2025-10-02 17:03:21 +02:00
f3bfe89f27 Fetch website from its own git repository
Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-10-02 15:45:21 +02:00
b040bebd1d Add acinca user
Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-10-01 12:27:43 +02:00
f69629d2da Restart slurmd on failure
A failure to reach the control node can cause slurmd to fail and the
unit remains in the failed state until is manually restarted. Instead,
try to restart the service every 30 seconds, forever:

    owl1% systemctl show slurmd | grep -E 'Restart=|RestartUSec='
    Restart=on-failure
    RestartUSec=30s
    owl1% pgrep slurmd
    5903
    owl1% sudo kill -SEGV 5903
    owl1% pgrep slurmd
    6137

Fixes: rarias/jungle#177
Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-09-30 17:20:39 +02:00
0668f0db74 Lower connect timeout when using hut substituter
Reviewed-by: Rodrigo Arias Mallo <rodrigo.arias@bsc.es>
2025-09-29 18:44:48 +02:00
5fcd57a061 Use hut substituter in all nodes
Reviewed-by: Rodrigo Arias Mallo <rodrigo.arias@bsc.es>
2025-09-29 18:44:38 +02:00
ad1544759f Remove machine access for user csiringo
Reviewed-by: Rodrigo Arias Mallo <rodrigo.arias@bsc.es>
2025-09-29 18:23:24 +02:00
e1c950a530 Mount apex /home via NFS in raccoon
Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-09-26 12:28:53 +02:00
f9632c37f8 Remove extra SSH jump configuration
We now have direct visibility among nodes so we don't need any extra
SSH configuration to reach them.

Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-09-26 12:28:51 +02:00
1f0cb4ae76 Add raccoon peer to wireguard
It routes traffic from fox, apex and the compute nodes so that we can
reach the git servers and tent.

Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-09-26 12:28:48 +02:00
e98fdb89ab Restrict fox peer to a single IP
Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-09-26 12:28:43 +02:00
6afe05b5fd Use lowercase peer hostnames
Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-09-26 12:28:25 +02:00
7d5aebf882 Share a public folder for documents
Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-09-19 10:59:40 +02:00
4da7780472 Add amd_hsmp module in fox for AMD uProf
Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-09-19 10:54:24 +02:00
d6126501ba Disable NMI watchdog in fox
Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-09-19 10:54:17 +02:00
e6e4846529 Add AMD uProf module and enable it in fox
Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-09-19 10:54:05 +02:00
ff0fc18d0a Mount home via NFS from apex in fox
Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2025-09-03 15:34:02 +02:00
19c7e32678 Allow access to NFS via wireguard subnet
Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2025-09-03 15:33:47 +02:00
017c19e7d0 Use 10.106.0.0/24 subnet to avoid collisions
The 106 byte is the code for 'j' (jungle) in ASCII:

	% printf j | od -t d
	0000000         106
	0000001

Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2025-09-03 12:03:13 +02:00
a36eff8749 Revert "Remove pam_slurm_adopt from fox"
This reverts commit 1eac0fcad8211195499bc566e6c70312b31af700.

Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2025-09-03 12:03:06 +02:00
df17b11458 Enable fail2ban in fox
Protect fox against ssh bruteforce attacks:

fox% sudo lastb | head
root     ssh:notty    200.124.28.102   Mon Sep  1 11:25 - 11:25  (00:00)
root     ssh:notty    200.124.28.102   Mon Sep  1 11:25 - 11:25  (00:00)
root     ssh:notty    200.124.28.102   Mon Sep  1 11:25 - 11:25  (00:00)
root     ssh:notty    200.124.28.102   Mon Sep  1 11:25 - 11:25  (00:00)
root     ssh:notty    200.124.28.102   Mon Sep  1 11:25 - 11:25  (00:00)
root     ssh:notty    200.124.28.102   Mon Sep  1 11:25 - 11:25  (00:00)
root     ssh:notty    200.124.28.102   Mon Sep  1 11:25 - 11:25  (00:00)
root     ssh:notty    200.124.28.102   Mon Sep  1 11:25 - 11:25  (00:00)
root     ssh:notty    200.124.28.102   Mon Sep  1 11:24 - 11:24  (00:00)
root     ssh:notty    200.124.28.102   Mon Sep  1 11:24 - 11:24  (00:00)

Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2025-09-03 12:03:02 +02:00
0dc7b7eb3d Accept connections from apex to fox slurmd
Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2025-09-03 12:03:00 +02:00
dff6eaf587 Accept fox connection to slurm controller
Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2025-09-03 12:02:59 +02:00
4b6b67b587 Add fox machine to SLURM
Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2025-09-03 12:02:57 +02:00
6bbfb0d124 Make apex host specific to each machine
Allows direct contact via the VPN when accessing from fox, but use
Internet when using the rest of the machines.

Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2025-09-03 12:02:49 +02:00
46d03d5ca7 Add local host fox in apex
Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2025-09-03 12:02:46 +02:00
e366e6ce87 Enable wireguard in apex
Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2025-09-03 12:02:43 +02:00
e415f70bbb Add wireguard server in fox
Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2025-09-03 12:02:38 +02:00
200c727bbf Use writeShellScript for suspend.sh and resume.sh
Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-08-29 12:35:28 +02:00
7413021440 Add firewall rules to slurm server
Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-08-29 12:35:26 +02:00
20b4805335 Remove hut from slurm
Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-08-29 12:35:24 +02:00
f7dff9deab Only configure apex as slurm server
Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-08-29 12:35:22 +02:00
f569933732 Split slurm configuration for client and server
Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-08-29 12:35:20 +02:00
ee895d2e4f Move slurm control server to apex
Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-08-29 12:35:16 +02:00
5ee8623af2 Fix typo in csiringo ssh key
Reviewed-by: Rodrigo Arias Mallo <rodrigo.arias@bsc.es>
2025-08-27 17:44:20 +02:00