65 Commits

Author SHA1 Message Date
d335d69ba6 Add BSC machines to ssh config
Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-01-16 14:23:51 +01:00
260986b9f2 Delay nix-gc until /home is mounted
Prevents starting the garbage collector before the remote FS are
mounted, in particular /home. Otherwise, all the gcroots which have
symlinks in /home will be considered stale and they will be removed.

See: #79
Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2024-09-20 09:45:30 +02:00
15afbe94bd Add dbautist user with access to hut
Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2024-09-20 09:42:02 +02:00
efd35a9cd1 Set the serial console to ttyS1 in raccoon
Apparently the ttyS0 console doesn't exist but ttyS1 does:

  raccoon% sudo stty -F /dev/ttyS0
  stty: /dev/ttyS0: Input/output error
  raccoon% sudo stty -F /dev/ttyS1
  speed 9600 baud; line = 0;
  -brkint -imaxbel

The dmesg line agrees:

  00:03: ttyS1 at I/O 0x2f8 (irq = 3, base_baud = 115200) is a 16550A

The console configuration is then moved from base to xeon to allow
changing it for the raccoon machine.

Reviewed-by: Aleix Boné <abonerib@bsc.es>
2024-09-12 08:36:56 +02:00
152b71e718 Add 10 min shutdown jitter to avoid spikes
The shutdown timer will fire at slightly different times for the
different nodes, so we slowly decrease the power consumption.

Reviewed-by: Aleix Boné <abonerib@bsc.es>
2024-09-12 08:36:44 +02:00
d17be714ec Program shutdown for August 2nd for all machines
Reviewed-by: Aleix Boné <abonerib@bsc.es>
2024-09-12 08:36:36 +02:00
6e9d33b483 Allow ptrace to any process of the same user
Allows users to attach GDB to their own processes, without requiring
running the program with GDB from the start. It is only available in
compute nodes, the storage nodes continue with the restricted settings.

Reviewed-by: Aleix Boné <abonerib@bsc.es>
2024-09-12 08:36:09 +02:00
58abaefbc4 Add abonerib user to hut, raccon, owl1 and owl2
Reviewed-by: Aleix Boné <abonerib@bsc.es>
2024-09-12 08:36:07 +02:00
5ea7827a8a Grant rpenacob access to owl1 and owl2 nodes
Reviewed-by: Aleix Boné <abonerib@bsc.es>
2024-09-12 08:36:05 +02:00
b17e4a13f9 Access private repositories via hut SSH proxy
Reviewed-by: Aleix Boné <abonerib@bsc.es>
2024-09-12 08:36:03 +02:00
9c4e60c2c2 Set the default proxy to point to hut
Reviewed-by: Aleix Boné <abonerib@bsc.es>
2024-09-12 08:35:56 +02:00
a0dab66aa5 Move vlopez user to jungleUsers for koro host
Access to other machines can be easily added into the "hosts" attribute
without the need to replicate the configuration.

Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2024-07-16 12:35:39 +02:00
24ee74d614 Split xeon specific configuration from base
To accomodate the raccoon knights workstation, some of the configuration
pulled by m/common/main.nix has to be removed. To solve it, the xeon
specific parts are placed into m/common/xeon.nix and only the common
configuration is at m/common/base.nix.

Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2024-07-16 12:35:37 +02:00
15b4b28d2c Control user access to each machine
The users.jungleUsers configuration option behaves like the users.users
option, but defines the list attribute `hosts` for each user, which
filters users so that only the user can only access those hosts.

Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2024-07-16 12:35:34 +02:00
7f17fe8874 Move slurm client in a separate module
Reviewed-by: Rodrigo Arias Mallo <rodrigo.arias@bsc.es>
2024-02-13 11:11:17 +01:00
ed887b0412 Use tmpfs in /tmp
The /tmp directory was using the SSD disk which is not erased across
boots. Nix will use /tmp to perform the builds, so we want it to be as
fast as possible. In general, all the machines have enough space to
handle large builds like LLVM.

Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2023-11-28 12:25:50 +01:00
0d9c99a24e BSC packages are no longer in bsc attribute
Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2023-11-09 13:40:48 +01:00
472f4b0334 Don't log SLURM connection attempts from ssfhead 2023-10-06 15:22:04 +02:00
70a9e855cf Enable direnv integration 2023-09-20 09:32:58 +02:00
aa64e9ef24 Remove bscpkgs from the registry and nixPath
This is done to prevent accidental evaluations where the nixpkgs input
of bscpkgs is still pointing to a different version that the one
specified in the jungle flake. Instead use jungle#bscpkgs.X to get a
package from bscpkgs.
2023-09-15 12:00:33 +02:00
ff98ba47c4 Don't fetch registry flakes from the net 2023-09-15 12:00:28 +02:00
2646ad4b70 Enable fstrim service 2023-09-12 16:39:45 +02:00
2a0254b684 Add encrypted munge key with agenix 2023-09-08 19:05:45 +02:00
e3e6e7662d Remove unused large port hole in firewall 2023-09-08 18:22:48 +02:00
f231dc81f1 Allow only some ports for srun 2023-09-08 17:51:37 +02:00
a758eef354 Block ssfhead from reaching our slurm daemon 2023-09-08 17:36:28 +02:00
9c9c41fb57 Poweroff idle slurm nodes after 1 hour 2023-09-08 16:49:53 +02:00
1a1708f16f Add IB and IPMI node host names 2023-09-08 13:21:37 +02:00
68f4d54dd1 Add agenix to all nodes 2023-09-04 22:10:43 +02:00
0a5f9b55f5 Reorganize secrets and ssh keys
The agenix tools needs to read the secrets from a standalone file, but
we also need the same information for the SSH keys.
2023-09-04 21:36:31 +02:00
900de39e2f Add anavarro user 2023-09-04 16:00:01 +02:00
1e466d07df Set zsh inc_append_history option 2023-09-03 16:57:53 +02:00
13807c5e8f Set zsh shell for rarias 2023-09-03 16:46:27 +02:00
d8d6d6d421 Enable zsh and fix key bindings 2023-09-03 16:42:04 +02:00
a242ddd39c Keep a log over time with the config commits 2023-09-03 00:02:14 +02:00
2c52ef9ff0 Store nixos config in /etc/nixos/config.rev 2023-09-02 23:37:11 +02:00
9d93760e6f Enable watchdog 2023-08-30 16:32:17 +02:00
db6bb90af8 Also enable monitoring in lake2 2023-08-29 12:29:41 +02:00
b4015ded86 Move pkgs overlay to overlay.nix 2023-08-25 18:12:00 +02:00
6c656182f1 Add the lake2 hostname to the hosts 2023-08-25 14:44:35 +02:00
d54dcc8d8f Add ceph metrics to prometheus 2023-08-22 16:33:55 +02:00
d7a4420205 Add the bay host name 2023-08-22 15:56:09 +02:00
f98af9aeef Don't set all_proxy 2023-08-22 11:28:54 +02:00
07411beb49 GRUB version no longer needed 2023-07-27 17:22:20 +02:00
544d5a3d69 Kill slurmd remaining processes on upgrade 2023-07-27 14:49:20 +02:00
45ac6e95e9 Add koro node 2023-07-21 13:00:08 +02:00
d20fa359d9 Enable NTP using the BSC time server 2023-06-30 14:02:15 +02:00
9be15fdad2 Add the ssfhead node as gateway 2023-06-30 14:01:35 +02:00
13e365002c Use our host names first by default 2023-06-23 16:22:18 +02:00
a38072762f Add DNS tools to resolve hosts 2023-06-23 16:15:45 +02:00