Commit Graph

54 Commits

Author SHA1 Message Date
37073f017d Add raccoon workstation with Nvidia GPU
The raccoon workstation has a Nvidia GTX 960 GPU which will be used for
CUDA experiments. The configuration uses the production Nvidia driver
at version 550 which still supports the GPU. The current CUDA 12.2
version is also supported by the driver.

The workstation has Internet access directly from the gateway, but name
resolution via Google DNS servers seems to be blocked, so we use BSC
servers for now.

The NixOS system is installed in a partition alongside the old Debian
system, until we decide that is no longer neccesary to keep both. The
old /home partition is not used as we are using the same UIDs and groups
from the xeon machines, which don't match the ones here.
2024-06-07 10:47:10 +02:00
4077a87021 Split xeon specific configuration from base
To accomodate the raccoon knights workstation, some of the configuration
pulled by m/common/main.nix has to be removed. To solve it, the xeon
specific parts are placed into m/common/xeon.nix and only the common
configuration is at m/common/base.nix.
2024-06-07 10:45:46 +02:00
cc31cb30c3 Control user access to each machine
The users.jungleUsers configuration option behaves like the users.users
option, but defines the list attribute `hosts` for each user, which
filters users so that only the user can only access those hosts.
2024-06-07 10:45:46 +02:00
ff792f5f48 Move slurm client in a separate module
Reviewed-by: Rodrigo Arias Mallo <rodrigo.arias@bsc.es>
2024-02-13 11:11:17 +01:00
82f5d828c2 Use tmpfs in /tmp
The /tmp directory was using the SSD disk which is not erased across
boots. Nix will use /tmp to perform the builds, so we want it to be as
fast as possible. In general, all the machines have enough space to
handle large builds like LLVM.

Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2023-11-28 12:25:50 +01:00
dd341902fc BSC packages are no longer in bsc attribute
Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2023-11-09 13:40:48 +01:00
67a4905a0a Don't log SLURM connection attempts from ssfhead 2023-10-06 15:22:04 +02:00
bc62e28ca3 Enable direnv integration 2023-09-20 09:32:58 +02:00
653d411b9e Remove bscpkgs from the registry and nixPath
This is done to prevent accidental evaluations where the nixpkgs input
of bscpkgs is still pointing to a different version that the one
specified in the jungle flake. Instead use jungle#bscpkgs.X to get a
package from bscpkgs.
2023-09-15 12:00:33 +02:00
a1e8cfea47 Don't fetch registry flakes from the net 2023-09-15 12:00:28 +02:00
10ca572aec Enable fstrim service 2023-09-12 16:39:45 +02:00
19a451db77 Add encrypted munge key with agenix 2023-09-08 19:05:45 +02:00
ec9be9bb62 Remove unused large port hole in firewall 2023-09-08 18:22:48 +02:00
7050c505b5 Allow only some ports for srun 2023-09-08 17:51:37 +02:00
033a1fe97b Block ssfhead from reaching our slurm daemon 2023-09-08 17:36:28 +02:00
77cb3c494e Poweroff idle slurm nodes after 1 hour 2023-09-08 16:49:53 +02:00
6db5772ac4 Add IB and IPMI node host names 2023-09-08 13:21:37 +02:00
02f40a8217 Add agenix to all nodes 2023-09-04 22:10:43 +02:00
2bb366b9ac Reorganize secrets and ssh keys
The agenix tools needs to read the secrets from a standalone file, but
we also need the same information for the SSH keys.
2023-09-04 21:36:31 +02:00
2d16709648 Add anavarro user 2023-09-04 16:00:01 +02:00
9344daa31c Set zsh inc_append_history option 2023-09-03 16:57:53 +02:00
80c98041b5 Set zsh shell for rarias 2023-09-03 16:46:27 +02:00
3418e57907 Enable zsh and fix key bindings 2023-09-03 16:42:04 +02:00
6848b58e39 Keep a log over time with the config commits 2023-09-03 00:02:14 +02:00
f9c77b433a Store nixos config in /etc/nixos/config.rev 2023-09-02 23:37:11 +02:00
3c99c2a662 Enable watchdog 2023-08-30 16:32:17 +02:00
beb0d5940e Also enable monitoring in lake2 2023-08-29 12:29:41 +02:00
cb3a7b19f7 Move pkgs overlay to overlay.nix 2023-08-25 18:12:00 +02:00
f1ce815edd Add the lake2 hostname to the hosts 2023-08-25 14:44:35 +02:00
dfffc0bdce Add ceph metrics to prometheus 2023-08-22 16:33:55 +02:00
b677b827d4 Add the bay host name 2023-08-22 15:56:09 +02:00
bf692e6e4e Don't set all_proxy 2023-08-22 11:28:54 +02:00
14b173f67e GRUB version no longer needed 2023-07-27 17:22:20 +02:00
f892d43b47 Kill slurmd remaining processes on upgrade 2023-07-27 14:49:20 +02:00
66fb848ba8 Add koro node 2023-07-21 13:00:08 +02:00
7c1fe1455b Enable NTP using the BSC time server 2023-06-30 14:02:15 +02:00
2d4b178895 Add the ssfhead node as gateway 2023-06-30 14:01:35 +02:00
4dd25f2f89 Use our host names first by default 2023-06-23 16:22:18 +02:00
6dcd9d8144 Add DNS tools to resolve hosts 2023-06-23 16:15:45 +02:00
31be81d2b1 Lower perf_event_paranoid to -1 2023-06-23 16:01:27 +02:00
826cfdf43f Set perf paranoid to 0 by default 2023-06-21 16:24:19 +02:00
a1f258c5ce Add perf to packages 2023-06-21 15:41:06 +02:00
1c1d3f3231 Allow srun to specify the cpu binding
The task/affinity plugin needs to be selected.
2023-06-21 13:16:23 +02:00
623d46c03f Move authorized keys to users.nix 2023-06-20 14:08:34 +02:00
518a4d6af3 Add rpenacob user 2023-06-20 12:54:26 +02:00
60077948d6 Add osumb to the system packages 2023-06-16 19:22:41 +02:00
1724535495 Use explicit order in overlays 2023-06-16 18:26:51 +02:00
ab04855382 Add mpich overlay 2023-06-16 18:26:51 +02:00
684d5e41c5 Add coments in slurm config 2023-06-16 18:26:50 +02:00
316ea18e24 Add eudy host key to known hosts 2023-06-16 17:29:48 +02:00