Commit Graph

62 Commits

Author SHA1 Message Date
4bd1648074 Set the serial console to ttyS1 in raccoon
Apparently the ttyS0 console doesn't exist but ttyS1 does:

  raccoon% sudo stty -F /dev/ttyS0
  stty: /dev/ttyS0: Input/output error
  raccoon% sudo stty -F /dev/ttyS1
  speed 9600 baud; line = 0;
  -brkint -imaxbel

The dmesg line agrees:

  00:03: ttyS1 at I/O 0x2f8 (irq = 3, base_baud = 115200) is a 16550A

The console configuration is then moved from base to xeon to allow
changing it for the raccoon machine.

Reviewed-by: Aleix Boné <abonerib@bsc.es>
2024-09-12 08:36:56 +02:00
e15a3867d4 Add 10 min shutdown jitter to avoid spikes
The shutdown timer will fire at slightly different times for the
different nodes, so we slowly decrease the power consumption.

Reviewed-by: Aleix Boné <abonerib@bsc.es>
2024-09-12 08:36:44 +02:00
d988ef2eff Program shutdown for August 2nd for all machines
Reviewed-by: Aleix Boné <abonerib@bsc.es>
2024-09-12 08:36:36 +02:00
fcfc6ac149 Allow ptrace to any process of the same user
Allows users to attach GDB to their own processes, without requiring
running the program with GDB from the start. It is only available in
compute nodes, the storage nodes continue with the restricted settings.

Reviewed-by: Aleix Boné <abonerib@bsc.es>
2024-09-12 08:36:09 +02:00
6e87130166 Add abonerib user to hut, raccon, owl1 and owl2
Reviewed-by: Aleix Boné <abonerib@bsc.es>
2024-09-12 08:36:07 +02:00
06f9e6ac6b Grant rpenacob access to owl1 and owl2 nodes
Reviewed-by: Aleix Boné <abonerib@bsc.es>
2024-09-12 08:36:05 +02:00
da07aedce2 Access private repositories via hut SSH proxy
Reviewed-by: Aleix Boné <abonerib@bsc.es>
2024-09-12 08:36:03 +02:00
61427a8bf9 Set the default proxy to point to hut
Reviewed-by: Aleix Boné <abonerib@bsc.es>
2024-09-12 08:35:56 +02:00
4c5e85031b Move vlopez user to jungleUsers for koro host
Access to other machines can be easily added into the "hosts" attribute
without the need to replicate the configuration.

Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2024-07-16 12:35:39 +02:00
72faf8365b Split xeon specific configuration from base
To accomodate the raccoon knights workstation, some of the configuration
pulled by m/common/main.nix has to be removed. To solve it, the xeon
specific parts are placed into m/common/xeon.nix and only the common
configuration is at m/common/base.nix.

Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2024-07-16 12:35:37 +02:00
0e22d6def8 Control user access to each machine
The users.jungleUsers configuration option behaves like the users.users
option, but defines the list attribute `hosts` for each user, which
filters users so that only the user can only access those hosts.

Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2024-07-16 12:35:34 +02:00
ff792f5f48 Move slurm client in a separate module
Reviewed-by: Rodrigo Arias Mallo <rodrigo.arias@bsc.es>
2024-02-13 11:11:17 +01:00
82f5d828c2 Use tmpfs in /tmp
The /tmp directory was using the SSD disk which is not erased across
boots. Nix will use /tmp to perform the builds, so we want it to be as
fast as possible. In general, all the machines have enough space to
handle large builds like LLVM.

Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2023-11-28 12:25:50 +01:00
dd341902fc BSC packages are no longer in bsc attribute
Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2023-11-09 13:40:48 +01:00
67a4905a0a Don't log SLURM connection attempts from ssfhead 2023-10-06 15:22:04 +02:00
bc62e28ca3 Enable direnv integration 2023-09-20 09:32:58 +02:00
653d411b9e Remove bscpkgs from the registry and nixPath
This is done to prevent accidental evaluations where the nixpkgs input
of bscpkgs is still pointing to a different version that the one
specified in the jungle flake. Instead use jungle#bscpkgs.X to get a
package from bscpkgs.
2023-09-15 12:00:33 +02:00
a1e8cfea47 Don't fetch registry flakes from the net 2023-09-15 12:00:28 +02:00
10ca572aec Enable fstrim service 2023-09-12 16:39:45 +02:00
19a451db77 Add encrypted munge key with agenix 2023-09-08 19:05:45 +02:00
ec9be9bb62 Remove unused large port hole in firewall 2023-09-08 18:22:48 +02:00
7050c505b5 Allow only some ports for srun 2023-09-08 17:51:37 +02:00
033a1fe97b Block ssfhead from reaching our slurm daemon 2023-09-08 17:36:28 +02:00
77cb3c494e Poweroff idle slurm nodes after 1 hour 2023-09-08 16:49:53 +02:00
6db5772ac4 Add IB and IPMI node host names 2023-09-08 13:21:37 +02:00
02f40a8217 Add agenix to all nodes 2023-09-04 22:10:43 +02:00
2bb366b9ac Reorganize secrets and ssh keys
The agenix tools needs to read the secrets from a standalone file, but
we also need the same information for the SSH keys.
2023-09-04 21:36:31 +02:00
2d16709648 Add anavarro user 2023-09-04 16:00:01 +02:00
9344daa31c Set zsh inc_append_history option 2023-09-03 16:57:53 +02:00
80c98041b5 Set zsh shell for rarias 2023-09-03 16:46:27 +02:00
3418e57907 Enable zsh and fix key bindings 2023-09-03 16:42:04 +02:00
6848b58e39 Keep a log over time with the config commits 2023-09-03 00:02:14 +02:00
f9c77b433a Store nixos config in /etc/nixos/config.rev 2023-09-02 23:37:11 +02:00
3c99c2a662 Enable watchdog 2023-08-30 16:32:17 +02:00
beb0d5940e Also enable monitoring in lake2 2023-08-29 12:29:41 +02:00
cb3a7b19f7 Move pkgs overlay to overlay.nix 2023-08-25 18:12:00 +02:00
f1ce815edd Add the lake2 hostname to the hosts 2023-08-25 14:44:35 +02:00
dfffc0bdce Add ceph metrics to prometheus 2023-08-22 16:33:55 +02:00
b677b827d4 Add the bay host name 2023-08-22 15:56:09 +02:00
bf692e6e4e Don't set all_proxy 2023-08-22 11:28:54 +02:00
14b173f67e GRUB version no longer needed 2023-07-27 17:22:20 +02:00
f892d43b47 Kill slurmd remaining processes on upgrade 2023-07-27 14:49:20 +02:00
66fb848ba8 Add koro node 2023-07-21 13:00:08 +02:00
7c1fe1455b Enable NTP using the BSC time server 2023-06-30 14:02:15 +02:00
2d4b178895 Add the ssfhead node as gateway 2023-06-30 14:01:35 +02:00
4dd25f2f89 Use our host names first by default 2023-06-23 16:22:18 +02:00
6dcd9d8144 Add DNS tools to resolve hosts 2023-06-23 16:15:45 +02:00
31be81d2b1 Lower perf_event_paranoid to -1 2023-06-23 16:01:27 +02:00
826cfdf43f Set perf paranoid to 0 by default 2023-06-21 16:24:19 +02:00
a1f258c5ce Add perf to packages 2023-06-21 15:41:06 +02:00