287 Commits

Author SHA1 Message Date
ac700d34a5 Disable automatic August shutdown for Fox
The UPC has different dates for the yearly power cut, and Fox can
recover properly from a power loss, so we don't need to have it turned
off before the power cut. Simply disabling the timer is enough.

Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-07-24 11:22:10 +02:00
9b681ab7ce Add cudainfo program to test CUDA
The cudainfo program checks that we can initialize the CUDA RT library
and communicate with the driver. It can be used as standalone program or
built with cudainfo.gpuCheck so it is executed inside the build sandbox
to see if it also works fine. It uses the autoAddDriverRunpath hook to
inject in the runpath the location of the library directory for CUDA
libraries.

Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-07-23 11:52:09 +02:00
9ce394bffd Add missing symlink in cuda sandbox
Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-07-23 11:51:47 +02:00
8cd7b713ca Enable cuda systemFeature in raccoon and fox
This allows running derivations which depend on cuda runtime without
breaking the sandbox. We only need to add `requiredSystemFeatures = [ "cuda" ];`
to the derivation.

Reviewed-by: Rodrigo Arias Mallo <rodrigo.arias@bsc.es>
2025-07-22 17:07:13 +02:00
8eed90d2bd Move shared nvidia settings to a separate module
Reviewed-by: Rodrigo Arias Mallo <rodrigo.arias@bsc.es>
2025-07-22 17:06:45 +02:00
aee54ef39f Replace xeon07 by hut in ssh config
The xeon07 machine has been renamed to hut.

Reviewed-by: Rodrigo Arias Mallo <rodrigo.arias@bsc.es>
2025-07-21 18:10:08 +02:00
69f7ab701b Enable automatic Nix GC in raccoon
Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-07-21 17:58:26 +02:00
4c9bcebcdc Select proprietary NVIDIA driver in raccoon
The NVIDIA GTX 960 from 2016 has the Maxwell architecture, and NixOS
suggests using the proprietary driver for older than Turing:

> It is suggested to use the open source kernel modules on Turing or
> later GPUs (RTX series, GTX 16xx), and the closed source modules
> otherwise.

Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-07-21 17:58:21 +02:00
86e7c72b9b Enable open source NVidia driver in fox
It is recommended for newer versions.

Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-07-18 09:57:38 +02:00
a7dffc33b5 Remove option allowUnfree from fox and raccoon
It is already set to true for all machines.

Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-07-18 09:57:21 +02:00
6765dba3e4 Ban another scanner trying to connect via SSH
It is constantly spamming out logs:

apex# journalctl | grep 'Connection closed by 84.88.52.176' | wc -l
2255

Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-07-18 09:51:49 +02:00
0acfb7a8e0 Update weasel IPMI hostname for monitoring
Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-07-18 09:51:21 +02:00
2bb3b2fc4a Remove package ix as it is gone
Fails with: "error: ix has been removed from Nixpkgs, as the ix.io
pastebin has been offline since Dec. 2023".

Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2025-07-16 13:07:06 +02:00
a6698e6a6b Silently ban OpenVAS BSC scanner from apex
It is spamming our logs with refused connection lines:

apex% sudo journalctl -b0 | grep 'refused connection.*SRC=192.168.8.16' | wc -l
13945

Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-07-15 17:40:41 +02:00
b394c5a8f4 Rotate anavarro password and SSH key
Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-07-15 17:24:41 +02:00
3d5b845057 Add weasel machine configuration
Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-07-15 17:24:38 +02:00
9e83565977 Remove extra flush commands on firewall stop
They are not needed as they are already flushed when the firewall
starts or stops.

Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-07-15 11:18:45 +02:00
ce2cda1c41 Prevent accidental use of nftables
Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-07-15 11:18:42 +02:00
e6aef2cbd0 Add proxy configuration for internal hosts
Access internal hosts via apex proxy. From the compute nodes we first
open an SSH connection to apex, and then tunnel it through the HTTP
proxy with netcat.

This way we allow reaching internal GitLab repositories without
requiring the user to have credentials in the remote host, while we can
use multiple remotes to provide redundancy.

Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-07-15 11:18:36 +02:00
b7603053fa Remove unused blackbox configuration modules
Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-07-15 11:18:30 +02:00
3ca55acfdf Use IPv4 in blackbox probes
Otherwise they simply fail as IPv6 doesn't work.

Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-07-15 11:18:26 +02:00
e505a952af Make NFS mount async to improve latency
Don't wait to flush writes, as we don't care about consistency on a
crash:

> This option allows the NFS server to violate the NFS protocol and
> reply to requests before any changes made by that request have been
> committed to stable storage (e.g. disc drive).
>
> Using this option usually improves performance, but at the cost that
> an unclean server restart (i.e. a crash) can cause data to be lost or
> corrupted.

Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-07-15 11:18:20 +02:00
3ad9452637 Disable root_squash from NFS
Allows root to read files in the NFS export, so we can directly run
`nixos-rebuild switch` from /home.

Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-07-15 11:18:16 +02:00
fdd21d0dd0 Remove SSH proxy to access BSC clusters
We now have direct connection to them.

Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-07-15 11:18:13 +02:00
c40871bbfe Add users to apex machine
They need to be able to login to apex to access any other machine from
the SSF rack.

Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-07-15 11:18:09 +02:00
e8f5ce735e Remove proxy from hut HTTP probes
Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-07-15 11:18:04 +02:00
4a25056897 Remove proxy configuration from environment
All machines have now direct connection with the outside world.

Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-07-15 11:18:00 +02:00
89e0c0df28 Add storcli utility to apex
Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-07-15 11:17:57 +02:00
1b731a756a Add new configuration for apex
Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-07-15 11:17:43 +02:00
3d97fada6d Add pmartin1 user with access to fox
Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-07-03 11:16:43 +02:00
d1a2bfc90e Add access to fox for rpenacob user
Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-07-02 16:58:53 +02:00
44e76ce630 Revert "Only allow Vincent to access fox for now"
This reverts commit efac36b186efe6c3814278ae0a284ae346ff9d83.

Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-07-02 16:58:49 +02:00
adb7e0ef35 Add all terminfo files in environment
Fixes problems with the kitty terminal when opening vim or kakoune.

Reviewed-by: Rodrigo Arias Mallo <rodrigo.arias@bsc.es>
2025-07-02 16:02:45 +02:00
b0875816f2 Monitor Fox BMC with ICMP probes too
Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-07-02 15:51:22 +02:00
592da155a9 Restrict DAC VPN to fox-ipmi machine only
Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-07-02 15:51:19 +02:00
5376613ec4 Monitor fox via VPN
Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-07-02 15:51:16 +02:00
74891f0784 Add OpenVPN service to connect to fox BMC
Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-07-02 15:51:13 +02:00
d66f9f21dd Add ac.upc.edu as name search server
Allows referring to fox.ac.upc.edu directly as fox.

Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-07-02 15:51:09 +02:00
e660268661 Disable kptr_restrict in fox
Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-07-02 15:08:42 +02:00
d45b7ea717 Disable NUMA balancing in fox
See: https://www.kernel.org/doc/html/latest/admin-guide/sysctl/kernel.html#numa-balancing

Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-07-02 15:08:02 +02:00
c205fa4e34 Load amd_uncore module in fox
Needed for L3 events in perf.

Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-07-02 15:07:58 +02:00
5f055388a5 Enable SSH X11 forwarding
Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-07-02 15:07:54 +02:00
0bc69789d9 Disable registration in Gitea
Get rid of all the spam accounts they are trying to register.

Reviewed-by: Aleix Boné <abonerib@bsc.es>
Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2025-06-18 15:36:18 +02:00
09bc9d9c25 Enable msmtp configuration in tent
Allows gitea to send notifications via email.

Reviewed-by: Aleix Boné <abonerib@bsc.es>
Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2025-06-18 15:36:15 +02:00
6b53ab4413 Add GitLab runner with debian docker for PM
Reviewed-by: Aleix Boné <abonerib@bsc.es>
Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2025-06-18 15:36:13 +02:00
4618a149b3 Monitor nix-daemon in tent
Reviewed-by: Aleix Boné <abonerib@bsc.es>
Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2025-06-18 15:36:11 +02:00
448d85ef9d Move nix-daemon exporter to modules
Reviewed-by: Aleix Boné <abonerib@bsc.es>
Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2025-06-18 15:36:09 +02:00
956b99f02a Add p service for pastes
Reviewed-by: Aleix Boné <abonerib@bsc.es>
Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2025-06-18 15:36:07 +02:00
ec2eb8c3ed Enable public-inbox service in tent
Reviewed-by: Aleix Boné <abonerib@bsc.es>
Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2025-06-18 15:36:06 +02:00
09a5bdfbe4 Enable gitea in tent
Reviewed-by: Aleix Boné <abonerib@bsc.es>
Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2025-06-18 15:36:04 +02:00