351 Commits

Author SHA1 Message Date
67c991fc6f Enable jungle robot emails for Grafana in tent
Reviewed-by: Aleix Boné <abonerib@bsc.es>
Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2025-06-12 13:25:43 +02:00
a7b1334dd7 Add tent key for nix-serve
Reviewed-by: Aleix Boné <abonerib@bsc.es>
Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2025-06-12 13:20:29 +02:00
f5ac62577e Remove jungle nix cache from tent
Reviewed-by: Aleix Boné <abonerib@bsc.es>
Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2025-06-12 13:18:01 +02:00
6bbadc5246 Enable nix cache
Reviewed-by: Aleix Boné <abonerib@bsc.es>
Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2025-06-12 13:17:26 +02:00
5026f0257e Serve Grafana from subpath
Reviewed-by: Aleix Boné <abonerib@bsc.es>
Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2025-06-12 12:57:34 +02:00
cdbdef9bb1 Add nginx server in tent
Reviewed-by: Aleix Boné <abonerib@bsc.es>
Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2025-06-12 12:47:43 +02:00
a5b5765d57 Add monitoring in tent
Reviewed-by: Aleix Boné <abonerib@bsc.es>
Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2025-06-12 10:32:31 +02:00
a208cfbc6f Disable nix garbage collector in tent
Reviewed-by: Aleix Boné <abonerib@bsc.es>
Reviewed-by: Rodrigo Arias Mallo <rodrigo.arias@bsc.es>
2025-06-07 17:51:40 +02:00
9d8234024d Rekey secrets with tent keys
Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-06-05 11:09:15 +02:00
a20e8844c6 Add tent host key and admin keys
Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-06-05 11:07:00 +02:00
c89f9d79a0 Create directories in /vault/home for tent users
Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-06-03 19:07:43 +02:00
39a070852f Add software RAID in tent using 3 disks
Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-06-03 18:27:56 +02:00
6f5dacbcd3 Add access to tent to all hut users too
Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-06-03 17:24:40 +02:00
70eecd1e39 Add hut SSH configuration from outside SSF LAN
Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-06-03 17:17:29 +02:00
5f59a22705 Don't use proxy in base preset
Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-06-03 12:52:10 +02:00
3734a9210c Add tent machine from xeon04
We moved the tent machine to the server room in the BSC building and is
now directly connected to the raccoon via NAT.

Fixes: #106
Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-06-02 09:07:32 +02:00
c9b6edb6a9 Create specific SSF rack configuration
Allow xeon machines to optionally inherit SSF configuration such as the
NFS mount point and the network configuration.

Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-06-02 12:22:41 +02:00
10693417a3 Only allow Vincent to access fox for now
Needed to run benchmarks without interference.

Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-06-10 14:38:02 +02:00
c441d4aad7 Use performance governor in fox
Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-06-10 14:37:39 +02:00
729b781cdd Add hut as nix cache in fox
Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-06-10 18:23:20 +02:00
08953f64fb Use extra- for substituters and trusted-public-keys
From the nix manual:

> A configuration setting usually overrides any previous value. However,
> for settings that take a list of items, you can prefix the name of the
> setting by extra- to append to the previous value.

Reviewed-by: Rodrigo Arias Mallo <rodrigo.arias@bsc.es>
2025-06-03 17:59:17 +02:00
0c9f31ffe1 Use DHCP for Ethernet in fox
Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-06-06 15:11:12 +02:00
59d6742e77 Use UPC time servers as others are blocked
Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-06-06 14:44:47 +02:00
075dd928ad Create tracing group and add arocanon in raccoon
Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2025-05-15 12:24:49 +02:00
007418a52c Extend perf support in raccoon
Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2025-05-15 12:21:26 +02:00
87e5fc8af6 Enable nixdebuginfod in raccoon
Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2025-05-06 14:39:48 +02:00
1089dd10b7 Make raccoon use performance governor
Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2025-05-05 10:50:43 +02:00
6f07c93b5a Enable binfmt emulation in raccoon
Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2025-03-21 17:51:41 +01:00
34d55ea815 Disable nix garbage collector in raccoon
Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2025-03-18 16:48:47 +01:00
78d7b522bf Add dbautist user to raccoon machine
Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2025-03-03 13:55:23 +01:00
1b6c948325 Add node exporter monitoring in raccoon
Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2025-02-25 17:11:09 +01:00
8d01909666 Allow X11 forwarding via SSH
Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2025-02-18 16:19:04 +01:00
57a0d58691 Enable linger for user rarias
Allows services to run without a login session.

Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2024-10-14 19:12:25 +02:00
f79debb7a1 Only proxy SSH git remotes via hut in xeon
Other machines like raccoon have direct access.

Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2024-09-10 15:03:03 +02:00
23d3cc5f18 Add machine map file
Documents the location, board and serial numbers so we can track the
machines if they move around. Some information is unkown.

Using the Nix language to encode the machines location and properties
allows us to later use that information in the configuration of the
machines themselves.

Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-06-02 11:12:30 +02:00
e31c80c6c5 Remove fox monitoring via IPMI
We will need to setup an VPN to be able to access fox in its new
location, so for now we simply remove the IPMI monitoring.

Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-06-02 07:55:11 +02:00
69e1cde614 Monitor fox, gateway and UPC anella via ICMP
Fox should reply once the machine is connected to the UPC network.
Monitoring also the gateway and UPC anella allows us to estimate if the
whole network is down or just fox.

Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-05-28 13:03:01 +02:00
3264788343 Update configuration for UPC network
The fox machine will be placed in the UPC network, so we update the
configuration with the new IP and gateway. We won't be able to reach hut
directly so we also remove the host entry and proxy.

Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-05-26 14:17:06 +02:00
97e5e5d04b Disable home via NFS in fox
It won't be accesible anymore as we won't be in the same LAN.

Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-05-26 13:41:36 +02:00
490977cdc1 Rekey all secrets
Fox is no longer able to use munge or ceph, so we remove the key and
rekey them.

Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-05-26 12:30:03 +02:00
0b1feca6ac Rotate fox SSH host key
Prevent decrypting old secrets by reading the git history.

Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-05-26 12:27:57 +02:00
7d9340e8cb Distrust fox SSH key
We no longer will share secrets with fox until we can regain our trust.

Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-05-26 12:00:21 +02:00
653a197bf4 Remove Ceph module from fox
It will no longer be accesible from the UPC.

Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-05-26 11:50:57 +02:00
b386d30380 Remove fox from SLURM
Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-05-26 11:43:16 +02:00
dd8d3c508b Remove pam_slurm_adopt from fox
We no longer will be able to use SLURM from jungle.

Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-05-26 11:40:07 +02:00
bbf09ab960 Add UPC temperature sensor monitoring
These sensors are part of their air quality measurements, which just
happen to be very close to our server room.

Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-05-26 11:24:12 +02:00
3b5781ba63 Add meteocat exporter
Allows us to track ambient temperature changes and estimate the
temperature delta between the server room and exterior temperature.
We should be able to predict when we would need to stop the machines due
to excesive temperature as summer approaches.

Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-05-23 15:40:09 +02:00
7d10816c98 Add custom nix-daemon exporter
Allows us to see which derivations are being built in realtime. It is a
bit of a hack, but it seems to work. We simply look at the environment
of the child processes of nix-daemon (usually bash) and then look for
the $name variable which should hold the current derivation being
built. Needs root to be able to read the environ file of the different
nix-daemon processes as they are owned by the nixbld* users.

See: https://discourse.nixos.org/t/query-ongoing-builds/23486
Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-04-24 23:51:06 +02:00
1b77b70074 Set keep-outputs to true in all machines
From the documentation of keep-outputs, setting it to true would prevent
the GC from removing build time dependencies:

If true, the garbage collector will keep the outputs of non-garbage
derivations. If false (default), outputs will be deleted unless they are
GC roots themselves (or reachable from other roots).

In general, outputs must be registered as roots separately. However,
even if the output of a derivation is registered as a root, the
collector will still delete store paths that are used only at build time
(e.g., the C compiler, or source tarballs downloaded from the network).
To prevent it from doing so, set this option to true.

See: https://nix.dev/manual/nix/2.24/command-ref/conf-file.html#conf-keep-outputs
Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2025-04-22 16:16:42 +02:00
b5d0b34179 Add raccoon node exporter monitoring
Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-04-22 11:41:43 +02:00