20 Commits

Author SHA1 Message Date
5dbc100738 Add UPC temperature sensor monitoring
These sensors are part of their air quality measurements, which just
happen to be very close to our server room.

Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-10-01 16:40:17 +02:00
371b8b6a23 Add meteocat exporter
Allows us to track ambient temperature changes and estimate the
temperature delta between the server room and exterior temperature.
We should be able to predict when we would need to stop the machines due
to excesive temperature as summer approaches.

Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-10-01 16:40:17 +02:00
a7aa3b79a1 Reject SSH connections without SLURM allocation
Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-10-01 16:40:17 +02:00
7e5211d049 Fix MPICH build by fetching upstream patches too
Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-10-01 16:40:16 +02:00
cafd7ea682 Add workaround for MPICH 4.2.0
See: https://github.com/pmodels/mpich/issues/6946

Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2025-10-01 16:40:16 +02:00
08e9db0c3e Fix SLURM bug in rank integer sign expansion
See: https://bugs.schedmd.com/show_bug.cgi?id=19324

Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2025-10-01 16:40:16 +02:00
ab4cab97ba Merge pmix outputs for MPICH
MPICH expects headers and libraries to be present in the same directory.

Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2025-10-01 16:40:16 +02:00
29cdfba328 Fix warning in slurm exporter using vendorHash
Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2025-10-01 16:40:16 +02:00
59d9d19891 Remove old Ceph package overlay
The Ceph package is now integrated in upstream nixpkgs.

Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2025-10-01 16:40:16 +02:00
388a10b666 BSC packages are no longer in bsc attribute
Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2025-10-01 16:40:16 +02:00
38e23068f2 Add prometheus-slurm-exporter package 2025-10-01 16:40:16 +02:00
77276fb6c1 Revert "Update slurm to 23.02.05.1"
This reverts commit aaefddc44a9073166ac52b8bd56ac96258d3b053.
2025-10-01 16:40:16 +02:00
798fa002cc Update slurm to 23.02.05.1 2025-10-01 16:40:16 +02:00
4cea250cf4 Update ceph to 18.2.0 in overlay 2025-10-01 16:40:16 +02:00
3b823ee478 Move pkgs overlay to overlay.nix 2025-10-01 16:40:16 +02:00
9f03799d34 Set mpi to mpich by default in bscpkgs 2025-10-01 16:40:15 +02:00
3a9615fce4 Add missing parameter to extend 2025-10-01 16:40:15 +02:00
45d7b31c0a Use explicit order in overlays 2025-10-01 16:40:15 +02:00
73b33c4d6c Replace mpi inside bsc attribute 2025-10-01 16:40:15 +02:00
bae3c75222 Add mpich overlay 2025-10-01 16:40:15 +02:00