15 Commits

Author SHA1 Message Date
a81aebc788 Remove merged MPICH patch
Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2025-10-01 16:40:17 +02:00
5dbc100738 Add UPC temperature sensor monitoring
These sensors are part of their air quality measurements, which just
happen to be very close to our server room.

Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-10-01 16:40:17 +02:00
371b8b6a23 Add meteocat exporter
Allows us to track ambient temperature changes and estimate the
temperature delta between the server room and exterior temperature.
We should be able to predict when we would need to stop the machines due
to excesive temperature as summer approaches.

Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-10-01 16:40:17 +02:00
a7aa3b79a1 Reject SSH connections without SLURM allocation
Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-10-01 16:40:17 +02:00
7e5211d049 Fix MPICH build by fetching upstream patches too
Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-10-01 16:40:16 +02:00
cafd7ea682 Add workaround for MPICH 4.2.0
See: https://github.com/pmodels/mpich/issues/6946

Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2025-10-01 16:40:16 +02:00
08e9db0c3e Fix SLURM bug in rank integer sign expansion
See: https://bugs.schedmd.com/show_bug.cgi?id=19324

Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2025-10-01 16:40:16 +02:00
ab4cab97ba Merge pmix outputs for MPICH
MPICH expects headers and libraries to be present in the same directory.

Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2025-10-01 16:40:16 +02:00
59d9d19891 Remove old Ceph package overlay
The Ceph package is now integrated in upstream nixpkgs.

Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2025-10-01 16:40:16 +02:00
388a10b666 BSC packages are no longer in bsc attribute
Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2025-10-01 16:40:16 +02:00
38e23068f2 Add prometheus-slurm-exporter package 2025-10-01 16:40:16 +02:00
77276fb6c1 Revert "Update slurm to 23.02.05.1"
This reverts commit aaefddc44a9073166ac52b8bd56ac96258d3b053.
2025-10-01 16:40:16 +02:00
798fa002cc Update slurm to 23.02.05.1 2025-10-01 16:40:16 +02:00
4cea250cf4 Update ceph to 18.2.0 in overlay 2025-10-01 16:40:16 +02:00
3b823ee478 Move pkgs overlay to overlay.nix 2025-10-01 16:40:16 +02:00