1473 Commits

Author SHA1 Message Date
a0dab66aa5 Move vlopez user to jungleUsers for koro host
Access to other machines can be easily added into the "hosts" attribute
without the need to replicate the configuration.

Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2024-07-16 12:35:39 +02:00
525cad4117 Add raccoon motd file
Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2024-07-16 12:35:38 +02:00
24ee74d614 Split xeon specific configuration from base
To accomodate the raccoon knights workstation, some of the configuration
pulled by m/common/main.nix has to be removed. To solve it, the xeon
specific parts are placed into m/common/xeon.nix and only the common
configuration is at m/common/base.nix.

Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2024-07-16 12:35:37 +02:00
15b4b28d2c Control user access to each machine
The users.jungleUsers configuration option behaves like the users.users
option, but defines the list attribute `hosts` for each user, which
filters users so that only the user can only access those hosts.

Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2024-07-16 12:35:34 +02:00
b1ce302e4b Add PostgreSQL DB for performance test results
The database will hold the performance results of the execution of the
benchmarks. We follow the same setup on knights3 for now.

Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2024-07-16 12:35:24 +02:00
b8b85f55cd Enable Grafana email alerts
Allows sending Grafana alerts via email too, so we have a reduntant
mechanism in case Slack fails to deliver them.

Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2024-05-31 15:57:38 +02:00
1189626a6f Enable mail notification in Gitea
Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2024-05-31 10:56:49 +02:00
dbd95dd7b8 Add msmtp to send notifications via email
Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2024-05-31 10:56:20 +02:00
81b680a7d2 Allow Ceph traffic to lake2 2024-05-02 17:43:48 +02:00
ba60e121df Collect Gitea metrics in Prometheus
Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2024-05-02 17:32:25 +02:00
432e6c8521 Add Gitea service
Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2024-05-02 17:31:51 +02:00
c8160122b3 Add firewall rules for Ceph and monitoring
The firewall was blocking the monitoring traffic from hut and the Ceph
traffic among OSDs. The rules only allow connecting from the specific
host that they are supposed to be coming from.

Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2024-04-25 13:25:11 +02:00
3863fc25a5 Add workaround for MPICH 4.2.0
See: https://github.com/pmodels/mpich/issues/6946

Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2024-04-25 13:25:08 +02:00
2b26cd2f46 Fix SLURM bug in rank integer sign expansion
See: https://bugs.schedmd.com/show_bug.cgi?id=19324

Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2024-04-25 13:25:05 +02:00
30f2079f0b Merge pmix outputs for MPICH
MPICH expects headers and libraries to be present in the same directory.

Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2024-04-25 13:25:03 +02:00
366436b6d3 Remove nixseparatedebuginfod input
It has been integrated in nixpkgs, so is no longer required.

Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2024-04-25 13:24:58 +02:00
9f1cd02144 flake.lock: Update
Flake lock file updates:

• Updated input 'agenix':
    'github:ryantm/agenix/daf42cb35b2dc614d1551e37f96406e4c4a2d3e4' (2023-10-08)
  → 'github:ryantm/agenix/1381a759b205dff7a6818733118d02253340fd5e' (2024-04-02)
• Updated input 'agenix/darwin':
    'github:lnl7/nix-darwin/87b9d090ad39b25b2400029c64825fc2a8868943' (2023-01-09)
  → 'github:lnl7/nix-darwin/4b9b83d5a92e8c1fbfd8eb27eda375908c11ec4d' (2023-11-24)
• Updated input 'agenix/home-manager':
    'github:nix-community/home-manager/32d3e39c491e2f91152c84f8ad8b003420eab0a1' (2023-04-22)
  → 'github:nix-community/home-manager/3bfaacf46133c037bb356193bd2f1765d9dc82c1' (2023-12-20)
• Added input 'agenix/systems':
    'github:nix-systems/default/da67096a3b9bf56a91d16901293e51ba5b49a27e' (2023-04-09)
• Updated input 'bscpkgs':
    'git+https://git.sr.ht/~rodarima/bscpkgs?ref=refs/heads/master&rev=e148de50d68b3eeafc3389b331cf042075971c4b' (2023-11-22)
  → 'git+https://git.sr.ht/~rodarima/bscpkgs?ref=refs/heads/master&rev=de89197a4a7b162db7df9d41c9d07759d87c5709' (2024-04-24)
• Updated input 'nixpkgs':
    'github:NixOS/nixpkgs/e4ad989506ec7d71f7302cc3067abd82730a4beb' (2023-11-19)
  → 'github:NixOS/nixpkgs/6143fc5eeb9c4f00163267708e26191d1e918932' (2024-04-21)
• Updated input 'nixseparatedebuginfod':
    'github:symphorien/nixseparatedebuginfod/232591f5274501b76dbcd83076a57760237fcd64' (2023-11-05)
  → 'github:symphorien/nixseparatedebuginfod/98d79461660f595637fa710d59a654f242b4c3f7' (2024-03-07)
• Removed input 'nixseparatedebuginfod'
• Removed input 'nixseparatedebuginfod/flake-utils'
• Removed input 'nixseparatedebuginfod/flake-utils/systems'
• Removed input 'nixseparatedebuginfod/nixpkgs'

Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2024-04-25 13:24:29 +02:00
de89197a4a Add bigotes package
Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
Tested-by: Rodrigo Arias Mallo <rodrigo.arias@bsc.es>
2024-04-24 17:59:24 +02:00
5d3820631a Fix Nanos6 4.0 build
It looks like after upgrading the compiler the build breaks. The patch
simply adds the missing cstdint include, until a new release is made.

Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
Tested-by: Rodrigo Arias Mallo <rodrigo.arias@bsc.es>
2024-04-23 12:03:49 +02:00
9c8a077828 Enable separatedebuginfo for openmp
Reviewed-by: Rodrigo Arias Mallo <rodrigo.arias@bsc.es>
Tested-by: Rodrigo Arias Mallo <rodrigo.arias@bsc.es>
2024-04-12 12:27:54 +02:00
fce556cb28 Build OmpSs-2 LLVM with commit version information
Allows users to see which commit (or git tag) was used in clang.
Examples for the release and git versions:

% clang --version
clang version 18.0.0 (18.0.0-ompss-2)

% clang --version
clang version 18.0.0 (0a6d6c6)

Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
Tested-by: Rodrigo Arias Mallo <rodrigo.arias@bsc.es>
2024-03-14 14:13:14 +01:00
82ccae1315 Use google.com probe instead of bsc.es
The main website of the BSC is failing every day around 3:00 AM for
almost one hour, so it is not a very good target. Instead, google.com is
used which should be more reliable. The same robots.txt path is fetched,
as it is smaller than the main page.

Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2024-03-05 16:52:21 +01:00
37ce5ef391 Paraver: Use wrapGApps Hook for nixos
Needed to fix the error on NixOS:

  GLib-GIO-ERROR **: No GSettings schemas are installed on the system

See https://github.com/NixOS/nixpkgs/issues/16285

Reviewed-by: Rodrigo Arias Mallo <rodrigo.arias@bsc.es>
Tested-by: Rodrigo Arias Mallo <rodrigo.arias@bsc.es>
2024-02-15 16:06:29 +01:00
1df80460d2 Add another HTTPS probe for bsc.es
As all other HTTPS probes pass through the opsproxy01.bsc.es proxy, we
cannot detect a problem in our proxy or in the BSC one. Adding another
target like bsc.es that doesn't use the ops proxy allows us to discern
where the problem lies.

Instead of monitoring https://www.bsc.es/ directly, which will trigger
the whole Drupal server and take a whole second, we just fetch robots.txt
so the overhead on the server is minimal (and returns in less than 10 ms).

Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2024-02-13 12:26:56 +01:00
7f17fe8874 Move slurm client in a separate module
Reviewed-by: Rodrigo Arias Mallo <rodrigo.arias@bsc.es>
2024-02-13 11:11:17 +01:00
Raúl Peñacoba
3b21a32d83 Add ovni to OpenMP-V
Allows building OpenMP-V with ovni support, which is neccessary to run
the runtime tests of OpenMP-V in ovni.

Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
Reviewed-by: Rodrigo Arias Mallo <rodrigo.arias@bsc.es>
Tested-by: Rodrigo Arias Mallo <rodrigo.arias@bsc.es>
2024-01-15 10:20:46 +01:00
5880a6e5f6 Enable public-inbox at jungle.bsc.es/lists
The public-inbox service fetches emails from the sourcehut mailing lists
and displays them on the web. The idea is to reduce the dependency on
external services and add a secondary storage for the mailing lists in
case sourcehut goes down or changes the current free plans.

The service is available in https://jungle.bsc.es/lists/ and is open to
the public. It currently mirrors the bscpkgs and jungle mailing list.

We also edited the CSS to improve the readability and have larger fonts
by default.

The service for public-inbox produced by NixOS is not well configured to
fetch emails from an IMAP mail server, so we also manually edit the
service file to enable the network.

Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2023-12-15 11:18:08 +01:00
c4d5135fde Split openmp versions in separate derivations
The openmp derivation provides both libomp and libompv. To avoid
accidentally linking with the wrong library and to avoid the nosv
dependency on libomp, this patch separates each version in a different
derivation.

Also, it adapts the clang wrappers and stdenvs to provide an stdenv per
openmp library where each openmp will be used by default when the
compiler flag "-fopenmp" is used. This eases linking ompv with nixpkgs
libraries, such as blis, that expect openmp to be provided with stdenv.

Reviewed-by: Rodrigo Arias Mallo <rodrigo.arias@bsc.es>
Tested-by: Rodrigo Arias Mallo <rodrigo.arias@bsc.es>
2023-12-07 18:01:20 +01:00
3f2b9a766b Update clang with internal bug release
Reviewed-by: Rodrigo Arias Mallo <rodrigo.arias@bsc.es>
2023-12-07 18:01:08 +01:00
ecbb45d6ac Monitor https://pm.bsc.es/gitlab/ too
The GitLab instance is in the /gitlab endpoint and may fail
independently of https://pm.bsc.es/.

Cc: Víctor López <victor.lopez@bsc.es>
Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2023-12-05 09:56:28 +01:00
c564d945d4 Enable nixseparatedebuginfod module
The module is only enabled on Hut and Eudy because we noticed activity
on the debuginfod service even if no debug session was active.

Reviewed-by: Rodrigo Arias Mallo <rodrigo.arias@bsc.es>
2023-12-04 11:04:52 +01:00
Raúl Peñacoba
3ed644b88f Add clangNosvOpenmp-ld compiler test
Add a test to verify that "clang -fopenmp=libompv" links correctly with
nOS-V even though it is not placed in the buildInputs.

Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
Reviewed-by: Rodrigo Arias Mallo <rodrigo.arias@bsc.es>
Tested-by: Rodrigo Arias Mallo <rodrigo.arias@bsc.es>
2023-12-01 16:35:24 +01:00
Raúl Peñacoba
8ceaddfea7 Split OpenMP from Clang in LLVM
As the OpenMP-V implementation requires to be built with nOS-V, we can
split the OpenMP package in a different derivation to prevent rebuilds
of clang. Additionally, as OpenMP-V now can be build alongside the
vanilla OpenMP runtime, we simply build a single openmp derivation with
both runtimes. Only a single build of the clang compiler is now
required.

Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
Reviewed-by: Rodrigo Arias Mallo <rodrigo.arias@bsc.es>
2023-12-01 16:31:56 +01:00
2a953d811c Build TAMPI with ovni support
By default we build TAMPI with ovni support, as it will be disabled in
runtime unless explicitly enabled by the TAMPI_INSTRUMENT=ovni
environment variable.

Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2023-11-29 17:49:30 +01:00
fec4ddf6ab Merge TAMPI release and git in the same file
In order to reduce duplicate information we just place the two sources
in the same file.

Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2023-11-29 17:49:22 +01:00
9aaea0da0e Update paraver: 4.10.6 -> 4.11.2
Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2023-11-29 12:19:27 +01:00
ed887b0412 Use tmpfs in /tmp
The /tmp directory was using the SSD disk which is not erased across
boots. Nix will use /tmp to perform the builds, so we want it to be as
fast as possible. In general, all the machines have enough space to
handle large builds like LLVM.

Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2023-11-28 12:25:50 +01:00
f55b48ec86 Update ompv test with -fopenmp=libompv flag
Reviewed-By: Rodrigo Arias Mallo <rodrigo.arias@bsc.es>
Tested-By: Rodrigo Arias Mallo <rodrigo.arias@bsc.es>
2023-11-24 16:48:09 +01:00
1e52075c18 TAGASPI 2023.11 update and move to public repo
Reviewed-By: Rodrigo Arias Mallo <rodrigo.arias@bsc.es>
2023-11-24 16:47:52 +01:00
062b1c3c77 Mercurium 2023.11 update
Reviewed-By: Rodrigo Arias Mallo <rodrigo.arias@bsc.es>
2023-11-24 16:47:42 +01:00
1520eaa64e TAMPI 2023.11 update
Reviewed-By: Rodrigo Arias Mallo <rodrigo.arias@bsc.es>
2023-11-24 16:47:38 +01:00
54b4448e4b Ovni 2023.11 update
Reviewed-By: Rodrigo Arias Mallo <rodrigo.arias@bsc.es>
2023-11-24 16:47:27 +01:00
28a7496fbd nOS-V 2023.11 update
Reviewed-By: Rodrigo Arias Mallo <rodrigo.arias@bsc.es>
2023-11-24 16:47:22 +01:00
b5dae25e7f NODES 2023.11 update
Reviewed-By: Rodrigo Arias Mallo <rodrigo.arias@bsc.es>
2023-11-24 16:47:18 +01:00
bdc3670ccc Nanos6 2023.11 update
Reviewed-By: Rodrigo Arias Mallo <rodrigo.arias@bsc.es>
2023-11-24 16:47:09 +01:00
af590d5ace Clang 2023.11 update
Clang nos-v-merge branch has been merged into master.

Reviewed-By: Rodrigo Arias Mallo <rodrigo.arias@bsc.es>
2023-11-24 16:47:02 +01:00
8a31895e48 Update nixpkgs commit in default.nix
Reviewed-By: Rodrigo Arias Mallo <rodrigo.arias@bsc.es>
2023-11-24 16:46:54 +01:00
d9ae85ce4b Add a more strict test for OpenMP+nOS-V
In this test we ensure that the worksharing region is running inside a
nOS-V task, so we know that we are not using the vanilla OpenMP by
accident.

We also keep the previous test test/compilers/clang-openmp.nix as-is, so
we can check that the compiler injects the nosv library dependency in
the final binary on its own.

Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2023-11-24 14:52:23 +01:00
20ded0c0df Change the GPI-2 URL to a public repository
Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2023-11-24 14:49:03 +01:00
fe1d3fbb80 Enable runners for pm.bsc.es/gitlab too
The old runners for the PM gitlab were disabled in configuration in the
last outage, but they remained working until we reboot the node. With
this change we enable the runners for both PM and gitlab.bsc.es.

Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2023-11-24 14:45:23 +01:00