Compare commits

..

27 Commits

Author SHA1 Message Date
82fc3209de Set keep-outputs to true in all machines
From the documentation of keep-outputs, setting it to true would prevent
the GC from removing build time dependencies:

If true, the garbage collector will keep the outputs of non-garbage
derivations. If false (default), outputs will be deleted unless they are
GC roots themselves (or reachable from other roots).

In general, outputs must be registered as roots separately. However,
even if the output of a derivation is registered as a root, the
collector will still delete store paths that are used only at build time
(e.g., the C compiler, or source tarballs downloaded from the network).
To prevent it from doing so, set this option to true.

See: https://nix.dev/manual/nix/2.24/command-ref/conf-file.html#conf-keep-outputs
Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
2025-04-22 17:27:37 +02:00
abeab18270 Add raccoon node exporter monitoring
Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-04-22 14:50:08 +02:00
1985b58619 Increase data retention to 5 years
Now that we have more space, we can extend the retention time to 5 years
to hold the monitoring metrics. For a year we have:

	# du -sh /var/lib/prometheus2
	13G     /var/lib/prometheus2

So we can expect it to increase to about 65 GiB. In the future we may
want to reduce some adquisition frequency.

Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-04-22 14:50:03 +02:00
44bd061823 Don't forward any docker traffic
Access to the 23080 local port will be done by applying the INPUT rules,
which pass through nixos-fw.

Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-04-15 14:16:15 +02:00
e8c309f584 Allow traffic from docker to enter port 23080
Before:

  hut% sudo docker run -it --rm alpine /bin/ash -xc 'true | nc -w 3 -v 10.0.40.7 23080'
  + true
  + nc -w 3 -v 10.0.40.7 23080
  nc: 10.0.40.7 (10.0.40.7:23080): Operation timed out

After:

  hut% sudo docker run -it --rm alpine /bin/ash -xc 'true | nc -w 3 -v 10.0.40.7 23080'
  + true
  + nc -w 3 -v 10.0.40.7 23080
  10.0.40.7 (10.0.40.7:23080) open

Fixes: rarias/jungle#94
Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-04-15 14:16:10 +02:00
71ae7fb585 Add bscpm04.bsc.es SSH host and public key
Allows fetching repositories from hut and other machines in jungle
without the need to do any extra configuration.

Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-04-15 14:15:45 +02:00
8834d561d2 Add nix cache documentation section
Include usage from NixOS and non-NixOS hosts and a test with curl to
ensure it can be reached.

Reviewed-by: Rodrigo Arias Mallo <rodrigo.arias@bsc.es>
2025-04-15 14:08:22 +02:00
29daa3c364 Use hut nix cache in owl1, owl2 and raccoon
For owl1 and owl2 directly connect to hut via LAN with HTTP, but for
raccoon pass via the proxy using jungle.bsc.es with HTTPS. There is no
risk of tampering as packages are signed.

Reviewed-by: Rodrigo Arias Mallo <rodrigo.arias@bsc.es>
2025-04-15 14:08:17 +02:00
9c503fbefb Clean all iptables rules on stop
Prevents the "iptables: Chain already exists." error by making sure that
we don't leave any chain on start. The ideal solution is to use
iptables-restore instead, which will do the right job. But this needs to
be changed in NixOS entirely.

Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-04-15 14:08:14 +02:00
51b6a8b612 Make nginx listen on all interfaces
Needed for local hosts to contact the nix cache via HTTP directly.
We also allow the incoming traffic on port 80.

Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-04-15 14:08:07 +02:00
52213d388d Fix nginx /cache regex
`nix-serve` does not handle duplicates in the path:
```
hut$ curl http://127.0.0.1:5000/nix-cache-info
StoreDir: /nix/store
WantMassQuery: 1
Priority: 30
hut$ curl http://127.0.0.1:5000//nix-cache-info
File not found.
```

This meant that the cache was not accessible via:
`curl https://jungle.bsc.es/cache/nix-cache-info` but
`curl https://jungle.bsc.es/cachenix-cache-info` worked.

Reviewed-by: Rodrigo Arias Mallo <rodrigo.arias@bsc.es>
2025-04-15 14:08:04 +02:00
edf744db8d Add new GitLab runner for gitlab.bsc.es
It uses docker based on alpine and the host nix store, so we can perform
builds but isolate them from the system.

Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-04-08 17:41:18 +02:00
b82894eaec Remove SLURM partition all
We no longer have homogeneous nodes so it doesn't make much sense to
allocate a mix of them.

Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-04-08 17:15:27 +02:00
1c47199891 Add varcila user to hut and fox
Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-04-08 17:15:25 +02:00
8738bd4eeb Adjust fox slurm config after disabling SMT
Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-04-08 17:15:23 +02:00
7699783aac Add abonerib user to fox
Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-04-08 17:15:21 +02:00
fee1d4da7e Don't move doc in web output
Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-04-08 17:15:19 +02:00
b77ce7fb56 Add quickstart guide
Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-04-08 17:15:17 +02:00
b4a12625c5 Reject SSH connections without SLURM allocation
Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-04-08 17:15:15 +02:00
302106ea9a Add users to fox
Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-04-08 17:15:13 +02:00
96877de8d9 Add dalvare1 user
Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-04-08 17:15:11 +02:00
8878985be6 Add fox page in jungle website
Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-04-08 17:15:08 +02:00
737578db34 Mount NVME disks in /nvme{0,1}
Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-04-08 17:15:06 +02:00
88555e3f8c Exclude fox from being suspended by slurm
Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-04-08 17:15:04 +02:00
feb2060be7 Use IPMI host names instead of IP addresses
Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-04-08 17:15:01 +02:00
00999434c2 Add fox IPMI monitoring
Use agenix to store the credentials safely.

Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-04-08 17:14:59 +02:00
29d58cc62d Add new fox machine
Reviewed-by: Aleix Boné <abonerib@bsc.es>
2025-04-08 17:14:42 +02:00
15 changed files with 204 additions and 48 deletions

View File

@ -23,6 +23,7 @@
trusted-users = [ "@wheel" ]; trusted-users = [ "@wheel" ];
flake-registry = pkgs.writeText "global-registry.json" flake-registry = pkgs.writeText "global-registry.json"
''{"flakes":[],"version":2}''; ''{"flakes":[],"version":2}'';
keep-outputs = true;
}; };
gc = { gc = {

View File

@ -10,7 +10,7 @@ in
# Connect to intranet git hosts via proxy # Connect to intranet git hosts via proxy
programs.ssh.extraConfig = '' programs.ssh.extraConfig = ''
Host bscpm02.bsc.es bscpm03.bsc.es gitlab-internal.bsc.es alya.gitlab.bsc.es Host bscpm02.bsc.es bscpm03.bsc.es bscpm04.bsc.es gitlab-internal.bsc.es alya.gitlab.bsc.es
User git User git
ProxyCommand nc -X connect -x hut:23080 %h %p ProxyCommand nc -X connect -x hut:23080 %h %p
@ -22,6 +22,7 @@ in
programs.ssh.knownHosts = hostsKeys // { programs.ssh.knownHosts = hostsKeys // {
"gitlab-internal.bsc.es".publicKey = "ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIF9arsAOSRB06hdy71oTvJHG2Mg8zfebADxpvc37lZo3"; "gitlab-internal.bsc.es".publicKey = "ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIF9arsAOSRB06hdy71oTvJHG2Mg8zfebADxpvc37lZo3";
"bscpm03.bsc.es".publicKey = "ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIM2NuSUPsEhqz1j5b4Gqd+MWFnRqyqY57+xMvBUqHYUS"; "bscpm03.bsc.es".publicKey = "ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIM2NuSUPsEhqz1j5b4Gqd+MWFnRqyqY57+xMvBUqHYUS";
"bscpm04.bsc.es".publicKey = "ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIPx4mC0etyyjYUT2Ztc/bs4ZXSbVMrogs1ZTP924PDgT";
"glogin1.bsc.es".publicKey = "ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIFsHsZGCrzpd4QDVn5xoDOtrNBkb0ylxKGlyBt6l9qCz"; "glogin1.bsc.es".publicKey = "ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIFsHsZGCrzpd4QDVn5xoDOtrNBkb0ylxKGlyBt6l9qCz";
"glogin2.bsc.es".publicKey = "ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIFsHsZGCrzpd4QDVn5xoDOtrNBkb0ylxKGlyBt6l9qCz"; "glogin2.bsc.es".publicKey = "ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIFsHsZGCrzpd4QDVn5xoDOtrNBkb0ylxKGlyBt6l9qCz";
}; };

View File

@ -81,7 +81,7 @@
home = "/home/Computational/abonerib"; home = "/home/Computational/abonerib";
description = "Aleix Boné"; description = "Aleix Boné";
group = "Computational"; group = "Computational";
hosts = [ "owl1" "owl2" "hut" "raccoon" ]; hosts = [ "owl1" "owl2" "hut" "raccoon" "fox" ];
hashedPassword = "$6$V1EQWJr474whv7XJ$OfJ0wueM2l.dgiJiiah0Tip9ITcJ7S7qDvtSycsiQ43QBFyP4lU0e0HaXWps85nqB4TypttYR4hNLoz3bz662/"; hashedPassword = "$6$V1EQWJr474whv7XJ$OfJ0wueM2l.dgiJiiah0Tip9ITcJ7S7qDvtSycsiQ43QBFyP4lU0e0HaXWps85nqB4TypttYR4hNLoz3bz662/";
openssh.authorizedKeys.keys = [ openssh.authorizedKeys.keys = [
"ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIIIFiqXqt88VuUfyANkZyLJNiuroIITaGlOOTMhVDKjf abonerib@bsc" "ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIIIFiqXqt88VuUfyANkZyLJNiuroIITaGlOOTMhVDKjf abonerib@bsc"
@ -126,6 +126,19 @@
"ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIGEfy6F4rF80r4Cpo2H5xaWqhuUZzUsVsILSKGJzt5jF dalvare1@ssfhead" "ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIGEfy6F4rF80r4Cpo2H5xaWqhuUZzUsVsILSKGJzt5jF dalvare1@ssfhead"
]; ];
}; };
varcila = {
uid = 5650;
isNormalUser = true;
home = "/home/Computational/varcila";
description = "Vincent Arcila";
group = "Computational";
hosts = [ "hut" "fox" ];
hashedPassword = "$6$oB0Tcn99DcM4Ch$Vn1A0ulLTn/8B2oFPi9wWl/NOsJzaFAWjqekwcuC9sMC7cgxEVb.Nk5XSzQ2xzYcNe5MLtmzkVYnRS1CqP39Y0";
openssh.authorizedKeys.keys = [
"ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIKGt0ESYxekBiHJQowmKpfdouw0hVm3N7tUMtAaeLejK vincent@varch"
];
};
}; };
groups = { groups = {

View File

@ -11,7 +11,7 @@
proxy = { proxy = {
default = "http://hut:23080/"; default = "http://hut:23080/";
noProxy = "127.0.0.1,localhost,internal.domain,10.0.40.40"; noProxy = "127.0.0.1,localhost,internal.domain,10.0.40.40,hut";
# Don't set all_proxy as go complains and breaks the gitlab runner, see: # Don't set all_proxy as go complains and breaks the gitlab runner, see:
# https://github.com/golang/go/issues/16715 # https://github.com/golang/go/issues/16715
allProxy = null; allProxy = null;

View File

@ -56,6 +56,11 @@
iptables -A nixos-fw -p tcp -s 10.0.40.30 --dport 23080 -j nixos-fw-log-refuse iptables -A nixos-fw -p tcp -s 10.0.40.30 --dport 23080 -j nixos-fw-log-refuse
iptables -A nixos-fw -p tcp -s 10.0.40.0/24 --dport 23080 -j nixos-fw-accept iptables -A nixos-fw -p tcp -s 10.0.40.0/24 --dport 23080 -j nixos-fw-accept
''; '';
# Flush all rules and chains on stop so it won't break on start
extraStopCommands = ''
iptables -F
iptables -X
'';
}; };
}; };

View File

@ -22,8 +22,8 @@
"--docker-network-mode host" "--docker-network-mode host"
]; ];
environmentVariables = { environmentVariables = {
https_proxy = "http://localhost:23080"; https_proxy = "http://hut:23080";
http_proxy = "http://localhost:23080"; http_proxy = "http://hut:23080";
}; };
}; };
in { in {
@ -38,14 +38,13 @@
gitlab-bsc-docker = { gitlab-bsc-docker = {
# gitlab.bsc.es still uses the old token mechanism # gitlab.bsc.es still uses the old token mechanism
registrationConfigFile = config.age.secrets.gitlab-bsc-docker.path; registrationConfigFile = config.age.secrets.gitlab-bsc-docker.path;
tagList = [ "docker" "hut" ];
environmentVariables = { environmentVariables = {
https_proxy = "http://localhost:23080"; # We cannot access the hut local interface from docker, so we connect
http_proxy = "http://localhost:23080"; # to hut directly via the ethernet one.
https_proxy = "http://hut:23080";
http_proxy = "http://hut:23080";
}; };
# FIXME
registrationFlags = [
"--docker-network-mode host"
];
executor = "docker"; executor = "docker";
dockerImage = "alpine"; dockerImage = "alpine";
dockerVolumes = [ dockerVolumes = [
@ -53,7 +52,15 @@
"/nix/var/nix/db:/nix/var/nix/db:ro" "/nix/var/nix/db:/nix/var/nix/db:ro"
"/nix/var/nix/daemon-socket:/nix/var/nix/daemon-socket:ro" "/nix/var/nix/daemon-socket:/nix/var/nix/daemon-socket:ro"
]; ];
dockerExtraHosts = [
# Required to pass the proxy via hut
"hut:10.0.40.7"
];
dockerDisableCache = true; dockerDisableCache = true;
registrationFlags = [
# Increase build log length to 64 MiB
"--output-limit 65536"
];
preBuildScript = pkgs.writeScript "setup-container" '' preBuildScript = pkgs.writeScript "setup-container" ''
mkdir -p -m 0755 /nix/var/log/nix/drvs mkdir -p -m 0755 /nix/var/log/nix/drvs
mkdir -p -m 0755 /nix/var/nix/gcroots mkdir -p -m 0755 /nix/var/nix/gcroots
@ -66,32 +73,39 @@
mkdir -p -m 0700 "$HOME/.nix-defexpr" mkdir -p -m 0700 "$HOME/.nix-defexpr"
mkdir -p -m 0700 "$HOME/.ssh" mkdir -p -m 0700 "$HOME/.ssh"
cat > "$HOME/.ssh/config" << EOF cat > "$HOME/.ssh/config" << EOF
Host bscpm03.bsc.es gitlab-internal.bsc.es Host bscpm04.bsc.es gitlab-internal.bsc.es
User git User git
ProxyCommand nc -X connect -x hut:23080 %h %p ProxyCommand nc -X connect -x hut:23080 %h %p
Host amdlogin1.bsc.es armlogin1.bsc.es hualogin1.bsc.es glogin1.bsc.es glogin2.bsc.es fpgalogin1.bsc.es Host amdlogin1.bsc.es armlogin1.bsc.es hualogin1.bsc.es glogin1.bsc.es glogin2.bsc.es fpgalogin1.bsc.es
ProxyCommand nc -X connect -x hut:23080 %h %p ProxyCommand nc -X connect -x hut:23080 %h %p
EOF EOF
cat >> "$HOME/.ssh/known_hosts" << EOF cat >> "$HOME/.ssh/known_hosts" << EOF
bscpm03.bsc.es ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIM2NuSUPsEhqz1j5b4Gqd+MWFnRqyqY57+xMvBUqHYUS bscpm04.bsc.es ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIPx4mC0etyyjYUT2Ztc/bs4ZXSbVMrogs1ZTP924PDgT
gitlab-internal.bsc.es ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIF9arsAOSRB06hdy71oTvJHG2Mg8zfebADxpvc37lZo3 gitlab-internal.bsc.es ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIF9arsAOSRB06hdy71oTvJHG2Mg8zfebADxpvc37lZo3
EOF EOF
. ${pkgs.nix}/etc/profile.d/nix-daemon.sh . ${pkgs.nix}/etc/profile.d/nix-daemon.sh
${pkgs.nix}/bin/nix-channel --add https://nixos.org/channels/nixos-24.11 nixpkgs # Required to load SSL certificate paths
${pkgs.nix}/bin/nix-channel --update nixpkgs . ${pkgs.cacert}/nix-support/setup-hook
${pkgs.nix}/bin/nix-env -i ${lib.concatStringsSep " " (with pkgs; [ nix cacert git openssh netcat curl ])}
''; '';
environmentVariables = { environmentVariables = {
ENV = "/etc/profile"; ENV = "/etc/profile";
USER = "root"; USER = "root";
NIX_REMOTE = "daemon"; NIX_REMOTE = "daemon";
PATH = "/nix/var/nix/profiles/default/bin:/nix/var/nix/profiles/default/sbin:/bin:/sbin:/usr/bin:/usr/sbin"; PATH = "${config.system.path}/bin:/bin:/sbin:/usr/bin:/usr/sbin";
NIX_SSL_CERT_FILE = "/nix/var/nix/profiles/default/etc/ssl/certs/ca-bundle.crt";
}; };
}; };
}; };
}; };
# DOCKER* chains are useless, override at FORWARD and nixos-fw
networking.firewall.extraCommands = ''
# Don't forward any traffic from docker
iptables -I FORWARD 1 -p all -i docker0 -j nixos-fw-log-refuse
# Allow incoming traffic from docker to 23080
iptables -A nixos-fw -p tcp -i docker0 -d hut --dport 23080 -j ACCEPT
'';
#systemd.services.gitlab-runner.serviceConfig.Shell = "${pkgs.bash}/bin/bash"; #systemd.services.gitlab-runner.serviceConfig.Shell = "${pkgs.bash}/bin/bash";
systemd.services.gitlab-runner.serviceConfig.DynamicUser = lib.mkForce false; systemd.services.gitlab-runner.serviceConfig.DynamicUser = lib.mkForce false;
systemd.services.gitlab-runner.serviceConfig.User = "gitlab-runner"; systemd.services.gitlab-runner.serviceConfig.User = "gitlab-runner";

View File

@ -46,7 +46,7 @@
services.prometheus = { services.prometheus = {
enable = true; enable = true;
port = 9001; port = 9001;
retentionTime = "1y"; retentionTime = "5y";
listenAddress = "127.0.0.1"; listenAddress = "127.0.0.1";
}; };
@ -76,7 +76,7 @@
group = "root"; group = "root";
user = "root"; user = "root";
configFile = config.age.secrets.ipmiYml.path; configFile = config.age.secrets.ipmiYml.path;
extraFlags = [ "--log.level=debug" ]; # extraFlags = [ "--log.level=debug" ];
listenAddress = "127.0.0.1"; listenAddress = "127.0.0.1";
}; };
node = { node = {
@ -250,6 +250,14 @@
module = [ "raccoon" ]; module = [ "raccoon" ];
}; };
} }
{
job_name = "raccoon";
static_configs = [
{
targets = [ "127.0.0.1:19002" ]; # Node exporter
}
];
}
{ {
job_name = "ipmi-fox"; job_name = "ipmi-fox";
metrics_path = "/ipmi"; metrics_path = "/ipmi";

View File

@ -12,16 +12,19 @@ let
installPhase = '' installPhase = ''
cp -r public $out cp -r public $out
''; '';
# Don't mess doc/
dontFixup = true;
}; };
in in
{ {
networking.firewall.allowedTCPPorts = [ 80 ];
services.nginx = { services.nginx = {
enable = true; enable = true;
virtualHosts."jungle.bsc.es" = { virtualHosts."jungle.bsc.es" = {
root = "${website}"; root = "${website}";
listen = [ listen = [
{ {
addr = "127.0.0.1"; addr = "0.0.0.0";
port = 80; port = 80;
} }
]; ];
@ -38,7 +41,7 @@ in
proxy_redirect http:// $scheme://; proxy_redirect http:// $scheme://;
} }
location /cache { location /cache {
rewrite ^/cache(.*) /$1 break; rewrite ^/cache/(.*) /$1 break;
proxy_pass http://127.0.0.1:5000; proxy_pass http://127.0.0.1:5000;
proxy_redirect http:// $scheme://; proxy_redirect http:// $scheme://;
} }

View File

@ -0,0 +1,10 @@
{ config, ... }:
{
nix.settings =
# Don't add hut as a cache to itself
assert config.networking.hostName != "hut";
{
substituters = [ "http://hut/cache" ];
trusted-public-keys = [ "jungle.bsc.es:pEc7MlAT0HEwLQYPtpkPLwRsGf80ZI26aj29zMw/HH0=" ];
};
}

View File

@ -27,22 +27,6 @@ let
done done
''; '';
prolog = pkgs.writeScript "prolog.sh" ''
#!/usr/bin/env bash
echo "hello from the prolog"
exit 0
'';
epilog = pkgs.writeScript "epilog.sh" ''
#!/usr/bin/env bash
echo "hello from the epilog"
exit 0
'';
in { in {
systemd.services.slurmd.serviceConfig = { systemd.services.slurmd.serviceConfig = {
# Kill all processes in the control group on stop/restart. This will kill # Kill all processes in the control group on stop/restart. This will kill
@ -59,14 +43,13 @@ in {
clusterName = "jungle"; clusterName = "jungle";
nodeName = [ nodeName = [
"owl[1,2] Sockets=2 CoresPerSocket=14 ThreadsPerCore=2 Feature=owl" "owl[1,2] Sockets=2 CoresPerSocket=14 ThreadsPerCore=2 Feature=owl"
"fox Sockets=2 CoresPerSocket=96 ThreadsPerCore=2 Feature=fox" "fox Sockets=2 CoresPerSocket=96 ThreadsPerCore=1 Feature=fox"
"hut Sockets=2 CoresPerSocket=14 ThreadsPerCore=2" "hut Sockets=2 CoresPerSocket=14 ThreadsPerCore=2"
]; ];
partitionName = [ partitionName = [
"owl Nodes=owl[1-2] Default=YES DefaultTime=01:00:00 MaxTime=INFINITE State=UP" "owl Nodes=owl[1-2] Default=YES DefaultTime=01:00:00 MaxTime=INFINITE State=UP"
"fox Nodes=fox Default=NO DefaultTime=01:00:00 MaxTime=INFINITE State=UP" "fox Nodes=fox Default=NO DefaultTime=01:00:00 MaxTime=INFINITE State=UP"
"all Nodes=owl[1-2],hut Default=NO DefaultTime=01:00:00 MaxTime=INFINITE State=UP"
]; ];
# See slurm.conf(5) for more details about these options. # See slurm.conf(5) for more details about these options.

View File

@ -8,6 +8,7 @@
../module/slurm-client.nix ../module/slurm-client.nix
../module/slurm-firewall.nix ../module/slurm-firewall.nix
../module/debuginfod.nix ../module/debuginfod.nix
../module/hut-substituter.nix
]; ];
# Select the this using the ID to avoid mismatches # Select the this using the ID to avoid mismatches

View File

@ -8,6 +8,7 @@
../module/slurm-client.nix ../module/slurm-client.nix
../module/slurm-firewall.nix ../module/slurm-firewall.nix
../module/debuginfod.nix ../module/debuginfod.nix
../module/hut-substituter.nix
]; ];
# Select the this using the ID to avoid mismatches # Select the this using the ID to avoid mismatches

View File

@ -25,6 +25,11 @@
} ]; } ];
}; };
nix.settings = {
substituters = [ "https://jungle.bsc.es/cache" ];
trusted-public-keys = [ "jungle.bsc.es:pEc7MlAT0HEwLQYPtpkPLwRsGf80ZI26aj29zMw/HH0=" ];
};
# Configure Nvidia driver to use with CUDA # Configure Nvidia driver to use with CUDA
hardware.nvidia.package = config.boot.kernelPackages.nvidiaPackages.production; hardware.nvidia.package = config.boot.kernelPackages.nvidiaPackages.production;
hardware.graphics.enable = true; hardware.graphics.enable = true;

View File

@ -1,9 +1,11 @@
age-encryption.org/v1 age-encryption.org/v1
-> ssh-ed25519 HY2yRg 4Xns3jybBuv8flzd+h3DArVBa/AlKjt1J9jAyJsasCE -> ssh-ed25519 HY2yRg WSdjyQPzBJ4JbzQpGeq1AAYpWKoXmLI1ZtmNmM5QOzs
uyVjJxh5i8aGgAgCpPl6zTYeIkf9mIwURof51IKWvwE qGDlDT31DQF1DdHen0+5+52DdsQlabJdA2pOB5O1I6g
-> ssh-ed25519 CAWG4Q T2r6r1tyNgq1XlYXVtLJFfOfUnm6pSVlPwUqC1pkyRo -> ssh-ed25519 CAWG4Q wioWMDxQjN+d4JdIbCwZg0DLQu1OH2mV6gukRprjuAs
9yDoKU0EC34QMUXYnsJvhPCLm6oD9w7NlTi2sheoBqQ 670fE61hidOEh20hHiQAhP0+CjDF0WMBNzgwkGT8Yqg
-> ssh-ed25519 MSF3dg Bh9DekFTq+QMUEAonwcaIAJX4Js1O7cHjDniCD0gtm8 -> ssh-ed25519 MSF3dg DN19uvAEtqq4708P6HpuX9i/o/qAvHX6dj69dCF2H1o
t/Ro0URLeDUWcvb7rlkG2s03PZ+9Rr3N4TIX03tXpVc 4Lu9GnjiFLMeXJ2C7aVPJsCHCQVlhylNWJi896Av92s
--- E5+/D4aK2ihKRR4YC5XOTmUbKgOqBR0Nk0gYvFOzXOI --- 7cKBwOYNOUZ2h3/kAY09aSMASZSxX7hZIT4kvlIiT6w
‰ÀÍyKF~djº˜r%¸Š'ÉÓÖPä&_-lŸ”ö&o¥_ér¯¦r¢ÿß<C3BF>0ï,­U7†nC·Te…÷[fˆ97ü•…šÙ˦“ÈC!D±E<C2B1>Wé*ÐLAôx6¾#–¯ sqôiéËÆäÏŸ“åk ,ùÝ“ ³6—çà•äfQF5=¦bX+‡v e`Ï7ªA~PÎÖѦ7<15>Ì
´ÖA÷)·h³ù=oZ¸ ^´V0ñ/Ü…µr
k¸uœbĶ:R<52>>^gŒõ¼ik_*% <0B>a7ùKGæ<47>ÐÖçâ&­PI¶£n

View File

@ -13,6 +13,115 @@ which is available at `hut` or `xeon07`. It runs the following services:
- Grafana: to plot the data in the web browser. - Grafana: to plot the data in the web browser.
- Slurmctld: to manage the SLURM nodes. - Slurmctld: to manage the SLURM nodes.
- Gitlab runner: to run CI jobs from Gitlab. - Gitlab runner: to run CI jobs from Gitlab.
- Nix binary cache: to serve cached nix builds
This node is prone to interruptions from all the services it runs, so it is not This node is prone to interruptions from all the services it runs, so it is not
a good candidate for low noise executions. a good candidate for low noise executions.
# Binary cache
We provide a binary cache in `hut`, with the aim of avoiding unnecessary
recompilation of packages.
The cache should contain common packages from bscpkgs, but we don't provide
any guarantee that of what will be available in the cache, or for how long.
We recommend following the latest version of the `jungle` flake to avoid cache
misses.
## Usage
### From NixOS
In NixOS, we can add the cache through the `nix.settings` option, which will
enable it for all builds in the system.
```nix
{ ... }: {
nix.settings = {
substituters = [ "https://jungle.bsc.es/cache" ];
trusted-public-keys = [ "jungle.bsc.es:pEc7MlAT0HEwLQYPtpkPLwRsGf80ZI26aj29zMw/HH0=" ];
};
}
```
### Interactively
The cache can also be specified in a per-command basis through the flags
`--substituters` and `--trusted-public-keys`:
```sh
nix build --substituters "https://jungle.bsc.es/cache" --trusted-public-keys "jungle.bsc.es:pEc7MlAT0HEwLQYPtpkPLwRsGf80ZI26aj29zMw/HH0=" <...>
```
Note: you'll have to be a trusted user.
### Nix configuration file (non-nixos)
If using nix outside of NixOS, you'll have to update `/etc/nix/nix.conf`
```
# echo "substituters = https://jungle.bsc.es/cache" >> /etc/nix/nix.conf
# echo "trusted-public-keys = jungle.bsc.es:pEc7MlAT0HEwLQYPtpkPLwRsGf80ZI26aj29zMw/HH0=" >> /etc/nix/nix.conf
```
### Hint in flakes
By adding the configuration below to a `flake.nix`, when someone uses the flake,
`nix` will interactively ask to trust and use the provided binary cache:
```nix
{
nixConfig = {
extra-substituters = [
"https://jungle.bsc.es/cache"
];
extra-trusted-public-keys = [
"jungle.bsc.es:pEc7MlAT0HEwLQYPtpkPLwRsGf80ZI26aj29zMw/HH0="
];
};
outputs = { ... }: {
...
};
}
```
### Querying the cache
Check if the cache is available:
```sh
$ curl https://jungle.bsc.es/cache/nix-cache-info
StoreDir: /nix/store
WantMassQuery: 1
Priority: 30
```
Prevent nix from building locally:
```bash
nix build --max-jobs 0 <...>
```
Check if a package is in cache:
```bash
# Do a raw eval on the <package>.outPath (this should not build the package)
$ nix eval --raw jungle#openmp.outPath
/nix/store/dwnn4dgm1m4184l4xbi0qfrprji9wjmi-openmp-2024.11
# Take the hash (everything from / to - in the basename) and curl <hash>.narinfo
# if it exists in the cache, it will return HTTP 200 and some information
# if not, it will return 404
$ curl https://jungle.bsc.es/cache/dwnn4dgm1m4184l4xbi0qfrprji9wjmi.narinfo
StorePath: /nix/store/dwnn4dgm1m4184l4xbi0qfrprji9wjmi-openmp-2024.11
URL: nar/dwnn4dgm1m4184l4xbi0qfrprji9wjmi-17imkdfqzmnb013d14dx234bx17bnvws8baf3ii1xra5qi2y1wiz.nar
Compression: none
NarHash: sha256:17imkdfqzmnb013d14dx234bx17bnvws8baf3ii1xra5qi2y1wiz
NarSize: 1519328
References: 4gk773fqcsv4fh2rfkhs9bgfih86fdq8-gcc-13.3.0-lib nqb2ns2d1lahnd5ncwmn6k84qfd7vx2k-glibc-2.40-36
Deriver: vcn0x8hikc4mvxdkvrdxp61bwa5r7lr6-openmp-2024.11.drv
Sig: jungle.bsc.es:GDTOUEs1jl91wpLbb+gcKsAZjpKdARO9j5IQqb3micBeqzX2M/NDtKvgCS1YyiudOUdcjwa3j+hyzV2njokcCA==
# In oneline:
$ curl "https://jungle.bsc.es/cache/$(nix eval --raw jungle#<package>.outPath | cut -d '/' -f4 | cut -d '-' -f1).narinfo"
```
#### References
- https://nix.dev/guides/recipes/add-binary-cache.html
- https://nixos.wiki/wiki/Binary_Cache