Update to NixOS 24.11, monitor GPFS, add paste service and switch to nginx #82

Closed
rarias wants to merge 0 commits from update-nixos-24.11 into old-master
Owner

Several changes:

  • Switch the main disk to the NVME device in Hut
  • Move the web server to Hut using nginx
  • Add paste service (p)
  • Add script to monitor the GPFS in Grafana
  • Update to NixOS 24.11
  • Fix MPICH, gitlab-runner, hugo theme

The update seems to be working fine in Hut.

Several changes: - Switch the main disk to the NVME device in Hut - Move the web server to Hut using nginx - Add paste service (p) - Add script to monitor the GPFS in Grafana - Update to NixOS 24.11 - Fix MPICH, gitlab-runner, hugo theme The update seems to be working fine in Hut.
rarias added 21 commits 2025-01-15 14:51:31 +01:00
Instead of using multiple tunels to forward all our services to the VM
that serves jungle.bsc.es, just use nginx to redirect the traffic from
hut. This allows adding custom rules for paths that are not posible
otherwise.
Ensure that all hut users have a paste directory in /ceph/p owned by
themselves. We need to wait for the ceph mount point to create them, so
we use a systemd service that waits for the remote-fs.target.
This was breaking requests due to CSRF check.

See: https://github.com/grafana/grafana/issues/45117#issuecomment-1033842787
It causes the request to go to the website rather than the Gitea
service.
Flake lock file updates:

• Updated input 'agenix':
    'github:ryantm/agenix/de96bd907d5fbc3b14fc33ad37d1b9a3cb15edc6' (2024-07-09)
  → 'github:ryantm/agenix/f6291c5935fdc4e0bef208cfc0dcab7e3f7a1c41' (2024-08-10)
• Updated input 'bscpkgs':
    'git+https://git.sr.ht/~rodarima/bscpkgs?ref=refs/heads/master&rev=de89197a4a7b162db7df9d41c9d07759d87c5709' (2024-04-24)
  → 'git+https://git.sr.ht/~rodarima/bscpkgs?ref=refs/heads/master&rev=6782fc6c5b5a29e84a7f2c2d1064f4bcb1288c0f' (2024-11-29)
• Updated input 'nixpkgs':
    'github:NixOS/nixpkgs/693bc46d169f5af9c992095736e82c3488bf7dbb' (2024-07-14)
  → 'github:NixOS/nixpkgs/9c6b49aeac36e2ed73a8c472f1546f6d9cf1addc' (2025-01-14)
rarias requested review from arocanon 2025-01-15 14:57:30 +01:00
rarias requested review from abonerib 2025-01-15 14:57:30 +01:00
rarias changed title from Update to NixOS 24.05, monitor GPFS, add paste service and switch to nginx to Update to NixOS 24.11, monitor GPFS, add paste service and switch to nginx 2025-01-15 14:58:46 +01:00
rarias force-pushed update-nixos-24.11 from 6995ce4554 to 8c4e4216ba 2025-01-15 16:46:19 +01:00 Compare
abonerib requested changes 2025-01-15 18:04:24 +01:00
Dismissed
abonerib left a comment
Collaborator

LGTM

I left a few suggestions, but it's mostly nitpicking.

LGTM I left a few suggestions, but it's mostly nitpicking.
@ -0,0 +18,4 @@
systemd.services.gpfs-probe = {
description = "Daemon to report GPFS latency via SSH";
path = [ pkgs.openssh pkgs.netcat ];
Collaborator

We could use writeShellApplication for gpfs-probe-script and put these as runtimeInputs.

We could use `writeShellApplication` for `gpfs-probe-script` and put these as `runtimeInputs`.
Author
Owner

Yeah, but I also use the script manually so I want it to be in the FS as-is so I can modify it without waiting for Nix to regenerate it.

Yeah, but I also use the script manually so I want it to be in the FS as-is so I can modify it without waiting for Nix to regenerate it.
abonerib marked this conversation as resolved
@ -0,0 +23,4 @@
wantedBy = [ "default.target" ];
serviceConfig = {
Type = "simple";
ExecStart = "${pkgs.socat}/bin/socat -d2 TCP4-LISTEN:9966,fork EXEC:${gpfs-probe-script}";
Collaborator

${lib.getExe pkgs.socat}

`${lib.getExe pkgs.socat}`
Author
Owner

I don't think the binary will change any time soon. And if it does, I prefer to still change the path manually rather than relying on some auto-magic nix crap.

I don't think the binary will change any time soon. And if it does, I prefer to still change the path manually rather than relying on some auto-magic nix crap.
abonerib marked this conversation as resolved
@ -0,0 +2,4 @@
N=500
t=$(timeout 5 ssh bsc015557@glogin2.bsc.es "timeout 3 command time -f %e touch /gpfs/projects/bsc15/bsc015557/gpfs.{1..$N} 2>&1; rm -f /gpfs/projects/bsc15/bsc015557/gpfs.{1..$N}")
Collaborator

timeout 8 ssh -o ConnectTimeout=5

`timeout 8 ssh -o ConnectTimeout=5`
Author
Owner

Yeah, ConnectTimeout doesn't work when the home is fucked. The first timeout 5 also cuts the connection if the host is down so we don't need ConnectTimeout at all. I initially used 10 seconds, but I believe this is the prometheus timeout, so 5 seems safer. In any case, >1s already means the FS is broken.

Yeah, ConnectTimeout doesn't work when the home is fucked. The first timeout 5 also cuts the connection if the host is down so we don't need ConnectTimeout at all. I initially used 10 seconds, but I believe this is the prometheus timeout, so 5 seems safer. In any case, >1s already means the FS is broken.
abonerib marked this conversation as resolved
m/hut/nginx.nix Outdated
@ -0,0 +2,4 @@
let
website = pkgs.stdenv.mkDerivation {
name = "jungle-web";
src = theFlake;
Collaborator

Is there any reason for not using ../../web?

Is there any reason for not using `../../web`?
Author
Owner

I believe it avoids copying the website twice in the store.

I believe it avoids copying the website twice in the store.
abonerib marked this conversation as resolved
abonerib approved these changes 2025-01-16 12:47:03 +01:00
rarias force-pushed update-nixos-24.11 from 8c4e4216ba to fd530493d0 2025-01-16 14:24:26 +01:00 Compare
rarias force-pushed update-nixos-24.11 from fd530493d0 to 587caf262e 2025-01-16 16:00:30 +01:00 Compare
Author
Owner

Merged in 587caf262e, not sure why gitea doesn't detect it. Closing.

Merged in https://jungle.bsc.es/git/rarias/jungle/commit/587caf262e7154c2da58229cfbad00fee4b5503a, not sure why gitea doesn't detect it. Closing.
rarias closed this pull request 2025-01-16 16:21:49 +01:00
rarias deleted branch update-nixos-24.11 2025-01-16 16:22:05 +01:00

Pull request closed

Sign in to join this conversation.
No Reviewers
2 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: rarias/jungle#82
No description provided.