ea49d762d1
MERGEME: Only expose proxy to docker
2025-02-17 16:12:46 +01:00
ab82757b42
MERGEME: Increase log
2025-02-17 15:18:59 +01:00
58ab131553
MERGEME: load cacert setup hook
2025-02-17 15:04:06 +01:00
e9cc635b8a
MERGEME: Use system packages
2025-02-17 14:46:52 +01:00
bc8b79566b
MERGEME: Add docker,hut tags
2025-02-17 14:34:29 +01:00
3f4e282113
MERGEME: Use docker extra hosts
2025-02-17 14:28:03 +01:00
c9f1712986
MERGEME: Disable debug in ipmi monitoring
2025-02-17 14:15:47 +01:00
6004a3351d
Don't move doc in web output
2025-02-17 14:15:47 +01:00
66d10319f8
Add quickstart guide
2025-02-17 14:15:47 +01:00
93dc0aed33
Reject SSH connections without SLURM allocation
2025-02-17 14:15:47 +01:00
0cc1ae9fa7
Add users to fox
2025-02-17 14:15:47 +01:00
5ac9c70dd7
Add dalvare1 user
2025-02-17 14:15:47 +01:00
5253ebec9e
Add fox page in jungle website
2025-02-17 14:15:47 +01:00
1fe44faa6d
Mount NVME disks in /nvme{0,1}
2025-02-17 14:15:46 +01:00
b1fcd1e128
Exclude fox from being suspended by slurm
2025-02-17 14:15:46 +01:00
edf1f3d239
Use IPMI host names instead of IP addresses
2025-02-17 14:15:46 +01:00
878ee61734
Add fox IPMI monitoring
...
Use agenix to store the credentials safely.
2025-02-17 14:15:46 +01:00
f09556a26f
Add new fox machine
2025-02-17 14:15:46 +01:00
98cac8b086
Add new GitLab runner for gitlab.bsc.es
...
It uses docker based on alpine and the host nix store, so we can perform
builds but isolate them from the system.
2025-02-17 14:15:37 +01:00
587caf262e
Update PM GitLab tokens to new URL
...
Reviewed-by: Aleix Boné <abonerib@bsc.es >
2025-01-16 15:43:13 +01:00
2730404ca5
Fix MPICH build by fetching upstream patches too
...
Reviewed-by: Aleix Boné <abonerib@bsc.es >
2025-01-16 15:43:13 +01:00
84db5e6fd6
Fix papermod theme in website for new hugo
...
Reviewed-by: Aleix Boné <abonerib@bsc.es >
2025-01-16 15:43:13 +01:00
f4f34a3159
flake.lock: Update
...
Flake lock file updates:
• Updated input 'agenix':
'github:ryantm/agenix/de96bd907d5fbc3b14fc33ad37d1b9a3cb15edc6' (2024-07-09)
→ 'github:ryantm/agenix/f6291c5935fdc4e0bef208cfc0dcab7e3f7a1c41' (2024-08-10)
• Updated input 'bscpkgs':
'git+https://git.sr.ht/~rodarima/bscpkgs?ref=refs/heads/master&rev=de89197a4a7b162db7df9d41c9d07759d87c5709 ' (2024-04-24)
→ 'git+https://git.sr.ht/~rodarima/bscpkgs?ref=refs/heads/master&rev=6782fc6c5b5a29e84a7f2c2d1064f4bcb1288c0f ' (2024-11-29)
• Updated input 'nixpkgs':
'github:NixOS/nixpkgs/693bc46d169f5af9c992095736e82c3488bf7dbb' (2024-07-14)
→ 'github:NixOS/nixpkgs/9c6b49aeac36e2ed73a8c472f1546f6d9cf1addc' (2025-01-14)
Reviewed-by: Aleix Boné <abonerib@bsc.es >
2025-01-16 15:43:13 +01:00
91b8b4a3c5
Set nixpkgs to track nixos-24.11
...
Reviewed-by: Aleix Boné <abonerib@bsc.es >
2025-01-16 15:43:13 +01:00
6cad205269
Add script to monitor GPFS
...
Reviewed-by: Aleix Boné <abonerib@bsc.es >
2025-01-16 15:43:07 +01:00
c57bf76969
Add BSC machines to ssh config
...
Reviewed-by: Aleix Boné <abonerib@bsc.es >
2025-01-16 14:23:51 +01:00
ad4b615211
Collect statistics from logged users
...
Reviewed-by: Aleix Boné <abonerib@bsc.es >
2025-01-16 14:23:48 +01:00
b4518b59cf
Add custom GPFS exporter for MN5
...
Reviewed-by: Aleix Boné <abonerib@bsc.es >
2025-01-16 14:23:46 +01:00
45dc4124a3
Remove exception to fetch task endpoint
...
It causes the request to go to the website rather than the Gitea
service.
Reviewed-by: Aleix Boné <abonerib@bsc.es >
2025-01-16 14:23:43 +01:00
bdfe9a48fd
Use SSD for boot, then switch to NVME
...
Reviewed-by: Aleix Boné <abonerib@bsc.es >
2025-01-16 14:23:40 +01:00
1b337d31f8
Use NVME as root
...
Reviewed-by: Aleix Boné <abonerib@bsc.es >
2025-01-16 14:23:37 +01:00
717cd5a21e
Keep host header for Grafana requests
...
This was breaking requests due to CSRF check.
See: https://github.com/grafana/grafana/issues/45117#issuecomment-1033842787
Reviewed-by: Aleix Boné <abonerib@bsc.es >
2025-01-16 14:23:32 +01:00
def5955614
Ignore logging requests from the gitea runner
...
Reviewed-by: Aleix Boné <abonerib@bsc.es >
2025-01-16 14:23:28 +01:00
0e3c975cb5
Log the client IP not the proxy
...
Reviewed-by: Aleix Boné <abonerib@bsc.es >
2025-01-16 14:23:22 +01:00
93189a575e
Ignore misc directory
...
Reviewed-by: Aleix Boné <abonerib@bsc.es >
2025-01-16 14:23:19 +01:00
36592c44eb
Create paste directories in /ceph/p
...
Ensure that all hut users have a paste directory in /ceph/p owned by
themselves. We need to wait for the ceph mount point to create them, so
we use a systemd service that waits for the remote-fs.target.
Reviewed-by: Aleix Boné <abonerib@bsc.es >
2025-01-16 14:23:16 +01:00
a34e3752a2
Add paste documentation in jungle website
...
Reviewed-by: Aleix Boné <abonerib@bsc.es >
2025-01-16 14:23:13 +01:00
0d2dea94fb
Add p command to paste files
...
Reviewed-by: Aleix Boné <abonerib@bsc.es >
2025-01-16 14:23:10 +01:00
7f539d7e06
Use nginx to serve website and other services
...
Instead of using multiple tunels to forward all our services to the VM
that serves jungle.bsc.es, just use nginx to redirect the traffic from
hut. This allows adding custom rules for paths that are not posible
otherwise.
Reviewed-by: Aleix Boné <abonerib@bsc.es >
2025-01-16 14:23:07 +01:00
f8ec090836
Mount the NVME disk in /nvme
...
Reviewed-by: Aleix Boné <abonerib@bsc.es >
2025-01-16 14:22:58 +01:00
9a9161fc55
Delay nix-gc until /home is mounted
...
Prevents starting the garbage collector before the remote FS are
mounted, in particular /home. Otherwise, all the gcroots which have
symlinks in /home will be considered stale and they will be removed.
See: #79
Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es >
2024-09-20 09:45:30 +02:00
1a0cf96fc4
Add dbautist user with access to hut
...
Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es >
2024-09-20 09:42:02 +02:00
4bd1648074
Set the serial console to ttyS1 in raccoon
...
Apparently the ttyS0 console doesn't exist but ttyS1 does:
raccoon% sudo stty -F /dev/ttyS0
stty: /dev/ttyS0: Input/output error
raccoon% sudo stty -F /dev/ttyS1
speed 9600 baud; line = 0;
-brkint -imaxbel
The dmesg line agrees:
00:03: ttyS1 at I/O 0x2f8 (irq = 3, base_baud = 115200) is a 16550A
The console configuration is then moved from base to xeon to allow
changing it for the raccoon machine.
Reviewed-by: Aleix Boné <abonerib@bsc.es >
2024-09-12 08:36:56 +02:00
15b114ffd6
Remove setLdLibraryPath and driSupport options
...
They have been removed from NixOS. The "hardware.opengl" group is now
renamed to "hardware.graphics".
See: 98cef4c273
Reviewed-by: Aleix Boné <abonerib@bsc.es >
2024-09-12 08:36:53 +02:00
dd6d8c9735
Add documentation section about GRUB chain loading
...
Reviewed-by: Aleix Boné <abonerib@bsc.es >
2024-09-12 08:36:47 +02:00
e15a3867d4
Add 10 min shutdown jitter to avoid spikes
...
The shutdown timer will fire at slightly different times for the
different nodes, so we slowly decrease the power consumption.
Reviewed-by: Aleix Boné <abonerib@bsc.es >
2024-09-12 08:36:44 +02:00
5cad208de6
Don't mount the nix store in owl nodes
...
Initially we planned to run jobs in those nodes by sharing the same nix
store from hut. However, these nodes are now used to build packages
which are not available in hut. Users also ssh to the nodes, which
doesn't mount the hut store, so it doesn't make much sense to keep
mounting it.
Reviewed-by: Aleix Boné <abonerib@bsc.es >
2024-09-12 08:36:42 +02:00
c8687f7e45
Emulate other architectures in owl nodes too
...
Allows cross-compilation of packages for RISC-V that are known to try to
run RISC-V programs in the host.
Reviewed-by: Aleix Boné <abonerib@bsc.es >
2024-09-12 08:36:39 +02:00
d988ef2eff
Program shutdown for August 2nd for all machines
...
Reviewed-by: Aleix Boné <abonerib@bsc.es >
2024-09-12 08:36:36 +02:00
b07929eab3
Enable debuginfod daemon in owl nodes
...
WARNING: This will introduce noise, as the daemon wakes up from time to
time to check for new packages.
Reviewed-by: Aleix Boné <abonerib@bsc.es >
2024-09-12 08:36:30 +02:00