jungle

Author	SHA1	Message	Date
Rodrigo Arias Mallo	35a94a9b02	Enable runners for pm.bsc.es/gitlab too The old runners for the PM gitlab were disabled in configuration in the last outage, but they remained working until we reboot the node. With this change we enable the runners for both PM and gitlab.bsc.es. Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>	2023-11-24 14:45:23 +01:00
Rodrigo Arias Mallo	b6bd31e159	Remove complete ceph package from hut Only the ceph-client is needed. Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>	2023-11-24 12:58:54 +01:00
Rodrigo Arias Mallo	dd341902fc	BSC packages are no longer in bsc attribute Reviewed-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>	2023-11-09 13:40:48 +01:00
Rodrigo Arias Mallo	2953080fb8	Monitor anella instead of gw.bsc.es The target gw.bsc.es doesn't reply to our ICMP probes from hut. However, the anella hop in the tracepath is a good candidate to identify cuts between the login and the provider and between the provider and external hosts like Google or Cloudflare DNS. Reviewed-By: Aleix Roca Nonell <aleix.rocanonell@bsc.es>	2023-10-27 12:46:08 +02:00
Rodrigo Arias Mallo	9871517be2	Add ICMP probes These probes check if we can reach several targets via ICMP, which is not proxied, so they can be used to see if ICMP forwarding is working in the login node. In particular, we test if we can reach the Google (8.8.8.8) and Cloudflare (1.1.1.1) DNS servers, the BSC gateway which responds to ping only from the intranet and the login node (ssfhead). Reviewed-By: Aleix Roca Nonell <aleix.rocanonell@bsc.es>	2023-10-25 17:13:03 +02:00
Rodrigo Arias Mallo	736eacaac5	Enable proxy for Grafana too The alerts need to contact the slack endpoint, so we add the proxy environment variables to the grafana systemd service. Reviewed-By: Aleix Roca Nonell <aleix.rocanonell@bsc.es>	2023-10-25 16:55:56 +02:00
Rodrigo Arias Mallo	0e66aad099	Make blackbox exporter use the proxy By default it was trying to reach the targets using the default gateway, but since the electrical cut of 2023-10-20, the login node has not enabled forwarding again. So better if we don't rely on it. Reviewed-By: Aleix Roca Nonell <aleix.rocanonell@bsc.es>	2023-10-25 16:55:24 +02:00
Rodrigo Arias Mallo	67a4905a0a	Don't log SLURM connection attempts from ssfhead	2023-10-06 15:22:04 +02:00
Rodrigo Arias Mallo	d52d22e0db	Add docker runner too	2023-10-06 15:17:07 +02:00
Rodrigo Arias Mallo	42920c2521	Monitor gitlab.bsc.es too	2023-10-06 15:17:07 +02:00
Rodrigo Arias Mallo	4acd35e036	Monitor PM webpage via blackbox	2023-10-06 15:17:07 +02:00
Rodrigo Arias Mallo	621d20db3a	Temporarily disable pm runners	2023-10-06 15:17:07 +02:00
Rodrigo Arias Mallo	0926f6ec1f	Add runner for gitlab.bsc.es	2023-10-06 15:17:07 +02:00
Rodrigo Arias Mallo	61646cb3bd	Allow anonymous access to grafana	2023-09-22 10:51:30 +02:00
Rodrigo Arias Mallo	c0066c4744	Remove user/group when using DynamicUsers	2023-09-22 10:13:06 +02:00
Rodrigo Arias Mallo	ffd0593f51	Set the SLURM_CONF variable	2023-09-21 22:22:00 +02:00
Rodrigo Arias Mallo	f49ae0773e	Enable slurm-exporter service	2023-09-21 21:40:02 +02:00
Rodrigo Arias Mallo	8de3d2b149	Mount the hut nix store for SLURM jobs	2023-09-20 19:38:43 +02:00
Rodrigo Arias Mallo	bc62e28ca3	Enable direnv integration	2023-09-20 09:32:58 +02:00
Rodrigo Arias Mallo	653d411b9e	Remove bscpkgs from the registry and nixPath This is done to prevent accidental evaluations where the nixpkgs input of bscpkgs is still pointing to a different version that the one specified in the jungle flake. Instead use jungle#bscpkgs.X to get a package from bscpkgs.	2023-09-15 12:00:33 +02:00
Rodrigo Arias Mallo	a1e8cfea47	Don't fetch registry flakes from the net	2023-09-15 12:00:28 +02:00
Rodrigo Arias Mallo	e88805947e	Open ports in firewall of compute nodes	2023-09-14 15:45:43 +02:00
Rodrigo Arias Mallo	d9d249411d	Monitor storage nodes via IPMI too	2023-09-13 15:57:13 +02:00
Rodrigo Arias Mallo	10ca572aec	Enable fstrim service	2023-09-12 16:39:45 +02:00
Rodrigo Arias Mallo	75b0f48715	Serve the nix store from hut	2023-09-12 12:19:43 +02:00
Rodrigo Arias Mallo	19a451db77	Add encrypted munge key with agenix	2023-09-08 19:05:45 +02:00
Rodrigo Arias Mallo	ec9be9bb62	Remove unused large port hole in firewall	2023-09-08 18:22:48 +02:00
Rodrigo Arias Mallo	7ddd1977f3	Make exporters listen in localhost only	2023-09-08 18:13:04 +02:00
Rodrigo Arias Mallo	7050c505b5	Allow only some ports for srun	2023-09-08 17:51:37 +02:00
Rodrigo Arias Mallo	033a1fe97b	Block ssfhead from reaching our slurm daemon	2023-09-08 17:36:28 +02:00
Rodrigo Arias Mallo	77cb3c494e	Poweroff idle slurm nodes after 1 hour	2023-09-08 16:49:53 +02:00
Rodrigo Arias Mallo	6db5772ac4	Add IB and IPMI node host names	2023-09-08 13:21:37 +02:00
Rodrigo Arias Mallo	dca274d020	Unlock ovni gitlab runners	2023-09-05 16:59:45 +02:00
Rodrigo Arias Mallo	02f40a8217	Add agenix to all nodes	2023-09-04 22:10:43 +02:00
Rodrigo Arias Mallo	77d43b6da9	Add agenix module to ceph	2023-09-04 22:07:07 +02:00
Rodrigo Arias Mallo	ab55aac5ff	Remove old secrets	2023-09-04 22:04:32 +02:00
Rodrigo Arias Mallo	9b5bfbb7a3	Mount /ceph in owl1 and owl2	2023-09-04 22:00:36 +02:00
Rodrigo Arias Mallo	a69a71d1b0	Warn about the owl2 omnipath device	2023-09-04 22:00:17 +02:00
Rodrigo Arias Mallo	98374bd303	Clean owl2 configuration	2023-09-04 21:59:56 +02:00
Rodrigo Arias Mallo	3b6be8a2fc	Move the ceph client config to an external module	2023-09-04 21:59:04 +02:00
Rodrigo Arias Mallo	2bb366b9ac	Reorganize secrets and ssh keys The agenix tools needs to read the secrets from a standalone file, but we also need the same information for the SSH keys.	2023-09-04 21:36:31 +02:00
Rodrigo Arias Mallo	2d16709648	Add anavarro user	2023-09-04 16:00:01 +02:00
Rodrigo Arias Mallo	9344daa31c	Set zsh inc_append_history option	2023-09-03 16:57:53 +02:00
Rodrigo Arias Mallo	80c98041b5	Set zsh shell for rarias	2023-09-03 16:46:27 +02:00
Rodrigo Arias Mallo	3418e57907	Enable zsh and fix key bindings	2023-09-03 16:42:04 +02:00
Rodrigo Arias Mallo	6848b58e39	Keep a log over time with the config commits	2023-09-03 00:02:14 +02:00
Rodrigo Arias Mallo	f9c77b433a	Store nixos config in /etc/nixos/config.rev	2023-09-02 23:37:11 +02:00
Rodrigo Arias Mallo	9d487845f6	Enable binary emulation for other architectures	2023-08-31 17:27:08 +02:00
Rodrigo Arias Mallo	3c99c2a662	Enable watchdog	2023-08-30 16:32:17 +02:00
Rodrigo Arias Mallo	7d09108c9f	Enable all osd on boot in lake2	2023-08-30 16:32:17 +02:00

1 2 3

109 Commits