Delay shutdown to August 3rd at 22h and turn on when power is back automatically #152

Manually merged
rarias merged 4 commits from adjust-august-shutdown into master 2025-07-24 11:25:02 +02:00
Owner

Fixes #151

Delay August shutdown to the Sunday 3rd at 22:00 and configure all machines to turn on as soon as there is power. Fox won't turn off automatically but instead it will shut down when the power is cut, as it can recover properly from power cuts.

The power-policy service is working as expected:

apex% sudo ipmitool chassis status | grep 'Power Restore Policy'
Power Restore Policy : previous
apex% sudo nixos-rebuild test --flake .
...
apex% systemctl status power-policy
...
Jul 23 15:28:47 apex ipmitool[77904]: Set chassis power restore policy to always-on
apex% sudo ipmitool chassis status | grep 'Power Restore Policy'
Power Restore Policy : always-on

Same for the shutdown timer:

apex% systemctl status august-shutdown.timer
● august-shutdown.timer - Shutdown on August 3rd for maintenance
     Loaded: loaded (/etc/systemd/system/august-shutdown.timer; enabled; preset: ignored)
     Active: active (waiting) since Wed 2025-07-23 13:46:25 CEST; 1h 51min ago
 Invocation: 38476aeeafc84268b344fdc7b685c8cb
    Trigger: Sun 2025-08-03 22:09:31 CEST; 1 week 4 days left
   Triggers: ● systemd-poweroff.service

Jul 23 13:46:25 apex systemd[1]: Started Shutdown on August 3rd for maintenance.
Fixes #151 Delay August shutdown to the Sunday 3rd at 22:00 and configure all machines to turn on as soon as there is power. Fox won't turn off automatically but instead it will shut down when the power is cut, as it can recover properly from power cuts. The power-policy service is working as expected: ``` apex% sudo ipmitool chassis status | grep 'Power Restore Policy' Power Restore Policy : previous apex% sudo nixos-rebuild test --flake . ... apex% systemctl status power-policy ... Jul 23 15:28:47 apex ipmitool[77904]: Set chassis power restore policy to always-on apex% sudo ipmitool chassis status | grep 'Power Restore Policy' Power Restore Policy : always-on ``` Same for the shutdown timer: ``` apex% systemctl status august-shutdown.timer ● august-shutdown.timer - Shutdown on August 3rd for maintenance Loaded: loaded (/etc/systemd/system/august-shutdown.timer; enabled; preset: ignored) Active: active (waiting) since Wed 2025-07-23 13:46:25 CEST; 1h 51min ago Invocation: 38476aeeafc84268b344fdc7b685c8cb Trigger: Sun 2025-08-03 22:09:31 CEST; 1 week 4 days left Triggers: ● systemd-poweroff.service Jul 23 13:46:25 apex systemd[1]: Started Shutdown on August 3rd for maintenance. ```
rarias added 4 commits 2025-07-23 15:39:39 +02:00
The UPC has different dates for the yearly power cut, and Fox can
recover properly from a power loss, so we don't need to have it turned
off before the power cut. Simply disabling the timer is enough.
In all machines, as soon as we recover the power, turn the machine back
on. We cannot rely on the previous state as we will shut them down
before the power is cut to prevent damage on the power supply
monitoring circuit.
rarias requested review from arocanon 2025-07-23 15:39:44 +02:00
rarias requested review from abonerib 2025-07-23 15:39:44 +02:00
abonerib reviewed 2025-07-23 16:20:11 +02:00
@@ -0,0 +8,4 @@
{
options = {
power.policy = mkOption {
type = lib.types.nullOr (types.enum [ "always-on" "previous" "always-off" ]);
Collaborator

lib. is redundant here

`lib.` is redundant here
rarias marked this conversation as resolved
abonerib reviewed 2025-07-23 16:22:41 +02:00
@@ -0,0 +22,4 @@
ExecStart = "${pkgs.ipmitool}/bin/ipmitool chassis policy ${cfg}";
Type = "oneshot";
Restart = "on-failure";
RestartSec = "5s";
Collaborator

Should we limit the restart attempts?

Should we limit the restart attempts?
Author
Owner

Maybe StartLimitBurst=10 and StartLimitIntervalSec=10m, so it can fail up to 10 times in 10 minutes? It may fail if it collides with the prometheus probe which is also using the IPMI interface, but should work after a few tries. I would choose a long interval so it can take a while to fail.

Maybe `StartLimitBurst=10` and `StartLimitIntervalSec=10m`, so it can fail up to 10 times in 10 minutes? It may fail if it collides with the prometheus probe which is also using the IPMI interface, but should work after a few tries. I would choose a long interval so it can take a while to fail.
abonerib marked this conversation as resolved
arocanon approved these changes 2025-07-23 16:56:28 +02:00
rarias force-pushed adjust-august-shutdown from e3ae73092f to c50d24062f 2025-07-23 17:01:32 +02:00 Compare
abonerib approved these changes 2025-07-23 17:23:56 +02:00
rarias force-pushed adjust-august-shutdown from c50d24062f to 8f7787e217 2025-07-24 11:23:06 +02:00 Compare
rarias manually merged commit 8f7787e217 into master 2025-07-24 11:25:02 +02:00
Sign in to join this conversation.
No Reviewers
3 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: rarias/jungle#152