Transition to a ceph nix store #42
Loading…
Reference in New Issue
Block a user
No description provided.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
As discussed with Aleix and Vicenç, we would benefit from having the nix store directly placed in the ceph filesystem and let the compute nodes boot directly mounting it in /nix/store. This would solve the cache problems of the overlay FS as observed in #41 at the same time that prepares the path to export the nix store to other nodes (ejem, MN4/5). It also makes the room for the nix store larger and more robust (3 redundant copies).
The nodes can boot directly from the net via PXE, so we don't have to worry about their disk state (they are essentially stateless). However, we must ensure that they don't write into the nix database. We can achieve it by mounting the nix store as read only.
But we would need to be able to build some packages from inside the compute nodes (specially for debugging purposes) so they must be able to write to the store via the nix daemon of hut. This is probably doable as we already configured something similar for MN4.
Here is roughly the plan:
assigned to @rarias
The nix store in ceph doesn't really need 3 redundant copies, as the data stored there can be easily recovered, so let's create another pool just for the nix store with just 2 copies.
Using a remote store seems to allow building:
But not using a shell:
Using this findmnt:
And the socat hack to allow access to the hut daemon I'm able to build and enter a shell. So far, so good.
The gcroots are created by the hut daemon, so they are only guaranteed to be respected if they point to a place of the shared filesystem, otherwise they will be destroyed.
Another problem is how to avoid the collision of the /nix/var/nix/profiles/system among nodes.
In order to keep multiple versions of old system profiles per node, I need to adjust the symlinks so they don't clash with each other, see pkgs/os-specific/linux/nixos-rebuild/default.nix.
changed the description
marked the checklist item Ensure that nix build/develop/shell don't modify the store, but is all handled by the nix daemon (which will be on hut). as completed
marked the checklist item Make a tunnel for the nix daemon socket as completed
marked the checklist item Test that we can build packages locally from the node and be submitted to hut for build as completed
We will also need to patch the script that finds the grub entries:
a39526b3ef/nixos/modules/system/boot/loader/grub/install-grub.pl (L601)
We can safely assume that the installation of the GRUB will be done in the same host that contains the disk in which the GRUB is installed. This is only for disk installation, not for PXE booting.
If we netboot, all the state is stored in hut, so there is no need to do any
nixos-rebuild ... --target-host
anymore.mentioned in merge request !24