jungle/m/common/slurm.nix

{ ... }:

{
  services.slurm = {
    client.enable = true;
    controlMachine = "hut";
    clusterName = "jungle";
    nodeName = [
      "owl[1,2]  Sockets=2 CoresPerSocket=14 ThreadsPerCore=2 Feature=owl"
      "hut       Sockets=2 CoresPerSocket=14 ThreadsPerCore=2"
    ];

    # See slurm.conf(5) for more details about these options.
    extraConfig = ''
      # Use PMIx for MPI by default. It works okay with MPICH and OpenMPI, but
      # not with Intel MPI. For that use the compatibility shim libpmi.so
      # setting I_MPI_PMI_LIBRARY=$pmix/lib/libpmi.so while maintaining the PMIx
      # library in SLURM (--mpi=pmix). See more details here:
      # https://pm.bsc.es/gitlab/rarias/jungle/-/issues/16
      MpiDefault=pmix

      # When a node reboots return that node to the slurm queue as soon as it
      # becomes operative again.
      ReturnToService=2

      # Track all processes by using a cgroup
      ProctrackType=proctrack/cgroup

      # Enable task/affinity to allow the jobs to run in a specified subset of
      # the resources. Use the task/cgroup plugin to enable process containment.
      TaskPlugin=task/affinity,task/cgroup
    '';
  };
}
Enable slurm in xeon01 2023-04-26 13:35:06 +02:00			`{ ... }:`

			`{`
			`services.slurm = {`
			`client.enable = true;`
Rename xeon07 to hut 2023-06-14 11:15:00 +02:00			`controlMachine = "hut";`
Set the name of the slurm cluster to jungle 2023-06-16 12:00:54 +02:00			`clusterName = "jungle";`
Enable slurm in xeon01 2023-04-26 13:35:06 +02:00			`nodeName = [`
Simplify flake and expose host pkgs The configuration of the machines is now moved to m/ 2023-06-14 17:28:00 +02:00			`"owl[1,2] Sockets=2 CoresPerSocket=14 ThreadsPerCore=2 Feature=owl"`
			`"hut Sockets=2 CoresPerSocket=14 ThreadsPerCore=2"`
Enable slurm in xeon01 2023-04-26 13:35:06 +02:00			`];`
Allow srun to specify the cpu binding The task/affinity plugin needs to be selected. 2023-06-21 13:16:23 +02:00
			`# See slurm.conf(5) for more details about these options.`
Use pmix by default in slurm 2023-04-28 17:07:48 +02:00			`extraConfig = ''`
Add coments in slurm config 2023-06-16 14:16:14 +02:00			`# Use PMIx for MPI by default. It works okay with MPICH and OpenMPI, but`
			`# not with Intel MPI. For that use the compatibility shim libpmi.so`
			`# setting I_MPI_PMI_LIBRARY=$pmix/lib/libpmi.so while maintaining the PMIx`
			`# library in SLURM (--mpi=pmix). See more details here:`
			`# https://pm.bsc.es/gitlab/rarias/jungle/-/issues/16`
Use pmix by default in slurm 2023-04-28 17:07:48 +02:00			`MpiDefault=pmix`
Add coments in slurm config 2023-06-16 14:16:14 +02:00
			`# When a node reboots return that node to the slurm queue as soon as it`
			`# becomes operative again.`
Automatically resume restarted nodes in SLURM 2023-05-18 12:48:04 +02:00			`ReturnToService=2`
Allow srun to specify the cpu binding The task/affinity plugin needs to be selected. 2023-06-21 13:16:23 +02:00
			`# Track all processes by using a cgroup`
			`ProctrackType=proctrack/cgroup`

			`# Enable task/affinity to allow the jobs to run in a specified subset of`
			`# the resources. Use the task/cgroup plugin to enable process containment.`
			`TaskPlugin=task/affinity,task/cgroup`
Use pmix by default in slurm 2023-04-28 17:07:48 +02:00			`'';`
Enable slurm in xeon01 2023-04-26 13:35:06 +02:00			`};`
			`}`