Update NixOS and other changes #72

Closed
rarias wants to merge 0 commits from update-nixos into master
Owner
No description provided.
rarias added 17 commits 2024-07-22 12:19:03 +02:00
Allows users to attach GDB to their own processes, without requiring
running the program with GDB from the start.
Flake lock file updates:

• Updated input 'agenix':
    'github:ryantm/agenix/1381a759b205dff7a6818733118d02253340fd5e' (2024-04-02)
  → 'github:ryantm/agenix/de96bd907d5fbc3b14fc33ad37d1b9a3cb15edc6' (2024-07-09)
• Updated input 'nixpkgs':
    'github:NixOS/nixpkgs/6143fc5eeb9c4f00163267708e26191d1e918932' (2024-04-21)
  → 'github:NixOS/nixpkgs/693bc46d169f5af9c992095736e82c3488bf7dbb' (2024-07-14)
Starting with GitLab 16, there is a new mechanism to authenticate the
runners via authentication tokens, so use it instead.  Older tokens and
runners are also removed, as they are no longer used.

With the new way of managing tokens, both the tags and the locked state
are managed from the GitLab web page.

See: https://docs.gitlab.com/ee/ci/runners/new_creation_workflow.html
The current select mechanism was using the memory too as a consumable
resource, which by default only sets 1 MiB per node. As each job already
requests 1 MiB, it prevents other jobs from running.

As we are not really concerned with memory usage, we only use the unused
cores in the select criteria.
Prevents enless jobs from being left forever, while allow users to
request a larger time limit.
Prevents filling the journal logs with information messages.
WARNING: This will introduce noise, as the daemon wakes up from time to
time to check for new packages.
Allows cross-compilation of packages for RISC-V that are known to try to
run RISC-V programs in the host.
Initially we planned to run jobs in those nodes by sharing the same nix
store from hut. However, these nodes are now used to build packages
which are not available in hut. Users also ssh to the nodes, which
doesn't mount the hut store, so it doesn't make much sense to keep
mounting it.
The shutdown timer will fire at slightly different times for the
different nodes, so we slowly decrease the power consumption.
rarias added 2 commits 2024-07-22 14:07:48 +02:00
They have been removed from NixOS. The "hardware.opengl" group is now
renamed to "hardware.graphics".

See: 98cef4c273
Apparently the ttyS0 console doesn't exist but ttyS1 does:

  raccoon% sudo stty -F /dev/ttyS0
  stty: /dev/ttyS0: Input/output error
  raccoon% sudo stty -F /dev/ttyS1
  speed 9600 baud; line = 0;
  -brkint -imaxbel

The dmesg line agrees:

  00:03: ttyS1 at I/O 0x2f8 (irq = 3, base_baud = 115200) is a 16550A

The console configuration is then moved from base to xeon to allow
changing it for the raccoon machine.
abonerib reviewed 2024-09-10 16:44:52 +02:00
abonerib left a comment
Collaborator

LGTM, I left a couple of comments, but it's mostly nitpicking. I didn't go through all the docker gitlab details, I assumed the configuration has been tested.

LGTM, I left a couple of comments, but it's mostly nitpicking. I didn't go through all the docker gitlab details, I assumed the configuration has been tested.
@ -0,0 +8,4 @@
timerConfig = {
OnCalendar = "*-08-02 11:00:00";
RandomizedDelaySec = "10min";
Unit = "systemd-poweroff.service";
Collaborator

It would be nice to have broadcast a wall message some time before shutdown

It would be nice to have broadcast a `wall` message some time before shutdown
Author
Owner

I usually send some emails on the mailing list prior to that day. I think it would be good to send via wall too, feel free to send a patch or PR :-)

I usually send some emails on the mailing list prior to that day. I think it would be good to send via wall too, feel free to send a patch or PR :-)
rarias marked this conversation as resolved
@ -22,0 +16,4 @@
# Allow ptracing (i.e. attach with GDB) any process of the same user, see:
# https://www.kernel.org/doc/Documentation/security/Yama.txt
"kernel.yama.ptrace_scope" = "0";
Collaborator

Perhaps it would be wiser to only do this on the machines where it's needed, since it could be a security concern?

Perhaps it would be wiser to only do this on the machines where it's needed, since it could be a security concern?
Author
Owner

I think this would be needed in most machines, but it can be disabled in the storage nodes.

I think this would be needed in most machines, but it can be disabled in the storage nodes.
rarias marked this conversation as resolved
@ -86,0 +90,4 @@
# Ignore memory constraints and only use unused cores to share a node with
# other jobs.
SelectTypeParameters=CR_CORE
Collaborator

The documentation uses CR_Core, but it's probably case-insensitive: https://slurm.schedmd.com/cons_tres_share.html

The documentation uses `CR_Core`, but it's probably case-insensitive: https://slurm.schedmd.com/cons_tres_share.html
Author
Owner

Good catch!

Good catch!
rarias marked this conversation as resolved
rarias force-pushed update-nixos from 0a8db8bda6 to 4bd1648074 2024-09-12 08:38:46 +02:00 Compare
Author
Owner

Done. I added you as reviewer in the commit trailers too.

Done. I added you as reviewer in the commit trailers too.
rarias requested review from abonerib 2024-09-12 08:43:54 +02:00
rarias self-assigned this 2024-09-12 08:44:02 +02:00
abonerib approved these changes 2024-09-12 09:40:11 +02:00
rarias closed this pull request 2024-09-12 09:52:32 +02:00

Pull request closed

Sign in to join this conversation.
No reviewers
No Milestone
No project
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: rarias/jungle#72
No description provided.