Add web post update for 2025 #176
Loading…
x
Reference in New Issue
Block a user
No description provided.
Delete Branch "post-update-2025"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Use
hugo server -Ds web/to test. Anything else I missed?Sorry if I have gone overboard with the suggestions.
Maybe we can also mention that now we serve a binary cache, so people don't have to rebuild bscpkgs.
Towards the end, there are some
sos that could be removed to make the text more fluid.@ -0,0 +13,4 @@We have a new [fox machine](/fox), with two AMD Genoa 9684X CPUs and two NVIDIARTX4000 GPUs. During the last months we have been doing some tests and it seemsthat most of the components work well. We have configured CUDA to use the NVIDIAGPUs as well as AMD uProf to trace performance and energy counters from theI feel like a comma after GPUs would make things more clear.
@ -0,0 +20,4 @@We have upgraded the operating system on the login node to NixOS, which now runsLinux 6.15.6. During the upgrade, we have detected a problem with the RAIDcontroller that caused a catastrophic failure that prevented the BIOS fromthe second
thatcould be changed towhichto avoid repetition@ -0,0 +23,4 @@controller that caused a catastrophic failure that prevented the BIOS fromstarting.The `/` and `/home` partitions sit on a RAID 5 governed by a RAID hardwareWe are still talking about the RAID controller, so splitting the paragraph is a bit confusing (Unless we change the section header to problems with the raid controller).
I rewrote it to make it more clear.
@ -0,0 +27,4 @@controller, however it was unable to boot properly before handlingthe control over to the BIOS. After a long debugging session, we detected thatthe flash memory that stores the firmware of the hardware controller was likelyto be the issue, asassince memory (as is grammatically correct, but using it here reads as while: e.g.as/while memory cells lose charge they do X). (https://writinglawtutors.com/dont-use-as-to-mean-because/)@ -0,0 +29,4 @@the flash memory that stores the firmware of the hardware controller was likelyto be the issue, as[memory cells](https://en.wikipedia.org/wiki/Flash_memory#Principles_of_operation)may lose charge over time and can end up corrupting the content. So we flashedI would drop the first
Sosince it's a crutch.@ -0,0 +33,4 @@the latest firmware so the memory cells are charged again with the new bits andthat fixed the problem. Hopefully we will be able to use it for some more years.The SLURM server has been moved to apex, so now you can allocate your jobs fromThe rest of the
sos are fine, although they are a bit repetitve.@ -0,0 +36,4 @@The SLURM server has been moved to apex, so now you can allocate your jobs fromthere, including the new fox machine.### Translated machines to BSC buildingTransferred / Migrated
@ -0,0 +38,4 @@### Translated machines to BSC buildingThe server room had a temperature issue that affected our machines since the endhad been affecting
@ -0,0 +44,4 @@Since then, we have moved the cluster to BSC premises, where now rests at awhere it now
e8eb47c9b8toc1e042be96I'll wait until we have checked that we are not exposing anything that should not be in the cache. I would also like to test it on non-nixos machines to see how that would work.
c1e042be96toc441178910