3.1 KiB
title, description, date
title | description | date |
---|---|---|
Fox | AMD Genoa 9684X with 2 NVIDIA RTX4000 GPUs | 2025-02-12 |
Picture by Joanne Redwood, CC0.
The fox machine is a big GPU server that is configured to run heavy workloads. It has two fast AMD CPUs with large cache and 2 reasonable NVIDIA GPUs. Here are the detailed specifications:
- 2x AMD GENOA X 9684X DP/UP 96C/192T 2.55G 1,150M 400W SP5 3D V-cach
- 24x 32GB DDR5-4800 ECC RDIMM (total 768 GiB of RAM)
- 1x 2.5" SSD SATA3 MICRON 5400 MAX 480GB
- 2x 2.5" KIOXIA CM7-R 1.92TB NVMe GEN5 PCIe 5x4
- 2x NVIDIA RTX4000 ADA Gen 20GB GDDR6 PCIe 4.0
Access
To access the machine, request a SLURM session from apex using the fox
partition. If you need the machine for performance measurements, use an
exclusive reservation:
apex% salloc -p fox --exclusive
Otherwise, specify the CPUs that you need so other users can also use the node at the same time:
apex% salloc -p fox -c 8
Then use srun to execute an interactive shell:
apex% srun --pty $SHELL
fox%
Make sure you get all CPUs you expect:
fox% grep Cpus_allowed_list /proc/self/status
Cpus_allowed_list: 0-191
Follow these steps if you don't have access to apex or fox.
CUDA
To use CUDA, you can use the following flake.nix
placed in a new directory to
load all the required dependencies:
{
inputs.jungle.url = "jungle";
outputs = { jungle, ... }: {
devShell.x86_64-linux = let
pkgs = jungle.nixosConfigurations.fox.pkgs;
in pkgs.mkShell {
name = "cuda-env-shell";
buildInputs = with pkgs; [
git gitRepo gnupg autoconf curl
procps gnumake util-linux m4 gperf unzip
# Cuda packages (more at https://search.nixos.org/packages)
cudatoolkit linuxPackages.nvidia_x11
cudaPackages.cuda_cudart.static
cudaPackages.libcusparse
libGLU libGL
xorg.libXi xorg.libXmu freeglut
xorg.libXext xorg.libX11 xorg.libXv xorg.libXrandr zlib
ncurses5 stdenv.cc binutils
];
shellHook = ''
export CUDA_PATH=${pkgs.cudatoolkit}
export LD_LIBRARY_PATH=/var/run/opengl-driver/lib
export SMS=50
'';
};
};
}
Then just run nix develop
from the same directory:
% mkdir cuda
% cd cuda
% vim flake.nix
[...]
% nix develop
$ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2024 NVIDIA Corporation
Built on Tue_Feb_27_16:19:38_PST_2024
Cuda compilation tools, release 12.4, V12.4.99
Build cuda_12.4.r12.4/compiler.33961263_0
Filesystems
The machine has several file systems available.
/nfs/home
: The/home
from apex via NFS, which is also shared with other xeon machines. It has about 2 ms of latency, so not suitable for quick random access./nvme{0,1}/$USER
: The two local NVME disks, very fast and large capacity./tmp
: tmpfs, fast but not backed by a disk. Will be erased on reboot.