3.5 KiB
title, description, date
title | description | date |
---|---|---|
Fox | AMD Genoa 9684X with 2 NVIDIA RTX4000 GPUs | 2025-02-12 |
Picture by Joanne Redwood, CC0.
The fox machine is a big GPU server that is configured to run heavy workloads. It has two fast AMD CPUs with large cache and 2 reasonable NVIDIA GPUs. Here are the detailed specifications:
- 2x AMD GENOA X 9684X DP/UP 96C/192T 2.55G 1,150M 400W SP5 3D V-cach
- 24x 32GB DDR5-4800 ECC RDIMM (total 768 GiB of RAM)
- 1x 2.5" SSD SATA3 MICRON 5400 MAX 480GB
- 2x 2.5" KIOXIA CM7-R 1.92TB NVMe GEN5 PCIe 5x4
- 2x NVIDIA RTX4000 ADA Gen 20GB GDDR6 PCIe 4.0
Access
To access the machine, request a SLURM session from apex using the fox
partition. If you need the machine for performance measurements, use an
exclusive reservation:
apex% salloc -p fox --exclusive
Otherwise, specify the CPUs that you need so other users can also use the node at the same time:
apex% salloc -p fox -c 8
Then use srun to execute an interactive shell:
apex% srun --pty $SHELL
fox%
Make sure you get all CPUs you expect:
fox% grep Cpus_allowed_list /proc/self/status
Cpus_allowed_list: 0-191
Follow these steps if you don't have access to apex or fox.
CUDA
To use CUDA, you can use the following flake.nix
placed in a new directory to
load all the required dependencies:
{
inputs.jungle.url = "jungle";
outputs = { jungle, ... }: {
devShell.x86_64-linux = let
pkgs = jungle.nixosConfigurations.fox.pkgs;
in pkgs.mkShell {
name = "cuda-env-shell";
buildInputs = with pkgs; [
git gitRepo gnupg autoconf curl
procps gnumake util-linux m4 gperf unzip
# Cuda packages (more at https://search.nixos.org/packages)
cudatoolkit linuxPackages.nvidia_x11
cudaPackages.cuda_cudart.static
cudaPackages.libcusparse
libGLU libGL
xorg.libXi xorg.libXmu freeglut
xorg.libXext xorg.libX11 xorg.libXv xorg.libXrandr zlib
ncurses5 stdenv.cc binutils
];
shellHook = ''
export CUDA_PATH=${pkgs.cudatoolkit}
export LD_LIBRARY_PATH=/var/run/opengl-driver/lib
export SMS=50
'';
};
};
}
Then just run nix develop
from the same directory:
% mkdir cuda
% cd cuda
% vim flake.nix
[...]
% nix develop
$ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2024 NVIDIA Corporation
Built on Tue_Feb_27_16:19:38_PST_2024
Cuda compilation tools, release 12.4, V12.4.99
Build cuda_12.4.r12.4/compiler.33961263_0
AMD uProf
The AMD uProf performance analysis tool-suite is installed and ready to use.
See the AMD uProf user guide (PDF backup for v5.1) for more details on how to use the tools. To use the GUI make sure that you connect to fox using X11 forwarding.
Filesystems
The machine has several file systems available.
/nfs/home
: The/home
from apex via NFS, which is also shared with other xeon machines. It has about 2 ms of latency, so not suitable for quick random access./nvme{0,1}/$USER
: The two local NVME disks, very fast and large capacity./tmp
: tmpfs, fast but not backed by a disk. Will be erased on reboot.