Compare commits
7 Commits
4ed41e0c46
...
update-fox
| Author | SHA1 | Date | |
|---|---|---|---|
| 5f18335d14 | |||
| 29427b4b00 | |||
| 52abaf4d71 | |||
| cb6ca9bb5f | |||
| 9db8cb5624 | |||
| 70ee83804c | |||
| ac2ca88c0d |
3
.gitignore
vendored
3
.gitignore
vendored
@@ -1 +1,2 @@
|
||||
./public
|
||||
/public
|
||||
.hugo_build.lock
|
||||
|
||||
@@ -4,19 +4,19 @@ Welcome to the jungle, a set of machines with no imposed rules that are fully
|
||||
controlled and maintained by their users.
|
||||
|
||||
The configuration of all the machines is written in a centralized [git
|
||||
repository][config] using the Nix language for NixOS. Changes in the
|
||||
repository][jungle] using the Nix language for NixOS. Changes in the
|
||||
configuration of the machines are introduced by merge requests and pass a review
|
||||
step before being deployed.
|
||||
|
||||
[config]: https://pm.bsc.es/gitlab/rarias/jungle
|
||||
[jungle]: https://jungle.bsc.es/git/rarias/jungle
|
||||
|
||||
The machines have access to the large list of packages available in
|
||||
[Nixpkgs][nixpkgs] and a custom set of packages named [bscpkgs][bscpkgs],
|
||||
specifically tailored to our needs for HPC machines. Users can install their own
|
||||
packages and made them system-wide available by opening a merge request.
|
||||
All jungle machines have access to the large list of packages (>120k) available
|
||||
in [Nixpkgs][nixpkgs] and a custom set of [BSC packages][overlay] specifically
|
||||
tailored to our needs for HPC machines. Users can install any packages on their
|
||||
own or request them to be system-wide available for all users.
|
||||
|
||||
[nixpkgs]: https://github.com/NixOS/nixpkgs
|
||||
[bscpkgs]: https://pm.bsc.es/gitlab/rarias/bscpkgs
|
||||
[overlay]: https://jungle.bsc.es/git/rarias/jungle/src/branch/master/overlay.nix
|
||||
|
||||
We have put a lot of effort to guarantee very good reproducibility properties in
|
||||
the configuration of the machines and the software they use.
|
||||
|
||||
@@ -7,7 +7,7 @@ description: "Request access to the machines"
|
||||
|
||||
To request access to the machines we will need some information:
|
||||
|
||||
1. Which machines you want access to ([hut](/hut), [fox](/fox), owl1, owl2, eudy, koro...)
|
||||
1. Which machines you want access to ([apex](/apex), [hut](/hut), [fox](/fox), [owl1, owl2](/owl), eudy, koro...)
|
||||
1. Your user name (make sure it matches the one you use for the BSC intranet)
|
||||
1. Your real name and surname (for identification purposes)
|
||||
1. The salted hash of your login password, generated with `mkpasswd -m sha-512`
|
||||
|
||||
22
content/apex/_index.md
Normal file
22
content/apex/_index.md
Normal file
@@ -0,0 +1,22 @@
|
||||
---
|
||||
title: "Apex"
|
||||
description: "Login node"
|
||||
date: 2025-09-05T15:34:59+02:00
|
||||
---
|
||||
|
||||

|
||||
|
||||
Picture by [Michal Klajban](https://commons.wikimedia.org/wiki/File:Aoraki_-_Mt_Cook,_Aoraki_-_Mount_Cook_National_Park,_New_Zealand.jpg),
|
||||
licensed under [CC BY-SA](https://creativecommons.org/licenses/by-sa/4.0/deed.en).
|
||||
Cropped and color corrected.
|
||||
|
||||
Apex serves as the login node to the rest of machines in jungle. Connect to it
|
||||
by accessing `ssflogin.bsc.es`, only from the apex you will be able to see other
|
||||
machines.
|
||||
|
||||
Use `sinfo` to see which nodes are available and `salloc` to request them (see
|
||||
the manual for more details). Avoid doing large builds or computations in apex,
|
||||
use a different node instead.
|
||||
|
||||
The home is mounted from a hardware RAID 5 of 4 disks, and is shared across
|
||||
other machines.
|
||||
BIN
content/apex/apex.jpg
Normal file
BIN
content/apex/apex.jpg
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 195 KiB |
@@ -22,19 +22,16 @@ the detailed specifications:
|
||||
## Access
|
||||
|
||||
To access the machine, request a SLURM session from [apex](/apex) using the `fox`
|
||||
partition. If you need the machine for performance measurements, use an
|
||||
exclusive reservation:
|
||||
partition and set the time for the reservation (the default is 1 hour). If you
|
||||
need the machine for performance measurements, use an exclusive reservation:
|
||||
|
||||
apex% salloc -p fox --exclusive
|
||||
apex% salloc -p fox -t 02:00:00 --exclusive
|
||||
fox%
|
||||
|
||||
Otherwise, specify the CPUs that you need so other users can also use the node
|
||||
at the same time:
|
||||
|
||||
apex% salloc -p fox -c 8
|
||||
|
||||
Then use srun to execute an interactive shell:
|
||||
|
||||
apex% srun --pty $SHELL
|
||||
apex% salloc -p fox -t 02:00:00 -c 8
|
||||
fox%
|
||||
|
||||
Make sure you get all CPUs you expect:
|
||||
@@ -46,55 +43,38 @@ Follow [these steps](/access) if you don't have access to apex or fox.
|
||||
|
||||
## CUDA
|
||||
|
||||
To use CUDA, you can use the following `flake.nix` placed in a new directory to
|
||||
load all the required dependencies:
|
||||
To use CUDA you'll need to load the NVIDIA `nvcc` compiler and some additional
|
||||
libraries in the environment. Clone the
|
||||
[following
|
||||
example](https://jungle.bsc.es/git/rarias/devshell/src/branch/main/cuda) and
|
||||
modify the `flake.nix` if needed to add additional packages.
|
||||
|
||||
```nix
|
||||
{
|
||||
inputs.jungle.url = "jungle";
|
||||
Then just run `nix develop` from the same directory to spawn a new shell with
|
||||
the CUDA environment:
|
||||
|
||||
outputs = { jungle, ... }: {
|
||||
devShell.x86_64-linux = let
|
||||
pkgs = jungle.nixosConfigurations.fox.pkgs;
|
||||
in pkgs.mkShell {
|
||||
name = "cuda-env-shell";
|
||||
buildInputs = with pkgs; [
|
||||
git gitRepo gnupg autoconf curl
|
||||
procps gnumake util-linux m4 gperf unzip
|
||||
fox% git clone https://jungle.bsc.es/git/rarias/devshell
|
||||
|
||||
# Cuda packages (more at https://search.nixos.org/packages)
|
||||
cudatoolkit linuxPackages.nvidia_x11
|
||||
cudaPackages.cuda_cudart.static
|
||||
cudaPackages.libcusparse
|
||||
fox% cd devshell/cuda
|
||||
|
||||
libGLU libGL
|
||||
xorg.libXi xorg.libXmu freeglut
|
||||
xorg.libXext xorg.libX11 xorg.libXv xorg.libXrandr zlib
|
||||
ncurses5 stdenv.cc binutils
|
||||
];
|
||||
shellHook = ''
|
||||
export CUDA_PATH=${pkgs.cudatoolkit}
|
||||
export LD_LIBRARY_PATH=/var/run/opengl-driver/lib
|
||||
export SMS=50
|
||||
'';
|
||||
};
|
||||
};
|
||||
}
|
||||
```
|
||||
fox% nix develop
|
||||
|
||||
Then just run `nix develop` from the same directory:
|
||||
|
||||
% mkdir cuda
|
||||
% cd cuda
|
||||
% vim flake.nix
|
||||
[...]
|
||||
% nix develop
|
||||
$ nvcc -V
|
||||
fox$ nvcc -V
|
||||
nvcc: NVIDIA (R) Cuda compiler driver
|
||||
Copyright (c) 2005-2024 NVIDIA Corporation
|
||||
Built on Tue_Feb_27_16:19:38_PST_2024
|
||||
Cuda compilation tools, release 12.4, V12.4.99
|
||||
Build cuda_12.4.r12.4/compiler.33961263_0
|
||||
Copyright (c) 2005-2025 NVIDIA Corporation
|
||||
Built on Fri_Feb_21_20:23:50_PST_2025
|
||||
Cuda compilation tools, release 12.8, V12.8.93
|
||||
Build cuda_12.8.r12.8/compiler.35583870_0
|
||||
|
||||
fox$ make
|
||||
nvcc -ccbin g++ -m64 -Wno-deprecated-gpu-targets -o cudainfo cudainfo.cpp
|
||||
|
||||
fox$ ./cudainfo
|
||||
./cudainfo Starting...
|
||||
|
||||
CUDA Device Query (Runtime API) version (CUDART static linking)
|
||||
|
||||
Detected 2 CUDA Capable device(s)
|
||||
...
|
||||
|
||||
## AMD uProf
|
||||
|
||||
|
||||
@@ -23,7 +23,7 @@ a good candidate for low noise executions.
|
||||
We provide a binary cache in `hut`, with the aim of avoiding unnecessary
|
||||
recompilation of packages.
|
||||
|
||||
The cache should contain common packages from bscpkgs, but we don't provide
|
||||
The cache should contain common packages from jungle, but we don't provide
|
||||
any guarantee that of what will be available in the cache, or for how long.
|
||||
We recommend following the latest version of the `jungle` flake to avoid cache
|
||||
misses.
|
||||
|
||||
Reference in New Issue
Block a user