bscpkgs/README

463 lines
15 KiB
Plaintext

bscpkgs: User guide
ABSTRACT
This repository contains a set of nix packages used in the Barcelona
Supercomputing Center by the Programming Models group.
The current setup uses the xeon07 machine to build packages, which are
automatically uploaded to MareNostrum4, due to lack of permissions in
the latter to perform the build safely.
Some preliminary steps must be done manually to be able to build and
install packages (derivations in nix jargon).
1. Introduction
To easily connect to xeon07 in one step, setup the SSH (for version
7.3 and upwards) configuration file in ~/.ssh/config adding these
lines:
Host cobi
HostName ssflogin.bsc.es
User your-username-here
Host xeon07
ProxyJump cobi
HostName xeon07
User your-username-here
You should be able to connect with:
laptop$ ssh xeon07
1.1 Network access
In order to use nix you would need to be able to download the sources
from Internet. Usually the download requires the ports 22, 80 and 443
to be open for outgoing traffic.
Check that you have network access in xeon07 provided by the
environment variables "http_proxy" and "https_proxy". Try to fetch a
webpage with curl, to ensure the proxy is working:
xeon07$ curl x.com
x
1.2 SSH keys
Package sources are usually downloaded directly from the git server,
so you must be able to access all repositories without a password
prompt.
Most repositories at https://pm.bsc.es/gitlab are open to read for
logged in users, but there are some exceptions (for example the nanos6
repository) where you must have explicitly granted read access.
If you don't have a ssh key at ~/.ssh/*.pub in xeon07 create a new one
without password protection by running:
xeon07$ ssh-keygen
Generating public/private rsa key pair.
Enter file in which to save the key (~/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in ~/.ssh/id_rsa.
Your public key has been saved in ~/.ssh/id_rsa.pub.
...
By default it will create the private key at ~/.ssh/id_rsa. Copy the
contents of your public ssh key in ~/.ssh/id_rsa.pub and paste it in
GitLab at:
https://pm.bsc.es/gitlab/profile/keys
Then, configure it for use in the ~/.ssh/config file, adding:
Host bscpm03.bsc.es
IdentityFile ~/.ssh/id_rsa
Finally verify the SSH connection to the server works and you get a
greeting from the GitLab server with your username:
xeon07$ ssh git@bscpm03.bsc.es
PTY allocation request failed on channel 0
Welcome to GitLab, @rarias!
Connection to bscpm03.bsc.es closed.
Verify that you can access nanos6/nanos6 repository (otherwise you
first need to ask to be granted read access), at:
https://pm.bsc.es/gitlab/nanos6/nanos6
Finally, you should be able to download the nanos6/nanos6 git
repository without any password interaction by running:
xeon07$ git clone git@bscpm03.bsc.es:nanos6/nanos6.git
You will also need to access MareNostrum 4 from the xeon07 node, in
order to submit experiments. Add the following lines as well to the
~/.ssh/config file and set your user name:
Host mn0 mn1 mn2
User your-mn4-username
IdentityFile ~/.ssh/id_rsa
Then copy the key to MareNostrum 4 (it will ask you the first time for
your password):
xeon07$ ssh-copy-id -i ~/.ssh/id_rsa.pub mn1
And ensure that you can connect without a password:
xeon07$ ssh mn1
...
login1$
1.3 The bscpkgs repo
Once you have Internet and you have granted access to the PM GitLab
repositories you can begin building software with nix. First ensure
that the nix binaries are available from your shell in xeon07:
xeon07$ nix --version
nix (Nix) 2.3.6
Now you are ready to build and install packages with nix. Clone the
bscpkgs repository:
xeon07$ git clone git@bscpm03.bsc.es:rarias/bscpkgs.git
Nix looks in the current folder for a file named "default.nix" for
packages, so go to the repo directory:
xeon07$ cd bscpkgs
Now you should be able to build nanos6:
xeon07$ nix-build -A bsc.nanos6
..
/nix/store/3i0qkdywm9xjv2cm1ldx9smb552sf6r1-nanos6-2.4-6f10a32
The installation is placed in the nix store (with the path stated in
the last line of the build process), with the "result" symbolic link
pointing to the same location:
xeon07$ readlink result
/nix/store/3i0qkdywm9xjv2cm1ldx9smb552sf6r1-nanos6-2.4-6f10a32
1.4 Configuration of mn4 (MareNostrum 4)
In order to execute the programs built at xeon07, you first need to
enter nix environment. To do so, add to the end of the file ~/.bashrc
in mn4 the following line:
export PATH=/gpfs/projects/bsc15/nix/bin:$PATH
Then logout and login again (our source the ~/.bashrc file) and you
will now have the `nix-setup` command available. This command executes
a new shell where the /nix store is available. To execute it:
mn4$ nix-setup
Now you will see a new shell, where you can access the nix store:
nix|mn4$ ls /nix
gcroots profiles store var
The last build of nanos6 can be also found in mn4 at the same
location:
/nix/store/3i0qkdywm9xjv2cm1ldx9smb552sf6r1-nanos6-2.4-6f10a32
Remember to enter the nix environment by running `nix-setup` when you
need something from the nix store.
You cannot perform any build operations from mn4: to do so use the
xeon07 machine.
2. Basic usage of nix
Nix is a package manager which handles easily reproducibility and
configuration of packages and dependencies. See more info here:
https://nixos.org/nix/manual/
We will only cover the basic usage of nix for the BSC packages.
2.1 The user environment
All nix packages are stored under the /nix directory. When you need to
"install" some binary from nix, a symlink is added to a folder
included in the $PATH variable. In particular, you should have
something similar added to your $PATH:
xeon07$ echo $PATH | sed 's/:/\n/g' | grep nix
/home/Computational/rarias/.nix-profile/bin
/nix/var/nix/profiles/default/bin
The first one is your custom installation of packages that are stored
in your home directory and the second one is the default installation
which contains the nix tools (which are installed in the /nix
directory as well).
Use `nix search` to look for official packages in the "nixpkgs"
channel (the default repository of packages):
xeon07$ nix search cowsay
warning: using cached results; pass '-u' to update the cache
* cowsay (cowsay)
A program which generates ASCII pictures of a cow with a message
* neo-cowsay (neo-cowsay)
Cowsay reborn, written in Go
* ponysay (ponysay-3.0.3)
Cowsay reimplemention for ponies
* tewisay (tewisay-unstable-2017-04-14)
Cowsay replacement with unicode and partial ansi escape support
When you need a program that is not available in your environment,
much like when you use "module load ..." you can use nix-env to modify
what is currently loaded. For example:
xeon07$ nix-env -iA nixpkgs.cowsay
Notice that you should specify the prefix "nixpkgs." before. The
command will download (if not found already in the nix store), compile
(if necessary) and load the program `cowsay` from the nixpkgs
repository in the environment. You should be able to run it as:
xeon07$ cowsay "hello world"
_____________
< hello world >
-------------
\ ^__^
\ (oo)\_______
(__)\ )\/\
||----w |
|| ||
You can now inspect the ~/.nix-profile/bin folder, and see that a new
symlink was added to the actual installation of the binary:
xeon07$ file ~/.nix-profile/bin/cowsay
/home/Computational/rarias/.nix-profile/bin/cowsay: symbolic link to
`/nix/store/673gczmhr5b449521srz2n7g1klykz6n-cowsay-3.03+dfsg2/bin/cowsay'
You can list the current packages installed in your environment by
running:
xeon07$ nix-env -q
cowsay-3.03+dfsg2
nix-2.3.6
Notice that this setup only affects your user environment. Also, it is
permanent for any new session until you modify the environment again
and is immediate, all sessions will have the new environment
instantaneously.
You can remove any package from the environment using:
xeon07$ nix-env -e cowsay
See the manual with `nix-env --help` if you want to know more details.
2.2 Building packages
Usually, all official packages are already compiled and distributed
from a cache server so you don't need to rebuild them again. However,
BSC packages are distributed only in source code form as we don't have
any binary cache server yet.
Nix will handle the build process without any user interaction (with a
few exceptions which you shouldn't have to worry). If any other user
has already built the package then the build process is not needed,
and the package is used as is.
In order to build a BSC package go to the `bscpkgs` directory, and
run:
xeon07$ nix-build -A bsc.dummy
Notice the "bsc." prefix for BSC packages. The package will be built
and installed in the /nix directory, then a symlink is placed in the
result directory:
xeon07$ find result/ -type f
result/
result/bin
result/bin/dummy
The way in which nix handles the packages and dependencies ensures
that the environment of the build process of any package is exactly
the same, so the generated output should be the same if the builds are
deterministic.
You can check the reproducibility of the build by adding the "--check"
flag, which will rebuild the package and compare the checksum of every
file with the ones previously built:
xeon07$ nix-build -A bsc.dummy --check
...
xeon07$ echo $?
0
A return code of zero ensures the output is bit by bit identical to
the one installed. There are some packages that include
indeterministic information in the build process (such as the
timestamp of the current time) which will produce an error. Those
packages must be patched to ensure the output is deterministic.
Notice that if you "cd" into the "result/" directory you will be at
/nix directory (as you have follow the symlink) where you don't have
write permission. Therefore if your program attempts to write to the
current directory it will fail. It is recommended to instead run your
program from the top directory:
xeon07$ result/bin/dummy
Hello world!
Or you can install it in the environment:
xeon07$ nix-env -i ./result
And "cd" into any directory where you want to output some files and
just run it by the name:
xeon07$ cd /tmp
xeon07$ dummy
Hello world!
Finally, you can remove it from the environment if you don't need it:
xeon07$ nix-env -e dummy
If you want to know more details use "nix-build --help" to see the
manual.
2.3 The build process
Each package is built following a programmable configuration
description in the nix language. Builds in nix are performed under
very strict conditions. No access to any file in the file system is
allowed, unless stated in the dependencies, which are in the /nix
store only.
There is no network access in the build process and other restrictions
are enforced so that the build environment is reproducible. See more
details here:
https://nixos.wiki/wiki/Nix#Sandboxing
The top level "default.nix" file of the bscpkgs serves as a index
of all BSC packages. You can see the definition for each package, for
example the nbody app:
nbody = callPackage ./bsc/apps/nbody/default.nix {
stdenv = pkgs.gcc9Stdenv;
mpi = intel-mpi;
icc = icc;
tampi = tampi;
nanos6 = nanos6-git;
};
The compilation details are specified in the
"bsc/apps/nbody/default.nix" file. You can configure the package by
changing the inputs, for example, what specific implementation of
nanos6 or MPI you want to use. To change the MPI implementation to the
official MPICH package use:
nbody = callPackage ./bsc/apps/nbody/default.nix {
stdenv = pkgs.gcc9Stdenv;
mpi = pkgs.mpich; # Notice pkgs prefix for official packages
icc = icc;
tampi = tampi;
nanos6 = nanos6-git;
};
Then you can rebuild the nbody package:
xeon07$ nix-build -A bsc.nbody
...
And verify that the binary is indeed linked to MPICH now:
xeon07$ ldd result/bin/nbody_mpi.N2.2048.exe | grep mpi
libmpi.so.12 => /nix/store/dwkkcv78a5bs8smflpx9ppp3klhz3i98-mpich-3.3.2/lib/libmpi.so.12 (0x00007f6be0f07000)
If you modify a package which another package requires as a
dependency, nix will rebuild all required packages to propagate your
changes on demand.
However, if you come back to the original configuration, the package
will still be in the /nix store (unless the garbage collector was
manually run and removed your old build), so you don't need to rebuild
it again.
For example if nbody is configured back to use Intel MPI:
nbody = callPackage ./bsc/apps/nbody/default.nix {
stdenv = pkgs.gcc9Stdenv;
mpi = intel-mpi;
icc = icc;
tampi = tampi;
nanos6 = nanos6-git;
};
The build process now is not required:
xeon07$ nix-build -A bsc.nbody
/nix/store/rbq7wrjcmg6fzd6yhrlnkfvzcavdbdpc-nbody
xeon07$ ldd result/bin/nbody_mpi.N2.2048.exe | grep mpi
libmpifort.so.12 => /nix/store/jvsjvxj2a08340fpdrqbqix9z3mpp3bd-intel-mpi-2019.7.217/lib/libmpifort.so.12 (0x00007f3a00402000)
libmpi.so.12 => /nix/store/jvsjvxj2a08340fpdrqbqix9z3mpp3bd-intel-mpi-2019.7.217/lib/libmpi.so.12 (0x00007f39fed34000)
Take a look at the different package description files in the
bscpkgs repository if you want to understand more details. Also
the nix pills are a very good reference:
https://nixos.org/nixos/nix-pills/
2.4 Debugging the build process
It may happen that the build process fails in an unexpected way. Most
problems are related to missing dependencies and can be easily found
by looking at the error messages.
Other build problems are more subtle and require more debugging time.
One way of inspecting a build problem is by adding the breakpointHook
hook to the nativeBuildInputs array in a nix derivation (see
https://nixos.org/nixpkgs/manual/#ssec-setup-hooks for more info),
which will stop the build process and allow a shell to be attached to
the sandbox.
xeon07$ nix-build -A bsc.nbody
...
/nix/store/gvqm2yc9xx4vh3nglgckz8siya66jnkx-stdenv-linux/setup: line
83: fake-missing-command: command not found
build failed in buildPhase with exit code 127
To attach install cntr and run the following command as root:
cntr attach -t command \
cntr-/nix/store/sk2nsj7xfr62cjk6m3725ydfyswqz7n1-nbody
The command must run as root user, so you can use `sudo -i` to run it,
(the -i option is required to load the shell profile which provides
the nix path containing the cntr tool):
xeon$ sudo -i cntr attach -t command \
cntr-/nix/store/sk2nsj7xfr62cjk6m3725ydfyswqz7n1-nbody
nixbld@localhost:/var/lib/cntr> ls
bin build dev etc nix proc tmp var
Then you can inspect the build environment to see why the build
failed. Source the build/env-vars file to get the same environment
variables (which include the $PATH) of the build process.
/* vim: set ts=2 sw=2 tw=72 fo=watqc expandtab spell autoindent: */