forked from rarias/jungle
		
	
		
			
				
	
	
		
			463 lines
		
	
	
		
			15 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
			
		
		
	
	
			463 lines
		
	
	
		
			15 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
| 
 | |
| 
 | |
|                           bscpkgs: User guide
 | |
| 
 | |
| 
 | |
| ABSTRACT
 | |
| 
 | |
|   This repository contains a set of nix packages used in the Barcelona
 | |
|   Supercomputing Center by the Programming Models group.
 | |
| 
 | |
|   The current setup uses the xeon07 machine to build packages, which are 
 | |
|   automatically uploaded to MareNostrum4, due to lack of permissions in 
 | |
|   the latter to perform the build safely.
 | |
| 
 | |
|   Some preliminary steps must be done manually to be able to build and 
 | |
|   install packages (derivations in nix jargon).
 | |
| 
 | |
| 1. Introduction
 | |
| 
 | |
|   To easily connect to xeon07 in one step, setup the SSH (for version 
 | |
|   7.3 and upwards) configuration file in ~/.ssh/config adding these 
 | |
|   lines:
 | |
| 
 | |
|     Host cobi
 | |
|           HostName ssflogin.bsc.es
 | |
|           User your-username-here
 | |
| 
 | |
|     Host xeon07
 | |
|           ProxyJump cobi
 | |
|           HostName xeon07
 | |
|           User your-username-here
 | |
| 
 | |
|   You should be able to connect with:
 | |
| 
 | |
|     laptop$ ssh xeon07
 | |
| 
 | |
| 1.1 Network access
 | |
| 
 | |
|   In order to use nix you would need to be able to download the sources 
 | |
|   from Internet. Usually the download requires the ports 22, 80 and 443 
 | |
|   to be open for outgoing traffic.
 | |
| 
 | |
|   Check that you have network access in xeon07 provided by the 
 | |
|   environment variables "http_proxy" and "https_proxy". Try to fetch a 
 | |
|   webpage with curl, to ensure the proxy is working:
 | |
| 
 | |
|     xeon07$ curl x.com
 | |
|     x
 | |
| 
 | |
| 1.2 SSH keys
 | |
| 
 | |
|   Package sources are usually downloaded directly from the git server, 
 | |
|   so you must be able to access all repositories without a password 
 | |
|   prompt.
 | |
| 
 | |
|   Most repositories at https://pm.bsc.es/gitlab are open to read for 
 | |
|   logged in users, but there are some exceptions (for example the nanos6 
 | |
|   repository) where you must have explicitly granted read access.
 | |
| 
 | |
|   If you don't have a ssh key at ~/.ssh/*.pub in xeon07 create a new one 
 | |
|   without password protection by running:
 | |
| 
 | |
|     xeon07$ ssh-keygen
 | |
|     Generating public/private rsa key pair.
 | |
|     Enter file in which to save the key (~/.ssh/id_rsa):
 | |
|     Enter passphrase (empty for no passphrase):
 | |
|     Enter same passphrase again:
 | |
|     Your identification has been saved in ~/.ssh/id_rsa.
 | |
|     Your public key has been saved in ~/.ssh/id_rsa.pub.
 | |
|     ...
 | |
| 
 | |
|   By default it will create the private key at ~/.ssh/id_rsa. Copy the 
 | |
|   contents of your public ssh key in ~/.ssh/id_rsa.pub and paste it in 
 | |
|   GitLab at:
 | |
| 
 | |
|     https://pm.bsc.es/gitlab/profile/keys
 | |
| 
 | |
|   Then, configure it for use in the ~/.ssh/config file, adding:
 | |
| 
 | |
|     Host bscpm02.bsc.es
 | |
|       IdentityFile ~/.ssh/id_rsa
 | |
| 
 | |
|   Finally verify the SSH connection to the server works and you get a 
 | |
|   greeting from the GitLab server with your username:
 | |
| 
 | |
|     xeon07$ ssh git@bscpm02.bsc.es
 | |
|     PTY allocation request failed on channel 0
 | |
|     Welcome to GitLab, @rarias!
 | |
|     Connection to bscpm02.bsc.es closed.
 | |
| 
 | |
|   Verify that you can access nanos6/nanos6 repository (otherwise you 
 | |
|   first need to ask to be granted read access), at:
 | |
| 
 | |
|     https://pm.bsc.es/gitlab/nanos6/nanos6
 | |
|   
 | |
|   Finally, you should be able to download the nanos6/nanos6 git 
 | |
|   repository without any password interaction by running:
 | |
| 
 | |
|     xeon07$ git clone git@bscpm02.bsc.es:nanos6/nanos6.git
 | |
| 
 | |
|   You will also need to access MareNostrum 4 from the xeon07 node, in 
 | |
|   order to submit experiments. Add the following lines as well to the 
 | |
|   ~/.ssh/config file and set your user name:
 | |
| 
 | |
|     Host mn0 mn1 mn2
 | |
|             User your-mn4-username
 | |
|             IdentityFile ~/.ssh/id_rsa
 | |
| 
 | |
|   Then copy the key to MareNostrum 4 (it will ask you the first time for 
 | |
|   your password):
 | |
| 
 | |
|     xeon07$ ssh-copy-id -i ~/.ssh/id_rsa.pub mn1
 | |
| 
 | |
|   And ensure that you can connect without a password:
 | |
| 
 | |
|     xeon07$ ssh mn1
 | |
|     ...
 | |
|     login1$
 | |
| 
 | |
| 1.3 The bscpkgs repo
 | |
| 
 | |
|   Once you have Internet and you have granted access to the PM GitLab 
 | |
|   repositories you can begin building software with nix. First ensure 
 | |
|   that the nix binaries are available from your shell in xeon07:
 | |
| 
 | |
|     xeon07$ nix --version
 | |
|     nix (Nix) 2.3.6
 | |
| 
 | |
|   Now you are ready to build and install packages with nix. Clone the 
 | |
|   bscpkgs repository:
 | |
| 
 | |
|     xeon07$ git clone git@bscpm02.bsc.es:rarias/bscpkgs.git
 | |
| 
 | |
|   Nix looks in the current folder for a file named "default.nix" for 
 | |
|   packages, so go to the repo directory:
 | |
| 
 | |
|     xeon07$ cd bscpkgs
 | |
| 
 | |
|   Now you should be able to build nanos6:
 | |
| 
 | |
|     xeon07$ nix-build -A bsc.nanos6
 | |
|     ..
 | |
|     /nix/store/3i0qkdywm9xjv2cm1ldx9smb552sf6r1-nanos6-2.4-6f10a32
 | |
| 
 | |
|   The installation is placed in the nix store (with the path stated in 
 | |
|   the last line of the build process), with the "result" symbolic link 
 | |
|   pointing to the same location:
 | |
| 
 | |
|     xeon07$ readlink result
 | |
|     /nix/store/3i0qkdywm9xjv2cm1ldx9smb552sf6r1-nanos6-2.4-6f10a32
 | |
| 
 | |
| 1.4 Configuration of mn4 (MareNostrum 4)
 | |
| 
 | |
|   In order to execute the programs built at xeon07, you first need to 
 | |
|   enter nix environment. To do so, add to the end of the file ~/.bashrc 
 | |
|   in mn4 the following line:
 | |
| 
 | |
|     export PATH=/gpfs/projects/bsc15/nix/bin:$PATH
 | |
| 
 | |
|   Then logout and login again (our source the ~/.bashrc file) and you 
 | |
|   will now have the `nix-setup` command available. This command executes 
 | |
|   a new shell where the /nix store is available. To execute it:
 | |
| 
 | |
|     mn4$ nix-setup
 | |
| 
 | |
|   Now you will see a new shell, where you can access the nix store:
 | |
| 
 | |
|     nix|mn4$ ls /nix
 | |
|     gcroots  profiles  store  var
 | |
| 
 | |
|   The last build of nanos6 can be also found in mn4 at the same 
 | |
|   location:
 | |
| 
 | |
|     /nix/store/3i0qkdywm9xjv2cm1ldx9smb552sf6r1-nanos6-2.4-6f10a32
 | |
| 
 | |
|   Remember to enter the nix environment by running `nix-setup` when you 
 | |
|   need something from the nix store.
 | |
| 
 | |
|   You cannot perform any build operations from mn4: to do so use the 
 | |
|   xeon07 machine.
 | |
| 
 | |
| 2. Basic usage of nix
 | |
| 
 | |
|   Nix is a package manager which handles easily reproducibility and 
 | |
|   configuration of packages and dependencies. See more info here:
 | |
| 
 | |
|     https://nixos.org/nix/manual/
 | |
| 
 | |
|   We will only cover the basic usage of nix for the BSC packages.
 | |
| 
 | |
| 2.1 The user environment
 | |
| 
 | |
|   All nix packages are stored under the /nix directory. When you need to 
 | |
|   "install" some binary from nix, a symlink is added to a folder 
 | |
|   included in the $PATH variable. In particular, you should have 
 | |
|   something similar added to your $PATH:
 | |
| 
 | |
|     xeon07$ echo $PATH | sed 's/:/\n/g' | grep nix
 | |
|     /home/Computational/rarias/.nix-profile/bin
 | |
|     /nix/var/nix/profiles/default/bin
 | |
| 
 | |
|   The first one is your custom installation of packages that are stored 
 | |
|   in your home directory and the second one is the default installation 
 | |
|   which contains the nix tools (which are installed in the /nix 
 | |
|   directory as well).
 | |
| 
 | |
|   Use `nix search` to look for official packages in the "nixpkgs" 
 | |
|   channel (the default repository of packages):
 | |
| 
 | |
|   xeon07$ nix search cowsay
 | |
|   warning: using cached results; pass '-u' to update the cache
 | |
|   * cowsay (cowsay)
 | |
|     A program which generates ASCII pictures of a cow with a message
 | |
| 
 | |
|   * neo-cowsay (neo-cowsay)
 | |
|     Cowsay reborn, written in Go
 | |
| 
 | |
|   * ponysay (ponysay-3.0.3)
 | |
|     Cowsay reimplemention for ponies
 | |
| 
 | |
|   * tewisay (tewisay-unstable-2017-04-14)
 | |
|     Cowsay replacement with unicode and partial ansi escape support
 | |
| 
 | |
|   When you need a program that is not available in your environment, 
 | |
|   much like when you use "module load ..." you can use nix-env to modify 
 | |
|   what is currently loaded. For example:
 | |
| 
 | |
|     xeon07$ nix-env -iA nixpkgs.cowsay
 | |
| 
 | |
|   Notice that you should specify the prefix "nixpkgs." before. The 
 | |
|   command will download (if not found already in the nix store), compile 
 | |
|   (if necessary) and load the program `cowsay` from the nixpkgs 
 | |
|   repository in the environment. You should be able to run it as:
 | |
| 
 | |
|     xeon07$ cowsay "hello world"
 | |
|      _____________
 | |
|     < hello world >
 | |
|      -------------
 | |
|             \   ^__^
 | |
|              \  (oo)\_______
 | |
|                 (__)\       )\/\
 | |
|                     ||----w |
 | |
|                     ||     ||
 | |
| 
 | |
|   You can now inspect the ~/.nix-profile/bin folder, and see that a new 
 | |
|   symlink was added to the actual installation of the binary:
 | |
| 
 | |
|     xeon07$ file ~/.nix-profile/bin/cowsay
 | |
|     /home/Computational/rarias/.nix-profile/bin/cowsay: symbolic link to 
 | |
|     `/nix/store/673gczmhr5b449521srz2n7g1klykz6n-cowsay-3.03+dfsg2/bin/cowsay'
 | |
| 
 | |
|   You can list the current packages installed in your environment by 
 | |
|   running:
 | |
| 
 | |
|     xeon07$ nix-env -q
 | |
|     cowsay-3.03+dfsg2
 | |
|     nix-2.3.6
 | |
| 
 | |
|   Notice that this setup only affects your user environment. Also, it is 
 | |
|   permanent for any new session until you modify the environment again 
 | |
|   and is immediate, all sessions will have the new environment 
 | |
|   instantaneously.
 | |
| 
 | |
|   You can remove any package from the environment using:
 | |
| 
 | |
|     xeon07$ nix-env -e cowsay
 | |
| 
 | |
|   See the manual with `nix-env --help` if you want to know more details.
 | |
| 
 | |
| 2.2 Building packages
 | |
| 
 | |
|   Usually, all official packages are already compiled and distributed 
 | |
|   from a cache server so you don't need to rebuild them again. However, 
 | |
|   BSC packages are distributed only in source code form as we don't have 
 | |
|   any binary cache server yet.
 | |
|   
 | |
|   Nix will handle the build process without any user interaction (with a 
 | |
|   few exceptions which you shouldn't have to worry). If any other user 
 | |
|   has already built the package then the build process is not needed, 
 | |
|   and the package is used as is.
 | |
| 
 | |
|   In order to build a BSC package go to the `bscpkgs` directory, and 
 | |
|   run:
 | |
| 
 | |
|     xeon07$ nix-build -A bsc.dummy
 | |
| 
 | |
|   Notice the "bsc." prefix for BSC packages. The package will be built 
 | |
|   and installed in the /nix directory, then a symlink is placed in the 
 | |
|   result directory:
 | |
| 
 | |
|     xeon07$ find result/ -type f
 | |
|     result/
 | |
|     result/bin
 | |
|     result/bin/dummy
 | |
| 
 | |
|   The way in which nix handles the packages and dependencies ensures 
 | |
|   that the environment of the build process of any package is exactly 
 | |
|   the same, so the generated output should be the same if the builds are 
 | |
|   deterministic.
 | |
|   
 | |
|   You can check the reproducibility of the build by adding the "--check" 
 | |
|   flag, which will rebuild the package and compare the checksum of every 
 | |
|   file with the ones previously built:
 | |
| 
 | |
|     xeon07$ nix-build -A bsc.dummy --check
 | |
|     ...
 | |
|     xeon07$ echo $?
 | |
|     0
 | |
| 
 | |
|   A return code of zero ensures the output is bit by bit identical to 
 | |
|   the one installed. There are some packages that include 
 | |
|   indeterministic information in the build process (such as the 
 | |
|   timestamp of the current time) which will produce an error. Those 
 | |
|   packages must be patched to ensure the output is deterministic.
 | |
| 
 | |
|   Notice that if you "cd" into the "result/" directory you will be at 
 | |
|   /nix directory (as you have follow the symlink) where you don't have 
 | |
|   write permission. Therefore if your program attempts to write to the 
 | |
|   current directory it will fail. It is recommended to instead run your 
 | |
|   program from the top directory:
 | |
| 
 | |
|     xeon07$ result/bin/dummy
 | |
|     Hello world!
 | |
| 
 | |
|   Or you can install it in the environment:
 | |
| 
 | |
|     xeon07$ nix-env -i ./result
 | |
| 
 | |
|   And "cd" into any directory where you want to output some files and 
 | |
|   just run it by the name:
 | |
| 
 | |
|     xeon07$ cd /tmp
 | |
|     xeon07$ dummy
 | |
|     Hello world!
 | |
| 
 | |
|   Finally, you can remove it from the environment if you don't need it:
 | |
| 
 | |
|     xeon07$ nix-env -e dummy
 | |
| 
 | |
|   If you want to know more details use "nix-build --help" to see the 
 | |
|   manual.
 | |
| 
 | |
| 2.3 The build process
 | |
| 
 | |
|   Each package is built following a programmable configuration 
 | |
|   description in the nix language. Builds in nix are performed under 
 | |
|   very strict conditions. No access to any file in the file system is 
 | |
|   allowed, unless stated in the dependencies, which are in the /nix 
 | |
|   store only.
 | |
| 
 | |
|   There is no network access in the build process and other restrictions 
 | |
|   are enforced so that the build environment is reproducible. See more 
 | |
|   details here:
 | |
| 
 | |
|     https://nixos.wiki/wiki/Nix#Sandboxing
 | |
| 
 | |
|   The top level "default.nix" file of the bscpkgs serves as a index 
 | |
|   of all BSC packages. You can see the definition for each package, for 
 | |
|   example the nbody app:
 | |
| 
 | |
|     nbody = callPackage ./bsc/apps/nbody/default.nix {
 | |
|       stdenv = pkgs.gcc9Stdenv;
 | |
|       mpi = intel-mpi;
 | |
|       icc = icc;
 | |
|       tampi = tampi;
 | |
|       nanos6 = nanos6-git;
 | |
|     };
 | |
| 
 | |
|   The compilation details are specified in the 
 | |
|   "bsc/apps/nbody/default.nix" file.  You can configure the package by 
 | |
|   changing the inputs, for example, what specific implementation of 
 | |
|   nanos6 or MPI you want to use. To change the MPI implementation to the 
 | |
|   official MPICH package use:
 | |
| 
 | |
|     nbody = callPackage ./bsc/apps/nbody/default.nix {
 | |
|       stdenv = pkgs.gcc9Stdenv;
 | |
|       mpi = pkgs.mpich; # Notice pkgs prefix for official packages
 | |
|       icc = icc;
 | |
|       tampi = tampi;
 | |
|       nanos6 = nanos6-git;
 | |
|     };
 | |
| 
 | |
|   Then you can rebuild the nbody package:
 | |
| 
 | |
|     xeon07$ nix-build -A bsc.nbody
 | |
|     ...
 | |
| 
 | |
|   And verify that the binary is indeed linked to MPICH now:
 | |
| 
 | |
|     xeon07$ ldd result/bin/nbody_mpi.N2.2048.exe | grep mpi
 | |
|         libmpi.so.12 => /nix/store/dwkkcv78a5bs8smflpx9ppp3klhz3i98-mpich-3.3.2/lib/libmpi.so.12 (0x00007f6be0f07000)
 | |
| 
 | |
|   If you modify a package which another package requires as a 
 | |
|   dependency, nix will rebuild all required packages to propagate your 
 | |
|   changes on demand.
 | |
| 
 | |
|   However, if you come back to the original configuration, the package 
 | |
|   will still be in the /nix store (unless the garbage collector was 
 | |
|   manually run and removed your old build), so you don't need to rebuild 
 | |
|   it again.
 | |
| 
 | |
|   For example if nbody is configured back to use Intel MPI:
 | |
| 
 | |
|     nbody = callPackage ./bsc/apps/nbody/default.nix {
 | |
|       stdenv = pkgs.gcc9Stdenv;
 | |
|       mpi = intel-mpi;
 | |
|       icc = icc;
 | |
|       tampi = tampi;
 | |
|       nanos6 = nanos6-git;
 | |
|     };
 | |
| 
 | |
|   The build process now is not required:
 | |
| 
 | |
|     xeon07$ nix-build -A bsc.nbody
 | |
|     /nix/store/rbq7wrjcmg6fzd6yhrlnkfvzcavdbdpc-nbody
 | |
|     xeon07$ ldd result/bin/nbody_mpi.N2.2048.exe | grep mpi
 | |
|         libmpifort.so.12 => /nix/store/jvsjvxj2a08340fpdrqbqix9z3mpp3bd-intel-mpi-2019.7.217/lib/libmpifort.so.12 (0x00007f3a00402000)
 | |
|         libmpi.so.12 => /nix/store/jvsjvxj2a08340fpdrqbqix9z3mpp3bd-intel-mpi-2019.7.217/lib/libmpi.so.12 (0x00007f39fed34000)
 | |
| 
 | |
|   Take a look at the different package description files in the 
 | |
|   bscpkgs repository if you want to understand more details. Also 
 | |
|   the nix pills are a very good reference:
 | |
| 
 | |
|     https://nixos.org/nixos/nix-pills/
 | |
| 
 | |
| 2.4 Debugging the build process
 | |
| 
 | |
|   It may happen that the build process fails in an unexpected way. Most 
 | |
|   problems are related to missing dependencies and can be easily found 
 | |
|   by looking at the error messages.
 | |
| 
 | |
|   Other build problems are more subtle and require more debugging time.  
 | |
|   One way of inspecting a build problem is by adding the breakpointHook 
 | |
|   hook to the nativeBuildInputs array in a nix derivation (see
 | |
|   https://nixos.org/nixpkgs/manual/#ssec-setup-hooks for more info), 
 | |
|   which will stop the build process and allow a shell to be attached to 
 | |
|   the sandbox.
 | |
| 
 | |
|     xeon07$ nix-build -A bsc.nbody
 | |
|     ...
 | |
|     /nix/store/gvqm2yc9xx4vh3nglgckz8siya66jnkx-stdenv-linux/setup: line 
 | |
|     83: fake-missing-command: command not found
 | |
|     build failed in buildPhase with exit code 127
 | |
|     To attach install cntr and run the following command as root:
 | |
| 
 | |
|       cntr attach -t command \
 | |
|         cntr-/nix/store/sk2nsj7xfr62cjk6m3725ydfyswqz7n1-nbody
 | |
| 
 | |
|   The command must run as root user, so you can use `sudo -i` to run it, 
 | |
|   (the -i option is required to load the shell profile which provides 
 | |
|   the nix path containing the cntr tool):
 | |
| 
 | |
|     xeon$ sudo -i cntr attach -t command \
 | |
|       cntr-/nix/store/sk2nsj7xfr62cjk6m3725ydfyswqz7n1-nbody
 | |
|     nixbld@localhost:/var/lib/cntr> ls
 | |
|     bin  build  dev  etc  nix  proc  tmp  var
 | |
| 
 | |
|   Then you can inspect the build environment to see why the build 
 | |
|   failed. Source the build/env-vars file to get the same environment 
 | |
|   variables (which include the $PATH) of the build process.
 | |
| 
 | |
| /* vim: set ts=2 sw=2 tw=72 fo=watqc expandtab spell autoindent: */
 |