Configure MPI to run multi-node benchmarks #16
Loading…
Reference in New Issue
Block a user
No description provided.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
We should be able to run benchmarks in at least two nodes so we can pass distributed tests in ovni. We should add automated CI tests that pass osu test (or others) when the configuration is updated.
We will need to configure the MPI network, as it never works out of the box.
SLURM is already built with pmix 3:
Well, apparently we can only use MPICH with openpmix with the ch4 device and no pm.
From https://openpmix.github.io/support/faq/which-environments-include-support-for-pmix :
However, by default MPICH is built with PM and no device:
Enabled PMIx by default. MPICH is working too:
I updated MPI to the last release (2021.9, released in 2023), but is still failing in different ways. Here is the script I'm using to test:
PMI_LIBRARY set to libpmix.so
With PMI_LIBRARY set to libpmix.so it segfaults:
And also shows some error in stderr:
PMI_LIBRARY set to libpmi2.so
With libpmi2.so causes a authentication problem with munge:
And freezes:
PMI_LIBRARY set to libpmi.so
With libpmi.so also causes the same credential problem:
But it manages to run and works okay:
The pmix library is very old in nixpkgs (3.2.4) while the last stable release is 4.2.3. Let's try with the last release of pmix and see if Intel MPI/SLURM works with it.
PMIX fails to build when fetched from the tag, using the release tarball.
Updating PMIX doesn't solve the problem. Intel MPI seems to be unable to work with pmix, but it works with pmi2. The results are in the following table
With pmi2 the ranks don't see each other in the case of MPICH and OpenMPI. With pmix, Intel segfaults.
Okey, so there is more to it.
There are at least 3 things that must align for it to work:
I was always changing the protocol when selecting the pmi2 library provided by pmix. However, it seems that the libpmi2.so provided by pmix will use the pmix protocol, so it must use --mpi=pmix. Or at least that is what I understand following the long ticket at https://bugs.schedmd.com/show_bug.cgi?id=6418
Let's try loading the emulation pmi2 library from pmix in intel MPI with --pmi=pmix.
Dammit, it fails with a glibc mismatch:
Now it loads and gets stuck:
Now it looks that the problem is caused because libfabric cannot find the psm2 lib:
After fixing the psm2 problem (LD_LIBRARY_PATH set to libfabric), it gets stuck after configuring the network devices. However, I managed to get it working with the libpmi.so of PMIX:
This looks promising but it will require user intervention to set the I_MPI_PMI_LIBRARY.
By the way, all tests are running the
osu_bw -b multiple
test, with multiple messages at the same time. With the sequential mode we reach the theoretical limit:mentioned in merge request !11
Let's go with PMIx then.