Scripts for building FEniCS on the Peregrine cluster at RUG
forked from fenics-gaia-cluster
This collection of scripts will automatically build current releases of
FEniCS with PETSc and all dependencies on the Peregrine HPC cluster of the University of Groningen (RUG), using Lmod modules (see the
Python 3 with virtualenv is required.
This build uses
python 3.6.4 from the Lmod modules. All FEniCS related python modules will be installed within an virtual environment.
In particular, if you want to install different builds of FEniCS (different versions, packages (e.g., OpenMPI vs. MPICH, Intel MKL vs. OpenBLAS), or for different processor micro-architectures, see below), this will be helpful.
Here, pew is used to manage the virtualenvs. To install
pew follow these steps:
$ module load Python/3.6.4-intel-2018a $ pip3 install pew --user
First clone this repository, for example to the location
$ mkdir -p $HOME/dev && cd $HOME/dev $ git clone firstname.lastname@example.org:dajuno/fenics-peregrine-build.git
intel contains the build scripts using the intel toolchain (compilers, impi, intel MKL). The folder
foss contains build scripts using the foss toolchain (gcc, openblas, openmpi). Select the folder according to which toolchain you want to use.
The main file is
build_all.sh. Modify it to set the FEniCS version to be used. The
$TAG variable (can be changed) specifies the directories FEniCS and its dependencies are installed to and the name of the virtual environment.
It is recommended to set
continue_on_key=true for the first build in order to check that each dependency was installed correctly!
The script calls a number of
build_xxx.sh scripts which download and install the corresponding applications. Edit these files to change the version, compile flags, etc.
The calls can be commented if it was already built, e.g., if a number of programs was built correctly until an error occurred in, say,
build_dolfin.sh, because of a network timeout.
Note that if you want to rebuild everything the
$PREFIX and the
$BUILD_DIR dirs should be deleted, also the python virtual environment with
pew rm $TAG.
Make sure the environments are correctly set in
setup_env_fenics.env. Also revise the individual build files.
In order to build FEniCS run
$ ./build_all.sh |& tee -a build.log
on the compute node inside the
Wait for the build to finish. The output of
the build will be stored in
build.log as well as printed on the screen.
To activate a FEniCS build you need to source the environment file created by the build script and activate the virtualenv. The corresponding lines are printed after building is completed.
$ source <prefix>/bin/env_fenics_run.sh $ pew workon fenics-<tag>
Now you can run python/ipython.
python -c "import dolfin" should work. Try running
python poisson.py and
mpirun -n 4 python poisson.py.
Submit jobs to Peregrine (RUG)
To submit jobs to Peregrine, you need to provide a minimal configuration using sbatch, activate FEniCS environment and export the MPI library, e.g.
#!/bin/bash #SBATCH --job-name=name_of_job #SBATCH --nodes=1 #SBATCH --ntasks-per-node=16 #SBATCH --time=30:00:00 #SBATCH --mem=4000 # Load compilers and FEniCS environment variables source $HOME/dev/fenics-2019.1.0.post0-intel2018a/bin/env_fenics_run.sh export I_MPI_PMI_LIBRARY=/usr/lib64/libpmix.so # run the script using the specified num. of cores with srun pew in fenics-2019.1.0.post0-intel2018a srun python3 /home/you_p_number/path_to_your_file/your_file.yaml
- If an error occurs, check that the environment and the virtualenv have been correctly loaded, e.g., with
pip show dolfin,
pip show petsc4pywhich should point to the correct versions.
- Check in the
build.logif all dependencies were built correctly, in particular PETSc and DOLFIN. An exhaustive summary of the DOLFIN configuration is printed and should be checked.
- If python crashes with "Illegal Construction" upon
import dolfin, one of the dependencies was probably built on a different architecture. Make sure, e.g., petsc4py is picked up from the correct location and pip did not use a cached version when installing it!
- A common error is that the
$PYTHONPATHvariable conflicts with the virtual environment, when the wrong python modules are found. To that end, in
env_fenics_run.sh, this variable is
unset. Make sure it is not modified afterwards.
Extras: On working with ParaView and Peregrine
To run ParaView on the Peregrine sever:
Run a pvsever on Peregrine, using the SLURM script below, choosing an arbitrary port number, e.g. 11111.
As the job starts to run, check on which node it runs (either with
squeueor by looking up the job output line like
Accepting connection(s): pg-node005:222222.
Open an ssh forward tunner connecting you directly to the respective node, e.g.
ssh -N -L 11111:pg-node005:222222 email@example.com
This will forward the connection to the indicated node using the local port 11111 (your machine) and the target port 222222 (computing node).
Open ParaView 5.4.1 (version must match the current version on Peregrine, note that ParaView from the Ubuntu standard repositories does not work with this and ParaView must be downloaded instead.
Choose "Connect" from the top left corner, add a new server with properties client / server, localhost, port 11111 (default values). Connect to that server.
If successful, you can now open files directly from the server and view them in decomposed state.
#!/bin/bash #SBATCH -p short #SBATCH --nodes=1 #SBATCH -J ParaView #SBATCH -n 4 #SBATCH --mem-per-cpu=4000 #SBATCH --time=00-00:30:00 #SBATCH -o paraview_%j module load ParaView/5.4.1-foss-2018a-mpi srun -n 4 pvserver --use-offscreen-rendering --disable-xdisplay-test --server-port=222222