|
3 months ago | |
---|---|---|
foss | 4 years ago | |
intel | 3 months ago | |
.gitignore | 2 years ago | |
LICENSE | 3 months ago | |
README.md | 3 months ago | |
batch_intel.job | 4 years ago | |
poisson.py | 1 year ago |
README.md
FEniCS-Peregrine HPC
Scripts for building FEniCS on the Peregrine cluster at RUG
forked from fenics-gaia-cluster
This collection of scripts will automatically build current releases of
FEniCS with PETSc and all dependencies on the Peregrine HPC cluster of the University of Groningen (RUG), using Lmod modules (see the env_build.sh
files).
Prerequisites
python 3
Python 3 with virtualenv is required.
This build uses python 3.6.4
from the Lmod modules.
All FEniCS related python modules will be installed within an virtual environment.
In particular, if you want to install different builds of FEniCS (different versions,
packages (e.g., OpenMPI vs. MPICH, Intel MKL vs. OpenBLAS), or for different processor
micro-architectures, see below), this will be helpful.
Here, pew is used to manage the virtualenvs.
To install pew
follow these steps:
$ module load Python/3.6.4-intel-2018a
$ pip3 install pew --user
Compiling instructions
First clone this repository, for example to the location $HOME/dev
.
$ mkdir -p $HOME/dev && cd $HOME/dev
$ git clone https://git.web.rug.nl/P301191/FEniCS-Peregrine.git
SETUP
The folder intel
contains the build scripts using the intel toolchain (compilers, impi, intel MKL).
The folder foss
contains build scripts using the foss toolchain (gcc, openblas, openmpi).
Select the folder according to which toolchain you want to use.
The main file is build_all.sh
. Modify it to set the FEniCS version to be used. The $TAG
variable
(can be changed) specifies the directories FEniCS and its dependencies are installed to and the name of the virtual environment.
It is recommended to set continue_on_key=true
for the first build in order to check that each dependency was installed correctly!
The script calls a number of build_xxx.sh
scripts which download and install the corresponding applications. Edit these files to change the version, compile flags, etc.
The calls can be commented if it was already built, e.g., if a number of programs was built correctly until an error occurred in, say, build_dolfin.sh
, because of a network timeout.
Note that if you want to rebuild everything the $PREFIX
and the $BUILD_DIR
dirs should be deleted,
also the python virtual environment with pew rm $TAG
.
Make sure the environments are correctly set in env_build.sh
and setup_env_fenics.env
. Also revise the individual build files.
INSTALL
In order to build FEniCS run
$ ./build_all.sh |& tee -a build.log
on the compute node inside the FEniCS-Peregrine/intel
directory.
Remark There is an outdated folder with foss compilers. Currently is it not supported and expected to be deprecated in the future.
Wait for the build to finish. The output of
the build will be stored in build.log
as well as printed on the screen.
Running FEniCS
To activate a FEniCS build you need to source the environment file created by the build script and activate the virtualenv.
The corresponding lines are printed after building is completed.
$ source <prefix>/bin/env_fenics_run.sh
$ pew workon fenics-<tag>
Now you can run python/ipython. python -c "import dolfin"
should work.
Try running python poisson.py
and mpirun -n 4 python poisson.py
.
Remark If installed successfully, the fenics environment created will be named fenics-2019.1.0.post0-intel2018a.
Submit jobs to Peregrine (RUG)
To submit jobs to Peregrine, you need to provide a minimal configuration using sbatch,
activate FEniCS environment and export the MPI library, e.g.
#!/bin/bash
#SBATCH --job-name=name_of_job
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=16
#SBATCH --time=30:00:00
#SBATCH --mem=4000
# Load compilers and FEniCS environment variables
source $HOME/dev/fenics-2019.1.0.post0-intel2018a/bin/env_fenics_run.sh
export I_MPI_PMI_LIBRARY=/usr/lib64/libpmix.so
# run the script using the specified num. of cores with srun
pew in fenics-2019.1.0.post0-intel2018a srun python3 /home/you_p_number/path_to_your_file/your_file.yaml
Troubleshooting
- If an error occurs, check that the environment and the virtualenv have been correctly loaded, e.g., with
which python
,pip show dolfin
,pip show petsc4py
which should point to the correct versions. - Check in the
build.log
if all dependencies were built correctly, in particular PETSc and DOLFIN. An exhaustive summary of the DOLFIN configuration is printed and should be checked. - If python crashes with "Illegal Construction" upon
import dolfin
, one of the dependencies was probably built on a different architecture. Make sure, e.g., petsc4py is picked up from the correct location and pip did not use a cached version when installing it! - A common error is that the
$PYTHONPATH
variable conflicts with the virtual environment, when the wrong python modules are found. To that end, inenv_build.sh
andenv_fenics_run.sh
, this variable isunset
. Make sure it is not modified afterwards.
Extras: On working with ParaView and Peregrine
To run ParaView on the Peregrine sever:
-
Run a pvsever on Peregrine, using the SLURM script below, choosing an arbitrary port number, e.g. 11111.
-
As the job starts to run, check on which node it runs (either with
squeue
or by looking up the job output line likeAccepting connection(s): pg-node005:222222
. -
Open an ssh forward tunner connecting you directly to the respective node, e.g.
ssh -N -L 11111:pg-node005:222222 username@peregrine.hpc.rug.nl
This will forward the connection to the indicated node using the local port 11111 (your machine) and the target port 222222 (computing node).
-
Open ParaView 5.4.1 (version must match the current version on Peregrine, note that ParaView from the Ubuntu standard repositories does not work with this and ParaView must be downloaded instead.
-
Choose "Connect" from the top left corner, add a new server with properties client / server, localhost, port 11111 (default values). Connect to that server.
-
If successful, you can now open files directly from the server and view them in decomposed state.
SLURM script
#!/bin/bash
#SBATCH -p short
#SBATCH --nodes=1
#SBATCH -J ParaView
#SBATCH -n 4
#SBATCH --mem-per-cpu=4000
#SBATCH --time=00-00:30:00
#SBATCH -o paraview_%j
module load ParaView/5.4.1-foss-2018a-mpi
srun -n 4 pvserver --use-offscreen-rendering --disable-xdisplay-test --server-port=222222