ERF containers on Perlmutter¶
The following is a brief introduction to ERF and containers on the NERSC Perlmutter platform.
For more details, please see NERSC’s detailed containers documentation, which also includes containers tutorials.
Container images can be built on one’s desktop/laptop using a standard container framework such as Docker, Podman (Pod Manager), etc. or directly on a Perlmutter login node using podman-hpc. podman-hpc is a NERSC-developed wrapper that extends the capabilities of Podman for HPC. Containers are run on Perlmutter using either podman-hpc or shifter (also developed at NERSC). NERSC has a good podman-hpc tutorial.
Example ERF containerfile¶
The following is an example ERF containerfile with filename erf_containerfile:
1 FROM nvcr.io/nvidia/cuda:12.2.0-devel-ubuntu22.04
2
3 WORKDIR /app
4
5 RUN apt-get update -y && \
6 DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends \
7 g++-12 gcc-12 gfortran-12 git libtool make tar autoconf automake wget python3 cmake
8
9 # MPICH to be swapped out later for Cray MPI
10
11 RUN wget https://www.mpich.org/static/downloads/4.2.2/mpich-4.2.2.tar.gz && \
12 tar xzf mpich-4.2.2.tar.gz && cd mpich-4.2.2 && \
13 ./configure CC=/usr/bin/gcc-12 CXX=/usr/bin/g++-12 F77=/usr/bin/gfortran-12 FC=/usr/bin/gfortran-12 && \
14 make -j8 && make install && \
15 cd .. && rm -rf mpich-4.2.2 mpich-4.2.2.tar.gz
16
17 RUN mkdir /app/erf && cd /app/erf && wget https://github.com/erf-model/ERF/archive/refs/tags/24.06.tar.gz && \
18 tar xzf 24.06.tar.gz && cd ERF-24.06/Submodules && \
19 wget https://github.com/AMReX-Codes/amrex/releases/download/24.06/amrex-24.06.tar.gz && \
20 tar xzf amrex-24.06.tar.gz && rmdir AMReX && mv amrex AMReX && cd .. && mkdir MyBuild && cd MyBuild && \
21 cmake \
22 -DCMAKE_C_COMPILER=mpicc \
23 -DCMAKE_CXX_COMPILER=mpicxx \
24 -DCMAKE_Fortran_COMPILER=mpif90 \
25 -DCMAKE_BUILD_TYPE:STRING=Release \
26 -DCMAKE_CUDA_ARCHITECTURES=80 \
27 -DERF_ENABLE_MPI:BOOL=ON \
28 -DERF_ENABLE_CUDA=ON \
29 .. && \
30 make -j8
Line numbers were added for instructional purposes but should not appear in containerfile. The containerfile is available at https://github.com/erf-model/ERF/blob/development/Build/erf_containerfile
Line 1 downloads a container base image from NVIDIA’s container registry that contains the Ubuntu 22.04 operating system and CUDA 12.2.0
Line 3 sets the working directory
Lines 5-7 download the GNU 12 compilers and all the necessary utilities for building ERF in the container
Lines 11-15 download the MPICH source code and build it. At runtime this MPICH will get replaced by Perlmutter’s Cray MPI
Lines 17-30 wget the ERF and AMReX 24.06 tarballs and build with cmake/make
Build the container on Perlmutter using podman-hpc and using the containerfile erf_containerfile with name erf and tag 1.00 (-t <name>:<tag>)
podman-hpc build -t erf:1.00 -f erf_containerfile
In order to use this image in a job or access it from any other login node, one needs to migrate the image onto the $SCRATCH filesystem by issuing the following command:
podman-hpc migrate erf:1.00
podman-hpc images will display the following
user@perlmutter:login12:~> podman-hpc images
REPOSITORY TAG IMAGE ID CREATED SIZE R/O
localhost/erf 1.00 893605c3ee9b 5 hours ago 16.1 GB true
localhost/erf 1.00 893605c3ee9b 5 hours ago 16.1 GB false
Note that localhost will not be needed for podman-hpc commands.
Run container on Perlmutter¶
Submit the following slurm batch script in order to use the image in a job
#!/bin/bash
#SBATCH --account=<proj>
#SBATCH --constraint=gpu
#SBATCH --job-name=erf
#SBATCH --nodes=1
#SBATCH --time=0:05:00
#SBATCH -q regular
srun -N 1 -n 4 -c 32 --ntasks-per-node=4 --gpus-per-node=4 ./device_wrapper \
podman-hpc run --rm --mpi --gpu -v /pscratch/sd/u/user/erf/abl:/run -w /run erf:1.00 \
/app/erf/ERF-24.06/MyBuild/Exec/erf_exec CanonicalTests/ABL/inputs_smagorinsky amrex.use_gpu_aware_mpi=0
device_wrapper script:
#!/bin/bash
# select_cpu_device wrapper script
export CUDA_VISIBLE_DEVICES=$((3-$SLURM_LOCALID))
exec $*
Arguments for podman-hpc run used above
--rmremoves the container after exit--mpienables Cray MPI support (swaps MPICH in the container for Perlmutter’s Cray MPI)--gpuenables NVIDIA GPU support-v /pscratch/sd/u/user/erf/abl:/runmounts/pscratch/sd/u/user/erf/ablon Perlmutter onto/runin the container-w /runmakes the/rundirectory inside the container the working directory, i.e. any output from the ERF run will be written to the/rundirectory in the container, which will appear in the/pscratch/sd/u/user/erf/abldirectory on Perlmutter.erf:1.00container name and tag/app/erf/ERF-24.06/MyBuild/Exec/erf_execERF binary in container
The remaining arguments are the normal ERF command line arguments.
Please issue podman-hpc --help for the help page and podman-hpc run --help for the podman-hpc run help page.
Container image libraries¶
Container image libraries provide a convenient way to store and share images. The best known one is probably Docker Hub. NERSC provides a private registry to its users via registry.nersc.gov.
In order to push the ERF image to Docker Hub, need to modify the tag, login to Docker Hub, and then push the image.
podman-hpc tag erf:1.00 docker.io/docker_hub_username/erf:1.00
podman-hpc login docker.io
podman-hpc push docker.io/docker_hub_username/erf:1.00
Shifter container runtime¶
Shifter is a NERSC-developed tool that provides an alternative method for running containers on Perlmutter. Recall that Shifter cannot be used to build images. NERSC’s containers documentation provides an introduction to shifter including a tutorial.
Use the shifterimg pull command to pull images directly from Docker Hub and it will automatically convert your Docker image into Shifter format.
Need to first login to Docker Hub, then pull. --user nersc_user will restrict usage of the image to only nersc_user.
shifterimg login docker.io
shifterimg -v --user nersc_user pull docker_hub_username/erf:1.00
Submit the following slurm batch script in order to use the image in a job
#!/bin/bash
#SBATCH --account=<proj>
#SBATCH --constraint=gpu
#SBATCH --job-name=erf
#SBATCH --nodes=1
#SBATCH --time=0:05:00
#SBATCH -q regular
#SBATCH --image=docker_hub_username/erf:1.00 # for shifter, not podman-hpc
#SBATCH --module=mpich,gpu # for shifter, not podman-hpc; CPU-only MPI
##SBATCH --module=cuda-mpich # for shifter, not podman-hpc; GPU-Aware MPI
#SBATCH --volume="<$SCRATCH/my_erf_run_dir>:/run" # for shifter, not podman-hpc
srun -N 1 -n 4 -c 32 --ntasks-per-node=4 --gpus-per-node=4 ./device_wrapper \
shifter \
/app/erf/ERF-24.06/MyBuild/Exec/erf_exec CanonicalTests/ABL/inputs_smagorinsky amrex.use_gpu_aware_mpi=0
Common Issues¶
Using
podmanrather thanpodman-hpcon Perlmutter (best to always usepodman-hpc)Before issuing
podman-hpc migrate <name>:<tag>after having issued the command earlier with identical<name>:<tag>, if want to keep the same name, please change the<tag>to one that has not been used previously. If want to use an identical<name>:<tag>used in a previouspodman-hpc migratecommand, please first issuepodman-hpc rmsqi <name>:<tag>to delete the old image. Otherwise could potentially end up with errors such asError: read-only image store assigns the same name to multiple images
and will have resort to various methods to get out of the bad configuration state.
The default containerfile is a file called
Containerfile(case sensitive). When that file is being used, can replace-f erf_containerfilewith a period:podman-hpc build -t erf:1.00 .
Note that for this case, the period is mandatory. Here it does not denote the end of a sentence.