.. _container: Containerized EC-Earth4 ========================= Get the EC-Earth4 container --------------------------- .. caution:: The containerised version of EC-Earth4 is still experimental. Not all components are supported, and the model is significantly slower when run on multiple compute nodes. .. note:: Before running, please read "What works and what does not work" below. A containerised EC-Earth4 can be found on `GitLab `_. You may download the container to a location of your choice, but it is recommended to make a directory one level above ``ecearth4``. The run scripts will automatically look for a container ``{{experiment.base_dir}}/../ece4-containers/ecec-4.x.x-ompi.sif`` where ``4.x.x`` is the latest released version, currently |release|. You do not need to compile EC-Earth4 if you will use the container. That is the whole point! However, you will need to compile OASIS since it is needed by rdy2cpl on the first time step. More on that below... Running the container --------------------- The container allows you to run the following configurations: * Coupled OpenIFS + NEMO + runoff-mapper + XIOS * Ocean standalone NEMO + XIOS * Atmosphere standalone OpenIFS + amip reader + XIOS OpenIFS is built in both single and double precision executables. NEMO is built with ice-shelf cavities and without TOP. Before running the container, you will need to get the EC-Earth4 source code, install a conda environment and build OASIS. This is because we need ``rdy2cpl`` to generate the ``namcouple`` file, and ``rdy2cpl`` requires OASIS. See :ref:`here ` for instructions. To run the containerised EC-Earth4, set the following in your runscript: .. code-block:: yaml model_config: ecec: True job: launch: method: slurm-wrapper-taskset # only one that works for now slurm: srun: # must add --mpi=pmix to args args: [--label, --kill-on-bad-exit, --mpi=pmix] # multi-threading does not work # and while the model CAN run on multiple nodes, # it will not do so efficiently - when: job.launch.method in ["slurm-wrapper-taskset"] base.context: job: oifs: omp_num_threads: 1 omp_stacksize: "64M" use_hyperthreads: false groups: # On MareNostrum5 - 112 cores per node - { nodes: 1, rnfm: 1, xios: 3, nemo: 40, oifs: 68 } and in your ``user-config.yml`` file .. code-block:: yaml experiment: container_path: "/path/to/your/container/ecec-4.1.7-ompi.sif" if you have not stored it in the standard path ``{{experiment.base_dir}}/../ece4-containers/`` or you are not use the latest release. Then start the model as usual with ``se scriptlib/main.yml``. What works and what does not work --------------------------------- You can run EC-Earth4 in any available resolution. The model will produce output etc. as usual. However, there are currently the following limitations: * Only a few tagged releases of EC-Earth are available. * Only ``slurm-wrapper-taskset`` is supported. * The container has only been tested on Freja and MareNostrum5. * The containerised EC-Earth4 is much slower than normal when using multiple compute nodes since interconnect falls back to TCP. * Only works in an OpenMPI environment. * LPJ-Guess and M7 are not tested. * PISCES and PISM are not supported. How to build the container -------------------------- .. note:: It is generally not possible to build your own container using Apptainer/Singularity on HPCs for security reasons. They should be built on a local machine, e.g. your laptop. Normally, you should not need to build the container yourself. If you wish to do so, here's an example on how to do it on Freja. .. code-block:: shell # resolve links in path cd $(readlink -f .) # you can only build containers on compute nodes on Freja # so you must start an interactive job # Note: This is not necessary on a local laptop! interactive -n 10 -t 60 -A # go to apptainer dir cd ecearth4/scripts/apptainer # build # This will take quite some time since it needs to build an OS, # MPI library, HDF5+netCDF etc as well as the full model make ecec-ompi.sif Under the hood -------------- EC-Earth will run all executables (``nemo.exe`` etc.) inside the container - not on the host HPC. We bind some key directories on the host HPC to the container to facilitate exchange of files, e.g. for the model to read/write input/output data. Hence, model output is produced in the container but "mirrored" to the host HPC as we go. The script ``scripts/runtime/scriptlib/setup-ecec.yml`` binds ``ece-4-inidata`` to the container and the run directory is always bound by default. This also means EC-Earth will not be able to resolve links since they point to files on the host - not the container. The solution, for now, is to replace all links in the run directory with copies. A script does this for you before the job starts. The ``scripts/runtime/scriptlib/setup-ecec.yml`` also replaces the ``job.nemo.run_cmd`` (which is typically ``./nemo.exe``) by ``apptainer exec --pwd /opt/ece-4/run/ {{experiment.container_path}} /opt/ece-4/bin/nemo.exe``. This means we use Apptainer to execute ``/opt/ece-4/bin/nemo.exe`` in the container ``ecec-ompi.sif`` and to change directory to ``/opt/ece-4/run`` which we bind to the host run directory. ECEC is built with OpenMPI v4 on NSC-Freja. It is thus only compatible with OpenMPI on other HPCs. Scripts exist to build the container with MPICH which should be compatible with other vendors e.g. IntelMPI, CrayMPI, etc. To be done ---------- * Replace container MPI by host MPI at runtime for better compatibility and performance * Port container to ECMWF Atos (HPC2020) * Include EC-Earth4 MPICH build in the container * Include ``STANDALONE-TOP`` and ``COUPLED-TOP`` builds for NEMO in container * Build the container for each release Background for those interested ------------------------------- Why containers? ~~~~~~~~~~~~~~~ Pretty much all software depends heavily on external software. A climate model like EC-Earth4 requires compilers (C,C++,Fortran), data formats (GRIB, netCDF, HDF5), parallelisation (MPI), etc. Unfortunately, the vendor and version of these libraries often vary from one HPC to another and also change now and then due to software or hardware upgrades. The maintainers group supports EC-Earth4 use for a few selected HPCs in Europe, but this is difficult to maintain across the ever changing HPC landscape and also leaves some users on their own if their HPC is not supported. It would be much easier if the maintainers could release a compiled EC-Earth4 model which would run on any HPC. Containers exist to solve this very problem. A typical use case is a python code which requires a specific python version and additional modules of specific versions. One would build the required conda environment in a container which packs the entire software stack into a single image file. The file can then be uploaded to almost any HPC and run even if the HPC does not have python or conda installed. ...but what ARE containers (somewhat technical)? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ A container is a single file which contains all software necessary to run an application. It contains its own file system, its own operating system, its own compilers, etc. To build a conda environment in a container you would first need to choose which OS to start from, e.g. Ubuntu, Rocky Linux, etc., then install compilers and some basic software, then Mambaforge, then build the conda environment. At runtime, the container will "bind" your ``/home`` directory on the HPC to the container so that files in ``/home`` are also visible in the container file system. You can choose to bind other directories as well (this is done in ``scriptlib/setup-ecec.yml``). A container can see some devices on the system, e.g. network cards, but may require extra work to see GPUs. Software developers often use Docker to build and run containers, which is widely available and supported. However, Docker requires root priviledges in the container, which HPCs will not grant users. Singularity or Apptainer are designed for HPC use, do not require root access, and also support SLURM. A container built for Singularity/Apptainer is called a "Singularity Image File" or ``.sif``. How do we containerise EC-Earth4? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The containerisation was done by building a stack of containers, i.e. ``.sif`` files. We first build the OS, then the MPI libraries, then HDF5 and netCDF, then the conda environment with ScriptEngine, then EC-Earth. It is built in five steps * ``base.sif`` - OS + some basic libraries * ``pmix.sif`` - PMIx for process managing * ``openmpi.sif`` - MPI library. * ``hdf5-ompi.sif`` - HDF5 (parallel) * ``netcdf-ompi.sif`` - netCDF for C, C++ and Fortran (parallel) * ``conda-se-ompi.sif`` - Python environment with ScriptEngine etc * ``ecec-ompi.sif`` - EC-Earth4 coupled + ocean-only This builds a container with EC-Earth4 to be used on a HPC in a OpenMPI environment. For other environments, e.g. IntelMPI or CrayMPI, you need a container with MPICH which has the structure. * ``base.sif`` - same as Before * ``mpich.sif`` - MPI library * ``hdf5-mpich.sif`` - HDF5 (parallel) * ``netcdf-mpich.sif`` - netCDF for C, C++ and Fortran (parallel) * ``conda-se-mpich.sif`` - Python environment with ScriptEngine etc * ``ecec-mpich.sif`` - EC-Earth4 coupled + ocean-only Note that data, e.g. ``ece-4-inidata`` and ``cmip7-data`` are NOT included in the container (it would be a very large file). The input data will be bound to the container file system at runtime, see ``scriptlib/setup-ecec.yml``. The container includes several builds of EC-Earth4: ocean-only (NEMO+XIOS), atmosphere-only (OpenIFS+XIOS+amipfr), and coupled. The EC-Earth4 container was built on Freja at NSC using Apptainer, but works for both Apptainer and Singularity. MPI strategy ~~~~~~~~~~~~ To run an MPI application such as NEMO or OpenIFS in a container on a HPC we need two installations of MPI. The MPI in the container assigns ranks to the different processes for the executables, while the MPI on the host handles communication. The container can only run if the host and container MPI are compatible, which can be achieved in one of two ways: 1. "Hybrid method": Build identical MPI in the container as is on the host 2. "Bind-mount method": Bind the host MPI library to the container at runtime See also ``Apptainer documentation _`` for more details. EC-Earth4 container currently relies on the "hybrid method", i.e. the host MPI must be OpenMPI and compatible with the OpenMPI in the container. The "Bind-mount method" is work in progress.