Additional information
Known issues
Bit-reproducibility:
Note
EC-Earth will generally not be bit-for-bit reproducible when the number of MPI tasks is varied. If a user wishes to run an experiment which is bit-identical to a previous experiment the number of MPI tasks should not be changed.
As a general rule, climate models do not yield bit-identical results when an experiment is repeated after changing
hardware, e.g. moving to a new HPC or following a hardware upgrade
compilers, e.g. switching from GNU to Intel compilers
MPI implementation, e.g. switching from OpenMPI to IntelMPI
versions of libraries such as netCDF, BLAS or Intel MKL
However, it is not expected that such changes should result in significant changes in the model climate.
It has been noted that EC-Earth v4.1.0 and above may not produce bit-reproducible results when the following parameters change
Number of MPI tasks for OpenIFS (in coupled simulations)
Frequency of FullPos calls in OpenIFS (
sample_rateinconfig_oifs.yml)
although this is dependent on the hardware and software used.
When the number of MPI tasks is changed, the order in which OASIS computes global sums can change and break bit-reproducibility.
The bfb option in OASIS can overcome this issue but at a large and often unreasonable computer cost.
Changing the sampling frequency in OpenIFS, i.e. how often XIOS sends model fields to XIOS for writing, can break bit-reproducibility. This seems to be related to the math libraries (Intel MKL, OpenBLAS etc). Tests at HPC2020 (ECMWF) found that it is possible to have bit-reproducible results for different sample_rate as long as one of the following applies (but not guaranteed):
Have Intel CPUs and Intel MKL with
export MKL_CBWR=AUTO,STRICTHave AMD CPUs (HPC2020, NSC), and Intel MKL 19.0.5 or older, and
export MKL_DEBUG_CPU=5andexport MKL_CBWR=AUTO,STRICTHave any CPUs with OpenBLAS but build for SandyBridge architecture.
EC-Earth 4.1.0 is bit-reproducible with changing sample_rate on LUMI without any of the above settings, but setting export LIBSCI_ARCH_OVERRIDE=ivybridge (or sandybridge) is required for bit-identical results when the number of MPI processes.
Use of experimental GRIB codes
Warning
Care must be taken when assigning GRIB codes to new variables. Reusing “experimental” GRIB codes without a coordinated strategy can lead to hidden conflicts and model crashes.
In OpenIFS, many GRIB codes—especially in the default table 128—are marked as “experimental products” and appear to be available. However, a significant subset of these codes is already used internally or by auxiliary components, even if not immediately visible to users.
Problem description
A conflict was encountered when introducing LPJG CO2 fluxes in parallel with bare soil albedo using the same GRIB codes:
118, 119, 120
The codes 118–120 (table 128) were assumed to be free experimental slots and were used for:
fco2natfco2antfco2npp
At the same time, bare soil albedo fields also used GRIB codes in the range 117–120 and were appended to the ICMGG initial condition file by:
scripts/utils/update_icmgg.sh
This resulted in duplicate GRIB codes in the same file, leading to model aborts during initialization due to inconsistent GRIB definitions.
Etienne attempted to relocate the LPJG fields to nearby codes (e.g. 114–116) also failed, as these codes were already in use internally by OpenIFS, despite being labeled as experimental.
Resolution
The issue was resolved by reassigning LPJG CO2 fluxes to GRIB codes in the range 210064–210080 in table 216, which are explicitly reserved:
co2of → 210067
co2nbf → 210068
co2apf → 210069
co2fire → 210080
tcco2 → 210064
This avoids overlap with codes used by bare soil albedo and other internal OpenIFS components.
Lessons learned
“Experimental” GRIB codes in table 128 are not guaranteed to be unused.
Some GRIB codes may be implicitly reserved or used internally in OpenIFS (e.g. in physics or coupling code), even if not documented externally.
Conflicts may only appear at runtime (e.g. during initialization or GRIB consistency checks), making them difficult to diagnose.
Recommendations
Avoid using GRIB codes from the default table 128 unless their usage is fully verified.
Prefer using dedicated tables (e.g. table 216) where code ranges are more explicitly managed.
Cross-check GRIB codes against:
OpenIFS source code (e.g.
yom_grib_codes.F90)ECMWF parameter database: https://codes.ecmwf.int/grib/param-db/
Maintain a shared registry of GRIB codes used within EC-Earth to prevent overlaps between components.
Differences in OpenIFS 48r1
A new cycle of the OpenIFS atmosphere model (48r1) is introduced in EC-Earth v4.1.0. Here follows a list of changes to pay attention to:
Initial data,
ece-4-data-base, has changed as 48r1 needs different files than older versions. Make sure you have the latest data.Snow now has multiple layers and thus several snow fields are 3D.
Many necessary libraries, e.g. ecCodes, are downloaded to
sources/oifs-48r1/externaland compiled with the model. Compile times may thus be longer.