Online monitoring of experiments
The ScriptEngine runtime environment allows for online monitoring of experiments. The monitoring system can collect data from ongoing model runs and compute metrics that allow for a quick overview and assessment of running experiments. The computed metrics are stored as NetCDF files in the run directory and are available for manual evaluation. At the same time, the system can produce plots from the monitoring metrics and make a graphical representation locally on disk or automatically and continuously uploaded to the EC-Earth Redmine Portal. The monitoring system is controlled by configuration parameters in the ScriptEngine runtime scripts.
Basic monitoring configuration
The monitoring system is activated by setting experiment.monitoring.activate
to true
in the runtime scripts.
This setting is typically found in the experiment configuration script.
base.context:
experiment:
monitoring:
activate: true
Monitoring can be activated for already ongoing experiments, but monitoring metrics will only cover experiment legs for which monitoring has been switched on at the start of the leg. Switching off monitoring and again on later during the experiment is not supported.
The script that controls online monitoring is found in
runtime/se/scriptlib/monitoring.yml
.
Monitoring metrics
If the monitoring system is activated, it will run at the end of each leg and
produce monitoring data in {{experiment.run_dir}}/monitor
.
For most monitoring metrics, the data will be stored in NetCDF files.
These date in these files is complemented by detailed meta-data attributes and
should therefore be useful for most evaluation tools.
If presentation of the monitoring metrics at the Redmine Portal is not
configured (see next section), the monitoring system will produce a Markdown
page including plots in the {{experiment.run_dir}}/monitor/markdown
directory.
This can be used for manual inspection of manual uploading to a website that
supports Markdown.
List of monitoring metrics
General and technical metrics
Experiment id, description and notes
Number of simulated years
Disk usage
Simulation speed [Simulated years per day]
Atmosphere metrics
Near-surface air temperature
Sea-level pressure
Precipitation - evaporation
Precipitation
Evaporation including sublimation and transpiration
Net TOA
Net SFC
Net TOA-SFC
Surface net solar radiation
Surface net thermal radiation
Top net solar radiation
Top net thermal radiation
Total cloud cover
Ocean metrics
Sea surface temperature
Sea surface salinity
Sea surface height
Ocean temperature (3D)
Ocean salinity (3D)
Sea-ice metrics
Sea-ice area/fraction March/September northern hemisphere
Sea-ice area/fraction September/March southern hemisphere
Presentation on Redmine Portal
Beside storing the monitoring metrics locally on disk, the system can create plots and upload them to the EC-Earth Redmine Portal. Thus, the status of the ongoing experiment becomes immediately published and can be shared continuously between EC-Earth users.
In order to access the EC-Earth Redmine Portal for the automatic publication of results, the monitoring scripts needs to authenticate to Redmine. For this, a Redmime API key is needed. The API key can be created for by clicking on the “My account” link in the upper-right corner of any Redmine Portal page:
and then creating the API key by clicking on the link in the menu on the left of the page:
Once the Redmime API key is created (click “Show” in oder to see and copy it),
the key string needs to be copied to
{{experiment.monitoring.redmin_api_key}}
, usually defined in the user
configuration script of the runtime environment:
base.context:
experiment:
monitoring:
redmine_api_key: 57d49f05da3e606b4000ce4895597ac7a52197bc
Note
The key in the above example is just a random string, not a real Redmine API key!
Warning
Although the Redmine API key is not the same as the user’s Redmine Portal password, it is still important to keep it secret! In particular, make sure to not share any files that contain the key! Be careful when committing files to Git and keep appropriate permissions!
Any person that learns the API key of a user can access the Redmine using the user’s identity. If you suspect that your Redmine API key has been shared, reset the key immediately in the Redmine Portal!
If a valid Redmine API key has been configured (and
experiment.monitoring.activate
is true), then the monitoring system creates
an issue under the “EC-Earth experiments” project at the EC-Earth Redmine Portal
at the end of the first leg.
The title of the issue will be constructed as {{experiment.id}}:
{{experiment.description}}
and the status will initially be ongoing
.
This is an example:
The issue will updated after each leg automatically and data will be added to the time series and other plots. As further examples, here are the sea-level pressure time series after 30 model years:
and the map of average sea-level pressure:
For the September sea-ice concentration, an animated map is produced:
Because the monitoring of ongoing experiments creates regular Redmine issues, it is possible to interact on the page just as for any other issue, e.g. engaging in discussions about the experiment by adding comments.
Warning
Because the monitoring system will add information to the EC-Earth Redmine Portal automatically, some care has to be taken. It is advised to switch on the Redmine feature of the monitoring system only with the intention to share the results.
It is possible to switch on Redmine upload of monitoring later on in the experiment run. As long as monitoring (without Redmine) has been activated before, data will be generated and uploaded at the next end of a leg.