Getting familiar with the XCES Platform
The XCES platform is an instance of the Free Evaluation System Framework (Freva), designed by the Freva team and adapted for the scientists of the ClimXtreme project by Module D. Below you will find relevant information on how to get started with the system.
The system can used either through the shell, website, or its python module (e.g. for Jupyter). We will give an overview of the capabilities of XCES on all platforms and also the necessary steps to set a working environment in the shell that will allow us to share data with the rest of the ClimXtreme community or test our plugins in development stage.
Setting up a working environment
Step 1: DKRZ account & project group
To access XCES you will need a DKRZ account. If you still do not have one, you need to create one through DKRZ's LUV.
In order to use the system you will need to be part of group bm1159
, for that please go to https://luv.dkrz.de (click join existing project) and join bm1159
. Please briefly mention which module and project you are in.
Note: group bm1159
is used for anything related with XCES/Freva while bb1152
is for all Module A,B,C activities non related with XCES.
Note2: in the transition to Levante the former group bmx825
with the old Miklipers dissapeared along with the MiKlip machine so only bm1159
is valid from now on. All data, and outputs have been conveniently relocated.
Step 2: structure of the folders @ bm1159
At DKRZ you have a home/
, a scratch/
and a project work/
directory ( general info):
home/
: depending on your username (<username> ==
{a,b,g,k,m,u}
+ 6 digits) it will be different (/home/{a,b,g,k,m,u,zmaw}/{a,b,g,k,m,u,zmaw}######
), so for example for userk204229
it will be/home/k/k204229
.scratch/
: similar tohome/
it also depends on the username (/scratch/{a,b,g,k,m,u,zmaw}/{a,b,g,k,m,u,zmaw}######
), for example/scratch/k/k204229
.work/
: the workspaces of each project. For all Freva-ClimXtreme users is/work/bm1159/XCES/xces-work/<username>/
.
XCES will create certain working directories under /work/bm1159/XCES/xces-work/
for you if you run a plugin job via web, although it will not create the symlinks nor all the folders described below. If you never intend to directly work in levante, this might be enough.
However, for a better usage of the storage we recommend to log in via shell and load the xces/freva
module at levante:
$ ssh <username>@levante.dkrz.de $ module load clint xces
Loading the module will automatically create all the folder structure underneath for you, e.g (for $USER=k204229
) with some symlinks that will make the structure cleaner and lighter:
$ ls -lrth /work/bm1159/XCES/xces-work/$USER total 8.0K drwxr-sr-x. 7 k204229 bm1159 4.0K Jul 14 10:27 evaluation_system drwxr-sr-x. 11 k204229 bm1159 4.0K Jul 14 10:30 MYWORK
- Use
evaluation_system
for anything directly related to XCES, i.e.: data to be indexed into XCES, job outputs (datafiles, plots, configurations), and/or folders where your plugin(s) are being developed/tested. Here we will have the following folders:$ cd /work/bm1159/XCES/xces-work/k204229/evaluation_system $ lh total 20K drwxr-sr-x. 6 k204229 bm1159 4.0K Jun 22 19:01 CMOR4LINK/ <--- place the data you want to index in XCES here drwxr-sr-x. 3 k204229 bm1159 4.0K Nov 20 2020 config/ <--- the configuration files of job runs will be stored here drwxr-sr-x. 27 k204229 bm1159 4.0K Jul 13 15:48 output/ <--- the output folders of the job runs will be stored here drwxr-sr-x. 6 k204229 bm1159 4.0K Dec 8 2021 plots/ <--- the plots of the job runs can be stored hered drwxr-sr-x. 5 k204229 bm1159 4.0K May 18 09:00 plugins/ <--- folder where to place your plugin (tool) repositories for dev/testing lrwxrwxrwx. 1 k204229 bm1159 29 Jan 25 17:15 cache -> /scratch/k/k204229/xces-cache/ <--- place where the temporary files are placed
Your index data into Freva, will be placed under/work/bm1159/XCES/xces-work/data/crawl_my_data/user-$USER
that should be a symbolic link to/work/bm1159/XCES/xces-work/$USER/evaluation_system/CMOR4LINK/
.
As you can see, thecache
folder (folder for temporary calculations for your plugin runs) is a symlink to yourscratch
directory. For that:$ mkdir -p /scratch/k/$USER/xces-cache $ ln -s /scratch/k/$USER/xces-cache /work/bm1159/XCES/xces-work/$USER/evaluation_system/cache
Note: the
scratch
folder provides temporary storage and processing of large data. To prevent the cache outputs to take a significant chunk of project space (bear in mind that in the Miklip machine the amount of temporary files far exceeded 35TB!), old data is automatically deleted every 14 days given that a file has not been accessed (i.e. created or modified) during that period. - Use
MYWORK
for the rest of project work, i.e.: other data and/or analyses.
Step 3: make sure that the folders have the right group permission
If you want to plug your own plugins and, in general, make sure that XCES sees all the folders you want without a problem, please, make sure to use bm1159
linux group for your workspace.
$ cd /work/bm1159/XCES/xces-work/ $ chgrp -R bm1159 <username>
If that is the case, in the website you should be able to see the following when trying to Plug-my-plugin
button at the Plugins section:
Access through the shell
To use XCES system you may use ssh (You may copy the following line as is into your shell):
$ ssh -X <username>@levante.dkrz.de
The -X
will allow you to connect to the remote X server (e.g. to display images).
Start setting up the environment by loading the proper module (A list of common module
commands can be found here):
$ module load clint xces
This activates the system for your current session. You might notice some other modules have been loaded.
Note: unlike with the old version, the new version of Freva works with bash, csh, zsh and fish.
DEVELOPERS note: Freva loads other required environment variables, for example pointing PYTHONPATH
to /home/b/b380001/freva/bin/python
to 3.10.5
and making conda and mamba available. These package and environment manage systems can be very handy when, for example, trying to install specific libraries or compilers for tools you are developing (not a must, but recommended).
Working with Freva
Freva is an all-in-one framework with the main features: $ freva --help
$ freva --help usage: freva [-h] [-V] {esgf,user-data,plugin,history,databrowser} ... Free EVAluation system framework (freva) positional arguments: {esgf,user-data,plugin,history,databrowser} Available sub-commands: esgf Search/Download ESGF the data catalogue. user-data Update users project data plugin Apply data analysis plugin. history Read the plugin application history. databrowser Find data in the system. options: -h, --help show this help message and exit -V, --version show program's version number and exit To get help for the individual sub-commands use: freva <sub-command> --help
This is the main tool for the evaluation system:
- Usage:
$ freva COMMAND [OPTIONS]
- To get help for the individual commands:
$ freva COMMAND --help
freva Option | Description | Example |
---|---|---|
plugin | apply some analysis tool | freva plugin pca input=myfile.nc outputdir=/tmp variable=ta |
history | browser your history | freva history or freva history --plugin movieplotter |
databrowser | Search CMIP5, MIKLIP or your data with the fast auto-completion browser. If you are in bash, try hitting tab twice whenever you want to input a new attribute or value for the search, you'll see all possible values listed. | freva databrowser project=baseline1 variable=tas time_frequency=mon ensemble=r[1-5]i1p1 experiment=*196[0-5] |
user-data (former crawl_my_data ) | when you have your own little database, to put your data into the database. It also allows you to clean your indexed datasets or CMORize your data on the fly | freva user-data index crawl_dir /work/bm1159/XCES/xces-work/data/crawl_my_data/user-<username>/some/data |
esgf | Contact the esgf to query for datasets files and retrieve the wget script used to download them | freva esgf --download-script /tmp/download.wget variable=tas time_frequency=mon ensemble=r1i1p1 experiment=decadal1965 |
All commands have a -h
or --help
flag to display the commands help.
Basic commands for freva:
To get the main help | freva --help |
To list the tools | freva plugin -l |
To get the tool documentation | freva plugin <sometool> --doc |
To see the history | freva history |
To see the history help | freva history --help |
Piping Freva to a simple workflow
Let's assume that we want to compute a Principal Component Analysis over a series of ensemble members for a certain year (1960
), for surface temperature (tas
). We can use the PCA plugin and iterate it over all ensemble members of the experiment started in 1960 and compute the PCA of them for that variable. In bash this would be:
#!/bin/bash for file in $(freva databrowser project=baseline1 variable=tas time_frequency=mon experiment=*1960); do freva plugin pca input=$file outputdir=/tmp variable=tas pcafile=pca_$(basename $file) done
In addition to the existing MIKLIP data, module D will integrate further data sets. So far, hourly and daily data of the regional reanalysis COSMO-REA6 have been added, as well as daily data of the gridded data set HYRAS and selected station data of the DWD station network. Further information about the added data, where to find them, what parameter are integrated and what will be integrated next, can be found here.
Access through the web
We may log into the system via https://www.xces.dkrz.de/. As with the shell version, we will also need to use our DKRZ account login.
Access through python module (pypi, jupyter)
To be able to use freva within jupyter you must logon to the DKRZ Levante computing resources and load the xces module. After that you will have to install your jupyter kernel. For example
% ssh <username>@levante.dkrz.de $ module load clint xces $ jupyter-kernel-install python --name xces --display-name XCES
Then, in your own jupyterhub session when launching a new e.g. Notebook you will see that kernel ready to be used:
Incidentaly, you can also install a jupyter kernel based on freva's R environment like this:
% ssh <username>@levante.dkrz.de $ module load clint xces $ jupyter-kernel-install r --name xces-r --display-name XCES-R
BUT be aware that there is currently NO API to call freva functionalities within R, so it will only work as an additional R kernel, nothing more.
Using XCES/Freva R/Python environment to locally install your own packages
xces/Freva ships with its own R and Python environment, with some basic set of libraries included already. These environments can be used out-of-the-box. To check the R/Python version please:
$ module load clint xces $ which R;R --version /home/b/b380001/freva/envs/gnu-r/bin/R R version 4.2.2 (2022-10-31) -- "Innocent and Trusting" ... [k204229@levante2 ~] $ which python;python --version /home/b/b380001/freva/bin/python Python 3.10.5
As it could happen that these environments do not have the libraries you require, there is an option to, without the need of creating a completely new environment, prepend your desired libraries to the already pre-existing one.
1. Installing Python packages with pip
First load xces and its own python environment:
$ module load clint xces
You can check whether your needed package is already pre-installed with:
$ python -c "import your_package"
If you see an error such as ImportError: No module named <your_package>
you may want to install it locally. If it does not return anything is that the package is already there.
To install the Python package, type:
$ pip install your_package --user
The package will then be downloaded, built, including dependencies, and installed under $HOME/.local/lib/python3.X/site-packages
.
NOTE: conda/mamba does not allow to do something similar to that, so if the package you want to append can only be installed via conda then you will need to create your own environment.
2. Installing R packages locally
Say you want to install the e.g. MBC
Multivariate Bias Correction package from CRAN. You will simply need to follow these steps:
$ module load clint xces $ R > install.packages("MBC") Warning in install.packages("MBC") : 'lib = "/home/b/b380001/freva/envs/gnu-r/lib/R/library"' is not writable Would you like to use a personal library instead? (yes/No/cancel) yes <---- Would you like to create a personal library ‘/home/k/$USER/R/x86_64-conda-linux-gnu-library/4.2’ to install packages into? (yes/No/cancel) yes <--- --- Please select a CRAN mirror for use in this session --- ...
You successfully installed the package locally. You can see that your packages will be installed in your own ~/R/x86_64-conda-linux-gnu-library/4.2
(or whatever R version is currently installed)
You can get some more information on e.g. how to build packages locally (I adapted some of the above mentioned text from there) here.
Creating your own conda/mamba environment with the freva library
If, on the other hand you want to use freva as a python library alongside others (e.g. in a conda environment) then you can:
- install the package via pypi
pip install freva
or via conda/mambamamba install -c conda-forge freva
- add the environment config of XCES in your script, e.g.:
import os,sys os.environ["EVALUATION_SYSTEM_CONFIG_FILE"] = "/work/bm1159/XCES/freva/evaluation_system.conf" # this is for the environment os.environ["EVALUATION_SYSTEM_CONFIG_DIR"] = "/work/bm1159/XCES/freva" # this is for the environment import freva
Additional info
Overall usage information usage of Freva can be found here, including some Frequently Asked Questions.
The HELP page of XCES has a lot of important links, as well, including a little tour around the webpage!
If you have questions, please contact your Module coordinator (research related), or Deborah Niemann at Deborah.Niermann@dwd.de (data related), or Etor Lucio at lucio-eceiza@dkrz.de (software related).