===== Getting familiar with the XCES Platform ====== The XCES platform is an instance of the Free Evaluation System Framework (Freva), designed by the Freva team and adapted for the scientists of the ClimXtreme project by Module D. Below you will find relevant information on how to get started with the system. The system can used either through the [[basic_users_guide_xces_new#Access through the shell|shell]], [[basic_users_guide_xces_new#Access through the web|website]], or [[basic_users_guide_xces_new#access_through_python_module_pypi_jupyter|its python module (e.g. for Jupyter)]]. We will give an overview of the capabilities of XCES on all platforms and also the necessary steps to set a working environment in the shell that will allow us [[https://freva-clint.github.io/freva/FrevaCli.html#add-new-data-to-the-databrowser|to share data]] with the rest of the ClimXtreme community or test our [[list_of_plugins|plugins]] in development stage. ===== Setting up a working environment ===== ==== Step 1: DKRZ account & project group ==== To access XCES you will need a DKRZ account. If you still do not have one, you need to create one [[https://luv.dkrz.de/projects/newuser/|through DKRZ's LUV]]. In order to use the system you will need to be part of group ''bm1159'', for that please go to https://luv.dkrz.de (click //join existing project//) and join ''bm1159''. Please **briefly mention which module and project you are in.**\\ \\ **Note**: group ''bm1159'' is used for anything related with XCES/Freva while ''bb1152'' is for all Module A,B,C activities non related with XCES. **Note2**: in the transition to Levante the former group ''bmx825'' with the old Miklipers dissapeared along with the MiKlip machine so only ''bm1159'' is valid from now on. All data, and outputs have been conveniently relocated. ==== Step 2: structure of the folders @ bm1159 ==== At DKRZ you have a ''home/'', a ''scratch/'' and a project ''work/'' directory ([[https://docs.dkrz.de/doc/levante/file-systems.html?highlight=home%20scratch| general info]]): * ''home/'': depending on your username ('' =='' ''{a,b,g,k,m,u}'' + 6 digits) it will be different (''/home/{a,b,g,k,m,u,zmaw}/{a,b,g,k,m,u,zmaw}%%######%%''), so for example for user ''k204229'' it will be ''/home/k/k204229''. * ''scratch/'': similar to ''home/'' it also depends on the username (''/scratch/{a,b,g,k,m,u,zmaw}/{a,b,g,k,m,u,zmaw}%%######%%''), for example ''/scratch/k/k204229''. * ''work/'': the workspaces of each project. For all Freva-ClimXtreme users is ''/work/bm1159/XCES/xces-work//''. \\ \\ XCES will create certain working directories under ''/work/bm1159/XCES/xces-work/'' for you if you run a plugin job via web, although it will not create the symlinks nor all the folders described below. If you never intend to directly work in levante, this might be enough. However, for a better usage of the storage we recommend to __**[[login_shell|log in via shell]]**__ and load the ''xces/freva'' module at levante: $ ssh @levante.dkrz.de $ module load clint xces \\ Loading the module will automatically create all the folder structure underneath for you, e.g (for ''$USER=k204229'') with some symlinks that will make the structure cleaner and lighter: $ ls -lrth /work/bm1159/XCES/xces-work/$USER total 8.0K drwxr-sr-x. 7 k204229 bm1159 4.0K Jul 14 10:27 evaluation_system drwxr-sr-x. 11 k204229 bm1159 4.0K Jul 14 10:30 MYWORK * Use ''evaluation_system'' for anything directly related to XCES, i.e.: data to be indexed into XCES, job outputs (datafiles, plots, configurations), and/or folders where your plugin(s) are being developed/tested. Here we will have the following folders: $ cd /work/bm1159/XCES/xces-work/k204229/evaluation_system $ lh total 20K drwxr-sr-x. 6 k204229 bm1159 4.0K Jun 22 19:01 CMOR4LINK/ <--- place the data you want to index in XCES here drwxr-sr-x. 3 k204229 bm1159 4.0K Nov 20 2020 config/ <--- the configuration files of job runs will be stored here drwxr-sr-x. 27 k204229 bm1159 4.0K Jul 13 15:48 output/ <--- the output folders of the job runs will be stored here drwxr-sr-x. 6 k204229 bm1159 4.0K Dec 8 2021 plots/ <--- the plots of the job runs can be stored hered drwxr-sr-x. 5 k204229 bm1159 4.0K May 18 09:00 plugins/ <--- folder where to place your plugin (tool) repositories for dev/testing lrwxrwxrwx. 1 k204229 bm1159 29 Jan 25 17:15 cache -> /scratch/k/k204229/xces-cache/ <--- place where the temporary files are placed \\ Your index data into Freva, will be placed under ''/work/bm1159/XCES/xces-work/data/crawl_my_data/user-$USER'' that should be a symbolic link to ''/work/bm1159/XCES/xces-work/$USER/evaluation_system/CMOR4LINK/''.\\ \\ As you can see, the ''cache'' folder (folder for temporary calculations for your plugin runs) is a symlink to your ''scratch'' directory. For that: $ mkdir -p /scratch/k/$USER/xces-cache $ ln -s /scratch/k/$USER/xces-cache /work/bm1159/XCES/xces-work/$USER/evaluation_system/cache **Note:** the ''scratch'' folder provides temporary storage and processing of large data. To prevent the cache outputs to take a significant chunk of project space (bear in mind that in the Miklip machine the amount of temporary files far exceeded 35TB!), old data is automatically deleted every 14 days given that a file has not been accessed (i.e. created or modified) during that period. * Use ''MYWORK'' for the rest of project work, i.e.: other data and/or analyses. ==== Step 3: make sure that the folders have the right group permission ==== If you want to plug your own plugins and, in general, make sure that XCES sees all the folders you want without a problem, please, **make sure to use ''bm1159'' linux group for your workspace.** $ cd /work/bm1159/XCES/xces-work/ $ chgrp -R bm1159 If that is the case, in the website you should be able to see the following when trying to ''Plug-my-plugin'' button at the [[https://www.xces.dkrz.de/plugins/|Plugins section]]: {{:public:plug_my_plugin_new.png?nolink|}} \\ ===== Access through the shell ===== To use XCES system you may use [[login_shell|ssh]] (You may copy the following line as is into your shell): $ ssh -X @levante.dkrz.de The ''-X'' will allow you to connect to the remote X server (e.g. to display images). Start setting up the environment by loading the proper module (A list of common ''module'' commands can be found [[module_commands|here]]): $ module load clint xces This **activates** the system for your current session. You might notice some other modules have been loaded. \\ \\ **Note**: unlike with the old version, the new version of Freva works with bash, csh, zsh and fish. **DEVELOPERS note**: Freva loads other required environment variables, for example pointing ''PYTHONPATH'' to ''/home/b/b380001/freva/bin/python'' to ''3.10.5'' and making [[https://docs.conda.io/en/latest/|conda]] and [[https://github.com/mamba-org/mamba|mamba]] available. These package and environment manage systems can be very handy when, for example, trying to install specific libraries or compilers [[https://freva-clint.github.io/freva/developers_guide.html|for tools you are developing]] (not a must, but recommended). ==== Working with Freva ==== Freva is an all-in-one framework with the main features: ''$ freva %%--%%help'' $ freva --help usage: freva [-h] [-V] {esgf,user-data,plugin,history,databrowser} ... Free EVAluation system framework (freva) positional arguments: {esgf,user-data,plugin,history,databrowser} Available sub-commands: esgf Search/Download ESGF the data catalogue. user-data Update users project data plugin Apply data analysis plugin. history Read the plugin application history. databrowser Find data in the system. options: -h, --help show this help message and exit -V, --version show program's version number and exit To get help for the individual sub-commands use: freva --help This is the main tool for the evaluation system: * Usage: ''$ freva COMMAND [OPTIONS]'' * To get help for the individual commands: ''$ freva COMMAND %%--%%help'' \\ \\ ^ freva Option ^ Description ^ Example ^ | [[https://freva-clint.github.io/freva/FrevaCli.html#running-data-analysis-plugins-the-freva-plugin-command|plugin]] | apply some analysis tool | ''freva plugin pca input=myfile.nc outputdir=/tmp variable=ta'' | | [[https://freva-clint.github.io/freva/FrevaCli.html#inspecting-previous-analysis-jobs-the-freva-history-command|history]] | browser your history | ''freva history'' or ''freva history %%--%%plugin movieplotter'' | | [[https://freva-clint.github.io/freva/FrevaCli.html#searching-for-data-the-freva-databrowser-command|databrowser]] | Search CMIP5, MIKLIP or your data with the fast auto-completion browser. If you are in bash, try hitting tab twice whenever you want to input a new attribute or value for the search, you'll see all possible values listed. | ''freva databrowser project=baseline1 variable=tas time_frequency=mon ensemble=r[1-5]i1p1 experiment=*196[0-5]'' | | [[https://freva-clint.github.io/freva/FrevaCli.html#managing-your-own-datasets-the-freva-user-data-command|user-data]] (former ''crawl_my_data'') | when you have your own little database, to put your data into the database. It also allows you to clean your indexed datasets or CMORize your data on the fly | ''freva user-data index crawl_dir /work/bm1159/XCES/xces-work/data/crawl_my_data/user-/some/data'' | | [[https://freva-clint.github.io/freva/FrevaCli.html#searching-for-esgf-data-the-freva-esgf-command|esgf]] | Contact the esgf to query for datasets files and retrieve the wget script used to download them | ''freva esgf %%--%%download-script /tmp/download.wget variable=tas time_frequency=mon ensemble=r1i1p1 experiment=decadal1965'' | All commands have a ''-h'' or ''%%--%%help'' flag to display the commands help.\\ \\ \\ Basic commands for freva:\\ | To get the main help | ''freva %%--%%help'' | | To list the tools | ''freva plugin -l'' | | To get the tool documentation | ''freva plugin %%--%%doc'' | | To see the history | ''freva history'' | | To see the history help | ''freva history %%--%%help'' | ==== Piping Freva to a simple workflow ==== Let's assume that we want to compute a Principal Component Analysis over a series of ensemble members for a certain year (''1960''), for surface temperature (''tas''). We can use the [[https://www.xces.dkrz.de/plugins/pca/detail/|PCA plugin]] and iterate it over all ensemble members of the experiment started in 1960 and compute the PCA of them for that variable. In bash this would be: #!/bin/bash for file in $(freva databrowser project=baseline1 variable=tas time_frequency=mon experiment=*1960); do freva plugin pca input=$file outputdir=/tmp variable=tas pcafile=pca_$(basename $file) done In addition to the existing MIKLIP data, module D will integrate further data sets. So far, hourly and daily data of the regional reanalysis COSMO-REA6 have been added, as well as daily data of the gridded data set HYRAS and selected station data of the DWD station network. Further information about the added data, where to find them, what parameter are integrated and what will be integrated next, can be found [[datasets_at_xces|here]].\\ \\ ===== Access through the web ===== We may log into the system via [[https://www.xces.dkrz.de/]]. As with the shell version, we will also need to use our DKRZ account login. ===== Access through python module (pypi, jupyter) ===== To be able to use freva within [[https://climxtreme.uni-bonn.de/doku.php?id=connec2jupyterhub|jupyter]] you must logon to the DKRZ Levante computing resources and load the xces module. After that you will have to install your jupyter kernel. For example % ssh @levante.dkrz.de $ module load clint xces $ jupyter-kernel-install python --name xces --display-name XCES Then, in your own jupyterhub session when launching a new e.g. Notebook you will see that kernel ready to be used: {{ :public:jupyter_kernels.jpg?nolink&800 |}} Incidentaly, you can also install a jupyter kernel based on freva's R environment like this: % ssh @levante.dkrz.de $ module load clint xces $ jupyter-kernel-install r --name xces-r --display-name XCES-R BUT be aware that there is currently NO API to call freva functionalities within R, so it will only work as an additional R kernel, nothing more. ===== Using XCES/Freva R/Python environment to locally install your own packages ===== xces/Freva ships with its own R and Python environment, with some basic set of libraries included already. These environments can be used out-of-the-box. To check the R/Python version please: $ module load clint xces $ which R;R --version /home/b/b380001/freva/envs/gnu-r/bin/R R version 4.2.2 (2022-10-31) -- "Innocent and Trusting" ... [k204229@levante2 ~] $ which python;python --version /home/b/b380001/freva/bin/python Python 3.10.5 As it could happen that these environments do not have the libraries you require, there is an option to, without the need of creating a completely new environment, prepend your desired libraries to the already pre-existing one. ==== 1. Installing Python packages with pip ==== First load xces and its own python environment: $ module load clint xces You can check whether your needed package is already pre-installed with: $ python -c "import your_package" If you see an error such as ''ImportError: No module named '' you may want to install it locally. If it does not return anything is that the package is already there. To install the Python package, type: $ pip install your_package --user The package will then be downloaded, built, including dependencies, and installed under ''$HOME/.local/lib/python3.X/site-packages''. **NOTE**: conda/mamba does not allow to do something similar to that, so if the package you want to append can only be installed via conda then you will need to create your own environment. ==== 2. Installing R packages locally ==== Say you want to install the e.g. ''MBC'' [[https://cran.r-project.org/web/packages/MBC/index.html|Multivariate Bias Correction package from CRAN]]. You will simply need to follow these steps: $ module load clint xces $ R > install.packages("MBC") Warning in install.packages("MBC") :   'lib = "/home/b/b380001/freva/envs/gnu-r/lib/R/library"' is not writable Would you like to use a personal library instead? (yes/No/cancel) yes <---- Would you like to create a personal library ‘/home/k/$USER/R/x86_64-conda-linux-gnu-library/4.2’ to install packages into? (yes/No/cancel) yes <--- --- Please select a CRAN mirror for use in this session --- ... You successfully installed the package locally. You can see that your packages will be installed in your own ''~/R/x86_64-conda-linux-gnu-library/4.2'' (or whatever R version is currently installed) You can get some more information on e.g. how to build packages locally (I adapted some of the above mentioned text from there) [[https://nesi.github.io/hpc_training/lessons/maui-and-mahuika/installing-packages-locally|here]]. ===== Creating your own conda/mamba environment with the freva library ===== If, on the other hand you want to use freva as a python library alongside others (e.g. in a conda environment) then you can: - install the package via pypi ''pip install freva'' or via conda/mamba ''mamba install -c conda-forge freva'' - add the environment config of XCES in your script, e.g.: import os,sys os.environ["EVALUATION_SYSTEM_CONFIG_FILE"] = "/work/bm1159/XCES/freva/evaluation_system.conf" # this is for the environment os.environ["EVALUATION_SYSTEM_CONFIG_DIR"] = "/work/bm1159/XCES/freva" # this is for the environment import freva ===== Additional info ===== Overall usage information usage of Freva can be found [[https://freva-clint.github.io/freva/index.html|here]], including [[https://freva-clint.github.io/freva/FAQ.html|some Frequently Asked Questions.]] The [[https://www.xces.dkrz.de/plugins/about|HELP]] page of XCES has a lot of important links, as well, including a little tour around the webpage! \\ \\ \\ If you have questions, please contact your Module coordinator (research related), or Deborah Niemann at Deborah.Niermann@dwd.de (data related), or Etor Lucio at lucio-eceiza@dkrz.de (software related).\\