Date: Mon, 18 Mar 2024 23:05:44 -0600 (MDT) Message-ID: <1452156092.2579.1710824744159@confluence-tomcat.ucar.edu> Subject: Exported From Confluence MIME-Version: 1.0 Content-Type: multipart/related; boundary="----=_Part_2578_1251476492.1710824744159" ------=_Part_2578_1251476492.1710824744159 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Content-Location: file:///C:/exported.html
The CESM team has developed a post-processing = diagnostics tool using python. This page describes how to use it on chemist= ry output on the NCAR HPC, Cheyenne. At least a full year of data is needed= with two months before and two months after the year. For example, to proc= ess 2004 you need output from November 1, 2003 to March 1, 2005. More infor= mation is at: https://github.com/NCAR/CESM_po= stprocessing/wiki, and the Quick Start Guide is here: https://github.com/NC= AR/CESM_postprocessing/wiki/cheyenne-and-geyser-quick-start-guide= p>
This automated diagnostic tool is useful for understan= ding overarching features of the simulation by comparing with a default set= of observations and/or a previous simulation. It can also be used to produ= ce timeseries (See Sect= ion 5 (below)) for long simulations, reducing the space of output files= by compressing the data. This is highly recommended by the CAM-chem team. = Also note, always only keep output that you really need for your science!= p>
> cd= <case_dir> > cesm_pp_activate (opens the virtual environment) [(NPL) ] > create_postprocess --caseroot <case_dir> [(NPL) ] > deactivate (closes the virtual environment)
> cd= /glade/scratch/<user>/post_processing/ > cesm_pp_activate (opens the virtual environment) [(NPL) ] > create_postprocess --caseroot /glade/scratch/<user>/pos= t_processing/<model-run> [(NPL) ] > deactivate (closes the virtual environment)
If you get the SUCCESS! notification, the &l=
t;model-run>
folder has been been created in the post_processing =
location and analysis code has been added. Note: the <m=
odel-run> has to be the same name as your run folder.
Within the 'postprocess' directory (that was created i= n step #2) edit the scripts.
> ls= *xml
Either use pp_config (like xmlchange) or edit th= e following files directly in an editor.
env_postprocess.xml
If post processing is occurring somewhere other than i= n <case_dir>, set the location of the model data:
> ./= pp_config --set DOUT_S_ROOT=3D<full archive path of model run output to = be analyzed>
Example:
> ./= pp_config --set DOUT_S_ROOT=3D/gpfs/fs1/scratch/<user>/archive/<mo= del-run>
Note: do not add slashes to the end o= f the path.
Tell the diagnostics what kind of grids to expect. For= example the 0.9x1.25 degree resolution:
> ./= pp_config --set ATM_GRID=3D0.9x1.25; > ./pp_config --set ICE_NX=3D288 > ./pp_config --set ICE_NY=3D19 > ./pp_config --set LND_GRID=3D0.9x1.25
Other changes:
<ent= ry id=3D"GENERATE_TIMESERIES" value=3D"FALSE" />
You can leave this setting to "TRUE" if you want= to generate timeseries (for longer runs).
env_diags_atm.xml=
code>
Set up to compare with another model run.=
<ent= ry id=3D"ATMDIAG_MODEL_VS_OBS" value=3D"False" /> <entry id=3D"ATMDIAG_MODEL_VS_MODEL" value=3D"True" /> <entry id=3D"ATMDIAG_CLEANUP_FILES" value=3D"True" />
Test dataset (the run you want to analyse)
<ent= ry id=3D"ATMDIAG_test_compute_climo" value=3D"True" /> <entry id=3D"ATMDIAG_test_compute_zonalAvg" value=3D"True" />
Control dataset (the run you want to compare with)
<ent= ry id=3D"ATMDIAG_cntl_casename" value=3D"<cntr_case_name>" /> <entry id=3D"ATMDIAG_cntl_path_history" value=3D"<path-to-comparison-= output-on-archive>" /> <entry id=3D"ATMDIAG_cntl_compute_climo" value=3D"True" /> <entry id=3D"ATMDIAG_cntl_compute_zonalAvg" value=3D"True" />
Time period of analysis for test and control cases, mi= nimum 1 year and need output for 2 months either side of the full year to a= nalyze.
<ent= ry id=3D"ATMDIAG_test_first_yr" value=3D"2014" /> <entry id=3D"ATMDIAG_test_nyrs" value=3D"1" /> <entry id=3D"ATMDIAG_cntl_first_yr" value=3D"2014" /> <entry id=3D"ATMDIAG_cntl_nyrs" value=3D"1" />
Other diagnostic variables to set
<ent= ry id=3D"ATMDIAG_strip_off_vars" value=3D"False" /> <entry id=3D"ATMDIAG_netcdf_format" value=3D"netcdfLarge" />
Diagnostic sets
<ent= ry id=3D"ATMDIAG_all_chem_sets" value=3D"False" />
Then set chem sets to True manually except for chem se= t #6 (this one takes a long time).
Note 1: Chemistry diagnostic set 2 (C= set2) will only be calculated when performing a model-model comparison.
Note 2: To ensure all seasons are cal= culated make sure...
In a=
tm_averages
and atm_diagnostics
files, make sure the #PBS account flag i=
s set:
#PBS -A <account_number>
> qs= ub atm_averages
Calculates the climatological values for test and cont= rol cases (~40 mins for 5 years), check the log files in logs folder.
Find climo files in: $DOUT_S_ROOT/atm/proc/climo=
/$ATMDIAG_test_casename/ and:
$DOUT_S_ROOT/atm/proc/climo/$ATMDIAG_cn=
tl_casename/
> qs= ub atm_diagnostics
If instructions are followed as above, this step calcu= lates model versus model values from the climo data created in step 4a) and= creates diagnostic output (~10 mins for 5 years), check the log files in l= ogs folder.
Find diagnostic files in: $
DOUT_S_ROOT/atm=
/proc/
diag/$ATMDIAG_test_casename-
/$ATMDIAG_cntl_casename=
To visualize the output, open index.html
=
in a web browser.
Timeseries production with this tool can be completed for selected or al= l the output streams (e.g. atm, ocn). The user can specify whether to write= out one large timeseries for the entire run (for example a full 100 years)= or instead produce files containing smaller time-chunks (e.g. of 10 or 20 = years). Time chunks (even 100 years) have to be always define.
env_postprocess.x=
ml
to generate any timeseries<ent= ry id=3D"GENERATE_TIMESERIES" value=3D"TRUE" />
Then, determine whether you would allow partial time-chunks to be produc= ed or only full timeseries chunks.
<ent= ry id=3D"TIMESERIES_COMPLETECHUNK" value=3D"FALSE" /
If this is set to TRUE, years that are not contained in a complete a tim= e-chunk will not be processed. Therefore setting this to "FALSE" makes sure= all your output gets processed.
The default setting is that all timeseries are generated. To allow user-= control over this, set:
<ent= ry id=3D"TIMESERIES_GENERATE_ALL" value=3D"FALSE" /
env_timeseries.xml
Turn on the output streams you wish to process. For example to process t= he atmosphere output stream:
<com= p_archive_spec name=3D"cam"> <rootdir>atm</rootdir> <multi_instance>True</multi_instance
To not process this output stream, you have to set
<mul= ti_instance>False</multi_instance
Next, you have to identify the length of each timeseries chunk, for all = the different output streams you wish to process. For example, processing t= he atmosphere monthly output in 10-year time-chunks looks like:
<fil= e_extension suffix=3D".h0.[0-9]"> <subdir>hist</subdir> <tseries_create>TRUE</tseries_create> <tseries_output_format>netcdf4c</tseries_output_format> <tseries_tper>month_1</tseries_tper> <tseries_filecat_tper>years</tseries_filecat_tper> <tseries_filecat_n>10</tseries_filecat_n> </file_extension>
>qsu= b timeseries
As for the other scripts, you need to make sure your project number is c=
orrectly set (#PBS -A <account_number>
). If you have a l=
ot of output, you may have to increase your PE-layout with more cores to fi=
nish. Usually, atm monthly data are processed very fast. Daily and sub-dail=
y output can take a bit longer depending on the variables you are using.