Infrastructure Knowledge Base

This is the landing page for the Infrastructure team's knowledge base. The goal is to provide one centralized location for procedures, information, templates, troubleshooting, and FAQ. For now, we will try and sort information based on overall subject.

Misc Short Topics

Content here is not extensive enough to have a dedicated topical page.

UCAR Hosted Physical Machines

Right now we have one physical machine hosted by UCAR in the Mesa Lab Data Center. See ticket HELP-36848 for more information on the setup process. This ticket can be re-opened to address fixes or permission changes on the device. Device passwords, user accounts, and SSH keys are managed by the JEDI Infra team.

Device power administration is done via the Morocco web tool. This must be accessed on the UCAR VPN and you need to be granted permission for each machine. Use the "Sites" tab and navigate by datacenter → cabinet → pod to find the machine's control dashboard.

Device: JCSDA-PRECISION-TOWER-7875-U1
- OS: Ubuntu 24.04
- SSH Address: 192.168.19.115 (must be on UCAR VPN)
- Physical location: MLDC - cabinet S5 - POD 2

AKA the ML (or Mesa Lab) Computer

We are using this machine, as of fall '25, to run an experimental Near-Real Time (NRT) system called "obsbench". This system contains 3 main components: ingest, model experiments, and post-processing. This section briefly documents those components and how they come together for our product. We are using a shared obsbench user to create and run the NRT experiments. Files and configuration are stored in obsbench@ucar-precision-7875-tower:~/nrtdev , you can set up the correct environment by running source ~/nrtdev/setup.sh which will include cylc installed in your venv.

Ingest - observations and model ingest

Observation ingest
- Obs: gdas_prepbufr_ingest, ssec_amv_ingest, and gdas_amsua_ingest.yaml
- Configuration location: /home/obsbench/nrtdev/configs/ingest-observations.yaml
- Current workflow engine: cylc
- Runs continuously on a day delay for 6 hour cycles
NGCM Backgrounds (F32 and F64)
- Files: F32 and F64 Analysis
- Configuration location(s): /home/obsbench/nrtdev/configs/ngcm-backgroundF32.yaml and /home/obsbench/nrtdev/configs/ngcm-backgroundF64.yaml
- Current workflow engine: cylc
- Runs continuously on a 7 day delay, once daily for a 1 day cycle
Re-ingest of prepbufr
- Obs: gdas_prepbufr_ingest
- Configuration location: /home/obsbench/nrtdev/configs/reingestPrepbufr.yaml
- Current workflow engine: cylc
- Runs continuously on a two day delay for 6 hour cycles
- Note, this was added to get the updated 48h prepburf files that contains a lot more data than the NRT prepbufr files

HofX/ NGCM - Two experiments for F32 and F64

Resolutions: low and medium
Observations: radiosonde, aircraft, satwinds, amsua metop-b, amsua metop-c
Configuration location(s): /home/obsbench/nrtdev/configs/ngcm-obsbenchF32.yaml and /home/obsbench/nrtdev/configs/ngcm-obsbenchF64.yaml
Current workflow engine: cylc
Runs continuously on a 8 day delay (plus one hour delay for F32 and two hour delay for F64 due to memory constraints) for a 7 day forecast, once daily

To check the running experiments:

View experiments via cylc tui

ssh -Y <user.name>@precision-tower-7875.nrit.ucar.edu
sudo -iu obsbench     # Talk to Ashley, Evan, or Travis if you need to be added to the user group
source ~/nrtdev/setup.sh
cylc tui              # Use the arrow keys to open the experiments and navigate

Monitoring is set up under the obsbench user to check the current running workflows for task failures. If a task fails, a slack message will be sent to the mvp channel. This script runs via cron after the daily experiments complete. Monitoring scripts and logs can be found at /home/obsbench/nrtdev/monitoring

Space shortcuts

Page tree

Contents:

Misc Short Topics

UCAR Hosted Physical Machines

AKA the ML (or Mesa Lab) Computer