Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: adding recent issues

...

  • Make sure your code is up to date.
  • Try deleting your old venv and starting with fresh installs of solo, r2d2, ewok, and simobs.
  • Rebuild jedi-bundle using the scripts available in jedi-tools' build_skylab.sh.
  • Make sure your environment is set up correctly. Protip: use jedi-tools' setup.sh. We keep the HPC setup scripts up to date with the most recent release of spack-stack.
  • Did you restart the ecflow server?

Previous Issues

Updating CMakeLists.txt to use TAG

Instead of building using the "BRANCH" keyword, typically pointing to develop. You can specify a github hash using the keyword "TAG". See the following example.

Code Block
ecbuild_bundle( PROJECT oops GIT "https://github.com/jcsda-internal/oops.git" TAG <git commit hash> )

PR's CI test is stuck in the queue

Inside the pull request, the CI test shows the message:

Code Block
Queued — Waiting to run this check …

Cause: the job exited as soon as the container was invoked without emitting any useful logs. Frustratingly github doesn't have a mechanism to set a job timeout so if the runner dies without updating the check-run the status is set as waiting forever (and github seems fine with this status even if it's hostile to users and developers). You shouldn't worry about leaving hanging check runs. Our runner backends do have useful timeouts and if they get disconnected from github they will clean up their resources even if they can't report back.

Solution: Retrigger CI

r2d2.error.RegistrationNotFound.RegistrationNotFound

The following error was given by R2D2:

Code Block
Traceback (most recent call last):
 File "/work2/noaa/jcsda/smaticka/data_repos/feedback_files/c3762d_8dayAprMay_24HforeC_eval/r2d2_experiment_fetch.py", line 9, in <module>
  for search_result in R2D2Data.search(item='feedback', experiment=experiment):
 File "/work2/noaa/jcsda/smaticka/jedi_ioda_10apr_gnu/jedi-bundle/r2d2/src/r2d2/r2d2_data.py", line 711, in search
  r2d2_data.validate_search_kwargs(kwargs)
 File "/work2/noaa/jcsda/smaticka/jedi_ioda_10apr_gnu/jedi-bundle/r2d2/src/r2d2/r2d2_item.py", line 258, in validate_search_kwargs
  R2D2Item.process_kwargs(kwargs)
 File "/work2/noaa/jcsda/smaticka/jedi_ioda_10apr_gnu/jedi-bundle/r2d2/src/r2d2/r2d2_item.py", line 377, in process_kwargs
  R2D2Index.process_index_item_kwarg(kwargs, item)
 File "/work2/noaa/jcsda/smaticka/jedi_ioda_10apr_gnu/jedi-bundle/r2d2/src/r2d2/r2d2_index.py", line 171, in process_index_item_kwarg
  raise err.RegistrationNotFound(item, kwargs[item])
r2d2.error.RegistrationNotFound.RegistrationNotFound: 
c3762d is not registered in experiment yet!
You must manually register this Name using R2D2Index.register() method.

Cause: the experiment "c3762d" was not found and was deleted by the R2D2 scrubber based on "lifetime".

Solution: the user will need to rerun the original experiment and update the expid. If a longer lifetime is required, then see R2D2's tutorial document for updating lifetime. 

sbatch: error: Invalid account or account/partition

The following message was received when submitting a skylab experiment on Orion: 

Code Block
batch: error: Batch job submission failed: Invalid account or account/partition combination specified

Cause: the user did not have access to the correct groups in order to run experiments. 

Solution: email the POC for the HPC to grant access to the jcsda groups.

skylab.jcsda.org or experiments.jcsda.org is not responding

...