Yannick opened the meeting announcing that due to the recent interruptions of the Thursday schedules (code sprints, JEDI release, Academy) we will catch up on current activity by doing a general roundtable update. Next week is another Academy so we will not hold this meeting. We will start up again in two weeks and get back on the regular schedule (of alternating roundtables and special topic discussions) with a presentation by Wojciech on developing JEDI in his IDE (Integrated Development Environment, such as Visual C++ or Xcode).

JEDI1 (Infrastructure)

Mark M gave the following summary.


JEDI 1 Summary

We have not had a meeting for the past few weeks due to the (two) JEDI Academies and the Thanksgiving holiday so this update is not comprehensive.   We'll focus on a few items of most interest to the group.

Academies: Many in the JEDI team, including the JEDI 1 team have been occupied with carrying out our two virtual academies, one on Nov 16-20 and another next week focusing on the UKMO and UASF.

JEDI Stack vetting procedure: Several members of the team have expressed a need for a vetting procedure when new software packages are proposed to be added to the JEDI stack.  Adding dependencies to JEDI can limit portability and increases the time needed to maintain HPC modules, local (laptop/workstation) modules, and containers.  So, for this reason, before a build script can be added to the JEDI stack, it must be approved by the JEDI 1 team, often in consultation with the broader JEDI team.  Mark M is working on updating the JEDI-stack documentation on GitHub to outline this procedure; expect a pull request probably by tomorrow and feel free to review it if interested.

Public repo synchronization:  Since our last JEDI bi-weekly meeting, we have implemented a web hook that will automatically synchronize the public and internal JCSDA repositories.  Currently it's only implemented for the develop branches of each repo but it can easily be extended to master if we deem that that is warranted.  Here is how it works: whenever a pull request merged in to the develop branch of JCSDA-internal, the web hook will push those changes to the develop branch on the corresponding JCSDA repo.  It's working well for over 90% of PRs now but there is still an issue with git-lfs.  If the PR includes a file that is tracked by git-lfs, then the webhook fails and the push has to be done manually.  The reason for this failure is that git-lfs is not working for the serverless AWS lambda function that constitutes the back end of the web hook.  Mark M is working on debugging this.

Singularity: In version 3.6 and beyond (the current release is 3.6.4), Singularity changed the verification signature for containers.  In preparation for the Academy, Singularity was upgraded on the Academy nodes and the signature on the jedi-gnu-openmpi-dev.sif development container was updated.  So, if you are using an older version of singularity and you try to verify the container with the "singularity verify" command, it will likely fail.  Mark M is working on updating the the other containers (also the Vagrantfile) and also on tagging and archiving all of the singularity, docker, and charliecloud containers with the release tag.  Mark M is also working on creating HPC application and development containers to accompany the release.  One immediate use case is NASA's Pleiades system - Mark M is working with several Pleiades admins and users to get multi-node singularity containers working there as well.  Contact him for further details if you are interested in this.

HPC modules:  In the next few weeks the JEDI 1 team will be updating HPC modules (some are already updated) to include CGAL and to replace the previous bufrlib with a version obtained directly from NCEP.  Mark M started this on the single-node AWS platform but is running into problems with the intel compiler complaining about a license.  He's debugging it now.

MPAS CI: Maryam has been working with the MPAS team on implementing CI testing for MPAS.  The work is largely done (working for gnu and clang) and there is an active PR.  There was an issue with the intel container but we believe this is close to being solved.

CI Pipelines: Maryam has also been working on cost estimates and upgrades for the CI Pipelines.  The pipeline is currently only implemented for oops as a base repo, with ioda, ufo, and soca as the downstream repos.  She is now working to extend the pipelines to fv3-jedi but to do so she has to sort out some issues, including how to use a tagged version of crtm with the jedi-build package (this is needed to get the soca pipeline to work properly as well).

CodeCov: Maryam also looked into CodeCov pricing for internal (private) repos.  We have to sort out the details but it is likely that only 5-10 JEDI core team members will have access to the detailed CodeCov logs for PRs in the future.  Stay tuned for further updates.

Steve H has continued to work on ioda-engines integration (see JEDI 2 update).

Mark O has started looking into setting up a systematic approach to assessing and monitoring JEDI performance through profiling.  He recommends using Vtune as a useful profiling tool for identifying bottlenecsk and cross-platform comparisons.  He has also been maintaining the NRT websites and is currently sorting out some issues with the GEOS site.

Claude and Rick have been making progress on ewok.  It's now running on Orion with fv3-jedi.  Rick has also been looking more into how we might set up virtual environments for JEDI python dependencies.

Chris S asked for the timeline for EWOK since NCAR is interested in this tool. The bottom line estimate is first quarter of AOP 2021 (Apr-Jun, 2021). The current status of EWOK is that a prototype version is running forecasts on Orion using the FV3 model. EWOK currently uses ECFLOW and other flow platforms such as CYLC are in the works. EWOK is undergoing a lot of development activity now, and the documentation is incomplete. EWOK is checked into github (https://github.com/JCSDA-internal/ewok) so it is available for perusal, but the advice is to wait a couple weeks (for the activity to quiet down a bit) before diving in and studying it. Yannick added that EWOK will be presented in January during a Thursday special topics discussion.

JEDI2 (Observations)

Ryan gave the following summary.

JEDI 2 Summary

During the past three weeks, we had the JEDI academy and the Thanksgiving holidays. Because of this, JEDI2 had only one meeting. Notes are available at [1]. This meeting featured two presentations from EMC.

The first presentation [2], given by Ron Mclaren, discussed the state of the bufr-parsing code that he and Rahul have been writing inside the ioda-converters repository.

To remind today's audience, this parser is configurable via a YAML file, it reads in bufr (generally radiance data for now) and writes out a IODA-engines ObsGroup, which can then be either used directly in-memory or stored to disk for later.  It wraps around the latest version of the nceplibs-bufr library that EMC has released on GitHub. We have a PR for nceplibs-bufr in the jedi stack, and I do have a module for this on Hera.

There was much discussion during and after the presentation. Ben Ruston noted that an end user or developer still needs to write a large block of BUFR mnemonics into the YAML file so that the parser understands the structure of the BUFR that Ron wants to parse. He suggested several utilities to parse the bufr header to automatically extract this information, and further suggested that Ron examine the eccodes project. The group also noted that Jamie Bresch has been working on parsing a different set of BUFR inputs, and she submitted a pull request to ioda-converters on Monday (JCSDA-internal/ioda-converters#368 - [3]).

The second presentation was given by Emily Liu [4], and it described EMC's transition from operational GDAS to UFO for radiance-based instruments. She showed a schematic of the steps needed in the processing flow, along with timelines to the end of the AOP. She highlighted several areas where we still need work. For example, if we move away from GSI ncdiags we need to adjust our ioda-converters and have them append additional information regarding scan positions and instrument channel information. We also need to further need to expand JEDI's QC flag capabilities. Meetings to address these will be scheduled over the next two weeks.

We did not have much time to discuss group issues. These are listed in the meeting notes [1] and will be handled separately. Wojciech is working on expansions to the Parameters classes. Neill is revising his observation-error PR that had unfortunate interactions with ROPP and downstream repositories. Steve is working on the NCEPLIBS-bufr PR. Both he and I are looking at improvements to how we handle unit test files. Praveen is preparing for the EMC-internal jedi academy.

One further note, the ioda-engines repository is migrating to ioda. Downstream impact should be minimal, and an epic describing the steps of this migration is viewable on ZenHub [5].


Links:

[1] - https://docs.google.com/document/d/1m40qrodQnBmo5LHN1FOnyNzokJ09HMMsYX_nxJG_uYU/edit?usp=sharing
[2] - https://drive.google.com/file/d/1fN8UvjiiNldhnf3oj2LGoQXilVHgTw-4/view?usp=sharing
[3] - https://github.com/JCSDA-internal/ioda-converters/pull/368
[4] - https://drive.google.com/file/d/1_xL-K8DduuVl08Tn9eI0lPNi728YR-7Y/view?usp=sharing
[5] - https://github.com/JCSDA-internal/ioda-engines/issues/157


JEDI3 (Models)

Dan noted that there was no JEDI3 meeting since he last reported due to the recent release and academy work. He added that the JEDI3 team has been providing support for the OBS project team, along with getting caught up with reviewing model related pull requests.

JEDI4 (DA Methodology)

Anna presented the following summary.

JEDI 4 Summary

In the last three weeks:
  • at Nov 16 JEDI4 meeting Sergey led a discussion on implementing static covariances inside the local volume solver.
  • BJ has run month-long cycling experiments with multivariate B with the new configuration, the results are improved from experiments with univariate B.
  • JJ is working on low resolution EDA, and is looking into tuning obs errors.
  • Marek is looking into partitioning in atlas.
  • Olly is working on scientific documentation on wind transforms. He is also working on building transi with JEDI, transi now builds and the tests run with ufo-bundle.
  • Sarah is working on putting MPI into NRL static covariances.
  • Sergey is working on Halo distribution
  • Travis started benchmark runs for Var/EnVar/LETKF.
  • Benjamin has several PRs in saber and model repositories, planned to be merged next week:
    • adding a yaml option for level-dependent lengthscales, and an option to specify different lengthscales for different variables in yaml. This affects all models that use that yaml option.
    • switching to using bump_interpolator instead of type_obsop (allows for simpler interface in the models)
    • adding an option to use atlas for mesh generation. This will be introduced as an option in addition to stripack meshes.
  • Clementine is working on block Lanczos minimizer
  • Fabio added an option to oops minimizers to read forecast sensitivity, to be used in FSOI.


JEDI5 (Support, Training)

Yannick reported that the first ever virtual Academy, two weeks ago, was successful. Next week will be the second virtual Academy with the UKMO and USAF. Yannick added that the recent code sprints, JEDI release and academies have interfered with the core team's "pace" (speed of reviews, code development, etc.) and things should return to normal after next week's academy is completed.

Updates from the Group

At this point the updates from the core team on the JEDI AOP categories was finished and the floor was opened up to updates or questions from the group.

JJ asked about being able to pass QC results from one outer loop to the next output loop. Essentially once a location is filtered out, it gets marked as a missing value and cannot be added back into the assimilation downstream (in a subsequent outer loop). A use case that JJ mentioned is to not assimilate cloudy locations in the first outer loop, but rather introduce those in subsequent outer loops. Some discussion ensued and it was noted that the inability to pass QC results from one outer loop to the next was intentional. Operational centers have determined that it's not worth the computing resources to restore observations back into the assimilation once they are rejected. However, JJ's use case may be useful for research purposes so this topic warrants further discussion.

Mark M announced that work is underway to get our JEDI containers running on NASA's Pleiades supercomputing system, for both single-node and multi-node workflows. Anyone interested in testing this should contact Mark.

Yannick closed the meeting at this point since there were no more updates or questions. We will not meet next Thursday (12/10/20) due to the Academy, so next meeting is in two weeks (12/17/20) where we will hear from Wojciech on developing JEDI in his IDE.

  • No labels