Most of today's meeting was concerned with the reorganization of the ObsSpace classes in oops, ufo, and ioda.

Xin started the discussion by pointing out multiple places in ufo where code is duplicated, both in terms of the file structure and the code itself.  He then went on to illustrate several examples of conditional execution based on if/else if statements.  This could be cleaned up substantially and optimized with a more object-oriented approach.

For this reason, there is an effort at JCSDA (led by Xin, Steve, and Yannick) to reorganize the ObsSpace data structure in order to:

  • Reduce duplicated subroutines
  • Simplify the APIs
  • Re-design ObsSpace data structure


Then Steve shared similar concerns and efforts, focusing in particular on the reading and writing of data in ioda:


The next steps are to design new class structures for ufo and ioda, working in parallel to ensure that the new ObsSpace tasks complement the new IODA reader/writer tasks.

Much of the work with the IODA reader/writer tasks involves filtering the data to only read in what is needed.  Yannick emphasized that much of this can be done with a good database tool like ODB, which can query and filter the data.  There is ongoing work to determine what the best approach here might be (see April 12 meeting notes).

Steve's presentation was then followed by a discussion on where best to implement polymorphism for the IODA reader/writer tasks and in particular whether to place it in the C++ layer or the Fortran layer.  Fortran 2003 includes enhanced features for polymorphism which are currently being exploited by LFRic at the Met Office to good effect, though compiler support can be a challenge.  However, Chris expressed some concern about the limitations of polymorphism in Fortran 2003 relative to C++.  It was suggested that it might be better to handle the polymorphism in C++, leaving the Fortran code more targeted to specific tasks.  If this  is indeed the approach we follow, then it would be beneficial to move some of the current Fortran code into the C++ layer to minimize code duplication.  This might be a good objective for later this summer.

The short-term goals of this effort to reorganize ObsSpace is to clean up the code, but on a longer term we wish to assess which approach would be optimal, considering new options such as ODB.  Chris cautioned against relying too much on complex inheritance hierarchies because this can make code fragile.  Yannick suggested that much of this could potentially be avoided by taking a database approach like OBD.

Several concerns were raised during the discussion. Only read into memory what will be used by the forward operator. Some observation types require a single latitude, longitude value where other types require multiple latitude and longitude values. Because of this, the latitude and longitude may not belong in a base class. How do we want to handle observation variables, such as temperature, that are common across many (if not all) observation types. For example, do these variables warrant their own class? It was emphasized that we are at the beginning of the design process and what was shown are preliminary design ideas. All of the concerns raised are excellent questions that are speaking more to the details of the design, and will be considered as we move forward to refine the design.

Then Mark gave a brief overview of the Git Flow work flow, including a request that JEDI developers adopt it.  The basic philosophy is nicely encapsulated in the following diagram developed and distributed by Vincent Dreissen - we encourage all JEDI developers to download this and have it handy as a reference:

 



One question that came up is how repository forking fits into this Git Flow strategy.  There is currently no need for developers to fork JEDI repositories.  Instead, developers are encouraged to create their own feature branches and to merge them into the develop branch via pull requests.  However, if a developer wishes to fork a repository, that is acceptable.  Chris mentioned that it can be useful because then you have push permission to all branches.  

As the user base for JEDI grows, we may find a greater use for Forking.   To manage the code development from multiple users, it may be beneficial for each Center to work with their own fork and then to merge them as needed.   Another instance where Forking could be useful is when we make the releases public.  Then anyone who does not have write permission to the repositories can fork and work with their own copies

Another comment had to do with the "git flow feature publish" command.  As with any git flow command, it is not necessary to use this - the same functionality can be achieved with basic git.  The publish command is just shorthand for telling git to create a new branch on GitHub with this name and track the remote version.

It was also mentioned that on many HPC systems, users do not have permission to install the gitflow application.  Again, that is not a major problem since git flow just provides shorthand access to git's basic functionality.  So, you can still follow the git flow paradigm with basic git.  Just remember to use these naming conventions for your branches:

  • feature/mybranch
  • release/mybranch
  • bugfix/mybranch
  • hotfix/mybranch

And terminate these branches after they have been merged by means of a pull request.


  • No labels