This webpage documents the project to rewrite the HWRF automation scripts in Python. At the time this project started, there were a total of four different scripting systems within NCEP and DTC alone, which independently implemented the HWRF, using a combination of ksh, Perl, awk, sed, grep and other languages. The systems had insufficient fault tolerance and error detection, and were a major drain on manpower. Management and developers came to the decision that the scripts need to be replaced.
We document here both the ongoing work to add functionality to the Python-based HWRF, hereafter called the pyHWRF, and also provide high-level documentation of the existing functionality and design of the system. Low-level documentation will be provided automatically generated from the Python docstrings within the source code itself. This website only attempts to provide a high-level description of the modules, packages, scripts and how to use them and modify them. Detailed descriptions of every single function are available in the docstrings.
This is the original project proposal, given to management in the NCEP Environmental Modeling Center (EMC), NCEP Central Operations (NCO) and the Developmental Testbed Center (DTC) on September 19, 2013. It was at that meeting that management decided on Python for the rewrite (alternatives being Ruby or 1993 ksh). Their decision was primarly due to prior existing knowledge of Python throughout the atmosphere and ocean modeling community, especially within NOAA.
HWRF Script Rewrite and Unification September 19, 2013 Timothy Brown and Samuel Trahan
Overview The HWRF scripting system in its present state is over-complicated, has insufficient fault tolerance and is a major drain on manpower; it is sorely in need of replacement. HWRF scripts exist in three different forms at present: an operational system run by NCO, a similar system run by EMC, and a less capable but more portable system run by everyone else and maintained by DTC. The system used by NCO and EMC has insufficient fault tolerance, causing occasional operational failures and frequent failures in development parallels on less reliable machines. Furthermore, the system is over-complicated: more than 38000 lines for the base workflow, and tens of thousands more for graphics and automation. That makes any development or debugging major tasks rather than the simple efforts they should be. Continuing to use the current system, as it expands and gets increasingly complex, will be infeasible and will result in ever-increasing drain on resources on EMC, DTC and NCO.
One of the reasons for the issues in the present system is the choice of ksh88 as primary language. That language lacks basic functionality such as message passing between jobs, internal state beyond simple strings and numbers (and limited 1D arrays), nor failure information beyond 8 byte integers. The language has no built-in support for any basic operations such as date manipulation, string processing, binary I/O, numerical computation and the like, which are part of the standard library of most non-shell languages. All of this forces the inefficient use of numerous subprocesses, temporary files and extraneous tiny fortran programs, which slow the scripting system and drastically increase its complexity.
A second cause of the problems are that everyone is working on a different set of scripts than EMC. The DTC scripting system is simpler, and incorporates some of the fault tolerance features that EMC’s scripts should have, but lacks our real-time capabilities. Having to maintain two separate scripts wastes an enormous amount of development time in both EMC and DTC. NCO has to make many custom changes, controlled by numerous checks to the $PARAFLAG variable simply to change data paths and other features that should be much easier to modify than they presently are. This wastes development resources our organizations could otherwise be putting towards improving forecast skill or reliability, or towards our other duties.
To solve these problems, the EMC and DTC HWRF groups would like to rewrite the present ksh-based HWRF systems to create a single Python-based system that is simpler, easier to debug and develop, more fault tolerant, easier to reconfigure, and less error-prone. The choice of Python 2.6.6 has come from a discussion between EMC, NCO and DTC, and will be explained later in this document. We believe this rewrite would be of great benefit for all three of our organizations as well as for our external collaborators whose contributions have been invaluable.
However, before we begin the rewrite, we need confirmation from NCO that Python 2.6.6, presently installed on WCOSS compute nodes, is acceptable for an operational system. We are not asking that anyone approve our entire implementation in advance; we just want to be assured that the use of Python 2.6.6, in and of itself, will not be the reason the implementation will be refused. This is because it is infeasible to maintain the Python-based system as a third system from now until March 2014, and it will also be infeasible to rewrite an entirely new ksh-based system at the last second if stakeholders request that all Python be removed.
EMC and DTC have approved this plan. We just need confirmation from NCO.
Language First: a bit of history. The current EMC and DTC scripts are written in Korn shell. EMC uses the 88 variant, while the DTC variant is 93. The DTC scripts are more functionally oriented and rely on features only present in the newer 93 variant. The reason for the use of ksh88 in EMC is that it was mandated in the original implementation in 2007 due to limitations of NCEP’s contract with IBM. Those limitations no longer exist: Python is used even by the operating system scripts, and is installed on WCOSS compute nodes.
The DTC and EMC have agreed a script unification is of vital importance. It will streamline the R2O and O2R transition processes by the same scripts and framework used in operations to all researchers. This unification will reduce the time and money spent by both the DTC and EMC annually by eliminating the duplication of work and providing a more robust framework that will be less error prone. We believe it will also reduce NCO’s expenditure of resources due to the new system being simpler, easier to debug, more fault-tolerant and easier to configure (ie.: less SPA time to implement it!)
Upon consulting all the stakeholders, including NCO, it was decided that Python 2.6 provides the best path forward for HWRF system. As this scripting language and version is currently installed on WCOSS, Jet, Zeus and Yellowstone. The new scripts for driving the HWRF system will be called pyHWRF. In the future we suggest a migration to Python 2.7 since it is the long-term support release of Python 2.x and has many upgrades, fixes and forward-compatibility features. The choice of Python 2.6 instead of 2.7 is merely due to RedHat Linux being several years behind on Python updates.
Timeline The ideal timeline is to have pyHWRF ready for the 2014 operational implementation. This is currently slated for December 2013. In order to reach this goal, the DTC will contribute 0.5 FTE (Timothy Brown) with a matching contribution by EMC (Sam Trahan and Zhan Zhang). The following milestones have been agreed upon:
The post-processing phase is an ideal candidate to start the migration to pyHWRF, as it is a small self contained component of the current system that needs to be modified to enable EMC’s method of running the tracker (continuously waiting for WRF output, versus a single instance at the end of the WRF forecast).
Tools Python provides a wealth of tools to interact with the operating system and perform string and date manipulation. pyHWRF will build upon the core python module and leverage already existing community modules to provide IO, plotting and numerical routines.
Modules Abstraction Layer pyHWRF will follow the object oriented paradigm providing polymorphism and a message abstraction layer. Key components in the implementation will be:
All development and maintenance will occur within the Subversion revision control system. |
Notes PyHWRF telecon Friday June 20. 2014 Vijay, Sam, Tim, Ligia, Christina Today is the last day of Sam's visit to NOAA-ESRL-GSD-DTC. We have made a lot of progress in the PyHWRF scripts. Most parts of the end-to-end system are working now, including the atmospheric and ocean initializations, GSI, coupled forecast model, postprocesing, and tracking. A few parts still need some debugging and testing, and there a few missing parts. Somoe of the missing parts are needed for operations, others for the community release. Jet reservations for the realtime parallels start on Tuesday 7/24, and Sam would like to have the system running for WP by then. Sam will set it up to work with HSS originally, and Tim will follow up with making it work with Rocoto. Missing
Needed for public release
Test, test, test
|
Python Post-Processing Presentation from EMC to NCO (April 4, 2014)
This presentation was given to describe the Python post-processing and delivery systems that are expected to be in the operational HWRF in May 2014. This presentation was given to the NCEP Central Operations developers that would be implementing HWRF in operations, as well as Environmental Modeling Center researchers that would be supporting them. The purpose was to describe the product delivery system, the implementation of the HWRF components, and how they link to one another.
Due to the rapid development of the 2014 HWRF system over the past weeks, this project is presently advancing in three branches. Soon, all three will be merged into the HWRF trunk. For now, relocation work is ongoing in the following branch, which runs the 2013 HWRF system:
The production 2014 system is in the "H214" branch, which only runs pyHWRF post-processing. This branch will be deleted once the WRF and UPP updates for 2014 are merged to the trunk:
The pyH214 branch runs the 2014 system, and has additional experimental code added in to run most of the initialization and the forecast in Python. It will soon be merged to the pyHWRF branch, and the pyH214 will be deleted:
The documentation of the pyHWRF is split into several pages due to the large size of the available documentation. New readers are directed to the Technical Overview which provides an overview of the entire system, and links to additional documentation.
FIXME: add the missing pages.
hwrf
Python package.hwrf
package and Python itself, providing a platform-independent environment.This section is a placeholder for additional future documentation of the project structure.
In order to complete the re-write to obtain a system that will produce results equivalent to the 2014 HWRF implementation the following needs to be achieved.
The high level system.
The flow is defined within ush/hwrf_expt.py
, while the system configuration is defined within parm/hwrf.conf
.
To run an experiment/storm you need to modify the initial kick script pyhwrf_driver.py
and then execute for the storm.
fe2$ cd kick fe2$ $EDITOR pyhwrf_driver.py fe2$ ./pyhwrf_driver.py 2012102806 18L HISTORY |
The items to edit within pyhwrf_driver.py
are
Variable | Value (examples) |
---|---|
diskgroup | 'dtc-hurr' |
rungroup | 'dtc-hurr' |
projdir | '/pan2/projects/dtc-hurr' |
Note: That the values have to be quoted.
||Completed||Priority||Locked||CreatedDate||CompletedDate||Assignee||Name|| |T|M|T|1397664764674|1397664764674|strahan|post (upp, tracker)| |F|M|F|1397664764674| |strahan|track analysis| |T|M|T|1397664764674|1397664764674|tpbrown|wps (geogrid, ungrib, metgrid)| |F|M|F|1397664764674| |strahan|wrf (main forecast)| |F|M|F|1397664764674| |strahan|wrf analysis| |F|M|F|1397664764674| |strahan|wrf ghost| |F|M|F|1397664764674| |tpbrown|vortex relocation| |F|M|F|1397664764674| |tpbrown|mpi pom init| |F|M|F|1397664764674| |tpbrown|GSI| |