Tom Auligne
Don Stark
Xin Zhang
Zaizhong Ma
Beau Paisley
Hans Huang
WRF PLUS is based on the 2004 version (2004) of WRF. It was finished by 2005, using the TAMC automatic adjoint generating tool. A cloud scheme was later added manually.
Latest developments in WRF PLUS date back to four years ago.
Automatic generation of adjoint from the TAMC code (tangent linear and adjoint model compiler). A code to be analyzed is submitted to the company and within about 24 hours the adjoint code is returned. The process may fail and require adjusting the input code. WRF PLUS required several fixes before the adjoint generation was completely successful.
There are three versions of the WRF PLUS code
1. Original version generated by TAMC which was parallelized and optimized by Tom Henderson and John Michalakes.
2. Version optimized to reduce the disk IO between the PLUS and NL components. (25% reduction in cost) The reduced version keeps a six hour window of the time steps in memory. Only keeps the primary variables (u,v,t,q,p). Increases the PLUS memory usage by 10%.
3. Wei Wang's version. Removes the optimizations introduced in version 1 and instead saves the intermediate RK integration steps to eliminate need for recomputation. Again, these modifications increase memory usage.
Version 3 is the fastest but it uses more memory (about twice as much). Above 64 processors, both version 2 and 3 perform similarly, which means that Version 2 scales a bit better for larger problems.
Version 3 has been hand fixed by Ma and now passes bit for bit testing (TL and ADJ tests) on Bluefire and in serial mode.
Parallelizing: MPI only (no OpenMP)
Profiling: through the Trace package
Debuging: Totalview tool