What's the Point?

Writing readable, maintainable code should be a priority in scientific computing. If a body of source code is difficult to understand, it is also difficult to:

The costs associated with these difficulties can be substantial, and can easily exceed the cost of computing power needed to run a piece of code, even on an HPC project. Therefore, source code should not be treated as just a program to be fed into a machine. Source code is also a document used to communicate with other human beings.

Because CAM is a community model, the code is shared between many developers and many users all over the world. When writing code for inclusion in CAM, keep in mind that your changes can and will be read by many other people, even after you stop working on that piece of code yourself! Therefore it is important to write code that is clear enough for them to follow.

Top Ten List

Here are some of the most important things to consider when writing code.

1. Document your code.

Names

Names are the single most important form of documentation. Try to use each variable or routine for only one purpose, and pick a name that matches that purpose.

Try to use names that are consistent with other CAM code, especially for short names. For instance, "t" usually refers to temperature, not time. In the CAM physics and chemistry, "i" is usually used as a column index, while "k" is usually a level index.

For each variable that has some physical meaning, include a comment specifying what it is and what the units are (if any). If you have multiple, similar variables (e.g. multiple reference pressures), explain the difference between them in the comment, and give the variables names that help you to remember the difference.

If you are using a number that's a constant, give it a name. If it's already present in CAM's physconst, you should use the physconst value instead of specifying your own. Here is an example:

a = 3.14159262 * r**2
rho = p / (287.*t)
use shr_kind_mod, only: r8 => shr_kind_r8

real(r8), parameter :: pi = 3.14159262_r8
! gas constant for dry air [J/kg/K]
real(r8), parameter :: rair = 287._r8

area = pi * radius**2
rho = p / (rair*t)
! physconst parameters use SI units.
use physconst, only: pi, rair

area = pi * radius**2
rho = p / (rair*t)

 

Notice how we specified "r8" in the second example above. All real numbers in CAM must have an explicit "kind", which is how precision is specified in Fortran. For double precision values, use shr_kind_r8, and for single precision, use shr_kind_r4. (These are usually abbreviated as just "r8" and "r4".)

Organization

Clear organization can also be a form of documentation. It is much easier to understand code when related actions are close together, and unrelated actions are kept in separate routines.

One method for organizing your code is to write routines in a hierarchical manner. First write an outline that specifies, in broad terms, the tasks a routine should perform. You can do this either in comments or in a separate document. Then fill in the routine, writing the code for each task. If any tasks seem to be particularly complex, try extracting those tasks into new subroutines or functions.

Comments

Furthermore, there are some practices to avoid:

2. Avoid outdated or non-standard Fortran language features.

3. Avoid duplicating code; instead, try to reuse code in multiple places.

4. Avoid certain practices that are error prone.

5. Try to organize code into many routines, each with a specific purpose and few arguments.

6. Use an editor with features that help you write Fortran source code.

7. When modifying an existing module, follow the conventions it uses.

8. Be careful with floating point arithmetic.

9. Make use of attributes that limit the ways that data can be used.

10. Collaborate.

CLM Coding Conventions

The CLM Coding Conventions contain some generally useful advice, and may be of interest to CAM coders looking for more advice (or examples), or developers working on both models. There are a few differences worth emphasizing.

Arguably, CAM has had more problems with misleading or out-of-date comments, than a lack of comments. Therefore we do not require comments in all of the same situations as CLM does. However, we still encourage users to write code that's as clear as possible in these cases. For instance, when you use several "if" blocks, you don't necessarily need to comment the "end if" statement to specify which "if" it matches. However, it's preferable to avoid very long or deeply nested conditionals, so that it's clear which "end if" matches a given "if" anyway.

Similarly, CLM requires lower bounds to be specified for array arguments, and this is related to the way that threading is handled. In most CAM modules, all arrays have a lower bound of 1, and threading is handled at higher levels of the physics. Since in most cases the lower bounds of arrays are all 1, we do not require "1" to be specified.

CAM does not require specific numbers of spaces for indentation; if editing an existing module, simply follow the same style as used in that module. However, tabs should be avoided; CAM developers use many different editors, which treat tabs in very different ways. Furthermore, tabs make it difficult to determine column number, in those rare cases where it matters in Fortran.