Information on the CAM decomposition, for ESMF folks to verify:
Location of files to describe decomposition:
/fis/cgd/home/erik/CAM_decomps
(NOTE: This archive was moved to MSS under /ERIK/archive/CAM_decomps.c060612.tar.gz on June/12/2006)
Subdirectories. exist for each grid. These directories contain the decomposition descriptions for each case, and softlinks to input files that contain the CAM and CLM input grids.
What the Physics Grid Create would look like:
integer, parameter :: numdims = 2
type(ESMF_LOGICAL) :: periodic(numdims)= (/ESMF_TRUE, ESMF_FALSE/)
character(len=64) :: dimnames(numdims) = (/"longitude","latitude "/)
character(len=64) :: dimunits(numdims) = (/"degrees","degrees"/)
!
! pcols is the max number of columns in a chunk:
! For straight latitude decomposition, pcols=plat
! Otherwise, pcols = 16
! On vector platforms pcols = large value optimized for vectorization loops
!
coord1(1) = 0.0
do lon = 2, plon
coord1(lon) = (londeg(lon,1) + londeg(lon-1,1))*0.5
end do
coord1(plon+1) = 360.0
coord2(1) = -90.0
do lat = 2, plat
coord2(lat) = (latdeg(lat) + latdeg(lat-1))*0.5
end do
coord2(plat+1) = 90.0
MyCount = 0
n = 0
do c = begchunk, endchunk ! Loop over chunks in this processor
MyCount = get_ncols_p(c) + Mycount
do i = 1, get_ncols_p(c) ! Number of columns in a chunk
n = n + 1
myIndices(n,1) = get_lat_p(c,i)
myIndices(n,2) = get_lon_p(c,i)
end do
end do
!
! NOTE: Optionally a DELayout could be created with DE's for each chunk, with chunks
! begchunk to endchunk on virtual DE's that are on the same processor.
! Create and distribute ESMF grid
phys_grid = ESMF_GridCreateHorzLatLon(
coord1, coord2, &
horzstagger=ESMF_GRID_HORZ_STAGGER_A, &
dimnames=dimnames,dimunits=dimunits, &
name="Physics Grid", rc=rc)
call ESMF_GridDistribute(phys_grid, delayout=layout, &
myCount=myCount, &
myIndices=myIndices, rc=rc)
What are the different physics chunking options that will be used?
- Option == -1 Latitude decomposition
- Option == 0: split local longitude/latitude blocks into chunks, while attempting to create load-balanced chunks
- Option == 1: load balance chunks and assignment, attempting to also minimize communication costs
- Option == 2: split local longitude/latitude blocks into chunk assigning columns using block ordering
- Option == 3: split indiviudal longitude/latitude blocks into chunks, assigning columns using block ordering (default)
What about the Land model decomposition?
The Land model clumped decomposition turns the land points into a 1D vector and thus has a different decomposition than the physics chunked decomposition.
The land grid:use clmtype, only: clm3, gridcell_type integer, parameter :: numdims = 2 type(ESMF_LOGICAL) :: periodic(numdims)= (/ESMF_TRUE, ESMF_FALSE/) character(len=64) :: dimnames(numdims) = (/"longitude","latitude "/) character(len=64) :: dimunits(numdims) = (/"degrees","degrees"/) type(gridcell_type), pointer :: gptr ! pointer to gridcell derived subtype coord1(1) = 0.0 do lon = 2, lsmlon coord1(lon) = (longxy(lon,1) + longxy(lon-1,1))*0.5 end do coord1(lsmlon+1) = 360.0 coord2(1) = -90.0 do lat = 2, lsmlat coord2(lat) = (latixy(lat) + latixy(lat-1))*0.5 end do coord2(lsmlat+1) = 90.0 call get_proc_bounds( begg, endg, begl, endl, begc, endc, & begp, endp) gptr => clm3%g ! ! For CLM NOT all of the grid points will be filled (only roughly 1/3rd) ! MyCount = endg - begg + 1 n = 0 do g = begg, endg ! Loop over grid points used on this processor, get their global indices n = n + 1 myIndices(n,1) = gptr%ixy(g) myIndices(n,2) = gptr%jxy(g) end do lnd_grid = ESMF_GridCreateHorzLatLon( coord1, coord2, & horzstagger=ESMF_GRID_HORZ_STAGGER_A, & dimnames=dimnames,dimunits=dimunits, & name="CLM Land model Grid", rc=rc) call ESMF_GridDistribute(lnd_grid, delayout=layout, & myCount=myCount, & myIndices=myIndices, rc=rc)
What about the redistribute between land and atmosphere?
This redistribution goes both directions.
call ESMF_BundleRedistStore(atmBundle, lndBundle, parentVM, a2l_route, &
routeOptions=ESMF_ROUTE_OPTION_SYNC+ESMF_ROUTE_OPTION_PACK_PET, rc)
call ESMF_BundleRedist( atmBundle, lndBundle, a2l_route, rc
What are the grids and decompositions that will be used for the performance evaluation?
According to the Evaluation plan there are 2 different grids and a total of 8 different decompositions/grids that will be evaluated. Those are:
- 64x128.noomp 16,32,64 tasks
- 64x128 8 tasks, 8 threads
- 128x256.noomp 32, 64, 128 tasks
- 128x256 16 tasks, 8 threads
What are the grids that will be used?:
T5 8x16 Grid:
lat = -73.7992136285632, -52.8129431899943, -31.704091745008,
-10.5698823125761, 10.5698823125761, 31.704091745008, 52.8129431899943,
73.7992136285632 ;
lon = 0, 22.5, 45, 67.5, 90, 112.5, 135, 157.5, 180, 202.5, 225, 247.5, 270,
292.5, 315, 337.5 ;
T21 32x64 Grid:
lat = -85.7605871204438, -80.26877907225, -74.7445403686358,
-69.2129761693708, -63.6786355610969, -58.1429540492033,
-52.6065260343453, -47.0696420596877, -41.5324612466561,
-35.9950784112716, -30.4575539611521, -24.9199286299486,
-19.3822313464344, -13.8444837343849, -8.30670285651881,
-2.76890300773601, 2.76890300773601, 8.30670285651881, 13.8444837343849,
19.3822313464344, 24.9199286299486, 30.4575539611521, 35.9950784112716,
41.5324612466561, 47.0696420596877, 52.6065260343453, 58.1429540492033,
63.6786355610969, 69.2129761693708, 74.7445403686358, 80.26877907225,
85.7605871204438 ;
lon = 0, 5.625, 11.25, 16.875, 22.5, 28.125, 33.75, 39.375, 45, 50.625,
56.25, 61.875, 67.5, 73.125, 78.75, 84.375, 90, 95.625, 101.25, 106.875,
112.5, 118.125, 123.75, 129.375, 135, 140.625, 146.25, 151.875, 157.5,
163.125, 168.75, 174.375, 180, 185.625, 191.25, 196.875, 202.5, 208.125,
213.75, 219.375, 225, 230.625, 236.25, 241.875, 247.5, 253.125, 258.75,
264.375, 270, 275.625, 281.25, 286.875, 292.5, 298.125, 303.75, 309.375,
315, 320.625, 326.25, 331.875, 337.5, 343.125, 348.75, 354.375 ;
T42 64x128 Grid:
lat = -87.8637988392326, -85.0965269883174, -82.3129129478863,
-79.5256065726594, -76.7368996803683, -73.9475151539897,
-71.1577520115873, -68.3677561083132, -65.5776070108278,
-62.7873517989631, -59.9970201084913, -57.2066315276432,
-54.4161995260862, -51.6257336749383, -48.8352409662506,
-46.0447266311017, -43.2541946653509, -40.463648178115,
-37.6730896290453, -34.8825209937735, -32.091943881744,
-29.3013596217627, -26.510769325211, -23.7201739335347,
-20.9295742544895, -18.1389709902394, -15.3483647594915,
-12.5577561152307, -9.76714555919557, -6.97653355394864,
-4.18592053318915, -1.3953069108195, 1.3953069108195, 4.18592053318915,
...., 87.8637988392326 ;
lon = 0, 2.8125, 5.625, 8.4375, ...., 357.1875 ;
T85 128x256 Grid:
lat = -88.9277353522959, -87.5387052130272, -86.1414721015279,
-84.7423855907142, -83.3425960440704, -81.9424662991732,
-80.5421464346171, -79.1417096486217, -77.7411958655138,
-76.3406287023715, -74.9400230196494, -73.5393886337675,
-72.1387322891624, -70.7380587725176, -69.3373715749609,
-67.9366733025785, -66.5359659401756, -65.1352510260352,
-63.7345297708429, -62.3338031405324, -60.9330719152074,
-59.5323367318266, -58.1315981156439, -56.7308565037137,
-55.3301122627028, -53.9293657025561, -52.5286170870997,
-51.1278666423533, -49.7271145631097, -48.3263610181882,
-46.9256061546646, -45.5248501013023, -44.1240929713558,
-42.723334864877, -41.3225758706231, -39.9218160676465,
-38.5210555266244, -37.1202943109788, -35.719532477824,
-34.3187700787707, -32.918007160614, -31.5172437659226,
-30.1164799335463, -28.7157156990552, -27.3149510951204,
-25.9141861518467, -24.5134208970629, -23.1126553565776,
-21.7118895544042, -20.3111235129604, -18.9103572532454,
-17.5095907949986, -16.1088241568413, -14.7080573564048,
-13.3072904104462, -11.9065233349538, -10.5057561452436,
-9.10498885604852, -7.70422148160049, -6.3034540357076,
-4.90268653182654, -3.5019189831313, -2.10115140257898,
-0.700383802973324, 0.700383802973324, 2.10115140257898, 3.5019189831313,
....,
88.9277353522959 ;
lon = 0, 1.40625, 2.8125, 4.21875, .... 358.59375 ;
FV 10x15 Grid (19x24)
lat = -90, -80, -70, -60, -50, -40, -30, -20, -10, 0, 10, 20, 30, 40, 50,
60, 70, 80, 90 ;
lon = 0, 15, 30, 45, 60, 75, 90, 105, 120, 135, 150, 165, 180, 195, 210,
225, 240, 255, 270, 285, 300, 315, 330, 345 ;
FV 4x5 Grid (46x72)
lat = -90, -86, -82, .... 90 ;
lon = 0, 5, 10, ...., 355 ;
FV 2x2.5 Grid (91x144 points)
lat = -90, -88, -86, -84, ...., 90 ;
lon = 0, 2.5, 5, 7.5, ...., 357.5 ;
Write physics chunked decomposition:
Code that does this is checked in on: cam3_2_38_brnch_wrtdecomps
subroutine cam_writedecomp()
use pmgrid, only: plat, plon
use commap, only: londeg, latdeg
use phys_grid, only: get_ncols_p, get_lat_p, get_lon_p, phys_grid_getopts
use ppgrid, only: pcols, begchunk, endchunk
use spmd_utils, only: iam, npes
use units, only: getunit, freeunit
integer, parameter :: numdims = 2
real(r8) :: coord1(plon+1)
real(r8) :: coord2(plat+1)
integer :: MyCount
integer :: MyIndices(pcols*(endchunk-begchunk+1),numdims)
character(len=256) :: filename
integer :: n, n1, c, j, lat, lon, i ! Indices
integer :: ncols
integer :: unit
integer :: phys_load
!
! pcols is the max number of columns in a chunk:
! For straight latitude decomposition, pcols=plat
! Otherwise, pcols = 16
! On vector platforms pcols = large value optimized for vectorization loops
! Coord1, and coord2 are the coordinate vertices that would be entered on GridCreate
!
coord1(1) = 0.0
do lon = 2, plon
coord1(lon) = (londeg(lon,1) + londeg(lon-1,1))*0.5
end do
coord1(plon+1) = 360.0
coord2(1) = -90.0
do lat = 2, plat
coord2(lat) = (latdeg(lat) + latdeg(lat-1))*0.5
end do
coord2(plat+1) = 90.0
MyCount = 0
n = 0
call phys_grid_getopts( phys_loadbalance_out=phys_load )
unit = getunit( iu=10+iam)
write(filename,1010) plat, plon, iam, phys_load
1010 format( 'CAM_decomp:',i3.3,'x',i3.3, 'task-', i2.2, 'load-',i1.1,'.txt')
write(6,*) 'Write out decomposition on file: ', filename
call shr_sys_flush(6)
open(unit=unit,file=filename,status='REPLACE',form='FORMATTED',action='WRITE')
do c = begchunk, endchunk ! Loop over chunks in this processor
ncols = get_ncols_p(c)
MyCount = ncols + Mycount
end do
write(unit,fmt='(a,i6)') 'Total number of grid points on this PE = ', MyCount
do c = begchunk, endchunk ! Loop over chunks in this processor
ncols = get_ncols_p(c)
n1 = n + 1
write(unit,fmt='(a, 4i6)') 'Chunk id, #columns-in-chunk, start-index, end-index', &
c, ncols, n1, n1+ncols
do i = 1, ncols ! Number of columns in a chunk
n = n + 1
myIndices(n,1) = get_lat_p(c,i)
myIndices(n,2) = get_lon_p(c,i)
end do
write(unit,fmt='(a)') 'Latitude: global indices then coordinate in degrees '
write(unit,fmt='(20i6)') (myIndices(j,1), j = n1,n)
write(unit,fmt='(20f6.1)') (latdeg(myIndices(j,1)), j = n1,n)
write(unit,fmt='(a)') 'Longitude: global indices then coordinate in degrees '
write(unit,fmt='(20i6)') (myIndices(j,2), j = n1,n)
write(unit,fmt='(20f6.1)') (londeg(myIndices(j,2),myIndices(j,1)), j = n1,n)
end do
close(unit)
call freeunit(unit)
write(6,) 'Done writting decomp'
call shr_sys_flush(6)
end subroutine cam_writedecomp
Land decomp:
subroutine clm_camWriteDecomp()
use clmtype , only : clm3, gridcell_type
use clm_varpar , only : lsmlon, lsmlat
use clm_varsur , only : latixy, longxy, numlon
use clmtype , only : gridcell_type
use spmd_utils , only : iam, npes
use units , only : getunit, freeunit
use phys_grid , only : phys_grid_getopts
use shr_sys_mod , only : shr_sys_flush
use decompMod , only : get_proc_bounds
integer, parameter :: numdims = 2
type(gridcell_type), pointer :: gptr ! pointer to gridcell derived subtype
real(r8) :: coord1(lsmlon+1,lsmlat+1) ! Longitude
real(r8) :: coord2(lsmlon+1,lsmlat+1) ! Latitude
integer :: lat, lon
integer :: begg, endg, begl, endl, begc, endc, begp, endp
integer, allocatable :: myIndices(:,:)
integer :: MyCount, n, g, j, n1, n2, stride, nn
character(len=256) :: filename
integer :: unit
integer :: phys_load
coord1(:,1) = -90.0
coord2(1,:) = 0.0
do lat = 2, lsmlat
do lon = 2, numlon(lat)
coord1(lon,lat) = (longxy(lon,lat) + longxy(lon,lat-1))*0.5
coord2(lon,lat) = (latixy(lon,lat) + latixy(lon-1,lat))*0.5
end do
coord2(numlon(lat)+1,lat) = 360.0
end do
coord1(:,lsmlat+1) = 90.0
call get_proc_bounds( begg, endg, begl, endl, begc, endc, begp, endp)
gptr => clm3%g
!
! Open file to write out to...
!
call phys_grid_getopts( phys_loadbalance_out=phys_load )
unit = getunit( iu=40+iam)
write(filename,1010) lsmlat, lsmlon, iam, phys_load
1010 format( 'CLM_decomp:',i3.3,'x',i3.3, 'task-', i2.2, 'load-',i1.1,'.txt')
write(6,*) 'Write out decomposition on file: ', filename
call shr_sys_flush(6)
open(unit=unit,file=filename,status='REPLACE',form='FORMATTED',action='WRITE')
!
! For CLM NOT all of the grid points will be filled (only roughly 1/3rd)
!
MyCount = endg - begg + 1
write(unit,fmt='(a,i6)') 'Total number of grid points on this PE = ', MyCount
allocate( myIndices(MyCount,numdims) )
n = 0
do g = begg, endg ! Loop over grid points used on this processor, get their global indices
n = n + 1
myIndices(n,1) = gptr%ixy(g) ! Longitude
myIndices(n,2) = gptr%jxy(g) ! Latitude
end do
!
! Write to file:
!
stride = 14
n2 = 0
do nn = 1, n, stride
n1 = n2+1
n2 = n1 + stride-1
if ( n2 > n ) n2 = n
write(unit,fmt='(a,i6,i6)') 'Write grid points between: n1,n2: ', n1, n2
write(unit,fmt='(a)') 'Longitude: global indices then coordinate in degrees '
write(unit,fmt='(20i7)') (myIndices(j,1), j = n1,n2)
write(unit,fmt='(20f7.1)') (longxy(myIndices(j,1),myIndices(j,2)), j = n1,n2)
write(unit,fmt='(a)') 'Latitude: global indices then coordinate in degrees '
write(unit,fmt='(20i7)') (myIndices(j,2), j = n1,n2)
write(unit,fmt='(20f7.1)') (latixy(myIndices(j,1),myIndices(j,2)), j = n1,n2)
end do
call shr_sys_flush(6)
deallocate( myIndices )
close(unit)
call freeunit(unit)
write(6,*) 'Done writting decomp'
end subroutine clm_camWriteDecomp
Decomposition information in CODE
There is a directory called "cam3_2_46_brnch_mct" that has all of the CAM and CLM code in question. I soft-linked the main program to the top level, so you can see what the top level driver looks like. The CAM decomposition is figured out in the file
"models/atm/cam/src/physics/cam1/phys_grid.F90"
The land model decomposition is figured out in the file
"models/lnd/clm2/src/main/decompMod.F90"
Files that have MCT data-structures in them all have a "MCT_" prefix in them.
So
"models/lnd/clm2/src/main/MCT_lnd_comp.F90"
figures out how to map the CLM internal description of it's decomposition into MCT, for example.
"models/atm/cam/src/control/MCT_atm_comp.F90"
does the same for the atmosphere model.
Each of the previous MCT_ layer subroutines have functions that define a MCT Global Seg Map inside them for that particular component in question. But, they use phys_grid or decompMod methods to figure out how to do this. In your case you could either read the files, or try to use the decomposition methods. The problem with doing the later is that you are likely to need a lot of other code that those two files are dependent on.
Looking for use statements in phys_grid...
- use shr_kind_mod – kind parameters
- use ppgrid, only ------ physics size parameters
- use pmgrid, only ----- Dynamics size parameters
- use abortutils --------- Abort method
- use spmd_utils ------- SPMD description (would need this, and their two components below)
- use spmd_phys
- use spmddyn
- The above is about 2k lines)
- use psect -------------- Spectral data sizes
- use dycore ------------ Short description of which dycore is running
- use rgrid --------------- Reduced grid stuff (could probably hard-code this)
- use commap ---------- Latitude and Longitude could read this from a file
- use mod_com -------- Pilgrim stuff – is this needed? This is probably lots of stuff with it
- use dyn_grid ---------- Pretty short description of the dynamics decomp
- use mpishorthand — Some MPI wrappers – short