problems with cpl_map_mod

Part of CCSM on BlueGene Project.
One of the first obstacles in running cpl6 at high processor counts and resolutions was the amount of memory needed to initialize a mapping.
The old algorithm was to read in all the weights on node0 and then scatter them. This required to much memory on node 0.
The new algorithm (from Tony) reads in pieces and broadcasts them to each node. Each node then picks out what it needs.

Question: did the cpl_map_mod with the new cpl_map_read but the old cpl_map_npFix3R work fine on less than 32 procs on BlueGene and other platforms? (Possibly not due to the temporary npMapNone patch.)