[ecco-support] Cost function calculation problems when changing number of cores in ECCOv4-r4
Dan Jones
dcjones.work at gmail.com
Mon Dec 28 10:41:05 EST 2020
Hello,
I am attempting to carry out some scaling tests on an HPC platform using
ECCOv4-r4. As part of the scaling exercises, I change the number of cores
using the procedure below.
If I want to use 360 cores, I use the following procedure. Before
compiling, I change the following parameters in SIZE.h:
sNx = 15
sNy = 15
nPx = 360
nPy = 1
Based on my experience with ECCOv4-r3, I believe this should give me an
executable that uses 360 cores. At run-time, I uncomment the lines
following #15x15 nprocs=360 and comment out the lines for the nprocs=96
case. I also have a configuration that uses nprocs=192. In all three cases,
the code compiles and runs.
However, I noticed that the cost function changes dramatically based on the
number of cores, and it sometimes returns NaN. In both forward and adjoint
mode, I get the following values for "fc" in the file named
costfunction0129:
96 cores, fc = 6733184.16
192 cores, fc = NaN
360 cores, fc = 5883940.61
What am I missing in terms of calculating the cost function and changing
the number of cores? Shouldn't fc be the same regardless of the number of
cores used?
Best regards,
Dan
--------------------------------------------------------------
Dr Dan Jones / British Antarctic Survey
danjonesocean.com <http://www.danjonesocean.com> / @DanJonesOcean
--------------------------------------------------------------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/pipermail/ecco-support/attachments/20201228/48a238dd/attachment.html
More information about the ecco-support
mailing list