<div dir="ltr"><div>Hi Hong,<br><br>Ah okay. Thanks for your quick reply. What would you recommend that I do in terms of increasing the number of cores? Is it possible for me to get the files needed for the 192 and 360 core setup?<br><br>Best,<br>Dan<br><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Mon, Dec 28, 2020 at 5:00 PM <<a href="mailto:ecco-support-request@mit.edu">ecco-support-request@mit.edu</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Send ecco-support mailing list submissions to<br>
<a href="mailto:ecco-support@mit.edu" target="_blank">ecco-support@mit.edu</a><br>
<br>
To subscribe or unsubscribe via the World Wide Web, visit<br>
<a href="http://mailman.mit.edu/mailman/listinfo/ecco-support" rel="noreferrer" target="_blank">http://mailman.mit.edu/mailman/listinfo/ecco-support</a><br>
or, via email, send a message with subject or body 'help' to<br>
<a href="mailto:ecco-support-request@mit.edu" target="_blank">ecco-support-request@mit.edu</a><br>
<br>
You can reach the person managing the list at<br>
<a href="mailto:ecco-support-owner@mit.edu" target="_blank">ecco-support-owner@mit.edu</a><br>
<br>
When replying, please edit your Subject line so it is more specific<br>
than "Re: Contents of ecco-support digest..."<br>
<br>
<br>
Today's Topics:<br>
<br>
1. Cost function calculation problems when changing number of<br>
cores in ECCOv4-r4 (Dan Jones)<br>
2. Re: [EXTERNAL] Cost function calculation problems when<br>
changing number of cores in ECCOv4-r4 (Zhang, Hong (US 398K))<br>
<br>
<br>
----------------------------------------------------------------------<br>
<br>
Message: 1<br>
Date: Mon, 28 Dec 2020 15:41:05 +0000<br>
From: Dan Jones <<a href="mailto:dcjones.work@gmail.com" target="_blank">dcjones.work@gmail.com</a>><br>
Subject: [ecco-support] Cost function calculation problems when<br>
changing number of cores in ECCOv4-r4<br>
To: <a href="mailto:ecco-support@mit.edu" target="_blank">ecco-support@mit.edu</a><br>
Message-ID:<br>
<CAPj3iHSYkUMu=BuRJSDZtHec-VMa-qciWXJh34VUhvQ=<a href="mailto:36Ah3w@mail.gmail.com" target="_blank">36Ah3w@mail.gmail.com</a>><br>
Content-Type: text/plain; charset="utf-8"<br>
<br>
Hello,<br>
<br>
I am attempting to carry out some scaling tests on an HPC platform using<br>
ECCOv4-r4. As part of the scaling exercises, I change the number of cores<br>
using the procedure below.<br>
<br>
If I want to use 360 cores, I use the following procedure. Before<br>
compiling, I change the following parameters in SIZE.h:<br>
<br>
sNx = 15<br>
sNy = 15<br>
nPx = 360<br>
nPy = 1<br>
<br>
Based on my experience with ECCOv4-r3, I believe this should give me an<br>
executable that uses 360 cores. At run-time, I uncomment the lines<br>
following #15x15 nprocs=360 and comment out the lines for the nprocs=96<br>
case. I also have a configuration that uses nprocs=192. In all three cases,<br>
the code compiles and runs.<br>
<br>
However, I noticed that the cost function changes dramatically based on the<br>
number of cores, and it sometimes returns NaN. In both forward and adjoint<br>
mode, I get the following values for "fc" in the file named<br>
costfunction0129:<br>
<br>
96 cores, fc = 6733184.16<br>
192 cores, fc = NaN<br>
360 cores, fc = 5883940.61<br>
<br>
What am I missing in terms of calculating the cost function and changing<br>
the number of cores? Shouldn't fc be the same regardless of the number of<br>
cores used?<br>
<br>
Best regards,<br>
Dan<br>
<br>
--------------------------------------------------------------<br>
Dr Dan Jones / British Antarctic Survey<br>
<a href="http://danjonesocean.com" rel="noreferrer" target="_blank">danjonesocean.com</a> <<a href="http://www.danjonesocean.com" rel="noreferrer" target="_blank">http://www.danjonesocean.com</a>> / @DanJonesOcean<br>
--------------------------------------------------------------<br>
-------------- next part --------------<br>
An HTML attachment was scrubbed...<br>
URL: <a href="http://mailman.mit.edu/pipermail/ecco-support/attachments/20201228/48a238dd/attachment-0001.html" rel="noreferrer" target="_blank">http://mailman.mit.edu/pipermail/ecco-support/attachments/20201228/48a238dd/attachment-0001.html</a><br>
<br>
------------------------------<br>
<br>
Message: 2<br>
Date: Mon, 28 Dec 2020 15:49:30 +0000<br>
From: "Zhang, Hong (US 398K)" <<a href="mailto:hong.zhang@jpl.nasa.gov" target="_blank">hong.zhang@jpl.nasa.gov</a>><br>
Subject: Re: [ecco-support] [EXTERNAL] Cost function calculation<br>
problems when changing number of cores in ECCOv4-r4<br>
To: "ECCO support list, wider membership" <<a href="mailto:ecco-support@mit.edu" target="_blank">ecco-support@mit.edu</a>><br>
Message-ID: <<a href="mailto:7F566E4A-1051-4C84-94D1-14C4DDB51A7F@jpl.nasa.gov" target="_blank">7F566E4A-1051-4C84-94D1-14C4DDB51A7F@jpl.nasa.gov</a>><br>
Content-Type: text/plain; charset="utf-8"<br>
<br>
<br>
<br>
> On Dec 28, 2020, at 7:41 AM, Dan Jones <<a href="mailto:dcjones.work@gmail.com" target="_blank">dcjones.work@gmail.com</a>> wrote:<br>
> <br>
> Hello,<br>
> <br>
> I am attempting to carry out some scaling tests on an HPC platform using ECCOv4-r4. As part of the scaling exercises, I change the number of cores using the procedure below. <br>
> <br>
> If I want to use 360 cores, I use the following procedure. Before compiling, I change the following parameters in SIZE.h:<br>
> <br>
> sNx = 15<br>
> sNy = 15<br>
> nPx = 360<br>
> nPy = 1<br>
> <br>
> Based on my experience with ECCOv4-r3, I believe this should give me an executable that uses 360 cores. At run-time, I uncomment the lines following #15x15 nprocs=360 and comment out the lines for the nprocs=96 case. I also have a configuration that uses nprocs=192. In all three cases, the code compiles and runs.<br>
> <br>
> However, I noticed that the cost function changes dramatically based on the number of cores, and it sometimes returns NaN. In both forward and adjoint mode, I get the following values for "fc" in the file named costfunction0129:<br>
> <br>
> 96 cores, fc = 6733184.16<br>
> 192 cores, fc = NaN<br>
> 360 cores, fc = 5883940.61<br>
> <br>
> What am I missing in terms of calculating the cost function and changing the number of cores? Shouldn't fc be the same regardless of the number of cores used?<br>
Hi Dan,<br>
?fc? will change because ?profilesfiles? (those of ?*.nc?) are prescribed on 30x30 grid;<br>
as to the NaN, could it also be related to profiles, or other issue?<br>
<br>
cheers<br>
Hong<br>
<br>
<br>
<br>
<br>
------------------------------<br>
<br>
_______________________________________________<br>
ecco-support mailing list<br>
<a href="mailto:ecco-support@mit.edu" target="_blank">ecco-support@mit.edu</a><br>
<a href="http://mailman.mit.edu/mailman/listinfo/ecco-support" rel="noreferrer" target="_blank">http://mailman.mit.edu/mailman/listinfo/ecco-support</a><br>
<br>
<br>
End of ecco-support Digest, Vol 59, Issue 1<br>
*******************************************<br>
</blockquote></div></div>