[ecco-support] Cost function calculation problems when changing number of cores in ECCOv4-r4

Dan Jones dcjones.work at gmail.com
Mon Dec 28 17:08:37 EST 2020


Hi Hong,

Ah okay. Thanks for your quick reply. What would you recommend that I do in
terms of increasing the number of cores? Is it possible for me to get the
files needed for the 192 and 360 core setup?

Best,
Dan


On Mon, Dec 28, 2020 at 5:00 PM <ecco-support-request at mit.edu> wrote:

> Send ecco-support mailing list submissions to
>         ecco-support at mit.edu
>
> To subscribe or unsubscribe via the World Wide Web, visit
>         http://mailman.mit.edu/mailman/listinfo/ecco-support
> or, via email, send a message with subject or body 'help' to
>         ecco-support-request at mit.edu
>
> You can reach the person managing the list at
>         ecco-support-owner at mit.edu
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of ecco-support digest..."
>
>
> Today's Topics:
>
>    1. Cost function calculation problems when changing  number of
>       cores in ECCOv4-r4 (Dan Jones)
>    2. Re: [EXTERNAL] Cost function calculation problems when
>       changing  number of cores in ECCOv4-r4 (Zhang, Hong (US 398K))
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Mon, 28 Dec 2020 15:41:05 +0000
> From: Dan Jones <dcjones.work at gmail.com>
> Subject: [ecco-support] Cost function calculation problems when
>         changing        number of cores in ECCOv4-r4
> To: ecco-support at mit.edu
> Message-ID:
>         <CAPj3iHSYkUMu=BuRJSDZtHec-VMa-qciWXJh34VUhvQ=
> 36Ah3w at mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
>
> Hello,
>
> I am attempting to carry out some scaling tests on an HPC platform using
> ECCOv4-r4. As part of the scaling exercises, I change the number of cores
> using the procedure below.
>
> If I want to use 360 cores, I use the following procedure. Before
> compiling, I change the following parameters in SIZE.h:
>
> sNx = 15
> sNy = 15
> nPx = 360
> nPy = 1
>
> Based on my experience with ECCOv4-r3, I believe this should give me an
> executable that uses 360 cores. At run-time, I uncomment the lines
> following #15x15 nprocs=360 and comment out the lines for the nprocs=96
> case. I also have a configuration that uses nprocs=192. In all three cases,
> the code compiles and runs.
>
> However, I noticed that the cost function changes dramatically based on the
> number of cores, and it sometimes returns NaN. In both forward and adjoint
> mode, I get the following values for "fc" in the file named
> costfunction0129:
>
> 96 cores, fc = 6733184.16
> 192 cores, fc = NaN
> 360 cores, fc = 5883940.61
>
> What am I missing in terms of calculating the cost function and changing
> the number of cores? Shouldn't fc be the same regardless of the number of
> cores used?
>
> Best regards,
> Dan
>
> --------------------------------------------------------------
> Dr Dan Jones / British Antarctic Survey
> danjonesocean.com <http://www.danjonesocean.com> / @DanJonesOcean
> --------------------------------------------------------------
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL:
> http://mailman.mit.edu/pipermail/ecco-support/attachments/20201228/48a238dd/attachment-0001.html
>
> ------------------------------
>
> Message: 2
> Date: Mon, 28 Dec 2020 15:49:30 +0000
> From: "Zhang, Hong (US 398K)" <hong.zhang at jpl.nasa.gov>
> Subject: Re: [ecco-support] [EXTERNAL] Cost function calculation
>         problems when changing  number of cores in ECCOv4-r4
> To: "ECCO support list, wider membership" <ecco-support at mit.edu>
> Message-ID: <7F566E4A-1051-4C84-94D1-14C4DDB51A7F at jpl.nasa.gov>
> Content-Type: text/plain; charset="utf-8"
>
>
>
> > On Dec 28, 2020, at 7:41 AM, Dan Jones <dcjones.work at gmail.com> wrote:
> >
> > Hello,
> >
> > I am attempting to carry out some scaling tests on an HPC platform using
> ECCOv4-r4. As part of the scaling exercises, I change the number of cores
> using the procedure below.
> >
> > If I want to use 360 cores, I use the following procedure. Before
> compiling, I change the following parameters in SIZE.h:
> >
> > sNx = 15
> > sNy = 15
> > nPx = 360
> > nPy = 1
> >
> > Based on my experience with ECCOv4-r3, I believe this should give me an
> executable that uses 360 cores. At run-time, I uncomment the lines
> following #15x15 nprocs=360 and comment out the lines for the nprocs=96
> case. I also have a configuration that uses nprocs=192. In all three cases,
> the code compiles and runs.
> >
> > However, I noticed that the cost function changes dramatically based on
> the number of cores, and it sometimes returns NaN. In both forward and
> adjoint mode, I get the following values for "fc" in the file named
> costfunction0129:
> >
> > 96 cores, fc = 6733184.16
> > 192 cores, fc = NaN
> > 360 cores, fc = 5883940.61
> >
> > What am I missing in terms of calculating the cost function and changing
> the number of cores? Shouldn't fc be the same regardless of the number of
> cores used?
> Hi Dan,
> ?fc? will change because ?profilesfiles? (those of ?*.nc?) are prescribed
> on 30x30 grid;
> as to the NaN, could it also be related to profiles, or other issue?
>
> cheers
> Hong
>
>
>
>
> ------------------------------
>
> _______________________________________________
> ecco-support mailing list
> ecco-support at mit.edu
> http://mailman.mit.edu/mailman/listinfo/ecco-support
>
>
> End of ecco-support Digest, Vol 59, Issue 1
> *******************************************
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/pipermail/ecco-support/attachments/20201228/19ca58a3/attachment.html


More information about the ecco-support mailing list