[StarCluster] load balanced nodes accepting jobs before ready

Justin Riley jtriley at MIT.EDU
Thu Apr 24 12:09:01 EDT 2014


Hey Stewart,

You can fix this issue by setting disable_queue=True in your config to
disable the default SGE plugin. Then you can define the SGE plugin in
your config, add it to your plugins list, and then move the pkginstaller
(and any other plugins that need to run before the node gets added)
*before* SGE in the list. This will ensure all other plugins get
executed before the node gets added to SGE. See the following doc for
more details on setting disable_queue and defining the SGE plugin in
your config:

http://star.mit.edu/cluster/docs/latest/plugins/sge.html#advanced-options

~Justin

On Mon, Apr 14, 2014 at 06:02:20PM +0000, Stewart, Andrew wrote:
>    pkginstaller was called during add_node, but the node was added to the
>    host list and its queue enabled before pkginstaller had a chance to finish
>    installing dependencies.  So it looks like a race condition.  I did bump
>    pkginstaller to the front of the plugins line (ahead of IPCluster) but I
>    haven’t yet bothered to test whether that helps the situation any.    The
>    most certain way to handle it would be to just disable the queue until
>    provisioning is complete.
>    I actually think the simpler solution would be to bypass pkginstaller and
>    just share managed packages with compute nodes via NFS.  Why reinstall the
>    same package N times?
>    --
>    Andrew Stewart
>    Office of Research Information Services (ORIS),
>    Office of the Chief Information Officer (OCIO), 
>    Smithsonian Institution
>    202-505-3633
>    From: Rajat Banerjee <[1]rajatb at post.harvard.edu>
>    Date: Monday, April 14, 2014 at 10:49 AM
>    To: Andrew Stewart <[2]stewarta at si.edu>
>    Cc: "[3]starcluster at mit.edu" <[4]starcluster at mit.edu>
>    Subject: Re: [StarCluster] load balanced nodes accepting jobs before ready
>    Hi,
>    Does that mean that the pkginstaller plugin doesn't get called during
>    add_node ? before the host is added to the SGE host list?
>    Raj
> 
> References
> 
>    Visible links
>    1. mailto:rajatb at post.harvard.edu
>    2. mailto:stewarta at si.edu
>    3. mailto:starcluster at mit.edu
>    4. mailto:starcluster at mit.edu

> _______________________________________________
> StarCluster mailing list
> StarCluster at mit.edu
> http://mailman.mit.edu/mailman/listinfo/starcluster

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
Url : http://mailman.mit.edu/pipermail/starcluster/attachments/20140424/89b534e8/attachment-0001.bin


More information about the StarCluster mailing list