[StarCluster] load balanced nodes accepting jobs before ready

Stewart, Andrew StewartA at si.edu
Sat Apr 12 20:28:16 EDT 2014


When loadbalancer adds nodes, it’s making them available to the scheduler before they’re fully provisioned.  I’m using pkginstaller plugin to install required libraries across the cluster, but if a job hits a newly added node before pkginstaller has finished, those jobs then fail because the library was not yet installed.

So, I need to either

  1.  force loadbalancer to wait until all provisioning is complete before readying the node for job scheduling
  2.  Bypass pkginstaller altogether by making the master node share its libraries with the rest of the cluster over nfs

Any suggestions?

--
Andrew Stewart
Office of Research Information Services (ORIS),
Office of the Chief Information Officer (OCIO),
Smithsonian Institution
202-505-3633
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/pipermail/starcluster/attachments/20140413/dbcef177/attachment.htm


More information about the StarCluster mailing list