[StarCluster] load balanced nodes accepting jobs before ready
Stewart, Andrew
StewartA at si.edu
Sat Apr 12 20:28:16 EDT 2014
When loadbalancer adds nodes, it’s making them available to the scheduler before they’re fully provisioned. I’m using pkginstaller plugin to install required libraries across the cluster, but if a job hits a newly added node before pkginstaller has finished, those jobs then fail because the library was not yet installed.
So, I need to either
1. force loadbalancer to wait until all provisioning is complete before readying the node for job scheduling
2. Bypass pkginstaller altogether by making the master node share its libraries with the rest of the cluster over nfs
Any suggestions?
--
Andrew Stewart
Office of Research Information Services (ORIS),
Office of the Chief Information Officer (OCIO),
Smithsonian Institution
202-505-3633
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/pipermail/starcluster/attachments/20140413/dbcef177/attachment.htm
More information about the StarCluster
mailing list