[StarCluster] load balanced nodes accepting jobs before ready

Stewart, Andrew StewartA at si.edu
Mon Apr 14 13:32:07 EDT 2014


Thanks Mich.  on the advice of someone else I did something similar:

1) set initial-state on all.q to disabled
2) manually enable after manual provisioning check passes

On Apr 14, 2014, at 7:19 AM, "François-Michel L'Heureux" <fmlheureux at datacratic.com<mailto:fmlheureux at datacratic.com>> wrote:

Hi Stewart!

I ran into a similar issue. I use a complex value to couter that situation. In steps:

  1.  When creating a cluster, I add the complex value in OGS.
  2.  Whenever I run a job, I require that complex value. (See flag "-l" in qrsh/qsub)
  3.  The last step I do when I initialize a node is add that complex value/resource to that node.

Hence, if a job is queued, it cannot run on a uninitialized node because the complex value is missing. The downside is that you have to alter all your qsub/qrsh commands to request that parameter, otherwise they will bypass it. (It might be possible to set it as a default requirement, I haven't looked.)

Good luck
Mich
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/pipermail/starcluster/attachments/20140414/097b9d57/attachment.htm


More information about the StarCluster mailing list