[StarCluster] StarCluster LoadBalancer
Sergio Mafra
sergiohmafra at gmail.com
Sat Feb 9 10:47:43 EST 2013
Hi fellows,
I have a cluster of 5 nodes (cc1.4xlarge) running two jobs. Each one with
40 nodes.
I´m trying to use loadbalancer to kill the cluster after the jobs are done.
One strange thing is despite of the jobs are running in the queue, as you
can see here:
queuename qtype resv/used/tot. load_avg arch
states
---------------------------------------------------------------------------------
all.q at master BIP 0/8/16 0.42 linux-x64
2 0.55500 serra85 sgeadmin r 02/09/2013 11:52:17 8
---------------------------------------------------------------------------------
all.q at node001 BIP 0/8/1 -NA- linux-x64
auo
2 0.55500 serra85 sgeadmin r 02/09/2013 11:52:17 8
---------------------------------------------------------------------------------
all.q at node002 BIP 0/8/1 -NA- linux-x64
auo
2 0.55500 serra85 sgeadmin r 02/09/2013 11:52:17 8
---------------------------------------------------------------------------------
all.q at node003 BIP 0/8/1 -NA- linux-x64
auo
2 0.55500 serra85 sgeadmin r 02/09/2013 11:52:17 8
---------------------------------------------------------------------------------
all.q at node004 BIP 0/8/1 -NA- linux-x64
auo
2 0.55500 serra85 sgeadmin r 02/09/2013 11:52:17 8
If I issue this command in StarCluster: $ starcluster loadbalance newave -n
1
this is what I´ve got:
>>> Loading full job history
Execution hosts: 5
Queued jobs: 0
Avg job duration: 0 secs
Avg job wait time: 0 secs
Last cluster modification time: 2013-02-09 15:32:21
>>> Not adding nodes: already at or above maximum (5)
>>> Looking for nodes to remove...
>>> No nodes can be removed at this time
>>> Sleeping...(looping again in 60 secs)
It seems that LoadBalancer didn´t got the right Avg Job Duration and can
kill the cluster wrongly, even though that is jobs running.
All the best,
Sergio
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/pipermail/starcluster/attachments/20130209/e8f930d7/attachment.htm
More information about the StarCluster
mailing list