[Starcluster] Load Balancer Problems

Amaro Taylor amaro.taylor at resgroupinc.com
Fri Jul 30 16:40:18 EDT 2010


Hey,

So I was testing out the Load Balancer today and it doesnt appear to be
working. Here is the output I was getting and the output from the job on
startcluster.

ssh.py:248 - ERROR - command source /etc/profile && qacct -j -b 201007301725
failed with status 1
>>> Oldest job is from None. # queued jobs = 0. # hosts = 2.
>>> Avg job duration = 0 sec, Avg wait time = 0 sec.
>>> Cluster change was made less than 180 seconds ago (2010-07-30
20:24:13.398974).
>>> Not changing cluster size until cluster stabilizes.
>>> Sleeping, looping again in 60 seconds.


It says 0 queued jobs but thats not accurate.
this is what qstat says on the master node

#########################################################################
      1 0.55500 Bone_Estim sgeadmin     qw    07/30/2010 20:26:20     1
7-1000:1
sgeadmin at domU-12-31-39-01-5D-67:~/jacobian-parallel/test/bone$ qstat -q
all.q -f -u "*"
queuename                      qtype resv/used/tot. load_avg arch
states
---------------------------------------------------------------------------------
all.q at domU-12-31-39-01-5C-97.c BIP   0/1/1          0.52     lx24-x86
      1 0.55500 Bone_Estim sgeadmin     r     07/30/2010 20:29:03     1 6
---------------------------------------------------------------------------------
all.q at domU-12-31-39-01-5D-67.c BIP   0/1/1          1.22     lx24-x86
      1 0.55500 Bone_Estim sgeadmin     r     07/30/2010 20:28:33     1 5

############################################################################
 - PENDING JOBS - PENDING JOBS - PENDING JOBS - PENDING JOBS - PENDING JOBS
############################################################################
      1 0.55500 Bone_Estim sgeadmin     qw    07/30/2010 20:26:20     1
7-1000:1
sgeadmin at domU-12-31-39-01-5D-67:~/jacobian-parallel/test/bone$ qstat -q
all.q -f -u "*"
queuename                      qtype resv/used/tot. load_avg arch
states
---------------------------------------------------------------------------------
all.q at domU-12-31-39-01-5C-97.c BIP   0/1/1          0.63     lx24-x86
      1 0.55500 Bone_Estim sgeadmin     r     07/30/2010 20:31:03     1 8
---------------------------------------------------------------------------------
all.q at domU-12-31-39-01-5D-67.c BIP   0/1/1          1.38     lx24-x86
      1 0.55500 Bone_Estim sgeadmin     r     07/30/2010 20:28:33     1 5

Any suggestions?



Best,
Amaro Taylor
RES Group, Inc.
1 Broadway • Cambridge, MA 02142 • U.S.A.
Tel: 310 880-1906 (Direct) • Fax: 617-812-8042 • Email:
amaro.taylor at resgroupinc.com

Disclaimer: The information contained in this email message may be
confidential. Please be careful if you forward, copy or print this message.
If you have received this email in error, please immediately notify the
sender and delete the message.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/pipermail/starcluster/attachments/20100730/7986e86e/attachment.htm


More information about the StarCluster mailing list