[StarCluster] load balancer stopped working?

David Koppstein david.koppstein at gmail.com
Mon Jun 8 11:10:37 EDT 2015


Hi,

I noticed that my load balancer stopped working -- specifically, it has
stopped deleting unnecessary nodes. It's been running fine for about three
weeks.

I have a small T2 micro instance loadbalancing a cluster of M3.xlarge. The
cluster is running Ubuntu 14.04 using the shared 14.0. AMI ami-38b99850.

The loadbalancer process is still running (started with nohup CMD &, where
CMD is the loadbalancer command below):

```
ubuntu at ip-10-0-0-20:~$ ps -ef | grep load
ubuntu   11784 11730  0 15:04 pts/1    00:00:00 grep --color=auto load
ubuntu   19493     1  0 Apr26 ?        01:25:03
/opt/venv/python2_venv/bin/python /opt/venv/python2_venv/bin/starcluster -c
/home/ubuntu/.starcluster/config loadbalance -n 1 -m 20 -w 300 dragon-1.3.0
```

Queue has been empty for several days.

```
dkoppstein at master:/dkoppstein/150521SG_v1.9_round2$ qstat -u "*"
dkoppstein at master:/dkoppstein/150521SG_v1.9_round2$
```

However, there are about 8 nodes that have been running over the weekend
and are not being killed despite -n 1. If anyone has any guesses as to why
the loadbalancer might stop working please let me know so I can prevent
this from happening in the future.

Thanks,
David
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/pipermail/starcluster/attachments/20150608/272f2cef/attachment.htm


More information about the StarCluster mailing list