See comments inline-<br><br><div class="gmail_quote">On Fri, Mar 25, 2011 at 11:40 AM, Kyeong Soo (Joseph) Kim <span dir="ltr">&lt;<a href="mailto:kyeongsoo.kim@gmail.com">kyeongsoo.kim@gmail.com</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">


For instance, the implementation of load<br>

balancing would be much simpler and better and, if needed, it can<br>

completely terminate the whole instances.<br>

<br>

As for my own experience with 25-node clusters, I found out that the<br>

load balancer did not terminate the master node, even though it<br>

finished all assigned jobs; the master node is a single point of<br>

contact and had to wait for all those jobs running in other nodes to<br>

finish.<br><br></blockquote><div><br></div><div>There is a variable in starcluster/balancers/sge/__init__.py</div><div>called:</div><div>#This would allow the master to be killed when the queue empties. UNTESTED.</div><div>


allow_master_kill = False </div><div><br></div><div>That would kill the master once the job queue is empty. You can turn it to True and test it if you&#39;d like.</div><div><br></div><div>This raises some risks - when the master is killed, the cluster is no longer accessible, and your results may be lost (unless you were smart enough to put them on ebs). I kept it semi-hidden because of these risks. Since you&#39;re obviously interested, give it a try. I used it for a little while, and it was able to terminate the master node when the jobs were finished. Though the cluster tags, groups, etc still exist, they won&#39;t incur any charges. at some later date you&#39;d still have to call &#39;starcluster stop &lt;cluster_tag&gt;.</div>


<div><br></div><div><br></div><div>Best,</div><div>Rajat</div><div><br></div></div>