[StarCluster] [Star cluster] error tolerance design when adding nodes
Jin Yu
yujin2004 at gmail.com
Sun Jul 20 15:08:00 EDT 2014
Hello,
For an example, I just found it is not uncommon to have one or two
instances not communicable after you adding 50 instances in the cluster.
The progress bar got stuck when waiting for ssh. And I have to manually
restart those problematic instances.
I have not yet went through the codes of starcluster, I wonder if
StarCluster already has some error tolerance design for these situation?
Thanks!
Jin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/pipermail/starcluster/attachments/20140720/60904015/attachment.htm
More information about the StarCluster
mailing list