[StarCluster] Starcluster debug file

Paolo Di Tommaso Paolo.DiTommaso at crg.eu
Wed Dec 21 09:02:49 EST 2011


I've made some benchmarks some weeks ago about large cluster deployment with StarCluster.

I reported the result in this mailing list, but it could be interesting to share it again.

It turned out that StarCluster  takes 15 minutes every 50 nodes to be launched (micro instances). Something like:

 - 50 nodes: boot time ~ 15 minutes
 - 100 nodes: ~ 30 minutes
 - 200 nodes: ~ 1 hour

This linear progression made me think that StarCluster uses a serial mechanism to start the instances. But it is only a speculation and cannot say more.

Anyway solving this problem would be a huge improvement for StarCluster.


Cheers,
Paolo





On Dec 21, 2011, at 12:25 PM, Sumita Sinha wrote:

Hello Justin,

I restarted a 200 node cluster and this time the step of Installing Sun Grid Engine was taking a lot of time.

>>> Installing Sun Grid Engine
197/199 |///////////////////////////////////////////////////////////////| 98%

I did a tail to the  ~/.starcluster/logs/debug.log file and could see the below line getting repeated .

2011-12-21 11:09:06,152 PID: 17885 threadpool.py:136 - DEBUG - unfinished_tasks = 2

I waited for almost 50 minutes and then had to terminate the cluster.



--
Regards
Sumita Sinha


_______________________________________________
StarCluster mailing list
StarCluster at mit.edu<mailto:StarCluster at mit.edu>
http://mailman.mit.edu/mailman/listinfo/starcluster

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/pipermail/starcluster/attachments/20111221/5f306ecf/attachment.htm


More information about the StarCluster mailing list