<div dir="ltr">Hi Hyokun,<br><div class="gmail_quote"><div dir="ltr"><div><br></div><div>I´m a user of MPICH2 and OGE.</div><div><br></div><div>It seems that you´re using $fill_up instead of $round_robin. If so, try to change it to $round_robin with <span style="background-color:rgb(250,250,250);color:rgb(62,67,73);font-family:Consolas,'andale mono','lucida console',monospace;font-size:14px;line-height:17px">$ qconf -mp orte</span></div>
<div>You can learn more here: <a href="http://star.mit.edu/cluster/docs/latest/plugins/sge.html#using-the-plugin" target="_blank">http://star.mit.edu/cluster/docs/latest/plugins/sge.html#using-the-plugin</a></div><div><br>
</div><div>Let me know if this help you.</div>
<div><br></div><div>All best.</div><div><br></div><div>Sergio</div></div><div class="gmail_extra"><br><br><div class="gmail_quote"><div><div class="h5">On Mon, Aug 19, 2013 at 1:53 AM, Hyokun Yun <span dir="ltr"><<a href="mailto:yun3@purdue.edu" target="_blank">yun3@purdue.edu</a>></span> wrote:<br>
</div></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><div class="h5"><div dir="ltr"><div>Dear starcluster users,</div><div><br></div><div><br></div><div>I am experiencing a problem using MPICH2 plugin with SGE.</div>
<div><br></div><div>I am using the following image: ami-52a0c53b which uses Ubuntu 12.04</div>
<div><br></div><div>When I use mpich2 plugin, it seems like mpich2 and SGE are not tightly integrated: when I execute my script using qsub, I get the following error message.</div><div><br></div><div>error: executing task of job 1 failed: execution daemon on host "node001" didn't accept task</div>
<div>error: executing task of job 1 failed: execution daemon on host "node002" didn't accept task</div><div>error: executing task of job 1 failed: execution daemon on host "node003" didn't accept task</div>
<div>error: executing task of job 1 failed: execution daemon on host "nodef004" didn't accept task</div><div><br></div><div>It runs fine when I simply execute 'mpirun' myself, instead of relying on SGE.</div>
<div>Also, the same script runs fine as well when I use OpenMPI instead of MPICH2. That's why I suspect it is MPICH2 & SGE integration issue.</div><div><br></div><div>The problem is that I need multi-thread support, and it is by default disabled in OpenMPI. I also prefer to use MPICH2 instead of OpenMPI.</div>
<div><br></div><div>I was able to reproduce the problem when I restarted the cluster from scratch. Would any of you please take a look on the problem by trying the same image with MPICH2 plugin?</div><div><br></div><div>
<br></div><div>Thanks,</div><div>Hyokun Yun</div>
</div>
<br></div></div>_______________________________________________<br>
StarCluster mailing list<br>
<a href="mailto:StarCluster@mit.edu" target="_blank">StarCluster@mit.edu</a><br>
<a href="http://mailman.mit.edu/mailman/listinfo/starcluster" target="_blank">http://mailman.mit.edu/mailman/listinfo/starcluster</a><br>
<br></blockquote></div><br></div>
</div><br></div>