[StarCluster] Fwd: Integration of MPICH2 plugin with SGE

Sergio Mafra sergiohmafra at gmail.com
Mon Aug 19 13:38:33 EDT 2013


Hi Hyokun,

I´m a user of MPICH2 and OGE.

It seems that you´re using $fill_up instead of $round_robin. If so, try to
change it to $round_robin with $ qconf -mp orte
You can learn more here:
http://star.mit.edu/cluster/docs/latest/plugins/sge.html#using-the-plugin

Let me know if this help you.

All best.

Sergio


On Mon, Aug 19, 2013 at 1:53 AM, Hyokun Yun <yun3 at purdue.edu> wrote:

> Dear starcluster users,
>
>
> I am experiencing a problem using MPICH2 plugin with SGE.
>
> I am using the following image: ami-52a0c53b which uses Ubuntu 12.04
>
> When I use mpich2 plugin, it seems like mpich2 and SGE are not tightly
> integrated: when I execute my script using qsub, I get the following error
> message.
>
> error: executing task of job 1 failed: execution daemon on host "node001"
> didn't accept task
> error: executing task of job 1 failed: execution daemon on host "node002"
> didn't accept task
> error: executing task of job 1 failed: execution daemon on host "node003"
> didn't accept task
> error: executing task of job 1 failed: execution daemon on host "nodef004"
> didn't accept task
>
> It runs fine when I simply execute 'mpirun' myself, instead of relying on
> SGE.
> Also, the same script runs fine as well when I use OpenMPI instead of
> MPICH2.  That's why I suspect it is MPICH2 & SGE integration issue.
>
> The problem is that I need multi-thread support, and it is by default
> disabled in OpenMPI.  I also prefer to use MPICH2 instead of OpenMPI.
>
> I was able to reproduce the problem when I restarted the cluster from
> scratch.  Would any of you please take a look on the problem by trying the
> same image with MPICH2 plugin?
>
>
> Thanks,
> Hyokun Yun
>
> _______________________________________________
> StarCluster mailing list
> StarCluster at mit.edu
> http://mailman.mit.edu/mailman/listinfo/starcluster
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/pipermail/starcluster/attachments/20130819/1d2a5728/attachment.htm


More information about the StarCluster mailing list