<div dir="ltr"><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">Hi,</div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">
When using AMI ami-6b211202 in us-east I stumbled across the same issue you're experiencing.</div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">
The symbolic links in the alternatives system are mixing MPICH and OpenMPI:</div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div><div class="gmail_default" style><div class="gmail_default" style>
<font face="arial, helvetica, sans-serif">root@master:/etc/alternatives# update-alternatives --display mpi</font></div><div class="gmail_default" style><font face="arial, helvetica, sans-serif">mpi - auto mode</font></div>
<div class="gmail_default" style><font face="arial, helvetica, sans-serif"> link currently points to /usr/include/mpich2</font></div><div class="gmail_default" style><font face="arial, helvetica, sans-serif">/usr/include/mpich2 - priority 40</font></div>
<div class="gmail_default" style><font face="arial, helvetica, sans-serif"> slave libmpi++.so: /usr/lib/libmpichcxx.so</font></div><div class="gmail_default" style><font face="arial, helvetica, sans-serif"> slave libmpi.so: /usr/lib/libmpich.so</font></div>
<div class="gmail_default" style><font face="arial, helvetica, sans-serif"> slave libmpif77.so: /usr/lib/libfmpich.so</font></div><div class="gmail_default" style><font face="arial, helvetica, sans-serif"> slave libmpif90.so: /usr/lib/libmpichf90.so</font></div>
<div class="gmail_default" style><font face="arial, helvetica, sans-serif"> slave mpic++: /usr/bin/mpic++.mpich2</font></div><div class="gmail_default" style><font face="arial, helvetica, sans-serif"> slave mpic++.1.gz: /usr/share/man/man1/mpic++.mpich2.1.gz</font></div>
<div class="gmail_default" style><font face="arial, helvetica, sans-serif"> slave mpicc: /usr/bin/mpicc.mpich2</font></div><div class="gmail_default" style><font face="arial, helvetica, sans-serif"> slave mpicc.1.gz: /usr/share/man/man1/mpicc.mpich2.1.gz</font></div>
<div class="gmail_default" style><font face="arial, helvetica, sans-serif"> slave mpicxx: /usr/bin/mpicxx.mpich2</font></div><div class="gmail_default" style><font face="arial, helvetica, sans-serif"> slave mpicxx.1.gz: /usr/share/man/man1/mpicxx.mpich2.1.gz</font></div>
<div class="gmail_default" style><font face="arial, helvetica, sans-serif"> slave mpif77: /usr/bin/mpif77.mpich2</font></div><div class="gmail_default" style><font face="arial, helvetica, sans-serif"> slave mpif77.1.gz: /usr/share/man/man1/mpif77.mpich2.1.gz</font></div>
<div class="gmail_default" style><font face="arial, helvetica, sans-serif"> slave mpif90: /usr/bin/mpif90.mpich2</font></div><div class="gmail_default" style><font face="arial, helvetica, sans-serif"> slave mpif90.1.gz: /usr/share/man/man1/mpif90.mpich2.1.gz</font></div>
<div class="gmail_default" style><font face="arial, helvetica, sans-serif">/usr/lib/openmpi/include - priority 40</font></div><div class="gmail_default" style><font face="arial, helvetica, sans-serif"> slave libmpi++.so: /usr/lib/openmpi/lib/libmpi_cxx.so</font></div>
<div class="gmail_default" style><font face="arial, helvetica, sans-serif"> slave libmpi.so: /usr/lib/openmpi/lib/libmpi.so</font></div><div class="gmail_default" style><font face="arial, helvetica, sans-serif"> slave libmpif77.so: /usr/lib/openmpi/lib/libmpi_f77.so</font></div>
<div class="gmail_default" style><font face="arial, helvetica, sans-serif"> slave libmpif90.so: /usr/lib/openmpi/lib/libmpi_f90.so</font></div><div class="gmail_default" style><font face="arial, helvetica, sans-serif"> slave mpiCC: /usr/bin/mpic++.openmpi</font></div>
<div class="gmail_default" style><font face="arial, helvetica, sans-serif"> slave mpiCC.1.gz: /usr/share/man/man1/mpiCC.openmpi.1.gz</font></div><div class="gmail_default" style><font face="arial, helvetica, sans-serif"> slave mpic++: /usr/bin/mpic++.openmpi</font></div>
<div class="gmail_default" style><font face="arial, helvetica, sans-serif"> slave mpic++.1.gz: /usr/share/man/man1/mpic++.openmpi.1.gz</font></div><div class="gmail_default" style><font face="arial, helvetica, sans-serif"> slave mpicc: /usr/bin/mpicc.openmpi</font></div>
<div class="gmail_default" style><font face="arial, helvetica, sans-serif"> slave mpicc.1.gz: /usr/share/man/man1/mpicc.openmpi.1.gz</font></div><div class="gmail_default" style><font face="arial, helvetica, sans-serif"> slave mpicxx: /usr/bin/mpic++.openmpi</font></div>
<div class="gmail_default" style><font face="arial, helvetica, sans-serif"> slave mpicxx.1.gz: /usr/share/man/man1/mpicxx.openmpi.1.gz</font></div><div class="gmail_default" style><font face="arial, helvetica, sans-serif"> slave mpif77: /usr/bin/mpif77.openmpi</font></div>
<div class="gmail_default" style><font face="arial, helvetica, sans-serif"> slave mpif77.1.gz: /usr/share/man/man1/mpif77.openmpi.1.gz</font></div><div class="gmail_default" style><font face="arial, helvetica, sans-serif"> slave mpif90: /usr/bin/mpif90.openmpi</font></div>
<div class="gmail_default" style><font face="arial, helvetica, sans-serif"> slave mpif90.1.gz: /usr/share/man/man1/mpif90.openmpi.1.gz</font></div><div class="gmail_default" style><font face="arial, helvetica, sans-serif">Current 'best' version is '/usr/include/mpich2'.</font></div>
<div class="gmail_default" style><font face="arial, helvetica, sans-serif">root@master:/etc/alternatives# update-alternatives --display mpirun</font></div><div class="gmail_default" style><font face="arial, helvetica, sans-serif">mpirun - auto mode</font></div>
<div class="gmail_default" style><font face="arial, helvetica, sans-serif"> link currently points to /usr/bin/mpirun.openmpi</font></div><div class="gmail_default" style><font face="arial, helvetica, sans-serif">/usr/bin/mpirun.mpich2 - priority 40</font></div>
<div class="gmail_default" style><font face="arial, helvetica, sans-serif"> slave mpiexec: /usr/bin/mpiexec.mpich2</font></div><div class="gmail_default" style><font face="arial, helvetica, sans-serif"> slave mpiexec.1.gz: /usr/share/man/man1/mpiexec.mpich2.1.gz</font></div>
<div class="gmail_default" style><font face="arial, helvetica, sans-serif"> slave mpirun.1.gz: /usr/share/man/man1/mpirun.mpich2.1.gz</font></div><div class="gmail_default" style><font face="arial, helvetica, sans-serif">/usr/bin/mpirun.openmpi - priority 50</font></div>
<div class="gmail_default" style><font face="arial, helvetica, sans-serif"> slave mpiexec: /usr/bin/mpiexec.openmpi</font></div><div class="gmail_default" style><font face="arial, helvetica, sans-serif"> slave mpiexec.1.gz: /usr/share/man/man1/mpiexec.openmpi.1.gz</font></div>
<div class="gmail_default" style><font face="arial, helvetica, sans-serif"> slave mpirun.1.gz: /usr/share/man/man1/mpirun.openmpi.1.gz</font></div><div class="gmail_default" style><font face="arial, helvetica, sans-serif">Current 'best' version is '/usr/bin/mpirun.openmpi'.</font></div>
<div class="gmail_default" style><font face="arial, helvetica, sans-serif"><br></font></div><div class="gmail_default" style><font face="arial, helvetica, sans-serif">You do compile it with MPICH and try to run with OpenMPI. The solution is to change the symbolic links by using the update-alternatives command. For the runtime link (mpirun), it must be done in all the nodes of the cluster.</font></div>
<div class="gmail_default" style><font face="arial, helvetica, sans-serif"><br></font></div><div class="gmail_default" style><font face="arial, helvetica, sans-serif">No doubt this will be corrected in upcoming versions of the AMIs.</font></div>
<div class="gmail_default" style><font face="arial, helvetica, sans-serif"><br></font></div><div class="gmail_default" style><font face="arial, helvetica, sans-serif">Regards,</font></div><div class="gmail_default" style>
<font face="arial, helvetica, sans-serif"><br></font></div><div class="gmail_default" style><font face="arial, helvetica, sans-serif">Gonçalo</font></div></div></div><div class="gmail_extra"><br><br><div class="gmail_quote">
On Mon, Apr 28, 2014 at 1:09 PM, Torstein Fjermestad <span dir="ltr"><<a href="mailto:tfjermestad@gmail.com" target="_blank">tfjermestad@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div dir="ltr"><div>Dear Justin,<br> <br></div>during the compilation, the cluster only consisted of the master node which is of instance type c3.large. In order to run a test parallel calculation, I added a node of instance type c3.4xlarge (16 processors). <br>
<br><div>The cluster is created form the following AMI: <br>[0] ami-044abf73 eu-west-1 starcluster-base-ubuntu-13.04-x86_64 (EBS)<br><br></div><div>Executing the application outside the queuing system like <br><br>mpirun -np 2 -hostfile hosts ./pw.x -in inputfile.inp<br>
<br>did not change anything. <br><br></div><div>The output of the command "mpirun --version" is the following:<br><br>mpirun (Open MPI) 1.4.5<br><br>Report bugs to <a href="http://www.open-mpi.org/community/help/" target="_blank">http://www.open-mpi.org/community/help/</a><br>
<br></div><div>After investigating the matter a little bit, I found that mpif90 is likely compiled with an MPI version different from mpirun. <br></div><div>The first line of the output of the command "mpif90 -v" is the following:<br>
<br></div><div>mpif90 for MPICH2 version 1.4.1<br><br></div><div>Furthermore, the output of the command "ldd pw.x" indicates that pw.x is compiled with mpich2 and not with Open MPI. The output is the following:<br>
<br></div><div>linux-vdso.so.1 => (0x00007fffd35fe000)<br> liblapack.so.3 => /usr/lib/liblapack.so.3 (0x00007ff38fb18000)<br> libopenblas.so.0 => /usr/lib/libopenblas.so.0 (0x00007ff38e2f5000)<br> <b>libmpich.so.3 </b>=> /usr/lib/libmpich.so.3 (0x00007ff38df16000)<br>
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007ff38dcf9000)<br> libgfortran.so.3 => /usr/lib/x86_64-linux-gnu/libgfortran.so.3 (0x00007ff38d9e5000)<br> libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007ff38d6df000)<br>
libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007ff38d4c9000)<br> libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007ff38d100000)<br> librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007ff38cef7000)<br>
libcr.so.0 => /usr/lib/libcr.so.0 (0x00007ff38cced000)<br> libmpl.so.1 => /usr/lib/libmpl.so.1 (0x00007ff38cae8000)<br> /lib64/ld-linux-x86-64.so.2 (0x00007ff390820000)<br> libquadmath.so.0 => /usr/lib/x86_64-linux-gnu/libquadmath.so.0 (0x00007ff38c8b2000)<br>
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007ff38c6ae000)<br><br></div><div>The feedback I got from the Quantum Espresso mailing list suggested that the cause of the error could be that pw.x (the executable) was not compiled with the same version of mpi as mpirun. <br>
</div><div>The output of the commands "mpirun --version", "mpif90 -v" and "ldd pw.x" above have lead me to suspect that this is indeed the case. <br></div><div><br></div><div>I therefore wonder whether it is possible to control which mpi version I compile my applications with. <br>
<br></div><div>If, with the current mpi installation, the applications are compiled with a different mpi version than mpirun, then I will likely have similar problems when compiling other applications as well. I would therefore very much appreciate if you could give me some hints on how I can solve this problem.<br>
<br></div><div>Thanks in advance.<br><br></div><div>Regards,<br></div><div>Torstein<br><br></div><div><br></div><div><br><br></div></div><div class="HOEnZb"><div class="h5"><div class="gmail_extra"><br><br><div class="gmail_quote">
On Thu, Apr 24, 2014 at 5:13 PM, Justin Riley <span dir="ltr"><<a href="mailto:jtriley@mit.edu" target="_blank">jtriley@mit.edu</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi Torstein,<br>
<br>
Can you please describe your cluster configuration (ie size, image id(s),<br>
instance type(s))? Also, you're currently using the SGE/OpenMPI<br>
integration. Have you tried just using mpirun only as described in the<br>
first part of:<br>
<br>
<a href="http://star.mit.edu/cluster/docs/latest/guides/sge.html#submitting-openmpi-jobs-using-a-parallel-environment" target="_blank">http://star.mit.edu/cluster/docs/latest/guides/sge.html#submitting-openmpi-jobs-using-a-parallel-environment</a><br>
<br>
Also, what does 'mpirun --version' show?<br>
<br>
~Justin<br>
<div><br>
On Thu, Apr 17, 2014 at 07:19:28PM +0200, Torstein Fjermestad wrote:<br>
> Dear all,<br>
><br>
> I recently tried to compile an application (Quantum Espresso,<br>
</div>> [1]<a href="http://www.quantum-espresso.org/" target="_blank">http://www.quantum-espresso.org/</a>) to be used for parallel computations<br>
<div><div>> on StarCluster. The installation procedure of the application consists of<br>
> the standard "./configure + make" steps. At the end of the output from<br>
> ./configure, the statement "Parallel environment detected successfully.\<br>
> Configured for compilation of parallel executables." appears.<br>
><br>
> The compilation with "make" completes without errors. I then run the<br>
> application in the following way:<br>
><br>
> I first write a submit script (submit.sh) with the following content:<br>
><br>
> cp /path/to/executable/pw.x .<br>
> mpirun ./pw.x -in input.inp<br>
> I then submit the job to the queueing system with the following command<br>
> <br>
> qsub -cwd -pe orte 16 ./submit.sh<br>
><br>
> However, in the output of the calculation, the following line is repeated<br>
> 16 times:<br>
><br>
> Parallel version (MPI), running on 1 processors<br>
><br>
> It therefore seems like the program runs 16 1 processor calculations that<br>
> all write to the same output.<br>
><br>
> I wrote about this problem to the mailing list of Quantum Espresso, and I<br>
> got the suggestion that perhaps the mpirun belonged to a different MPI<br>
> library than pw.x (a particular package of Quantum Espresso) was compiled<br>
> with.<br>
><br>
> I compiled pw.x on the same cluster as I executed mpirun. Are there<br>
> several versions of openMPI on the AMIs provided by StarCluster? In that<br>
> case, how can I choose the correct one.<br>
><br>
> Perhaps the problem has a different cause. Does anyone have suggestions on<br>
> how to solve it?<br>
><br>
> Thanks in advance for your help.<br>
><br>
> Yours sincerely,<br>
> Torstein Fjermestad<br>
><br>
</div></div>> References<br>
><br>
> Visible links<br>
> 1. <a href="http://www.quantum-espresso.org/" target="_blank">http://www.quantum-espresso.org/</a><br>
<br>
> _______________________________________________<br>
> StarCluster mailing list<br>
> <a href="mailto:StarCluster@mit.edu" target="_blank">StarCluster@mit.edu</a><br>
> <a href="http://mailman.mit.edu/mailman/listinfo/starcluster" target="_blank">http://mailman.mit.edu/mailman/listinfo/starcluster</a><br>
<br>
</blockquote></div><br></div>
</div></div><br>_______________________________________________<br>
StarCluster mailing list<br>
<a href="mailto:StarCluster@mit.edu">StarCluster@mit.edu</a><br>
<a href="http://mailman.mit.edu/mailman/listinfo/starcluster" target="_blank">http://mailman.mit.edu/mailman/listinfo/starcluster</a><br>
<br></blockquote></div><br></div>