<div dir="ltr">We benchmarked AWS enhanced networking late last year &amp; beginning of this year:<br><div><br><a href="http://blogs.scalablelogic.com/2013/12/enhanced-networking-in-aws-cloud.html">http://blogs.scalablelogic.com/2013/12/enhanced-networking-in-aws-cloud.html</a><br>
<a href="http://blogs.scalablelogic.com/2014/01/enhanced-networking-in-aws-cloud-part-2.html">http://blogs.scalablelogic.com/2014/01/enhanced-networking-in-aws-cloud-part-2.html</a><br><br></div><div>There are a few things that can affect MPI performance of AWS with enhanced networking:<br>
<br></div><div>1) Make sure that you are using a VPC, because instances in non-VPC default back to standard networking.<br><br>2) Make sure that your instances are all in a AWS Placement Group, or else the latency would be much longer.<br>
<br></div><div>3) Finally, you didn&#39;t specify the instance type -- it&#39;s important to know what kind of instances you used to perform the test...<br></div><div><div class="gmail_extra"><br clear="all"><div>Rayson<br>
<br>==================================================<br>Open Grid Scheduler - The Official Open Source Grid Engine<br><a href="http://gridscheduler.sourceforge.net/" target="_blank">http://gridscheduler.sourceforge.net/</a><br>
<a href="http://gridscheduler.sourceforge.net/GridEngine/GridEngineCloud.html" target="_blank">http://gridscheduler.sourceforge.net/GridEngine/GridEngineCloud.html</a></div>
<br><br><div class="gmail_quote">On Thu, May 8, 2014 at 1:30 PM, Torstein Fjermestad <span dir="ltr">&lt;<a href="mailto:tfjermestad@gmail.com" target="_blank">tfjermestad@gmail.com</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div dir="ltr"><div><div><div><div><div><div><div><div><div><div><div><div><div>Dear all,<br><br></div>I am planning to use Star Cluster to run Quantum Espresso (<a href="http://www.quantum-espresso.org/" target="_blank">http://www.quantum-espresso.org/</a>) calculations. For those who are not familiar with Quantum Espresso; it is a code to run quantum mechanical calculations on materials. In order for these types of calculations to achieve good scaling with respect to the number of CPU, fast communication hardware is necessary. <br>

<br></div>For this reason, I configured a cluster based on the HVM-EBS image:<br><br>[1] ami-ca4abfbd eu-west-1 starcluster-base-ubuntu-13.04-x86_64-hvm (HVM-EBS)<br><br></div>Then I followed the instructions on this site <br>

<br><a href="http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/enhanced-networking.html#test-enhanced-networking" target="_blank">http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/enhanced-networking.html#test-enhanced-networking</a><br>

<br></div>to check that &quot;enhanced networking&quot; was indeed enabled. Running the suggested commands gave me the same output as in the examples.  This certainly indicated that &quot;enhanced networking&quot; is enabled in the image. <br>

<br></div>On this image I installed Quantum Espresso (by use of apt-get install) and I generated a new modified image from which I generated the final cluster. <br><br></div>On this cluster, I carried out some parallelization tests by running the same Quantum Espresso calculation on different number of CPUs. I present the results below:<br>

<br>



        
        
        
        
        
        



<table cols="3" border="0" cellspacing="0">
        <colgroup span="2" width="85"></colgroup>
        <colgroup width="99"></colgroup>
        <tbody><tr>
                <td align="LEFT" height="17"># proc</td>
                <td align="LEFT">CPU time</td>
                <td align="LEFT">wall time</td>
        </tr>
        <tr>
                <td align="RIGHT" height="16">4<br></td>
                <td align="LEFT">4m23.98s</td>
                <td align="LEFT">5m 0.10s</td>
        </tr>
        <tr>
                <td align="RIGHT" height="16">8</td>
                <td align="LEFT">2m46.25s</td>
                <td align="LEFT">2m49.30s</td>
        </tr>
        <tr>
                <td align="RIGHT" height="16">16</td>
                <td align="LEFT">1m40.98s</td>
                <td align="LEFT">4m 2.82s</td>
        </tr>
        <tr>
                <td align="RIGHT" height="16">32</td>
                <td align="LEFT">0m57.70s</td>
                <td align="LEFT">3m36.15s</td>
        </tr>
</tbody></table>



<br></div>Except from the test ran with 8 CPUs, the wall time is significantly longer than the CPU time. This is usually an indication of a slow communication between the CPUs/nodes.<br><br></div>My question is therefore whether there is a way to check the communication speed between the nodes / CPUs.<br>

<br></div>The large difference between the CPU time and wall time may also be caused by an incorrect configuration of the cluster. Is there something I have done wrong / forgotten? <br><br></div>Does anyone have suggestions on how I can fix this parallelization issue?<br>

<br></div>Thanks in advance for your help.<br><br></div>Regards,<br></div>Torstein Fjermestad<br><br><div><div><div><div><div><div><div><div><br><br></div></div></div></div></div></div></div></div></div>
<br>_______________________________________________<br>
StarCluster mailing list<br>
<a href="mailto:StarCluster@mit.edu">StarCluster@mit.edu</a><br>
<a href="http://mailman.mit.edu/mailman/listinfo/starcluster" target="_blank">http://mailman.mit.edu/mailman/listinfo/starcluster</a><br>
<br></blockquote></div><br></div></div></div>