<div dir="ltr">Hi Folks,<div><br></div><div style>I hope someone can please shed light on the following new failure mode; crash report attached. (Btw, a prior, similar attempt to add 2 nodes to this cluster hung slightly earlier in the NFS sharing process.)</div>
<div style><br></div><div style><div>root@AWS-VTMXvcl /opt/awsutils/VI-utils</div><div># tail -f /var/log/VI-addnodes/addnode.log</div><div>StarCluster - (<a href="http://star.mit.edu/cluster">http://star.mit.edu/cluster</a>) (v. 0.94.3)</div>
<div>Software Tools for Academics and Researchers (STAR)</div><div>Please submit bug reports to <a href="mailto:starcluster@mit.edu">starcluster@mit.edu</a></div><div><br></div><div>>>> Launching node(s): node002, node003</div>
<div>Reservation:r-288a114b</div><div>>>> Waiting for instances to propagate... </div><div>>>> Waiting for node(s) to come up... (updating every 30s)</div><div>>>> Waiting for all nodes to be in a 'running' state...</div>
<div>4/4 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100% </div><div>>>> Waiting for SSH to come up on all nodes...</div><div>4/4 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100% </div>
<div>>>> Waiting for cluster to come up took 1.739 mins</div><div>>>> Running plugin starcluster.clustersetup.DefaultClusterSetup</div><div>>>> Configuring hostnames...</div><div>1/1 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100% </div>
<div>>>> Configuring /etc/hosts on each node</div><div>4/4 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100% </div><div>>>> Configuring NFS exports path(s):</div><div>/home /usr/share/jobs/</div>
<div>>>> Mounting all NFS export path(s) on 1 worker node(s)</div><div>1/1 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100% </div><div>!!! ERROR - Error occured while running plugin 'starcluster.clustersetup.DefaultClusterSetup':</div>
<div>!!! ERROR - error occurred in job (id=node002): remote command 'source /etc/profile && mount /home' failed with status 32:</div><div>mount.nfs: access denied by server while mounting master:/home</div>
<div>Traceback (most recent call last):</div><div> File "/usr/lib/python2.6/site-packages/StarCluster-0.94.3-py2.6.egg/starcluster/threadpool.py", line 48, in run</div><div> job.run()</div><div> File "/usr/lib/python2.6/site-packages/StarCluster-0.94.3-py2.6.egg/starcluster/threadpool.py", line 75, in run</div>
<div> r = self.method(*self.args, **self.kwargs)</div><div> File "/usr/lib/python2.6/site-packages/StarCluster-0.94.3-py2.6.egg/starcluster/node.py", line 731, in mount_nfs_shares</div><div> self.ssh.execute('mount %s' % path)</div>
<div> File "/usr/lib/python2.6/site-packages/StarCluster-0.94.3-py2.6.egg/starcluster/sshutils/__init__.py", line 555, in execute</div><div> msg, command, exit_status, out_str)</div><div>RemoteCommandFailed: remote command 'source /etc/profile && mount /home' failed with status 32:</div>
<div>mount.nfs: access denied by server while mounting master:/home</div><div><br></div><div>!!! ERROR - Oops! Looks like you've found a bug in StarCluster</div><div><br></div><div>!!! ERROR - Crash report written to: /root/.starcluster/logs/crash-report-15021.txt</div>
<div>!!! ERROR - Please remove any sensitive data from the crash report</div><div>!!! ERROR - and submit it to <a href="mailto:starcluster@mit.edu">starcluster@mit.edu</a></div><div><br></div><div style>Thanks in advance for any advice, fix, workaround -- anything.</div>
<div style><br></div><div style>Regards,</div><div style>Lyn</div><div><br></div><div><br></div></div></div>