<div dir="ltr">Sorry, I forgot to include the list.<br><br><div class="gmail_quote">---------- Forwarded message ----------<br>From: <b class="gmail_sendername">Milton Pividori</b> <span dir="ltr"><<a href="mailto:miltondp@gmail.com">miltondp@gmail.com</a>></span><br>
Date: 2013/11/5<br>Subject: Re: [StarCluster] Recover cluster after an error when starting<br>To: "MacMullan, Hugh" <<a href="mailto:hughmac@wharton.upenn.edu">hughmac@wharton.upenn.edu</a>><br><br><br><div dir="ltr">
Thank you Hugh, I just discovered what "restart" does. I will try it next time.<div><br></div><div>However, what I did now was to increase the timeout for mount in the file starcluster/node.py, in the mount_nfs_shares function, line 725 (I am using StarCluster 0.94.2). I added the option "timeo=20", and it worked.</div>
<div><br></div><div>Maybe it would be good to have a "timeout" option in the config file.</div><div><br></div><div>Thank you again!</div></div><div class="gmail_extra"><br><br><div class="gmail_quote">2013/11/5 MacMullan, Hugh <span dir="ltr"><<a href="mailto:hughmac@wharton.upenn.edu" target="_blank">hughmac@wharton.upenn.edu</a>></span><div>
<div class="h5"><br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div lang="EN-US" link="blue" vlink="purple">
<div>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">Hi Milton:<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">I would generally do a restart (starcluster restart mycluster).<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">-Hugh<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"><u></u> <u></u></span></p>
<p class="MsoNormal"><b><span style="font-size:11.0pt;font-family:"Calibri","sans-serif"">From:</span></b><span style="font-size:11.0pt;font-family:"Calibri","sans-serif""> <a href="mailto:starcluster-bounces@mit.edu" target="_blank">starcluster-bounces@mit.edu</a> [mailto:<a href="mailto:starcluster-bounces@mit.edu" target="_blank">starcluster-bounces@mit.edu</a>]
<b>On Behalf Of </b>Milton Pividori<br>
<b>Sent:</b> Tuesday, November 05, 2013 11:50 AM<br>
<b>To:</b> <a href="mailto:starcluster@mit.edu" target="_blank">starcluster@mit.edu</a><br>
<b>Subject:</b> [StarCluster] Recover cluster after an error when starting<u></u><u></u></span></p><div><div>
<p class="MsoNormal"><u></u> <u></u></p>
<div>
<p class="MsoNormal">Hi all,<u></u><u></u></p>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal">I am a new user of StarCluster. First of all, thank you for this great software!<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal">My question is about how to recover a cluster when there was an error in starting it. After I ran "starcluster start mycluster" I got a timeout error when mounting the /home directory (EBS volume). Is it possible to run the plugin again?
In this case, I think the plugin is "starcluster.clustersetup.DefaultClusterSetup".<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal">This is the last part of the error I get (the cluster size is 10 with t1.micro):<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<div>
<p class="MsoNormal">>>> Starting NFS server on master<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">>>> Configuring NFS exports path(s):<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">/home<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">>>> Mounting all NFS export path(s) on 9 worker node(s)<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">9/9 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100% <u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">!!! ERROR - Error occured while running plugin 'starcluster.clustersetup.DefaultClusterSetup':<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">!!! ERROR - error occurred in job (id=node009): remote command 'source /etc/profile && mount /home' failed with status 32:<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">mount.nfs: mount to NFS server 'master:/home' failed: timed out, giving up<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">Traceback (most recent call last):<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"> File "/usr/local/lib/python2.7/dist-packages/StarCluster-0.94.2-py2.7.egg/starcluster/threadpool.py", line 48, in run<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"> job.run()<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"> File "/usr/local/lib/python2.7/dist-packages/StarCluster-0.94.2-py2.7.egg/starcluster/threadpool.py", line 75, in run<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"> r = self.method(*self.args, **self.kwargs)<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"> File "/usr/local/lib/python2.7/dist-packages/StarCluster-0.94.2-py2.7.egg/starcluster/node.py", line 731, in mount_nfs_shares<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"> self.ssh.execute('mount %s' % path)<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"> File "/usr/local/lib/python2.7/dist-packages/StarCluster-0.94.2-py2.7.egg/starcluster/sshutils/__init__.py", line 555, in execute<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"> msg, command, exit_status, out_str)<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">RemoteCommandFailed: remote command 'source /etc/profile && mount /home' failed with status 32:<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">mount.nfs: mount to NFS server 'master:/home' failed: timed out, giving up<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal">Thank you!<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<p class="MsoNormal">-- <br>
<span style="font-family:"Tahoma","sans-serif"">Milton Pividori<br>
Blog: <a href="http://www.miltonpividori.com.ar" target="_blank">www.miltonpividori.com.ar</a></span><u></u><u></u></p>
</div>
</div>
</div></div></div>
</div>
</blockquote></div></div></div><div><div class="h5"><br><br clear="all"><div><br></div>-- <br><span style="font-family:tahoma,sans-serif">Milton Pividori</span><br style="font-family:tahoma,sans-serif"><span style="font-family:tahoma,sans-serif">Blog: <a href="http://www.miltonpividori.com.ar" target="_blank">www.miltonpividori.com.ar</a></span><br>
</div></div></div>
</div><br><br clear="all"><div><br></div>-- <br><span style="font-family:tahoma,sans-serif">Milton Pividori</span><br style="font-family:tahoma,sans-serif"><span style="font-family:tahoma,sans-serif">Blog: <a href="http://www.miltonpividori.com.ar" target="_blank">www.miltonpividori.com.ar</a></span><br>
</div>