<div dir="ltr">Hi all - I have a 50 node spot cluster running. I tried to add 10 additional nodes and at some point along the way it failed. Only 2 nodes were added to the cluster, but they aren't getting SGE jobs. I tried re-adding the nodes using '-x -a' but it fails. So I then tried to remove the nodes, and that is failing as well. How do I fix this? Here's the output:<div>
<br></div><div><p style="margin:0px;font-size:11px;font-family:Menlo">[ec2-user@awsmicro plugins]$ starcluster removenode ngscluster node060</p>
<p style="margin:0px;font-size:11px;font-family:Menlo">StarCluster - (<a href="http://star.mit.edu/cluster">http://star.mit.edu/cluster</a>) (v. 0.9999)</p>
<p style="margin:0px;font-size:11px;font-family:Menlo">Software Tools for Academics and Researchers (STAR)</p>
<p style="margin:0px;font-size:11px;font-family:Menlo">Please submit bug reports to <a href="mailto:starcluster@mit.edu">starcluster@mit.edu</a></p>
<p style="margin:0px;font-size:11px;font-family:Menlo;min-height:13px"><br></p>
<p style="margin:0px;font-size:11px;font-family:Menlo">>>> Running plugin tagger.TaggerPlugin</p>
<p style="margin:0px;font-size:11px;font-family:Menlo">>>> Running plugin setupuserenv.SetupUserEnvironment</p>
<p style="margin:0px;font-size:11px;font-family:Menlo">>>> Running plugin starcluster.plugins.users.CreateUsers</p>
<p style="margin:0px;font-size:11px;font-family:Menlo">>>> Running plugin starcluster.plugins.sge.SGEPlugin</p>
<p style="margin:0px;font-size:11px;font-family:Menlo">>>> Removing node060 from SGE</p>
<p style="margin:0px;font-size:11px;font-family:Menlo">!!! ERROR - Error occured while running plugin 'starcluster.plugins.sge.SGEPlugin':</p>
<p style="margin:0px;font-size:11px;font-family:Menlo">!!! ERROR - remote command 'source /etc/profile && qconf -dattr</p>
<p style="margin:0px;font-size:11px;font-family:Menlo">!!! ERROR - hostgroup hostlist node060 @allhosts' failed with status 1:</p>
<p style="margin:0px;font-size:11px;font-family:Menlo">!!! ERROR - error writing object "@allhosts" to spooling database</p></div><div><br></div><div><br></div><div>At this point, I have to go into the AWS web console and remove the nodes myself as starcluster isn't able to.</div>
<div><br></div></div>