HI,<br><br>Two issues here, as reported earlier. On the first one, running with new logging<br>turned on, I see an intermittent failure of 'starcluster addnode <clustername>'.<br>Error trace from log file below.<br>
<br>Second, on ELB adding too many nodes when adding more than one node<br>per iteration. The code at StarCluster/starcluster/plugins/sge/__init__.py<br>at line 637 reads:<br><br> if need_to_add > 0:<br> need_to_add = min(self.add_nodes_per_iteration, need_to_add)<br>
<br>The fix could be as simple as:<br><br> if need_to_add > 0:<br> head_room = self.max_nodes - self.stat.hosts<br> need_to_add = min(self.add_nodes_per_iteration, need_to_add, head_room)<br>
<br>depending upon what you know about self.max_node and self.stat.hosts.<br><br>Regards,<br><br>Don<br><br>PID: 12630 cluster.py:686 - DEBUG - returning self._nodes = [<Node: master (i-13d0707d)>, <Node: node001 (i-11d0\<br>
707f)>, <Node: node002 (i-efd07081)>, <Node: node003 (i-edd07083)>]<br>PID: 12630 cluster.py:670 - DEBUG - existing nodes: {u'i-13d0707d': <Node: master (i-13d0707d)>, u'i-11d0707f': \<br>
<Node: node001 (i-11d0707f)>, u'i-efd07081': <Node: node002 (i-efd07081)>, u'i-edd07083': <Node: node003 (i-edd0\<br>7083)>}<br>PID: 12630 cluster.py:673 - DEBUG - updating existing node i-13d0707d in self._nodes<br>
PID: 12630 cluster.py:673 - DEBUG - updating existing node i-11d0707f in self._nodes<br>PID: 12630 cluster.py:673 - DEBUG - updating existing node i-efd07081 in self._nodes<br>PID: 12630 cluster.py:673 - DEBUG - updating existing node i-edd07083 in self._nodes<br>
PID: 12630 cluster.py:686 - DEBUG - returning self._nodes = [<Node: master (i-13d0707d)>, <Node: node001 (i-11d0\<br>707f)>, <Node: node002 (i-efd07081)>, <Node: node003 (i-edd07083)>]<br>PID: 12630 clustersetup.py:96 - INFO - Configuring hostnames...<br>
PID: 12630 cli.py:182 - DEBUG - Traceback (most recent call last):<br> File "build/bdist.linux-i686/egg/starcluster/cli.py", line 160, in main<br> sc.execute(args)<br> File "build/bdist.linux-i686/egg/starcluster/commands/addnode.py", line 37, in execute<br>
self.cm.add_node(tag, aliases)<br> File "build/bdist.linux-i686/egg/starcluster/cluster.py", line 119, in add_node<br> cl.add_node(alias)<br> File "build/bdist.linux-i686/egg/starcluster/cluster.py", line 770, in add_node<br>
self.add_nodes(1, aliases=aliases)<br> File "build/bdist.linux-i686/egg/starcluster/cluster.py", line 805, in add_nodes<br> self.volumes)<br> File "build/bdist.linux-i686/egg/starcluster/clustersetup.py", line 510, in on_add_node<br>
self._setup_hostnames(nodes=[node])<br> File "build/bdist.linux-i686/egg/starcluster/clustersetup.py", line 98, in _setup_hostnames<br> self.pool.simple_job(node.set_hostname, (), jobid=node.alias)<br>AttributeError: 'NoneType' object has no attribute 'set_hostname'<br>
<br>PID: 12630 cli.py:129 - ERROR - Oops! Looks like you've found a bug in StarCluster<br>PID: 12630 cli.py:130 - ERROR - Debug file written to: /tmp/starcluster-debug-staruser.log<br>PID: 12630 cli.py:131 - ERROR - Look for lines starting with PID: 12630<br>
PID: 12630 cli.py:132 - ERROR - Please submit this file, minus any private information,<br>PID: 12630 cli.py:133 - ERROR - to <a href="mailto:starcluster@mit.edu">starcluster@mit.edu</a><br>PID: 12630 ssh.py:536 - DEBUG - __del__ called<br>
PID: 12630 ssh.py:536 - DEBUG - __del__ called<br><br>