[StarCluster] Bug.. need urgent workaround

Sergio Mafra sergiohmafra at gmail.com
Wed Jul 10 08:58:56 EDT 2013


Dear Friends,

I´m trying to setup a new cluster and StarCluster hangs on this point:

It´s a fresh creation of a 2 cr1.8xlarge instances using spot instances.
I´m using the latest version of SC 0.999

----
ERROR - Error occured while running plugin
'starcluster.clustersetup.DefaultClusterSetup':
!!! ERROR - error occurred in job (id=master): remote command 'source
/etc/profile && groupadd -o -g 1000 sgeadmin' failed with status 9:
groupadd: group sgeadmin exists
Traceback (most recent call last):
  File
"/usr/local/lib/python2.7/dist-packages/StarCluster-0.9999-py2.7.egg/starcluster/threadpool.py",
line 31, in run
    job.run()
  File
"/usr/local/lib/python2.7/dist-packages/StarCluster-0.9999-py2.7.egg/starcluster/threadpool.py",
line 58, in run
    r = self.method(*self.args, **self.kwargs)
  File
"/usr/local/lib/python2.7/dist-packages/StarCluster-0.9999-py2.7.egg/starcluster/clustersetup.py",
line 193, in _add_user_to_node
    node.add_user(self._user, uid, gid, self._user_shell)
  File
"/usr/local/lib/python2.7/dist-packages/StarCluster-0.9999-py2.7.egg/starcluster/node.py",
line 422, in add_user
    self.ssh.execute('groupadd -o -g %s %s' % (gid, name))
  File
"/usr/local/lib/python2.7/dist-packages/StarCluster-0.9999-py2.7.egg/starcluster/sshutils/__init__.py",
line 538, in execute
    msg, command, exit_status, out_str)
RemoteCommandFailed: remote command 'source /etc/profile && groupadd -o -g
1000 sgeadmin' failed with status 9:
groupadd: group sgeadmin exists

error occurred in job (id=node001): remote command 'source /etc/profile &&
groupadd -o -g 1000 sgeadmin' failed with status 9:
groupadd: group sgeadmin exists
Traceback (most recent call last):
  File
"/usr/local/lib/python2.7/dist-packages/StarCluster-0.9999-py2.7.egg/starcluster/threadpool.py",
line 31, in run
    job.run()
  File
"/usr/local/lib/python2.7/dist-packages/StarCluster-0.9999-py2.7.egg/starcluster/threadpool.py",
line 58, in run
    r = self.method(*self.args, **self.kwargs)
  File
"/usr/local/lib/python2.7/dist-packages/StarCluster-0.9999-py2.7.egg/starcluster/clustersetup.py",
line 193, in _add_user_to_node
    node.add_user(self._user, uid, gid, self._user_shell)
  File
"/usr/local/lib/python2.7/dist-packages/StarCluster-0.9999-py2.7.egg/starcluster/node.py",
line 422, in add_user
    self.ssh.execute('groupadd -o -g %s %s' % (gid, name))
  File
"/usr/local/lib/python2.7/dist-packages/StarCluster-0.9999-py2.7.egg/starcluster/sshutils/__init__.py",
line 538, in execute
    msg, command, exit_status, out_str)
RemoteCommandFailed: remote command 'source /etc/profile && groupadd -o -g
1000 sgeadmin' failed with status 9:
groupadd: group sgeadmin exists


!!! ERROR - Oops! Looks like you've found a bug in StarCluster
!!! ERROR - Crash report written to:
/home/ubuntu/.starcluster/logs/crash-report-8819.txt
!!! ERROR - Please remove any sensitive data from the crash report
!!! ERROR - and submit it to starcluster at mit.edu

All the best,

Sergio
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/pipermail/starcluster/attachments/20130710/fb55172a/attachment.htm


More information about the StarCluster mailing list