[StarCluster] starcluster (potential) bug report

Yevgeny Popkov ypopkov at gmail.com
Wed Mar 13 22:02:30 EDT 2013


Just FYI, once I changed CLUSTER_USER in the config back to sgeadmin the
error disappeared.

Thanks,
Yevgeny

On Wed, Mar 13, 2013 at 9:44 PM, Yevgeny Popkov <ypopkov at gmail.com> wrote:

> ubuntu at ip-10-149-30-54:~$ sc start smallcluster
> StarCluster - (http://star.mit.edu/cluster) (v. 0.9999)
> Software Tools for Academics and Researchers (STAR)
> Please submit bug reports to starcluster at mit.edu
>
> >>> Using default cluster template: smallcluster
> >>> Validating cluster template settings...
> >>> Cluster template settings are valid
> >>> Starting cluster...
> >>> Launching a 2-node cluster...
> >>> Launching master node (ami: ami-c4801ead, type: m3.xlarge)...
> >>> Creating security group @sc-smallcluster...
> Reservation:r-a3a18bd9
> >>> Launching node001 (ami: ami-c4801ead, type: m3.xlarge)
> SpotInstanceRequest:sir-f3ac7014
> >>> Waiting for cluster to come up... (updating every 30s)
> >>> Waiting for all nodes to be in a 'running' state...
> 1/1 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> 100%
> >>> Waiting for SSH to come up on all nodes...
> 1/1 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> 100%
> >>> Waiting for cluster to come up took 2.215 mins
> >>> The master node is ec2-50-19-195-49.compute-1.amazonaws.com
> >>> Configuring cluster...
> >>> Attaching volume vol-79faa709 to master node on /dev/sdz ...
> >>> Waiting for vol-79faa709 to transition to: attached...
> >>> Running plugin starcluster.clustersetup.DefaultClusterSetup
> >>> Configuring hostnames...
> 2/2 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> 100%
> !!! ERROR - Error occured while running plugin
> 'starcluster.clustersetup.DefaultClusterSetup':
> !!! ERROR - error occurred in job (id=node001): failed to connect to host
> ec2-23-22-91-158.compute-1.amazonaws.com on port 22
> Traceback (most recent call last):
>   File
> "/usr/local/lib/python2.7/dist-packages/StarCluster-0.9999-py2.7.egg/starcluster/threadpool.py",
> line 31, in run
>     job.run()
>   File
> "/usr/local/lib/python2.7/dist-packages/StarCluster-0.9999-py2.7.egg/starcluster/threadpool.py",
> line 58, in run
>     r = self.method(*self.args, **self.kwargs)
>   File
> "/usr/local/lib/python2.7/dist-packages/StarCluster-0.9999-py2.7.egg/starcluster/node.py",
> line 788, in set_hostname
>     hostname_file = self.ssh.remote_file("/etc/hostname", "w")
>   File
> "/usr/local/lib/python2.7/dist-packages/StarCluster-0.9999-py2.7.egg/starcluster/sshutils/__init__.py",
> line 291, in remote_file
>     rfile = self.sftp.open(file, mode)
>   File
> "/usr/local/lib/python2.7/dist-packages/StarCluster-0.9999-py2.7.egg/starcluster/sshutils/__init__.py",
> line 181, in sftp
>     self._sftp = paramiko.SFTPClient.from_transport(self.transport)
>   File
> "/usr/local/lib/python2.7/dist-packages/StarCluster-0.9999-py2.7.egg/starcluster/sshutils/__init__.py",
> line 130, in transport
>     port=self._port, timeout=self._timeout)
>   File
> "/usr/local/lib/python2.7/dist-packages/StarCluster-0.9999-py2.7.egg/starcluster/sshutils/__init__.py",
> line 97, in connect
>     raise exception.SSHConnectionError(host, port)
> SSHConnectionError: failed to connect to host
> ec2-23-22-91-158.compute-1.amazonaws.com on port 22
>
>
> !!! ERROR - Oops! Looks like you've found a bug in StarCluster
> !!! ERROR - Crash report written to:
> /home/ubuntu/.starcluster/logs/crash-report-2092.txt
> !!! ERROR - Please remove any sensitive data from the crash report
> !!! ERROR - and submit it to starcluster at mit.edu
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/pipermail/starcluster/attachments/20130313/2b91f104/attachment.htm


More information about the StarCluster mailing list