[StarCluster] Bug in Starcluster / paramiko

Paul Koerbitz paul.koerbitz at gmail.com
Tue Jan 3 17:17:02 EST 2012


Dear Starcluster Team,

it looks like I stumbled upon a bug in Starcluster / paramiko. I was just
trying to set up my first cluster, so it is also possible that I did
something wrong.
Please find the bug report attached.

I am running Ubuntu 10.04 amd64 with python 2.6 running under virtualenv.
Please let me know if you need any more information.

kind regards
Paul
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/pipermail/starcluster/attachments/20120103/c34940de/attachment.htm
-------------- next part --------------
---------- CRASH DETAILS ----------
2012-01-03 23:00:31,190 PID: 26980 config.py:551 - DEBUG - Loading config
2012-01-03 23:00:31,190 PID: 26980 config.py:119 - DEBUG - Loading file: /home/paul/.starcluster/config
2012-01-03 23:00:31,193 PID: 26980 awsutils.py:55 - DEBUG - creating self._conn w/ connection_authenticator kwargs = {'path': '/', 'region': RegionInfo:eu-west-1, 'port': None, 'is_secure': True}
2012-01-03 23:00:31,942 PID: 26980 start.py:178 - INFO - Using default cluster template: smallcluster
2012-01-03 23:00:31,942 PID: 26980 base.py:149 - WARNING - ************************************************************
2012-01-03 23:00:31,947 PID: 26980 base.py:153 - WARNING - Waiting 5 seconds before continuing...
2012-01-03 23:00:31,948 PID: 26980 base.py:154 - WARNING - Press CTRL-C to cancel...
2012-01-03 23:00:36,955 PID: 26980 cluster.py:1511 - INFO - Validating cluster template settings...
2012-01-03 23:00:38,018 PID: 26980 cluster.py:1526 - INFO - Cluster template settings are valid
2012-01-03 23:00:38,019 PID: 26980 cluster.py:1403 - INFO - Starting cluster...
2012-01-03 23:00:38,019 PID: 26980 cluster.py:929 - INFO - Launching a 1-node cluster...
2012-01-03 23:00:38,019 PID: 26980 cluster.py:982 - INFO - Launching master node (ami: ami-953b06e1, type: t1.micro)...
2012-01-03 23:00:38,180 PID: 26980 awsutils.py:162 - INFO - Creating security group @sc-spotcluster...
2012-01-03 23:00:40,486 PID: 26980 cluster.py:767 - INFO - SpotInstanceRequest:sir-ac743c13
2012-01-03 23:00:40,487 PID: 26980 cluster.py:1213 - INFO - Waiting for cluster to come up... (updating every 30s)
2012-01-03 23:00:40,700 PID: 26980 cluster.py:1143 - INFO - Waiting for open spot requests to become active...
2012-01-03 23:04:43,722 PID: 26980 cluster.py:660 - DEBUG - existing nodes: {}
2012-01-03 23:04:43,722 PID: 26980 cluster.py:668 - DEBUG - adding node i-04a43d4d to self._nodes list
2012-01-03 23:04:44,349 PID: 26980 cluster.py:676 - DEBUG - returning self._nodes = [<Node: master (i-04a43d4d)>]
2012-01-03 23:04:44,349 PID: 26980 cluster.py:1171 - INFO - Waiting for all nodes to be in a 'running' state...
2012-01-03 23:04:44,529 PID: 26980 cluster.py:660 - DEBUG - existing nodes: {u'i-04a43d4d': <Node: master (i-04a43d4d)>}
2012-01-03 23:04:44,530 PID: 26980 cluster.py:663 - DEBUG - updating existing node i-04a43d4d in self._nodes
2012-01-03 23:04:44,531 PID: 26980 cluster.py:676 - DEBUG - returning self._nodes = [<Node: master (i-04a43d4d)>]
2012-01-03 23:05:15,032 PID: 26980 cluster.py:660 - DEBUG - existing nodes: {u'i-04a43d4d': <Node: master (i-04a43d4d)>}
2012-01-03 23:05:15,032 PID: 26980 cluster.py:663 - DEBUG - updating existing node i-04a43d4d in self._nodes
2012-01-03 23:05:15,032 PID: 26980 cluster.py:676 - DEBUG - returning self._nodes = [<Node: master (i-04a43d4d)>]
2012-01-03 23:05:15,032 PID: 26980 cluster.py:1189 - INFO - Waiting for SSH to come up on all nodes...
2012-01-03 23:05:15,477 PID: 26980 cluster.py:660 - DEBUG - existing nodes: {u'i-04a43d4d': <Node: master (i-04a43d4d)>}
2012-01-03 23:05:15,477 PID: 26980 cluster.py:663 - DEBUG - updating existing node i-04a43d4d in self._nodes
2012-01-03 23:05:15,478 PID: 26980 cluster.py:676 - DEBUG - returning self._nodes = [<Node: master (i-04a43d4d)>]
2012-01-03 23:05:15,971 PID: 26980 ssh.py:69 - DEBUG - loading private key /home/paul/.ssh/paulkoer_aws.pem
2012-01-03 23:05:15,971 PID: 26980 ssh.py:76 - DEBUG - specified key does not end in either rsa or dsa, trying both
2012-01-03 23:05:15,972 PID: 26980 ssh.py:154 - DEBUG - Using private key /home/paul/.ssh/paulkoer_aws.pem (rsa)
2012-01-03 23:05:15,972 PID: 26980 ssh.py:91 - DEBUG - connecting to host ec2-176-34-221-54.eu-west-1.compute.amazonaws.com on port 22 as user root
2012-01-03 23:05:47,828 PID: 26980 cluster.py:660 - DEBUG - existing nodes: {u'i-04a43d4d': <Node: master (i-04a43d4d)>}
2012-01-03 23:05:47,829 PID: 26980 cluster.py:663 - DEBUG - updating existing node i-04a43d4d in self._nodes
2012-01-03 23:05:47,829 PID: 26980 cluster.py:676 - DEBUG - returning self._nodes = [<Node: master (i-04a43d4d)>]
2012-01-03 23:05:48,041 PID: 26980 ssh.py:91 - DEBUG - connecting to host ec2-176-34-221-54.eu-west-1.compute.amazonaws.com on port 22 as user root
2012-01-03 23:05:49,059 PID: 26980 utils.py:88 - INFO - Waiting for cluster to come up took 5.143 mins
2012-01-03 23:05:49,060 PID: 26980 cluster.py:1430 - INFO - The master node is ec2-176-34-221-54.eu-west-1.compute.amazonaws.com
2012-01-03 23:05:49,061 PID: 26980 cluster.py:1431 - INFO - Setting up the cluster...
2012-01-03 23:05:49,253 PID: 26980 cluster.py:660 - DEBUG - existing nodes: {u'i-04a43d4d': <Node: master (i-04a43d4d)>}
2012-01-03 23:05:49,254 PID: 26980 cluster.py:663 - DEBUG - updating existing node i-04a43d4d in self._nodes
2012-01-03 23:05:49,255 PID: 26980 cluster.py:676 - DEBUG - returning self._nodes = [<Node: master (i-04a43d4d)>]
2012-01-03 23:05:49,255 PID: 26980 clustersetup.py:96 - INFO - Configuring hostnames...
2012-01-03 23:05:49,273 PID: 26980 ssh.py:173 - DEBUG - creating sftp connection
2012-01-03 23:05:49,273 PID: 26980 threadpool.py:136 - DEBUG - unfinished_tasks = 1
2012-01-03 23:05:50,276 PID: 26980 threadpool.py:136 - DEBUG - unfinished_tasks = 1
2012-01-03 23:05:51,278 PID: 26980 threadpool.py:136 - DEBUG - unfinished_tasks = 1
2012-01-03 23:05:52,280 PID: 26980 threadpool.py:124 - INFO - Shutting down threads...
2012-01-03 23:05:52,282 PID: 26980 threadpool.py:136 - DEBUG - unfinished_tasks = 20
2012-01-03 23:05:53,285 PID: 26980 cli.py:263 - DEBUG - error occured in job (id=master): Garbage packet received
Traceback (most recent call last):
  File "/home/paul/.virtualenvs/np/lib/python2.6/site-packages/starcluster/threadpool.py", line 32, in run
    job.run()
  File "/home/paul/.virtualenvs/np/lib/python2.6/site-packages/starcluster/threadpool.py", line 59, in run
    r = self.method(*self.args, **self.kwargs)
  File "/home/paul/.virtualenvs/np/lib/python2.6/site-packages/starcluster/node.py", line 661, in set_hostname
    hostname_file = self.ssh.remote_file("/etc/hostname", "w")
  File "/home/paul/.virtualenvs/np/lib/python2.6/site-packages/starcluster/ssh.py", line 284, in remote_file
    rfile = self.sftp.open(file, mode)
  File "/home/paul/.virtualenvs/np/lib/python2.6/site-packages/starcluster/ssh.py", line 174, in sftp
    self._sftp = paramiko.SFTPClient.from_transport(self.transport)
  File "/home/paul/.virtualenvs/np/lib/python2.6/site-packages/paramiko/sftp_client.py", line 106, in from_transport
    return cls(chan)
  File "/home/paul/.virtualenvs/np/lib/python2.6/site-packages/paramiko/sftp_client.py", line 87, in __init__
    server_version = self._send_version()
  File "/home/paul/.virtualenvs/np/lib/python2.6/site-packages/paramiko/sftp.py", line 108, in _send_version
    t, data = self._read_packet()
  File "/home/paul/.virtualenvs/np/lib/python2.6/site-packages/paramiko/sftp.py", line 179, in _read_packet
    raise SFTPError('Garbage packet received')
SFTPError: Garbage packet received

---------- SYSTEM INFO ----------
StarCluster: 0.92.1
Python: 2.6.5 (r265:79063, Apr 16 2010, 13:57:41)  [GCC 4.4.3]
Platform: Linux-2.6.32-37-generic-x86_64-with-Ubuntu-10.04-lucid
boto: 2.0
paramiko: 1.7.7.1 (George)
Crypto: 2.4.1
jinja2: 2.5.5
decorator: 3.3.1


More information about the StarCluster mailing list