[StarCluster] poor behavior from addnode
Cedar McKay
cmckay at uw.edu
Fri Jun 6 00:10:19 EDT 2014
If I try to use addnode to add multiple clusters (see below), and one fails to launch because I've already exceeded my max instance number, the script will crash, yet some of the nodes will boot up. But it will boot without being integrated into the cluster.
Also, is this even the proper venue for reports like this? Do the creators even read this list? Or should I be submitting bug reports?
Thanks!
Cedar
rocaplab:~ cedar$ sc an -n 2 -I r3.8xlarge -b 0.40 v17b
StarCluster - (http://star.mit.edu/cluster) (v. 0.95.5)
Software Tools for Academics and Researchers (STAR)
Please submit bug reports to starcluster at mit.edu
>>> Launching node(s): v17b-node005, v17b-node006
!!! ERROR - MaxSpotInstanceCountExceeded: Max spot instance count exceeded
Traceback (most recent call last):
File "/Library/Python/2.7/site-packages/StarCluster-0.95.5-py2.7.egg/starcluster/cli.py", line 274, in main
sc.execute(args)
File "/Library/Python/2.7/site-packages/StarCluster-0.95.5-py2.7.egg/starcluster/commands/addnode.py", line 128, in execute
no_create=self.opts.no_create)
File "/Library/Python/2.7/site-packages/StarCluster-0.95.5-py2.7.egg/starcluster/cluster.py", line 189, in add_nodes
no_create=no_create)
File "/Library/Python/2.7/site-packages/StarCluster-0.95.5-py2.7.egg/starcluster/cluster.py", line 1033, in add_nodes
spot_bid=spot_bid)
File "/Library/Python/2.7/site-packages/StarCluster-0.95.5-py2.7.egg/starcluster/cluster.py", line 964, in create_nodes
resvs.extend(self.ec2.request_instances(image_id, **kwargs))
File "/Library/Python/2.7/site-packages/StarCluster-0.95.5-py2.7.egg/starcluster/awsutils.py", line 517, in request_instances
**shared_kwargs)
File "/Library/Python/2.7/site-packages/StarCluster-0.95.5-py2.7.egg/starcluster/awsutils.py", line 534, in request_spot_instances
return self.conn.request_spot_instances(**kwargs)
File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/boto-2.27.0-py2.7.egg/boto/ec2/connection.py", line 1597, in request_spot_instances
verb='POST')
File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/boto-2.27.0-py2.7.egg/boto/connection.py", line 1157, in get_list
raise self.ResponseError(response.status, response.reason, body)
EC2ResponseError: EC2ResponseError: 400 Bad Request
<?xml version="1.0" encoding="UTF-8"?>
<Response><Errors><Error><Code>MaxSpotInstanceCountExceeded</Code><Message>Max spot instance count exceeded</Message></Error></Errors><RequestID>98a71653-d53d-46d4-9f07-987103dfe74d</RequestID></Response>
More information about the StarCluster
mailing list