[StarCluster] Error creating/deleting security groups

Avner May avnermay at cs.columbia.edu
Tue Feb 24 16:55:43 EST 2015


Hi all,

I'm getting the errors below when I try to start a cluster, or
listclusters.  This all started after I terminated a cluster.  I got an
error during termination, so it told me to use the "-f" flag to force
termination.  I did that, but it was taking a very long time to erase the
security group.  So I interrupted the "terminate -f" command, and I've been
having issues ever since.  Basically, if I try to start a cluster, it is
taking forever in the step where it says "waiting for a security group
@sc-cluster-name" (I've been waiting like 20+ minutes for a cluster to
start...).  It then generally gives me some error like the one below.  It
also fails in the "listclusters" command.  At the heart of this there seem
to be issues in the "*get_all_security_groups*" and "*create_security_group*"
functions in "C:\Python27\lib\site-packages\boto\ec2\connection.py".  Any
idea what might be going on?  Help would be very appreciated, as this is
totally blocking my progress on my work.

Thanks a lot,
Avner

*C:\Windows\system32>starcluster start babel*
*StarCluster - (http://star.mit.edu/cluster <http://star.mit.edu/cluster>)
(v. 0.95.6)*
*Software Tools for Academics and Researchers (STAR)*
*Please submit bug reports to starcluster at mit.edu <starcluster at mit.edu>*

*>>> Using default cluster template: main*
*>>> Validating cluster template settings...*
*>>> Cluster template settings are valid*
*>>> Starting cluster...*
*>>> Launching a 20-node cluster...*
*>>> Creating security group @sc-babel...*
*>>> Waiting for security group @sc-babel...*
*!!! ERROR - InternalError: An internal error has occurred*
*Traceback (most recent call last):*
*  File
"C:\Python27\lib\site-packages\starcluster-0.95.6-py2.7.egg\starcluster\cli.py",
line 274, in main*
*    sc.execute(args)*
*  File
"C:\Python27\lib\site-packages\starcluster-0.95.6-py2.7.egg\starcluster\commands\start.py",
line 244, in execute*
*    validate_running=validate_running)*
*  File
"C:\Python27\lib\site-packages\starcluster-0.95.6-py2.7.egg\starcluster\cluster.py",
line 1628, in start*
*    return self._start(create=create, create_only=create_only)*
*  File "<string>", line 2, in _start*
*  File
"C:\Python27\lib\site-packages\starcluster-0.95.6-py2.7.egg\starcluster\utils.py",
line 112, in wrap_f*
*    res = func(*arg, **kargs)*
*  File
"C:\Python27\lib\site-packages\starcluster-0.95.6-py2.7.egg\starcluster\cluster.py",
line 1643, in _start*
*    self.create_cluster()*
*  File
"C:\Python27\lib\site-packages\starcluster-0.95.6-py2.7.egg\starcluster\cluster.py",
line 1163, in create_cluster*
*    self._create_flat_rate_cluster()*
*  File
"C:\Python27\lib\site-packages\starcluster-0.95.6-py2.7.egg\starcluster\cluster.py",
line 1185, in _create_flat_rate_cluster*
*    force_flat=True)[0]*
*  File
"C:\Python27\lib\site-packages\starcluster-0.95.6-py2.7.egg\starcluster\cluster.py",
line 926, in create_nodes*
*    cluster_sg = self.cluster_group.name <http://self.cluster_group.name>*
*  File
"C:\Python27\lib\site-packages\starcluster-0.95.6-py2.7.egg\starcluster\cluster.py",
line 655, in cluster_group*
*    vpc_id=vpc_id)*
*  File
"C:\Python27\lib\site-packages\starcluster-0.95.6-py2.7.egg\starcluster\awsutils.py",
line 300, in create_group*
*    while not self.get_group_or_none(name):*
*  File
"C:\Python27\lib\site-packages\starcluster-0.95.6-py2.7.egg\starcluster\awsutils.py",
line 333, in get_group_or_none*
*    return self.get_security_group(name)*
*  File
"C:\Python27\lib\site-packages\starcluster-0.95.6-py2.7.egg\starcluster\awsutils.py",
line 357, in get_security_group*
*    filters={'group-name': groupname})[0]*
*  File
"C:\Python27\lib\site-packages\starcluster-0.95.6-py2.7.egg\starcluster\awsutils.py",
line 369, in get_security_groups*
*    return self.conn.get_all_security_groups(filters=filters)*
*  File "C:\Python27\lib\site-packages\boto\ec2\connection.py", line 2968,
in get_all_security_groups*
*    [('item', SecurityGroup)], verb='POST')*
*  File "C:\Python27\lib\site-packages\boto\connection.py", line 1169, in
get_list*
*    response = self.make_request(action, params, path, verb)*
*  File "C:\Python27\lib\site-packages\boto\connection.py", line 1115, in
make_request*
*    return self._mexe(http_request)*
*  File "C:\Python27\lib\site-packages\boto\connection.py", line 1027, in
_mexe*
*    raise BotoServerError(response.status, response.reason, body)*
*BotoServerError: BotoServerError: 500 Internal Server Error*
*<?xml version="1.0" encoding="UTF-8"?>*
*<Response><Errors><Error><Code>InternalError</Code><Message>An internal
error has
occurred</Message></Error></Errors><RequestID>808ce646-9203-412f-8fa*
*9-0d994e74e418</RequestID></Response>*

I am also seeing the following error
*C:\Windows\system32>starcluster listclusters*
*StarCluster - (http://star.mit.edu/cluster <http://star.mit.edu/cluster>)
(v. 0.95.6)*
*Software Tools for Academics and Researchers (STAR)*
*Please submit bug reports to starcluster at mit.edu <starcluster at mit.edu>*

*!!! ERROR - InternalError: An internal error has occurred*
*Traceback (most recent call last):*
*  File
"C:\Python27\lib\site-packages\starcluster-0.95.6-py2.7.egg\starcluster\cli.py",
line 274, in main*
*    sc.execute(args)*
*  File
"C:\Python27\lib\site-packages\starcluster-0.95.6-py2.7.egg\starcluster\commands\listclusters.py",
line 36, in execute*
*    show_ssh_status=self.opts.show_ssh_status)*
*  File
"C:\Python27\lib\site-packages\starcluster-0.95.6-py2.7.egg\starcluster\cluster.py",
line 280, in list_clusters*
*    cluster_groups = self.get_cluster_security_groups()*
*  File
"C:\Python27\lib\site-packages\starcluster-0.95.6-py2.7.egg\starcluster\cluster.py",
line 253, in get_cluster_security_groups*
*    sgs = self.ec2.get_security_groups(filters={'group-name': glob})*
*  File
"C:\Python27\lib\site-packages\starcluster-0.95.6-py2.7.egg\starcluster\awsutils.py",
line 369, in get_security_groups*
*    return self.conn.get_all_security_groups(filters=filters)*
*  File "C:\Python27\lib\site-packages\boto\ec2\connection.py", line 2968,
in get_all_security_groups*
*    [('item', SecurityGroup)], verb='POST')*
*  File "C:\Python27\lib\site-packages\boto\connection.py", line 1169, in
get_list*
*    response = self.make_request(action, params, path, verb)*
*  File "C:\Python27\lib\site-packages\boto\connection.py", line 1115, in
make_request*
*    return self._mexe(http_request)*
*  File "C:\Python27\lib\site-packages\boto\connection.py", line 1027, in
_mexe*
*    raise BotoServerError(response.status, response.reason, body)*
*BotoServerError: BotoServerError: 500 Internal Server Error*
*<?xml version="1.0" encoding="UTF-8"?>*
*<Response><Errors><Error><Code>InternalError</Code><Message>An internal
error has
occurred</Message></Error></Errors><RequestID>c18d1a11-a6a6-4a8d-a74*
*2-f3f69b593189</RequestID></Response>*

I also got this error recently:
*C:\Windows\system32>starcluster start babel2*
*StarCluster - (http://star.mit.edu/cluster <http://star.mit.edu/cluster>)
(v. 0.95.6)*
*Software Tools for Academics and Researchers (STAR)*
*Please submit bug reports to starcluster at mit.edu <starcluster at mit.edu>*

*>>> Using default cluster template: main*
*>>> Validating cluster template settings...*
*>>> Cluster template settings are valid*
*>>> Starting cluster...*
*>>> Launching a 20-node cluster...*
*>>> Creating security group @sc-babel2...*
*!!! ERROR - VPCIdNotSpecified: No default VPC for this user*
*Traceback (most recent call last):*
*  File
"C:\Python27\lib\site-packages\starcluster-0.95.6-py2.7.egg\starcluster\cli.py",
line 274, in main*
*    sc.execute(args)*
*  File
"C:\Python27\lib\site-packages\starcluster-0.95.6-py2.7.egg\starcluster\commands\start.py",
line 244, in execute*
*    validate_running=validate_running)*
*  File
"C:\Python27\lib\site-packages\starcluster-0.95.6-py2.7.egg\starcluster\cluster.py",
line 1628, in start*
*    return self._start(create=create, create_only=create_only)*
*  File "<string>", line 2, in _start*
*  File
"C:\Python27\lib\site-packages\starcluster-0.95.6-py2.7.egg\starcluster\utils.py",
line 112, in wrap_f*
*    res = func(*arg, **kargs)*
*  File
"C:\Python27\lib\site-packages\starcluster-0.95.6-py2.7.egg\starcluster\cluster.py",
line 1643, in _start*
*    self.create_cluster()*
*  File
"C:\Python27\lib\site-packages\starcluster-0.95.6-py2.7.egg\starcluster\cluster.py",
line 1163, in create_cluster*
*    self._create_flat_rate_cluster()*
*  File
"C:\Python27\lib\site-packages\starcluster-0.95.6-py2.7.egg\starcluster\cluster.py",
line 1185, in _create_flat_rate_cluster*
*    force_flat=True)[0]*
*  File
"C:\Python27\lib\site-packages\starcluster-0.95.6-py2.7.egg\starcluster\cluster.py",
line 926, in create_nodes*
*    cluster_sg = self.cluster_group.name <http://self.cluster_group.name>*
*  File
"C:\Python27\lib\site-packages\starcluster-0.95.6-py2.7.egg\starcluster\cluster.py",
line 655, in cluster_group*
*    vpc_id=vpc_id)*
*  File
"C:\Python27\lib\site-packages\starcluster-0.95.6-py2.7.egg\starcluster\awsutils.py",
line 296, in create_group*
*    sg = self.conn.create_security_group(name, description, vpc_id=vpc_id)*
*  File "C:\Python27\lib\site-packages\boto\ec2\connection.py", line 3003,
in create_security_group*
*    SecurityGroup, verb='POST')*
*  File "C:\Python27\lib\site-packages\boto\connection.py", line 1207, in
get_object*
*    raise self.ResponseError(response.status, response.reason, body)*
*EC2ResponseError: EC2ResponseError: 400 Bad Request*
*<?xml version="1.0" encoding="UTF-8"?>*
*<Response><Errors><Error><Code>VPCIdNotSpecified</Code><Message>No default
VPC for this user</Message></Error></Errors><RequestID>8884b8a7-ad31-4ca9-8*
*37a-2d5982dbda43</RequestID></Response>*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/pipermail/starcluster/attachments/20150224/848e8eb1/attachment-0001.htm


More information about the StarCluster mailing list