[Starcluster] can't start an m1.large 64 bit ubuntu cluster

Justin Riley jtriley at MIT.EDU
Fri Jun 4 13:21:30 EDT 2010


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi Dean,

Sorry to hear you're having issues. From the error output:

"We currently do not have sufficient m1.large capacity in
the Availability Zone you requested (us-east-1c). Our system will be
working on provisioning additional capacity. You can currently get
m1.large capacity by not specifying an Availability Zone in your request
or choosing us-east-1d, us-east-1a, us-east-1b. "

This means that the us-east-1c zone didn't have enough resources to
fulfill your request and hence the error. I need to handle these errors
better so that it's more clear to users when they happen. Thanks for
submitting the debug file this will help me to do that.

With that said, do you have an AVAILABILITY_ZONE setting in your cluster
template in the config? If so, try commenting that out and starting the
cluster again. If not, unfortunately for the time being you'll just have
to try again until the zone has capacity again.

This problem uncovers a bigger issue which is that the StarCluster
images are only available in the us-east-1 zone. StarCluster let's
Amazon choose the zone to launch the cluster in if you do not specify
the AVAILABILITY_ZONE setting. However, given that the StarCluster
images are only available in us-east-1 Amazon always chooses us-east-1
which is a problem for anyone not in the east coast and also overloads
us-east-1. I'm working on replicating the 32bit/64bit images in all
availability zones to resolve this issue.

However, I believe rebundling a new image from the StarCluster images
using StarCluster's "createimage" command should place your new image in
a local availability zone (ie us-west-1 if you're closer to the west
region) as opposed to always us-east-1.

> I need to create a custom ami based on your ami-a5c42dcc image, but
> starting fails with the following error output:

To create a new image you do not need to start a new cluster with
StarCluster. In fact it is recommended not to create a new AMI from a
currently running instance started by StarCluster. The software *should*
warn you when you're attempting to do so. Instead, launch a single
instance of the StarCluster AMI using ElasticFox or the AWS management
console, login and customize the instance, and then use starcluster's
createimage command to create a new AMI. You can then use this new AMI
in your cluster template. See here for more details:

http://web.mit.edu/stardev/cluster/docs/create_new_ami.html

Hope that helps,

~Justin

P.S. Would you mind joining the mailing list? Thanks!




On 06/03/2010 03:48 PM, Dean Snyder wrote:
> I need to create a custom ami based on your ami-a5c42dcc image, but
> starting fails with the following error output:
> 
> config.py:281 - DEBUG - Loading config
> config.py:99 - DEBUG - Loading file: /Users/dean/.starcluster/config
> config.py:210 - DEBUG - partition setting not specified. Defaulting to 1
> cli.py:213 - INFO - Using default cluster template: smallcluster
> awsutils.py:47 - DEBUG - creating self._conn w/ connection_authenticator
> kwargs = {'path': '/', 'region': None, 'port': None, 'is_secure': True}
> cli.py:228 - INFO - Validating cluster template settings...
> awsutils.py:47 - DEBUG - creating self._conn w/ connection_authenticator
> kwargs = {'path': '/', 'region': None, 'port': None, 'is_secure': True}
> cli.py:230 - INFO - Cluster template settings are valid
> cluster.py:677 - INFO - Starting cluster...
> cluster.py:578 - INFO - Launching a 2-node cluster...
> cluster.py:583 - INFO - Launching master node...
> cluster.py:584 - INFO - Master AMI: ami-a5c42dcc
> awsutils.py:108 - INFO - Creating security group @sc-cidrcluster...
> cluster.py:605 - INFO - Launching worker nodes...
> cluster.py:606 - INFO - Node AMI: ami-a5c42dcc
> cli.py:1096 - DEBUG - Traceback (most recent call last):
>   File "build/bdist.macosx-10.6-universal/egg/starcluster/cli.py", line
> 1075, in main
>     sc.execute(args)
>   File "build/bdist.macosx-10.6-universal/egg/starcluster/cli.py", line
> 239, in execute
>     scluster.start(create=not self.opts.no_create)
>   File "build/bdist.macosx-10.6-universal/egg/starcluster/utils.py",
> line 27, in wrapper
>     res = func(*arg, **kargs)
>   File "build/bdist.macosx-10.6-universal/egg/starcluster/cluster.py",
> line 679, in start
>     self.create_cluster()
>   File "build/bdist.macosx-10.6-universal/egg/starcluster/cluster.py",
> line 617, in create_cluster
>     placement=zone)
>   File "build/bdist.macosx-10.6-universal/egg/starcluster/cluster.py",
> line 575, in run_instances
>     placement=placement)
>   File "build/bdist.macosx-10.6-universal/egg/starcluster/awsutils.py",
> line 161, in run_instances
>     placement=placement)
>   File "/Library/Python/2.6/site-packages/boto-1.9b-py2.6.egg/boto/ec2/
> connection.py", line 463, in run_instances
>     return self.get_object('RunInstances', params, Reservation, verb='POST')
>   File "/Library/Python/2.6/site-packages/boto-1.9b-py2.6.egg/boto/
> connection.py", line 620, in get_object
>     response = self.make_request(action, params, path, verb)
>   File "/Library/Python/2.6/site-packages/boto-1.9b-py2.6.egg/boto/
> connection.py", line 591, in make_request
>     headers=headers)
>   File "/Library/Python/2.6/site-packages/boto-1.9b-py2.6.egg/boto/
> connection.py", line 459, in make_request
>     return self._mexe(method, path, data, headers, host, sender)
>   File "/Library/Python/2.6/site-packages/boto-1.9b-py2.6.egg/boto/
> connection.py", line 435, in _mexe
>     raise BotoServerError(response.status, response.reason, body)
> BotoServerError: BotoServerError: 500 Internal Server Error
> <?xml version="1.0"?>
> <Response><Errors><Error><Code>InsufficientInstanceCapacity</
> Code><Message>We currently do not have sufficient m1.large capacity in
> the Availability Zone you requested (us-east-1c). Our system will be
> working on provisioning additional capacity. You can currently get
> m1.large capacity by not specifying an Availability Zone in your request
> or choosing us-east-1d, us-east-1a, us-east-1b.</Message></Error></
> Errors><RequestID>33860e59-186d-4fec-a8fa-dae188ca0a0b</RequestID></Response>
> 
> cli.py:1098 - ERROR - Oops! Looks like you've found a bug in StarCluster
> cli.py:1099 - ERROR - Debug file written to: /var/folders/Su/Su
> +FAzOUGKCkPaVlRRfiyE+++TI/-Tmp-/starcluster-debug.log
> cli.py:1100 - ERROR - Please submit this file, minus any private information,
> cli.py:1101 - ERROR - to starcluster at mit.edu
> 
> What do I need to do to get this to work?
> 
> Thanks,
> 
> Dean A. Snyder
> Senior Programmer/Analyst
> Center for Inherited Disease Research (CIDR)
> Johns Hopkins School of Medicine
> Bayview Research Campus
> 333 Cassell Dr, Triad Bldg, Suite 2000
> Baltimore, MD 21224
> cell:717 668-3048 office:410-550-4629
> www.cidr.jhmi.edu
> 
> 
> _______________________________________________
> Starcluster mailing list
> Starcluster at mit.edu
> http://mailman.mit.edu/mailman/listinfo/starcluster

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.14 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkwJNhoACgkQ4llAkMfDcrn7YACffsSuaz8pjCgUdNbS8iMHGMhO
V7cAn0l8CO9brJiYLqt8s3VhuIixAWBy
=b9zW
-----END PGP SIGNATURE-----



More information about the StarCluster mailing list