[StarCluster] Star cluster not creating instances on Amazon's East Region

Wed Aug 1 11:39:08 EDT 2012

Thanks for the feedback. Cluster compute instances are and always will
be launched in the same availability zone out of necessity. This change
would only be applied to instances that do not require a placement
group. I agree that, for some, having nodes spread out over multiple
availability zones could be a good thing. Those of us that prefer a
single zone for all nodes would be able to toggle this via a command
line flag and/or config setting.

~Justin

On Wed, Aug 01, 2012 at 04:46:09PM +0200, sebastian rooks wrote:
> On Wed, Aug 1, 2012 at 7:39 AM, Justin Riley <jtriley at mit.edu> wrote:
> > There would be some impact in terms of network latency given that, in
> > theory, nodes *might* be assigned to different availability zones which
> > means different data centers and thus higher network latency. This could
> > affect the performance of things like NFS-shares on the cluster as well
> > as any applications that require a lot of network communication between
> > nodes but it's hard to quantify the exact performance hit in this case
> > due to the randomness of where instances are allocated in the data
> > centers.
> >
> > So depending on whether Amazon spreads the instances across multiple
> > zones or not there could be a trade-off between network performance vs
> > launch reliability.
> >
> > With that said I'm leaning towards changing the default in favor of
> > launch reliability given that others have occassionally encountered
> > similar errors from AWS concerning oversubscribed zones. Not to mention
> > the network latency between nodes in general is suboptimal even when
> > launching in the same zone given that nodes are often separated by many
> > routers even within the same data center. My guess is that if possible,
> > Amazon will choose the same zone for instances by default but I need to
> > test this to make sure.
> >
> > If we made this switch I would also add a flag that enables the old
> > behavior of trying to launch all nodes in the same availability zone for
> > those that care and understand the risks. This way it's an optimization
> > flag instead of a 'deoptimization' flag and the default behavior is more
> > likely to 'just work' for users. I could also add an optional setting to
> > the config to set the 'zone launch strategy' by default for different
> > cluster configs.
> >
> > What do folks think?
> 
> When launching a cluster of cluster compute instances I certainly
> expect them to be in the same availability zone.
> I actually see the possibility of having nodes in several availability
> zones as an optimisation for use cases that can afford the network
> latency...
> 
> Regards,
> 
>   S
> 
> >
> > ~Justin
> >
> > On Wed, Aug 01, 2012 at 10:00:42AM +0530, Vipin Shankar wrote:
> >> Thanks Justin for the detailed response.
> >> Would there be any performance impact if we nodes are created across AZs
> >> (with the optional flag implementation) ?
> >>
> >> -Vipin
> >>
> >> -----Original Message-----
> >> From: Justin Riley
> >> Sent: Tuesday, July 31, 2012 11:16 PM
> >> To: Erik Gafni
> >> Cc: Ramit Bhardwaj ; starcluster at mit.edu ; Vipin Shankar
> >> Subject: Re: [StarCluster] Star cluster not creating instances on Amazon's
> >> East Region
> >>
> >> -----BEGIN PGP SIGNED MESSAGE-----
> >> Hash: SHA1
> >>
> >> Hi Ramit,
> >>
> >> That error comes directly from AWS.
> >>
> >> In the case that you don't specify an availability zone in your
> >> cluster config StarCluster first launches the master node, then
> >> determines which zone Amazon chose *within* the us-east-1 region for
> >> the master, and then launches the rest of the nodes using that zone in
> >> order to improve cluster locality. Sometimes if the zone is
> >> overloaded, as you encountered, the request to start the nodes fails.
> >>
> >> If you're using external EBS volumes with your cluster then you're
> >> implicitly specifying an availability zone given that volumes can only
> >> be attached to instances within their zone and thus StarCluster must
> >> pick the volume's availability zone as the zone for the entire cluster.
> >>
> >> I think it's clear now that we should have an optional flag that
> >> disables trying to launch all instances within the same zone and
> >> simply lets Amazon pick the zone for each node (except for master in
> >> case of EBS volumes). If folks have opinions on this I'd be happy to
> >> hear them...
> >>
> >> For the time being, if you're not using external EBS volumes then you
> >> can specify the zone to use when launching a cluster:
> >>
> >> $ starcluster start -a us-east-1a mycluster
> >>
> >> Otherwise if you're using external EBS volumes you will need to
> >> snapshot your volume and recreate it in another zone that's available
> >> in order for StarCluster to launch the instances in an alternate zone...
> >>
> >> HTH,
> >>
> >> ~Justin
> >>
> >> On 07/31/2012 01:20 PM, Erik Gafni wrote:
> >> > It looks like that AWS region is at capacity and physically out of
> >> > those instance types.  Either try different instance types or
> >> > switch regions.
> >> >
> >> > On Tue, Jul 31, 2012 at 4:58 AM, Ramit Bhardwaj
> >> > <ramit.bhardwaj at sicadinc.com <mailto:ramit.bhardwaj at sicadinc.com>>
> >> > wrote:
> >> >
> >> > Hello,
> >> >
> >> > We are using 'Star cluster' for creating clusters on Amazon Cloud.
> >> > But off late, we are seeing the following error in our logs when we
> >> > try to create a cluster on the US-East region:
> >> >
> >> > "!!! ERROR - Unsupported: The requested Availability Zone is
> >> > currently constrained and we are no longer accepting new customer
> >> > requests for t1/m1/c1/m2 instance types. Please retry your request
> >> > by not specifying an Availability Zone or choosing us-east-1a,
> >> > us-east-1d, us-east-1c."
> >> >
> >> > Is there anything that we are missing here? The exact setup was
> >> > working perfectly till last month. We are seeing this problem since
> >> > last one month.
> >> >
> >> > Also, in the forums i did not see anyone reporting this issue. Are
> >> > there any changes to any policy or something? Please advice.
> >> >
> >> > Best Regards Ramit _______________________________________________
> >> > StarCluster mailing list StarCluster at mit.edu
> >> > <mailto:StarCluster at mit.edu>
> >> > http://mailman.mit.edu/mailman/listinfo/starcluster
> >> >
> >> >
> >>
> >> -----BEGIN PGP SIGNATURE-----
> >> Version: GnuPG v2.0.19 (GNU/Linux)
> >> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
> >>
> >> iEYEARECAAYFAlAYGf0ACgkQ4llAkMfDcrmhkACeLNvrLUPsEWOpIX98E8r2rt6m
> >> MkEAn2LlTsgjJ1iSrPkRiwgKXzciIpPb
> >> =IDiw
> >> -----END PGP SIGNATURE-----
> >>
> >
> > _______________________________________________
> > StarCluster mailing list
> > StarCluster at mit.edu
> > http://mailman.mit.edu/mailman/listinfo/starcluster
> >
> _______________________________________________
> StarCluster mailing list
> StarCluster at mit.edu
> http://mailman.mit.edu/mailman/listinfo/starcluster
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
Url : http://mailman.mit.edu/pipermail/starcluster/attachments/20120801/433c515c/attachment.bin