[Starcluster] Syntax extension preference poll

Dan Yamins dyamins at gmail.com
Wed Jun 9 15:15:57 EDT 2010


I've added an extension to the starcluster config syntax in which
NODE_INSTANCE_TYPE can now also be a list, allowing the user the specify a
more structured cluster with multiple difference instance types with various
instance counts

The syntax I currently use in my extension is to overload NODE_INSTANCE_TYPE
so that it can represent a "compound type" as a comma-separated list of
types and the numbers of instances of each type that is desired.  E.g.:

   NODE_INSTANCE_TYPE = m1.xlarge 3, m1.large 2

means that 3 m1.xlarge and 2 m1.large node instances -- in addition to the
master -- are desired.

I'd like to poll the following questions:

1) Is the basic syntax reasonable?

2) I allow a simplification, namely that you don't have to specify the last
cluster size, e.g.

   NODE_INSTANCE_TYPE = m1.xlarge 3, m1.large

is a valid config.    If left unspecified, the instance count of the last
type is determined during config.load() simply by setting

   count of last instance type  = CLUSTER_SIZE - [total number of specified
instance types in the non-last instance types listed] - 1

where the -1 is done to deal with the fact that there's a master.

My questions are:
       a) Is this simplification desirable?  One point in its favor is that
it is completely compatible with current configuration syntax -- the special
case when only one node_instance_type is desired remains unchanged, and
easy.

       b) Should I add a further simplification, in which _other_ (that is,
non-last) instance types can have non-specified numbers of instances,
defaulting to 1?  E.g. like having the following be valid:

    NODE_INSTANCE_TYPE = m1.xlarge, c1.large, m1.xlarge

which would mean that 1 m1.xlarge, 1 c1.large and CLUSTER_SIZE - 3  m1.large
instances were desired.    Though it might make things easier, I'm uneasy
about this syntax because leaving the last instance type count unspecified
would have a different meaning than leaving the other instance type counts
unspecified.   [Unless *all* unspecified instance counts defaulted to 1,
which is even worse since it is not compatible with current syntax and makes
what should be the easier case more complicated.]

3) If you don't specify master_instance_type, starcluster currently defaults
this to the node_instance_type.   Now that node_instance_type is compound,
I've set it to default to the *first*  listed instance type, e.g. m1.xlarge
in the above example.   Is this the correct behavior?  Should it instead
default to the "smallest" (e.g. cheapest) instance type listed, e.g.
'm1.large' in the above example?   Or to the *last* listed type?

Even when there are multiple instance types, I'm still having all those
types use the single specified NODE_IMAGE_ID ami.   In the future it might
also be possible to allow listing multiple amis along with the instance
types.

Any thoughts on these issues before I submit a pull request would be
great.   And sorry for the long email,
Dan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/pipermail/starcluster/attachments/20100609/032054ca/attachment.htm


More information about the StarCluster mailing list