[Starcluster] Five suggestions / requests / questions for the new AMI

Justin Riley jtriley at MIT.EDU
Wed Feb 10 11:22:13 EST 2010


Hi Robert,

I really appreciate your questions/feedback/requests, but could you please 
start sending these to the mailing list rather than using the web feedback 
form? The answers to your questions are useful to others and really belong in 
the mailing list. Its more work for me to have to copy your message each time 
and remember to cc the list. Thanks.               

Regarding your questions/comments:

1) s3sync is a good idea, I'll add it and other similar things (ie s3mount) to 
the new AMI. By the way, I created a "Cookbook" page for creating the new AMI 
so that others could at least see what's involved:

http://starcluster.scripts.mit.edu/~starcluster/wiki/index.php?title=StarCluster_AMI_Cookbook

2) User data scripts will certainly be possible with the new alestic ami (and 
maybe already possible with the current AMI?). However, I want to do something 
a little more sophisticated than just that. I'm working on the ability to add 
"plugins" to starcluster. A plugin is essentially a subclass of my 
ClusterSetup class used in StarCluster to do all the cluster configuration. 
You create a new subclass on top of ClusterSetup and then run any special 
configuration routines you want in  your new class. You will then be able to 
specify your plugin in the config and have it run when creating the cluster.

The advantage to this approach is that I hand you a collection of root ssh 
connections to each of the nodes. This lets you do more sophisticated setup 
routines than just user-data given that you have programmatic access to each 
node with the ability to execute commands and create/modify/copy/delete files 
on each node. I will have examples of these plugins in the next version of 
StarCluster.

3) I'm not sure why you're having issues with the $PATH as CLUSTER_USER. How 
are you logging in? Does it do this if you're ssh'ing in as that user? How 
about when you "su - mpiuser"? Also, have you changed the CLUSTER_SHELL for 
that user?

4) The image size is made as small as possible by the ec2-bundle-vol scripts 
already. That's pretty much the entire operation you wait on is for the 
machine image to be compressed and split into chunks. I will not do anything 
extra in this space. You're welcome to give it a try and report back if you're 
successful and it actually saves a ton of space on S3.

5) Latest perl is on the new AMI. python 2.6 is on the ami as well. If enough 
people are interested I could put python3 on there, however, I doubt many are 
using it just yet.

Thanks,

~Justin


>Again, really excellent work (and ignore my prior comments about releasing 
the code, I realized it's all there on github already)!
>
>Five suggestions / requests / questions for the new AMI, here goes...
>
>1) s3sync
>
>or any other popular command line tool could be included (or I haven't found 
it yet) - almost everyone will have to communicate with S3 at one point. SC 
could hand over the secret key + id.
>
>2) option for user data scripts
>
>it would largely render rebundling unnecessary if SC would have an option to 
provide a user data script (see e.g. http://alestic.com/2009/06/ec2-user-data-
scripts) to hand over to EC2 for execution when it launches the master node 
(or, probably better, to exec it after running the SC setup on the master). 
This can be, for example, a (locally stored and developed) user-provided shell 
script that customizes the master AMI by things like wget, compile and make 
source code etc,  and - the most useful - by pulling data and all the latest 
versions of your own scripts from S3 (see point 1).
>
>3) non-root $PATH
>
>could well be my fault, but whenever I become the non-root user specified in 
starclustercfg I have to add the SGE bin path (/opt/sge/bin/lx...) to PATH 
before things work
>
>4) AMI size
>
>could the image be made smaller using dd before bundling it (that's said 
without knowing how big it is so disregard in case)
>
>5) interpreter versions
>
>just generally make sure the latest Perl is installed (even though I also 
like Python way more :)
>



More information about the StarCluster mailing list