[Starcluster] Five suggestions / requests / questions for the new AMI
Justin Riley
jtriley at MIT.EDU
Wed Feb 10 11:22:13 EST 2010
Hi Robert,
I really appreciate your questions/feedback/requests, but could you please
start sending these to the mailing list rather than using the web feedback
form? The answers to your questions are useful to others and really belong in
the mailing list. Its more work for me to have to copy your message each time
and remember to cc the list. Thanks.
Regarding your questions/comments:
1) s3sync is a good idea, I'll add it and other similar things (ie s3mount) to
the new AMI. By the way, I created a "Cookbook" page for creating the new AMI
so that others could at least see what's involved:
http://starcluster.scripts.mit.edu/~starcluster/wiki/index.php?title=StarCluster_AMI_Cookbook
2) User data scripts will certainly be possible with the new alestic ami (and
maybe already possible with the current AMI?). However, I want to do something
a little more sophisticated than just that. I'm working on the ability to add
"plugins" to starcluster. A plugin is essentially a subclass of my
ClusterSetup class used in StarCluster to do all the cluster configuration.
You create a new subclass on top of ClusterSetup and then run any special
configuration routines you want in your new class. You will then be able to
specify your plugin in the config and have it run when creating the cluster.
The advantage to this approach is that I hand you a collection of root ssh
connections to each of the nodes. This lets you do more sophisticated setup
routines than just user-data given that you have programmatic access to each
node with the ability to execute commands and create/modify/copy/delete files
on each node. I will have examples of these plugins in the next version of
StarCluster.
3) I'm not sure why you're having issues with the $PATH as CLUSTER_USER. How
are you logging in? Does it do this if you're ssh'ing in as that user? How
about when you "su - mpiuser"? Also, have you changed the CLUSTER_SHELL for
that user?
4) The image size is made as small as possible by the ec2-bundle-vol scripts
already. That's pretty much the entire operation you wait on is for the
machine image to be compressed and split into chunks. I will not do anything
extra in this space. You're welcome to give it a try and report back if you're
successful and it actually saves a ton of space on S3.
5) Latest perl is on the new AMI. python 2.6 is on the ami as well. If enough
people are interested I could put python3 on there, however, I doubt many are
using it just yet.
Thanks,
~Justin
>Again, really excellent work (and ignore my prior comments about releasing
the code, I realized it's all there on github already)!
>
>Five suggestions / requests / questions for the new AMI, here goes...
>
>1) s3sync
>
>or any other popular command line tool could be included (or I haven't found
it yet) - almost everyone will have to communicate with S3 at one point. SC
could hand over the secret key + id.
>
>2) option for user data scripts
>
>it would largely render rebundling unnecessary if SC would have an option to
provide a user data script (see e.g. http://alestic.com/2009/06/ec2-user-data-
scripts) to hand over to EC2 for execution when it launches the master node
(or, probably better, to exec it after running the SC setup on the master).
This can be, for example, a (locally stored and developed) user-provided shell
script that customizes the master AMI by things like wget, compile and make
source code etc, and - the most useful - by pulling data and all the latest
versions of your own scripts from S3 (see point 1).
>
>3) non-root $PATH
>
>could well be my fault, but whenever I become the non-root user specified in
starclustercfg I have to add the SGE bin path (/opt/sge/bin/lx...) to PATH
before things work
>
>4) AMI size
>
>could the image be made smaller using dd before bundling it (that's said
without knowing how big it is so disregard in case)
>
>5) interpreter versions
>
>just generally make sure the latest Perl is installed (even though I also
like Python way more :)
>
More information about the StarCluster
mailing list