[Starcluster] How to update numpy / rebundle Starcluster
Justin Riley
jtriley at MIT.EDU
Tue Dec 8 11:37:58 EST 2009
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hi Dan,
I'm cc'ing this list on this email, hope you don't mind :D
Thanks for your suggestions concerning numpy/scipy/software suite on the
StarCluster base AMIs. I will certainly put them to good use and will
likely have you test the AMIs before I release. As of the first release
I did very little testing of scipy/numpy other than importing them and
running a few trivial calculations. This next time around I'll put more
of an effort in this space, especially since this will peak the interest
of the SciPY/NumPY community.
> -- the python-drmaa module, which contains bindings for the drmaa
> interface to the SGE engine (much better than SGE's native interface,
> in my opinion. have you tried it).
I am using the python-drmaa library for a couple projects and it is
extremely useful. I will certainly put this in for set of StarCluster AMIs.
> Will you include (optional) command-line parameters for things like
nodenumber and maybe root/master AMI?
To answer your question about command-line parameters, the new version
of StarCluster now allows you to completely override anything that's in
the config besides the AWS credentials. Those most live either in a
config file or in environment variables. It's also now possible to
specify the config for StarCluster to use at command line. So you should
find you have considerably more flexibility in the next version ;)
> Great -- although I don't find the current EBS docs that cumbersome
I will certainly leave the old EBS documentation for whoever's
interested but I will advertise more heavily the new createami action
that will automate the process of rebundling AMIs.
> Actually, what about having attaching multiple EBS volumes? ...
Having the ability to mount multiple nfs'd EBS volumes on StarCluster is
a feature that I'd like to add, however, it may not make it for this
very next release. I'll add this as a feature request on github's issue
tracker. BTW, feel free to post suggestions/issues on the github issue
tracker for StarCluster:
http://github.com/jtriley/StarCluster/issues
Thanks again for your interest in StarCluster!
~Justin
Dan Yamins wrote:
>
> I'm currently working on the next beta release of StarCluster that
>
> should include multiple cluster support.
>
>
> Excellent. That will be great.
>
>
>
> With that in mind, I'll be
> updating the StarCluster base AMIs before then with the latest versions
> of ubuntu/sge/mpi/numpy/scipy/ipython etc and will likely ping you for
> help in testing things like atlas/lapack support in numpy/scipy, etc.
>
>
> I'm more than happy to help. Let me know ... I definitely suggest
> using numpy 1.4 when you make this upgrade, as it represents a major
> advance over 1.3, especially in the bugfix arena ...
>
> Also, what about including:
> -- python2.6-dev, which is needed for building a lot of things
> -- apache2
> -- python-setuptools, for people who like to use easy_install
> -- the most recent version of the nose testing package, since it is
> necessary to numpy and scipy test suites (what do you do now to test
> your scipy installations?)
> -- also, maybe consider creating symbolic links from your existing
> lapack/atlas/blas files (which end with .3gf.0) to the "standard names"
> as recognized by the numpy install (which don't have that ending), and
> doing the analogous thing with amd and camd packages.
> -- the python-drmaa module, which contains bindings for the drmaa
> interface to the SGE engine (much better than SGE's native interface,
> in my opinion. have you tried it). The SGE build already has the
> drmaa c library, it's just a matter of getting the pythin bindings.
>
> I've included at the end of this email brief summary of my build notes
> for the AMIs that I've bundled from yours. Let me know if they're too
> cryptic.
>
>
> Also, I plan to add couchdb (with python bindings) and hadoop to the
> list of base software.
>
>
> Excellent.
>
>
> 2. Regarding rebundling the StarCluster AMI with your own software,
> there is a script, although not documented...I
>
>
> I had no problem following the on-line instructions provided by Amazon
> to do the rebundling ... although it is a bit cumbersome. I've
> rebundled clean versions of your AMIs a few times, with a variety of
> additional software, and have had generally good results.
>
> I did have one problem when I tried to rebundle an AMI that I had myself
> created by rebundling your 64bit AMI. Instances of the resulting
> re-re-bundled AMI wouldn't get through the starcluster initialization
> process, stalling either when trying to mount EBS volumes or installing
> the SGE. (I could start instances of the AMI directly with ec2-run, but
> not via starcluster.) I've found through a number of experiences with
> other AMIs as well, then re-rebundling seems in general to be quite
> problematic. I have no idea why, though it always seems to have to do
> with SSH or other startup things. This problem is not _that_ crucial
> since I rarely need to build new machines, but as of now, when I do have
> to, I always start from scratch (meaning, a clean version of your AMI).
>
> I will likely build this in as an action in the command line interface
> in the next version
>
>
> Excellent, that should make things simpler.
>
>
>
>
> StarCluster will switch to an action-based command line interface
> (similar to manage.py if you're familiar with django) rather than
> option-based:
>
>
> This seems like a good design choice. Will you include (optional)
> command-line parameters for things like nodenumber and maybe root/master
> AMI? I find myself needing to start clusters of widely varying sizes
> at different points, and varying AMIs (since I switch back and from
> 32-bit to 64-bit depending on what has to be done).
>
>
> There should also appear an action for initializing a new EBS volume
> which will lessen the EBS volume docs on the web site.
>
>
> Great -- although I don't find the current EBS docs that cumbersome to
> use ... I actually think that having them is helpful, because it
> introduces you to the elasticfox tools and gives you a better idea of
> how EBS works in general ... this is not to say that you shouldn't build
> an EBS initialization action, but maybe don't take down those docs.
>
> Actually, what about having attaching multiple EBS volumes? Configured
> either in the config file or at the command line? I'm rapidly
> approaching the point where I need to have multiple TBs of drive space,
> and would like to attach several EBS volumes to my cluster.
>
>
> If you'd be interested in testing these (and others) new features,
> please let me know!
>
>
> Delighted to help if I can! Just let me know how.
>
> Dan
>
>
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= BUILD NOTES -=-=-=-=-=-=-=-=-=-=-=-=
>
> start with starcluster 32-bit or 64-bit base
>
> sudo apt-get install python2.6-dev
>
> get numpy 1.3
> set up site.cfg with /usr/lib and /usr/include as defaults
>
> in /usr/lib:
>
> ln -s libblas.so.3gf.0 libblas.so
> ln -s libcblas.so.3gf.0 libcblas.so
> ln -s libf77blas.so.3gf.0 libf77blas.so
> ln -s liblapack.so.3gf.0 liblapack.so
> ln -s liblapack_atlas.so.3gf.0 liblapack_atlas.so
>
> ln -s libamd.so.3.2.0 libamd.so
> ln -s libcamd.so.3.2.0 libcamd.so
> ln -s libcolamd.so.3.2.0 libcolamd.so
> ln -s libccolamd.so.3.2.0 libccolamd.so
> ln -s libumfpack.so.3.2.0 libumfpack.so
>
> in /usr/lib/atlas:
>
> ln -s libblas.so.3gf.0 libblas.so
> ln -s liblapack_atlas.so.3gf.0 liblapack_atlas.so
>
> edit site.cfg in numpy, then build
>
> sudo apt-get install python-setuptools
> easy_install nose
> -- test numpy
> beautifulsoup 3.0.7
> pp and clientform and mechanize via easy_install
> sudo apt-get install python-tk
> installed PIL
> sudo apt-get install r-base r-base-dev
> easy_install rpy2
> easy_install feedparser
> sudo apt-get install libttf-dev
> installed reportlab2.3
> easy_installed html5lib
> easy_install pisa
>
> sudo apt-get install apache2
> sudo apt-get install mercurial
>
> sudo apt-get install subversion
>
> easy_install networkx
>
> sudo apt-get install tk-dev
>
> sudo apt-get install python-lxml
>
> sudo apt-get install graphviz
>
>
> I also use python-drmaa bindings for using the drmaa interface to SGE
> ... but I run the egg from the user directory in /home as opposed to
> installing it.
>
>
>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iEYEARECAAYFAksegOYACgkQ4llAkMfDcrmg0ACgi+BxfAAimcCbwnKupu3oSOcI
7I0AoIzYQrdTJOW5SPKpXbxN7kRGL3tl
=qTzy
-----END PGP SIGNATURE-----
More information about the StarCluster
mailing list