[StarCluster] 100 nodes cluster
Mark Gordon
mark.gordon at ualberta.ca
Fri Oct 28 10:24:32 EDT 2011
Hi Paolo:
I wonder, what percentage of the launch time do you think is spend
configuring the nodes?
cheers,
Mark
On Fri, Oct 28, 2011 at 4:57 AM, Paolo Di Tommaso <Paolo.DiTommaso at crg.eu>wrote:
> Dear All,
>
> I'm still struggling with this problem with large cluster that requires so
> long time to be launched.
>
> I think that some improvements are possible having a better multithread
> handling, but I'm not a Python guru, so I cannot say about that in details.
>
> Anyway I'm looking for a more "radical" approach. My idea is to launch a
> 2-node cluster, save the master and slave nodes as two separate AMIs and use
> these to deploy a cluster of any size without having to install and
> configure everything from scratch (NFS, SGE, password less access, etc) but
> modifying only what is changed.
>
>
> So my questions is: which are the "delta" in the configuration files
> between two different cluster instances of X and Y nodes ?
>
> Knowing this it could be quite easy write a StarCluster plugin that will
> apply only these changes, achieving a much more faster launch time.
>
>
> Thank you,
>
> Paolo Di Tommaso
> Software Engineer
> Comparative Bioinformatics Group
> Centre de Regulacio Genomica (CRG)
> Dr. Aiguader, 88
> 08003 Barcelona, Spain
>
>
>
>
> On Oct 20, 2011, at 9:48 PM, Rayson Ho wrote:
>
> > ----- Original Message -----
> >> However, if one can wrap around the real
> > ssh with a fake ssh script that sleeps 30 seconds and then runs the
> > real
> >> ssh, then we can see how good (or bad) the Workerpool handles long
> > latency commands - and we will start from
> >> there to optimize the launch
> > performance.
> >
> > Replying to myself - after quickly reading the code...
> >
> > StarCluster uses Paramiko instead of executing ssh, so wrapping around a
> long latency ssh script won't work.
> >
> > And there are quite a lot of discussions about issues with multithreaded
> programs that call Paramiko -- just google: Paramiko+multithreading
> >
> >
> > Rayson
> >
> > =================================
> > Grid Engine / Open Grid Scheduler
> > http://gridscheduler.sourceforge.net
> > _______________________________________________
> > StarCluster mailing list
> > StarCluster at mit.edu
> > http://mailman.mit.edu/mailman/listinfo/starcluster
>
>
> _______________________________________________
> StarCluster mailing list
> StarCluster at mit.edu
> http://mailman.mit.edu/mailman/listinfo/starcluster
>
>
--
Mark Gordon
Systems Analyst
Department of Physics
University of Alberta
This communication is intended for the use of the recipient to which it is
addressed and may contain confidential, personal and/or privileged
information. Please contact us immediately if you are not the intended
recipient of this communication. If you are not the intended recipient of
this communication do not copy, distribute or take action on it. Any
communication received in error, or subsequent reply, should be deleted or
destroyed.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/pipermail/starcluster/attachments/20111028/fed3cbf7/attachment.htm
More information about the StarCluster
mailing list