[StarCluster] Are heterogeneous clusters possible in StarCluster?
Fran Campillo
fran at spitch.ch
Thu Dec 11 07:06:29 EST 2014
Thank you very much, Jennifer and Jin. This is very useful
information. The only detail that I miss now is how to add nodes and
stop them from the master (this is, from inside the cluster). With that,
I would be able to automate easily the whole process. Does any of you
guys happen to know whether this is possible?
Cheers!
Fran.
On 09.12.2014 18:35, Jennifer Staab wrote:
> I am not sure exactly what you are trying to do regarding a single
> cluster. But in my experience the setup of the "master" node is
> propagated to the worker nodes. I would set the master as I wanted
> and then add "workers" that had different attributes ( different AMI's
> and EC2 types, could be spots and/or on-demand). The workers always
> inherited attributes from the master node, specifically NFS volumes
> from the master were always shared amongst the workers ( this included
> other EBS volumes I had mounted using Starcluster config file). When
> you submit the "starcluster addnode" command you just have to make
> sure you specify the attributes(AMI, instance type, spot/on-demand,
> etc.) you want for the added nodes.
>
> In my experience, the added worker nodes have all their resources
> dedicated to the SGE queue(s). You would use "qconf" command to adjust
> how you want things set up regarding each added node and how you want
> to set up your queues. One good thing about this setup is you can have
> your master as an on-demand or reserved instance and your workers as
> spot instances (bid with cheaper hourly rates). This way jobs running
> on spots that are terminated (due to bid pricing) are just resubmitted
> back to the queue as long as you issue the resubmit option in your
> qsub calls.
>
> One word of caution is that I didn't mixing and match on OS type or
> virtualization type (PV, HVM) within a single cluster. My thought was
> that there might be underlying incompatibilities in the systems such
> that the propagation of attributes from master to worker nodes might
> not work seamlessly.
>
> Good Luck.
>
> -Jennifer
>
>
> On 12/9/14 11:30 AM, Fran Campillo wrote:
>>
>> Hi Jin!
>>
>> Thank you so much for your answer :) . Yes, I can try to do that,
>> but then I would need to do manually some stuff in the new nodes that
>> StarCluster usually takes care of (like the password-less ssh and
>> sharing /home and potentially other EBS volumes). Is my assumption right?
>>
>> Thanks again!
>>
>> Fran.
>>
>> On 09.12.2014 16:52, Jin Yu wrote:
>>> Hi Fran,
>>>
>>> You can start a master node first and then add different types of
>>> nodes later. You may setup the SGE to define the job allocation
>>> behavior among these nodes.
>>>
>>> -Jin
>>>
>>> On Mon, Dec 8, 2014 at 6:19 AM, Fran Campillo <fran at spitch.ch
>>> <mailto:fran at spitch.ch>> wrote:
>>>
>>>
>>> Hi,
>>>
>>> I began using StarCluster a couple of weeks ago for my
>>> research,
>>> and I find it really useful framework. I had to setup SGE myself
>>> several
>>> times in the past, and StarCluster makes our life way easier.
>>>
>>> I still don't know many of the features of StarCluster, and
>>> I would
>>> like to ask the community whether certain things I want to do are
>>> actually possible with the current version from StarCluster. In
>>> particular, I would like to create a heterogeneous cluster on
>>> StarCluster (this is, with different kinds of instances, that I
>>> could
>>> have in different SGE queues. In the problem I have to solve
>>> there are
>>> some stages that need GPU and others that do not, and I would
>>> like to be
>>> able to setup the complete cluster at the beginning of the
>>> process and
>>> work like this:
>>>
>>> ------------
>>>
>>> 1.- Init: setup of the heterogeneous cluster {gpu_clust,
>>> no_gpu_clust).
>>>
>>> 2.- No cuda tasks:
>>> 2.1.- stop gpu_clust instances.
>>> 2.2.- run tasks in no_gpu_clust.
>>>
>>> 3.- Cuda tasks:
>>> 3.1.- stop no_gpu_clust.
>>> 3.2.- start gpu_clust.
>>> 3.3.- run tasks in gpu_clust.
>>>
>>> 4.- No cuda tasks:
>>> ...
>>>
>>> ------------
>>>
>>> Is this currently possible with StarCluster? I guess that I can
>>> already fake this behavior creating both clusters from another
>>> Amazon
>>> instance and run the tasks with one or another via ssh, but I
>>> find it a
>>> worse solution.
>>>
>>> Thank you very much in advance!
>>>
>>> Cheers!
>>>
>>> Fran.
>>>
>>> _______________________________________________
>>> StarCluster mailing list
>>> StarCluster at mit.edu <mailto:StarCluster at mit.edu>
>>> http://mailman.mit.edu/mailman/listinfo/starcluster
>>>
>>>
>>
>>
>>
>> _______________________________________________
>> StarCluster mailing list
>> StarCluster at mit.edu
>> http://mailman.mit.edu/mailman/listinfo/starcluster
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/pipermail/starcluster/attachments/20141211/2a89b9cb/attachment.htm
More information about the StarCluster
mailing list