[StarCluster] Are heterogeneous clusters possible in StarCluster?

Fran Campillo fran at spitch.ch
Thu Dec 11 07:06:29 EST 2014


     Thank you very much, Jennifer and Jin. This is very useful 
information. The only detail that I miss now is how to add nodes and 
stop them from the master (this is, from inside the cluster). With that, 
I would be able to automate easily the whole process. Does any of you 
guys happen to know whether this is possible?

     Cheers!

     Fran.

On 09.12.2014 18:35, Jennifer Staab wrote:
> I am not sure exactly what you are trying to do regarding a single 
> cluster.  But in my experience the setup of the "master" node is 
> propagated to the worker nodes.  I would set the master as I wanted 
> and then add "workers" that had different attributes ( different AMI's 
> and EC2 types, could be spots and/or on-demand).  The workers always 
> inherited attributes from the master node, specifically NFS volumes 
> from the master were always shared amongst the workers ( this included 
> other EBS volumes I had mounted using Starcluster config file). When 
> you submit the "starcluster addnode" command you just have to make 
> sure you specify the attributes(AMI, instance type, spot/on-demand, 
> etc.) you want for the added nodes.
>
> In my experience, the added worker nodes have all their resources 
> dedicated to the SGE queue(s). You would use "qconf" command to adjust 
> how you want things set up regarding each added node and how you want 
> to set up your queues. One good thing about this setup is you can have 
> your master as an on-demand or reserved instance and your workers as 
> spot instances (bid with cheaper hourly rates). This way jobs running 
> on spots that are terminated (due to bid pricing) are just resubmitted 
> back to the queue as long as you issue the resubmit option in your 
> qsub calls.
>
> One word of caution is that I didn't mixing and match on OS type or 
> virtualization type (PV, HVM) within a single cluster. My thought was 
> that there might be underlying incompatibilities in the systems such 
> that the propagation of attributes from master to worker nodes might 
> not work seamlessly.
>
> Good Luck.
>
> -Jennifer
>
>
> On 12/9/14 11:30 AM, Fran Campillo wrote:
>>
>>     Hi Jin!
>>
>>     Thank you so much for your answer :) . Yes, I can try to do that, 
>> but then I would need to do manually some stuff in the new nodes that 
>> StarCluster usually takes care of (like the password-less ssh and 
>> sharing /home and potentially other EBS volumes). Is my assumption right?
>>
>>     Thanks again!
>>
>>     Fran.
>>
>> On 09.12.2014 16:52, Jin Yu wrote:
>>> Hi Fran,
>>>
>>> You can start a master node first and then add different types of 
>>> nodes later. You may setup the SGE to define the job allocation 
>>> behavior among these nodes.
>>>
>>> -Jin
>>>
>>> On Mon, Dec 8, 2014 at 6:19 AM, Fran Campillo <fran at spitch.ch 
>>> <mailto:fran at spitch.ch>> wrote:
>>>
>>>
>>>          Hi,
>>>
>>>          I began using StarCluster a couple of weeks ago for my
>>>     research,
>>>     and I find it really useful framework. I had to setup SGE myself
>>>     several
>>>     times in the past, and StarCluster makes our life way easier.
>>>
>>>          I still don't know many of the features of StarCluster, and
>>>     I would
>>>     like to ask the community whether certain things I want to do are
>>>     actually possible with the current version from StarCluster. In
>>>     particular, I would like to create a heterogeneous cluster on
>>>     StarCluster (this is, with different kinds of instances, that I
>>>     could
>>>     have in different SGE queues. In the problem I have to solve
>>>     there are
>>>     some stages that need GPU and others that do not, and I would
>>>     like to be
>>>     able to setup the complete cluster at the beginning of the
>>>     process and
>>>     work like this:
>>>
>>>     ------------
>>>
>>>          1.- Init: setup of the heterogeneous cluster {gpu_clust,
>>>     no_gpu_clust).
>>>
>>>          2.- No cuda tasks:
>>>          2.1.- stop gpu_clust instances.
>>>          2.2.- run tasks in no_gpu_clust.
>>>
>>>          3.- Cuda tasks:
>>>          3.1.- stop no_gpu_clust.
>>>          3.2.- start gpu_clust.
>>>          3.3.- run tasks in gpu_clust.
>>>
>>>          4.- No cuda tasks:
>>>          ...
>>>
>>>     ------------
>>>
>>>          Is this currently possible with StarCluster? I guess that I can
>>>     already fake this behavior creating both clusters from another
>>>     Amazon
>>>     instance and run the tasks with one or another via ssh, but I
>>>     find it a
>>>     worse solution.
>>>
>>>          Thank you very much in advance!
>>>
>>>          Cheers!
>>>
>>>          Fran.
>>>
>>>     _______________________________________________
>>>     StarCluster mailing list
>>>     StarCluster at mit.edu <mailto:StarCluster at mit.edu>
>>>     http://mailman.mit.edu/mailman/listinfo/starcluster
>>>
>>>
>>
>>
>>
>> _______________________________________________
>> StarCluster mailing list
>> StarCluster at mit.edu
>> http://mailman.mit.edu/mailman/listinfo/starcluster
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/pipermail/starcluster/attachments/20141211/2a89b9cb/attachment.htm


More information about the StarCluster mailing list