[StarCluster] minimal cost with loadbalance

Rayson Ho raysonlogin at gmail.com
Thu May 29 07:54:03 EDT 2014


You can set the Grid Engine "queue_sort_method" parameter to "seq_no" in
sched_conf:

http://gridscheduler.sourceforge.net/htmlman/htmlman5/sched_conf.html

And for this to work, we need each instance to have a different "seq_no",
so a small StarCluster plugin will need to be developed -- ie. the plugin
will assign a new seq_no when an instance gets created.

Rayson

==================================================
Open Grid Scheduler - The Official Open Source Grid Engine
http://gridscheduler.sourceforge.net/
http://gridscheduler.sourceforge.net/GridEngine/GridEngineCloud.html


On Thu, May 29, 2014 at 3:10 AM, David Mrva <davidm at cantabresearch.com>
wrote:

> Hello,
>
> I stared using StarCluster with Amazon spot instances. I expect that the
> workload of my application will fluctuate a lot and I aim to minimise
> the cost of running the spot instances. StarCluster's loadbalancer seems
> to go some way in this direction. It adds more spot instances when the
> SGE queue is busy and removes unused nodes. The removal of the nodes
> interacts with SGE's strategy for assigning jobs to queues. SGE chooses
> the node with the lowest load average to assign a job to. If there are
> more nodes in the cluster than are necessary to execute the jobs, this
> strategy will result in spreading the jobs that need to be executed
> across as many nodes as possible. This behaviour reduces the chances of
> some of the nodes staying unused and potentially being removed by the
> load balancer.
>
> I'd like to configure StarCluster in such a way that SGE jobs go to node
> A for as long as there are slots available on it and they go to node B
> only if there is no vacant slot on node A. For example, on a cluster
> with nodes A and B and 8 slots on each node if there are 4 slots being
> used on node A and 4 more jobs arrive to SGE, I'd like all 4 of these
> new jobs to go node A. Using the "orte" parallel environment with
> "fill_up" allocation strategy does not achieve this. For the above
> example, using the "fill_up" allocation strategy will pick node B
> (lowest load average node) and assign all 4 new jobs to it, resulting in
> nodes A and B running 4 jobs each instead of A running 8 jobs and B none.
>
> How can I use StarCluster's built-in load balancer to minimise the cost
> of running spot instances by minimising the number unused CPUs in the
> way described above?
>
> Many thanks,
> David
> _______________________________________________
> StarCluster mailing list
> StarCluster at mit.edu
> http://mailman.mit.edu/mailman/listinfo/starcluster
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/pipermail/starcluster/attachments/20140529/f38d8055/attachment.htm


More information about the StarCluster mailing list