[StarCluster] Can job terminate its cluster?

Rajat Banerjee rajatb at post.harvard.edu
Mon Jul 7 21:18:34 EDT 2014


There is also an experimental feature to terminate the cluster (We wrote it
but didn't spend much time testing it or collecting feedback on it):

Experimental Features

The load balancer, by default, will not kill the master node in order to
keep the cluster alive and functional. However, there are times when you
might want to destroy the master if the cluster is completely idle and
there are no more nodes left to remove. For example, you may wish to launch
10000 jobs and have the cluster shutdown when the last job has completed.
In this case you can use the experimental *-K*, or *–kill-cluster*, option:

$ starcluster loadbalance --kill-cluster mycluster

 The above command will load balance *mycluster* as usual, however, once
all jobs have completed and all worker nodes have been shutdown by the load
balancer the cluster will be terminated.



On Mon, Jul 7, 2014 at 7:59 PM, Rayson Ho <raysonlogin at gmail.com> wrote:

> You can get most of the functionality from the Elastic Load Balancer:
>
> http://star.mit.edu/cluster/docs/latest/manual/load_balancer.html
>
> As for emailing the user when all jobs are done, you can either write a
> small script that does the actually emailing when the jobs are done... or
> you can submit a job that waits for all other jobs, and then ask Grid
> Engine to send you an email (qsub -M) when that last job is done:
>
> http://gridscheduler.sourceforge.net/htmlman/htmlman1/qsub.html
>
> Rayson
>
> ==================================================
> Open Grid Scheduler - The Official Open Source Grid Engine
> http://gridscheduler.sourceforge.net/
> http://gridscheduler.sourceforge.net/GridEngine/GridEngineCloud.html
>
>
> On Mon, Jul 7, 2014 at 5:58 PM, James <jamesqf at charter.net> wrote:
>
>> Hi,
>>
>> I have a question which doesn't seem to be addressed in the
>> documentation, or on this list.  It seems that from the
>> examples, Starcluster expects a bunch of human interaction,
>> manually creating a cluster, submitting jobs, then terminating
>> the cluster.  I need an essentially batch system: the user starts
>> a job which creates a cluster (internal info determines how large
>> a cluster is needed); it runs unattended for hours/days (periodically
>> checkpointing itself); then when it finishes it copies results to
>> an EBS volume, sends an email message to the user, and terminates
>> the cluster.
>>
>> 1) Will Starcluster allow me to do this easily?  If so, a pointer
>> to docs/examples would be appreciated!
>>
>> 2) If not, is there some other application that will do this, so I
>> don't have to re-invent the wheel?
>>
>> 3) Or do I have to build my own wheel from scratch?
>>
>> Thanks,
>> James
>>
>>
>>
>> _______________________________________________
>> StarCluster mailing list
>> StarCluster at mit.edu
>> http://mailman.mit.edu/mailman/listinfo/starcluster
>>
>
>
> _______________________________________________
> StarCluster mailing list
> StarCluster at mit.edu
> http://mailman.mit.edu/mailman/listinfo/starcluster
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/pipermail/starcluster/attachments/20140707/3120a027/attachment.htm


More information about the StarCluster mailing list