[StarCluster] Starcluster SGE usage
John St. John
johnthesaintjohn at gmail.com
Thu Oct 18 13:07:09 EDT 2012
On Oct 18, 2012, at 9:04 AM, Justin Riley <jtriley at MIT.EDU> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Hey Guys,
>
> Glad you figured out what needed to be changed in the SGE
> configuration. I've been meaning to add a bunch more options to the
> SGE Plugin to configure things like this along with other SGE tuning
> parameters for some time now but simply haven't had the time. If
> either of you are interested in working on a PR to do this that'd be
> awesome. All of the SGE magic is here:
>
> https://github.com/jtriley/StarCluster/blob/develop/starcluster/plugins/sge.py
>
> and here's the SGE install and parallel environment templates used by
> StarCluster:
>
> https://github.com/jtriley/StarCluster/blob/develop/starcluster/templates/sge.py
>
> I'm happy to discuss the plugin and some of the changes that would be
> needed on IRC (freenode: #starcluster).
>
> ~Justin
>
>
> On 10/18/2012 08:30 AM, Gavin W. Burris wrote:
>> Hi John,
>>
>> You got it. Keeping all on the same node requires $pe_slots. This
>> is the same setting you would use for something like OpenMP.
>>
>> As for configuring the queue automatically, maybe there is an
>> option in the SGE plugin that we can place in the
>> ~/.starcluster/config file? I'd like to know, too. If not, we
>> could maybe add some code. Or keep a shell script on a persistent
>> volume that we run that does the needed qconf foo commands after
>> starting a new head node.
>>
>> Cheers.
>>
>>
>> On 10/17/2012 05:23 PM, John St. John wrote:
>>> Hi Gavin, Thanks for pointing me in the right direction. I found
>>> a great solution though that seems to work really well. Since the
>>> "slots" is already set up to be equal to the core count on each
>>> node, I just needed access to a parallel environment that allowed
>>> me to submit jobs to nodes, but request a certain number of slots
>>> on a single node rather than spread out across N nodes. Changing
>>> the allocation rule to "fill" would probably still overflow into
>>> multiple nodes at the edge case. The way to do this properly is
>>> with the $pe_slots allocation rule in the parallel environment
>>> config file. Here is what I did:
>>>
>>> qconf -sp by_node (create this with qconf -ap [name])
>>>
>>> pe_name by_node slots 9999999 user_lists
>>> NONE xuser_lists NONE start_proc_args /bin/true
>>> stop_proc_args /bin/true allocation_rule $pe_slots
>>> control_slaves TRUE job_is_first_task TRUE urgency_slots
>>> min accounting_summary FALSE
>>>
>>>
>>> Then I modify the parallel environment list in all.q: qconf -mq
>>> all.q pe_list make orte by_node
>>>
>>> That does it! Wahoo!
>>>
>>> Ok now the problem is that I want this done automatically
>>> whenever a cluster is booted up, and if a node is added I want to
>>> make sure these configurations aren't clobbered. Any suggestions
>>> on making that happen?
>>>
>>> Thanks everyone for your time!
>>>
>>> Best, John
>>>
>>>
>>> On Oct 17, 2012, at 8:16 AM, Gavin W. Burris <bug at sas.upenn.edu>
>>> wrote:
>>>
>>>> Hi John,
>>>>
>>>> The default configuration will distribute jobs based on load,
>>>> meaning new jobs land on the least loaded node. If you want to
>>>> fill nodes, you can change the load formula on the scheduler
>>>> config: # qconf -msconf load_formula slots
>>>>
>>>> If you are using a parallel environment, the default can be
>>>> changed to fill a node, as well: # qconf -mp orte
>>>> allocation_rule $fill_up
>>>>
>>>> You may want to consider making memory consumable to prevent
>>>> over-subscription. An easy option may be to make an arbitrary
>>>> consumable complex resource, say john_jobs, and set it to the
>>>> max number you want running at one time: # qconf -mc john_jobs
>>>> jj INT <= YES YES 0 0 # qconf -me global complex_values
>>>> john_jobs=10
>>>>
>>>> Then, when you submit a job, specify the resource: $ qsub -l
>>>> jj=1 ajob.sh
>>>>
>>>> Each job submitted in this way will consume one count of
>>>> john_jobs, effectively limiting you to ten.
>>>>
>>>> Cheers.
>>>>
>>>>
>>>> On 10/16/2012 06:32 PM, John St. John wrote:
>>>>> Thanks Jesse!
>>>>>
>>>>> This does seem to work. I don't need to define -pe in this
>>>>> case b/c the slots are actually limited per node.
>>>>>
>>>>> My only problem with this solution is that all jobs are now
>>>>> limited to this hard coded number of slots, and also when
>>>>> nodes are added to the cluster while it is running the file
>>>>> is modified and the line would need to be edited again. On
>>>>> other systems I have seen the ability to specify that a job
>>>>> will use a specific number of CPU's without being in a
>>>>> special parallel environment I have seen the "-l ncpus=X"
>>>>> option working, but it does't seem to with the default
>>>>> starcluster setup. Also it looks like the "orte" parallel
>>>>> environment has some stuff very specific to MPI, and doesn't
>>>>> have a problem splitting the requested number of slots
>>>>> between multiple nodes, which I definitely don't want. I just
>>>>> want to limit the number of jobs per node, but be able to
>>>>> specify that at runtime.
>>>>>
>>>>> It looks like the grid engine is somehow aware of the number
>>>>> of CPU's available on each node. I get this with by running
>>>>> `qhost`: HOSTNAME ARCH NCPU LOAD
>>>>> MEMTOT MEMUSE SWAPTO SWAPUS
>>>>> -------------------------------------------------------------------------------
>>>>>
>>>>>
> global - - - - - -
>>>>> - master linux-x64 8 0.88 67.1G
>>>>> 1.5G 0.0 0.0 node001 linux-x64 8
>>>>> 0.36 67.1G 917.3M 0.0 0.0 node002
>>>>> linux-x64 8 0.04 67.1G 920.4M 0.0 0.0 node003
>>>>> linux-x64 8 0.04 67.1G 887.3M 0.0 0.0 node004
>>>>> linux-x64 8 0.06 67.1G 911.4M 0.0 0.0
>>>>>
>>>>>
>>>>> So it seems like there should be a way to tell qsub that job
>>>>> X is using some subset of the available CPU, or RAM, so that
>>>>> it doesn't oversubscribe the node.
>>>>>
>>>>> Thanks for your time!
>>>>>
>>>>> Best, John
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Oct 16, 2012, at 2:12 PM, Jesse Lu <jesselu at stanford.edu
>>>>> <mailto:jesselu at stanford.edu>> wrote:
>>>>>
>>>>>> You can modify the all.q queue to assign a fixed number of
>>>>>> slots to each node.
>>>>>>
>>>>>> * If I remember correctly, "$ qconf -mq all.q" will bring
>>>>>> up the configuration of the all.q queue in an editor. *
>>>>>> Under the "slots" attribute should be a semilengthly string
>>>>>> such as "[node001=16],[node002=16],..." * Try replacing the
>>>>>> entire string with a single number such as "2". This should
>>>>>> assign each host to have only two slots. * Save the
>>>>>> configuration and try a simple submission with the 'orte'
>>>>>> parallel environment and let me know if it works.
>>>>>>
>>>>>> Jesse
>>>>>>
>>>>>> On Tue, Oct 16, 2012 at 1:37 PM, John St. John
>>>>>> <johnthesaintjohn at gmail.com
>>>>>> <mailto:johnthesaintjohn at gmail.com>> wrote:
>>>>>>
>>>>>> Hello, I am having issues telling qsub to limit the number
>>>>>> of jobs ran at any one time on each node of the cluster.
>>>>>> There are sometimes ways to do this with things like "qsub
>>>>>> -l node=1:ppn=1" or "qsub -l procs=2" or something. I even
>>>>>> tried "qsub -l slots=2" but that gave me an error and told
>>>>>> me to use the parallel environment. When I tried to use the
>>>>>> "orte" parallel environment like "-pe orte 2" I see
>>>>>> "slots=2" in my qstat list, but everything gets executed
>>>>>> on one node at the same parallelization as before. How do I
>>>>>> limit the number of jobs per node? I am running a process
>>>>>> that consumes a very large amount of ram.
>>>>>>
>>>>>> Thanks, John
>>>>>>
>>>>>>
>>>>>> _______________________________________________ StarCluster
>>>>>> mailing list StarCluster at mit.edu
>>>>>> <mailto:StarCluster at mit.edu>
>>>>>> http://mailman.mit.edu/mailman/listinfo/starcluster
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________ StarCluster
>>>>> mailing list StarCluster at mit.edu
>>>>> http://mailman.mit.edu/mailman/listinfo/starcluster
>>>>>
>>>>
>>>> -- Gavin W. Burris Senior Systems Programmer Information
>>>> Security and Unix Systems School of Arts and Sciences
>>>> University of Pennsylvania
>>>
>>>
>>
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v2.0.19 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://www.enigmail.net/
>
> iEYEARECAAYFAlCAKJAACgkQ4llAkMfDcrkzmgCgkXOBPBXw5Q41RF+qABuPH2NH
> seQAoIqVmbTjgIrsPfFIJpj7POwbxcKf
> =wRr3
> -----END PGP SIGNATURE-----
More information about the StarCluster
mailing list