[StarCluster] OpenMP and Sun Grid Engine

Damian Eads eads at soe.ucsc.edu
Sun May 1 06:20:25 EDT 2011


Hi Justin,

Thank you very much for your help. I meant to send a follow-up e-mail
earlier but it slipped my mind. The orte parallel environment performs
a round robin schedule by default. This causes Sun Grid Engine to
invoke the OpenMP-threaded process several times on multiple nodes.
When calling 'top', there are dozens of job processes running on each
node. The cluster becomes unresponsive to SSH requests.

To get around this, I created a new parallel environment 'smp' and
replaced $round_robin with $pe_slots.

This is achieved by typing

     $ qconf -ap smp

editing the allocation_rule field in the parallel environment
specification file,

$ qconf -mp smp

The file should look something like this:

pe_name            smp
slots              XXX
user_lists         NONE
xuser_lists        NONE
start_proc_args    /bin/true
stop_proc_args     /bin/true
allocation_rule    $pe_slots
control_slaves     FALSE
job_is_first_task  TRUE
urgency_slots      min
accounting_summary FALSE

where XXX is the number nodes times the number of cores per node. Then
I edit the all.1 queue specification file

$ qconf -mq

and add the parallel environment 'smp' to the 'pe_list' field.

Running qsub -pe smp 8 my-open-mp-script.sh then works smoothly. I am
just speculating but if NSLOTS does not divide evenly the number of
cores on a machine, there is a possibility a job is erroneously spread
across multiple nodes.

To attain better CPU utilization, I set the number of threads to a
number higher than the number of slots

   export OMP_NUM_THREADS=$((NSLOTS+4))

Cheers,

Damian

On Mon, Apr 18, 2011 at 9:31 PM, Justin Riley <jtriley at mit.edu> wrote:
> Hi Damian,
>
> It's been a little while since I've played with OpenMP but from what I remember you need to set OMP_NUM_THREADS equal to the number of slots you allocate using the parallel environment. In theory, you should be able to use the same command:
>
> $ qsub -pe orte X open-mp-script.sh [args]
>
> And inside open-mp-script.sh you would need to export OMP_NUM_THREADS=$NSLOTS and then run your OpenMP binary like so:
>
> $ cat open-mp-script.sh
> export OMP_NUM_THREADS=$NSLOTS
> ...
> /path/to/my/openmp/binary $*
> ....
>
> Don't forget to make your binary executable (chmod +x <binary>). Let me know how this goes. If that doesn't work I'll look into this further.
>
> HTH,
>
> ~Justin
>
> On Apr 18, 2011, at 11:38 PM, Damian Eads wrote:
>
>> Hi,
>>
>> Last year, I was using MPI and it was suggested by Justin to use
>>
>>     qsub -pe orte X mpi-job-script.sh [mpi job arguments]
>>
>> to add an MPI job to the queue (where X is the number of slots for the job).
>>
>> Now, my situation is slightly different. I am no longer using MPI but
>> OpenMP (you know, #pragma parallel before certain for loops). What
>> process manager should I use with Sun Grid Engine in this case? How
>> would I specify how many slots the job should use?
>>
>> Thank you in advance.
>>
>> Kind regards,
>>
>> Damian
>> _______________________________________________
>> StarCluster mailing list
>> StarCluster at mit.edu
>> http://mailman.mit.edu/mailman/listinfo/starcluster




More information about the StarCluster mailing list