[StarCluster] Delay when using Sun Grid Engine

Jesse Lu jesselu at stanford.edu
Wed Sep 5 19:32:52 EDT 2012


Hi Rayson,

Let me first say thanks for OGS, its a super useful tool!

So, an update....
I realized that the parameter was load_report_time in the global
configuration.
The delay was basically exactly load_report_time, and so I have set it to
0, and the delay is basically gone...

Rayson, here is my global configuration (qconf -sconf), any comments?
Particularly, is it okay to have a value of zero for load_report_time?

$ qconf -sconf
#global:
execd_spool_dir              /opt/sge6/default/spool
mailer                       /bin/mail
xterm                        /usr/bin/X11/xterm
load_sensor                  none
prolog                       none
epilog                       none
shell_start_mode             posix_compliant
login_shells                 sh,bash,ksh,csh,tcsh
min_uid                      0
min_gid                      0
user_lists                   none
xuser_lists                  none
projects                     none
xprojects                    none
enforce_project              false
enforce_user                 auto
load_report_time             00:00:00
max_unheard                  00:05:00
reschedule_unknown           02:00:00
loglevel                     log_warning
administrator_mail           none at none.edu
set_token_cmd                none
pag_cmd                      none
token_extend_time            none
shepherd_cmd                 none
qmaster_params               none
execd_params                 none
reporting_params             accounting=false reporting=false \
                             flush_time=00:00:15 joblog=false
sharelog=00:00:00
finished_jobs                100
gid_range                    20000-20100
qlogin_command               builtin
qlogin_daemon                builtin
rlogin_command               builtin
rlogin_daemon                builtin
rsh_command                  builtin
rsh_daemon                   builtin
max_aj_instances             2000
max_aj_tasks                 75000
max_u_jobs                   0
max_jobs                     0
max_advance_reservations     0
auto_user_oticket            0
auto_user_fshare             0
auto_user_default_project    none
auto_user_delete_time        86400
delegated_file_staging       false
reprioritize                 0
jsv_url                      none
jsv_allowed_mod              ac,h,i,e,o,j,M,N,p,w


On Wed, Sep 5, 2012 at 12:52 PM, Rayson Ho <raysonlogin at gmail.com> wrote:

> On Wed, Sep 5, 2012 at 1:10 PM, Jesse Lu <jesselu at stanford.edu> wrote:
> > However, if I run in a parallel environment (e.g. qsub -pe orte ...) then
> > there is an approximately 40 sec delay after job completion. That is to
> say,
> > the job has technically finished, although qstat still lists it as
> running,
> > and subsequent jobs are held up. Any ideas?
>
> That's fixed in the update release.
>
> Rayson
>
> ==================================================
> Open Grid Scheduler - The Official Open Source Grid Engine
> http://gridscheduler.sourceforge.net/
>
>
> >
> > Thanks in advance!
> >
> >
> > On Tue, Sep 4, 2012 at 5:33 PM, Rayson Ho <raysonlogin at gmail.com> wrote:
> >>
> >> That's the default scheduling time, and if you really want the
> >> scheduler to react to your qsub requests ASAP, you can turn on
> >> "scheduling-on-demand":
> >>
> >> http://gridscheduler.sourceforge.net/howto/tuning.html
> >>
> >> And in OGS/GE 2011.11 u1 p1 (we need a better name), the time it takes
> >> to report job done should be reduced.
> >>
> >> Rayson
> >>
> >> ==================================================
> >> Open Grid Scheduler - The Official Open Source Grid Engine
> >> http://gridscheduler.sourceforge.net/
> >>
> >>
> >>
> >> On Tue, Sep 4, 2012 at 8:05 PM, Jesse Lu <mr.jesselu at gmail.com> wrote:
> >> > Yes! Exactly.
> >> >
> >> > -- Jesse
> >> > ________________________________
> >> > On Sep 4, 2012 4:19 PM, Rayson Ho <raysonlogin at gmail.com> wrote:
> >> >
> >> > Hi Jesse,
> >> >
> >> > Are you referring to the scheduling time of Grid Engine??
> >> >
> >> > Rayson
> >> >
> >> > ==================================================
> >> > Open Grid Scheduler - The Official Open Source Grid Engine
> >> > http://gridscheduler.sourceforge.net/
> >> >
> >> >
> >> > On Tue, Sep 4, 2012 at 6:37 PM, Jesse Lu <jesselu at stanford.edu>
> wrote:
> >> >> Hi StarCluster users,
> >> >>
> >> >> I've noticed long delays with Sun Grid Engine when submitting jobs
> and
> >> >> especially after job execution. Even running a simple "hostname" job
> >> >> takes
> >> >> several seconds. Moreover, running an MPI version of "hostname" can
> >> >> take 2
> >> >> minutes!!
> >> >>
> >> >> Can someone help me get rid of this delay? Thank you.
> >> >>
> >> >> Jesse
> >> >>
> >> >> _______________________________________________
> >> >> StarCluster mailing list
> >> >> StarCluster at mit.edu
> >> >> http://mailman.mit.edu/mailman/listinfo/starcluster
> >> >>
> >
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/pipermail/starcluster/attachments/20120905/9b4ee9c3/attachment-0001.htm


More information about the StarCluster mailing list