[StarCluster] jobs on slave nodes disappear

Justin Riley jtriley at MIT.EDU
Sat Dec 31 15:03:22 EST 2011


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi Liang,

Is this happening consistently even after restarting the cluster using
"starcluster restart mycluster"? Also, is there anything in your
job(s) error logs? Given the output you provided these would most
likely be located in the directory you submitted the job from and
should be named something like "single.sh.e23".

~Justin


On 12/30/2011 08:58 PM, liang cheng wrote:
> Greetings !
> 
> I created  a star cluster on EC2 and use qsub to submit jobs. It
> used to work well. From this afternoon, after I requested for
> additional EC2 instance from Amazon, the issue comes out.
> 
> Only the jobs submitted to the master node are executed. Other
> jobs disappeared just in no time.  Some diagonosis is as below. Any
> helps are appreciated !
> 
> Happy New Year !
> 
> 
> root at master:/# qacct -j 23 
> ============================================================== 
> qname        all.q hostname     node006 group        root
>  owner        root project      NONE department   defaultdepartment
>  jobname      single.sh out 3 jobnumber    23 taskid
> undefined account      sge priority     0 qsub_time    Sat Dec 31
> 01:38:32 2011 start_time   Sat Dec 31 01:38:39 2011 end_time
> Sat Dec 31 01:38:39 2011 granted_pe   NONE slots        1
>  failed       0 exit_status  0 ru_wallclock 0 ru_utime     0.010
>  ru_stime     0.010 ru_maxrss    2276 ru_ixrss     0
>  ru_ismrss    0 ru_idrss     0 ru_isrss     0 ru_minflt    2648
>  ru_majflt    0 ru_nswap     0 ru_inblock   0 ru_oublock   272
>  ru_msgsnd    0 ru_msgrcv    0 ru_nsignals  0 ru_nvcsw     12
>  ru_nivcsw    3 cpu          0.020 mem          0.000 io
> 0.000 iow          0.000 maxvmem      0.000 arid         undefined
> 
> =========================
> 
> Thanks, -Liang

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.17 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk7/aooACgkQ4llAkMfDcrmFegCfULuLAaDIrEvDi1257HZR3ico
B5wAn2rGWD5D9c4rETIq07d6jKq/jrCs
=pb1b
-----END PGP SIGNATURE-----


More information about the StarCluster mailing list