<div>Hi Justin,</div>
<div> </div>
<div>Thanks for your reply. There's no error log nor output log even when I use "-e" or "-o" option.</div>
<div> </div>
<div>I created a cluster with one master and 10 slave. I made a minor change on the master node and use "starcluster createimage i-xxxx AAA BBB". "i-xxxx" is the instance id of the master. After I got the ami-yyyy, I run "starcluster start ami-yyyy". I found all jobs submitted to slave nodes are finished instantly, as you see in the log I sent earlier. The jobs in master node are run normally.</div>
<div> </div>
<div>I haven't used "restart" command but will give it a try.</div>
<div> </div>
<div>-Liang<br><br></div>
<div class="gmail_quote">On Sat, Dec 31, 2011 at 12:03 PM, Justin Riley <span dir="ltr"><<a href="mailto:jtriley@mit.edu">jtriley@mit.edu</a>></span> wrote:<br>
<blockquote style="BORDER-LEFT:#ccc 1px solid;MARGIN:0px 0px 0px 0.8ex;PADDING-LEFT:1ex" class="gmail_quote">-----BEGIN PGP SIGNED MESSAGE-----<br>Hash: SHA1<br><br>Hi Liang,<br><br>Is this happening consistently even after restarting the cluster using<br>
"starcluster restart mycluster"? Also, is there anything in your<br>job(s) error logs? Given the output you provided these would most<br>likely be located in the directory you submitted the job from and<br>should be named something like "single.sh.e23".<br>
<br>~Justin<br>
<div>
<div></div>
<div class="h5"><br><br>On 12/30/2011 08:58 PM, liang cheng wrote:<br>> Greetings !<br>><br>> I created a star cluster on EC2 and use qsub to submit jobs. It<br>> used to work well. From this afternoon, after I requested for<br>
> additional EC2 instance from Amazon, the issue comes out.<br>><br>> Only the jobs submitted to the master node are executed. Other<br>> jobs disappeared just in no time. Some diagonosis is as below. Any<br>
> helps are appreciated !<br>><br>> Happy New Year !<br>><br>><br>> root@master:/# qacct -j 23<br>> ==============================================================<br>> qname all.q hostname node006 group root<br>
> owner root project NONE department defaultdepartment<br>> jobname single.sh out 3 jobnumber 23 taskid<br>> undefined account sge priority 0 qsub_time Sat Dec 31<br>> 01:38:32 2011 start_time Sat Dec 31 01:38:39 2011 end_time<br>
> Sat Dec 31 01:38:39 2011 granted_pe NONE slots 1<br>> failed 0 exit_status 0 ru_wallclock 0 ru_utime 0.010<br>> ru_stime 0.010 ru_maxrss 2276 ru_ixrss 0<br>> ru_ismrss 0 ru_idrss 0 ru_isrss 0 ru_minflt 2648<br>
> ru_majflt 0 ru_nswap 0 ru_inblock 0 ru_oublock 272<br>> ru_msgsnd 0 ru_msgrcv 0 ru_nsignals 0 ru_nvcsw 12<br>> ru_nivcsw 3 cpu 0.020 mem 0.000 io<br>> 0.000 iow 0.000 maxvmem 0.000 arid undefined<br>
><br>> =========================<br>><br>> Thanks, -Liang<br><br></div></div>-----BEGIN PGP SIGNATURE-----<br>Version: GnuPG v2.0.17 (GNU/Linux)<br>Comment: Using GnuPG with Mozilla - <a href="http://enigmail.mozdev.org/" target="_blank">http://enigmail.mozdev.org/</a><br>
<br>iEYEARECAAYFAk7/aooACgkQ4llAkMfDcrmFegCfULuLAaDIrEvDi1257HZR3ico<br>B5wAn2rGWD5D9c4rETIq07d6jKq/jrCs<br>=pb1b<br>-----END PGP SIGNATURE-----<br></blockquote></div><br>