[StarCluster] Jobs not writing output files

Ashish Jain ashishj at usc.edu
Wed Nov 21 20:04:42 EST 2012


Hi Rayson,

The exact command is this -

ssh -i key root at publicDns << EOD
qsub -N bt-mz.A.2 -b y -cwd -pe orte 2 mpirun
~/NPB3.3.1-MZ/NPB3.3-MZ-MPI/bin/bt-mz.A.2
EOD

1) I'm running the NASA Parallel Benchmark. It has classes A to F which
determine how large the benchmark is, and the number of MPI processes to
run on which is the last digit (1, 2, 4, 8...128). Out of the 43 such
benchmarks, 22 gave the correct result. For the remaining either the output
size is zero, half complete output or no output at all. If the run any of
these failed benchmarks individually, they run correctly.

2) I've found a few bugs, have got a few log files (around 22). What is the
best way to submit those?

Thanks
Ashish


On Wed, Nov 21, 2012 at 12:05 PM, Rayson Ho <raysonlogin at gmail.com> wrote:

> Hi Ashish,
>
> Can you list the qsub parameters you use to submit the jobs?
>
> Rayson
>
> ==================================================
> Open Grid Scheduler - The Official Open Source Grid Engine
> http://gridscheduler.sourceforge.net/
>
>
> On Tue, Nov 20, 2012 at 5:56 AM, Ashish Jain <ashishj at usc.edu> wrote:
> > Hi,
> >
> > I'm trying to submit many jobs at one go. I have 3 nodes each a EC2 1.4x
> > cluster. There are few glitches I have seen with this -
> >
> > 1) If I submit the job at one go ( around 6 jobs each needing one
> process),
> > apart from the first job, the rest of the jobs are put in a "t" state
> for a
> > long time
> > 2) If i space out the jobs ( sleep of 15 seconds between calls), the jobs
> > are run more smoothly. However I'm seeing an issue where the jobs are not
> > writing the .o and .e files, and sometimes when they write, they are
> either
> > incomplete or empty.
> >
> > I would like to understand what is happening here. Is there a minimum
> time
> > between submitting jobs?
> >
> > Thanks
> > Ashish
> >
> > _______________________________________________
> > StarCluster mailing list
> > StarCluster at mit.edu
> > http://mailman.mit.edu/mailman/listinfo/starcluster
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/pipermail/starcluster/attachments/20121121/5c2ae1ed/attachment.htm


More information about the StarCluster mailing list