Dear <span class="Apple-style-span" style="font-family: arial, sans-serif; font-size: 13px; border-collapse: collapse; color: rgb(121, 6, 25); font-weight: bold; white-space: nowrap; -webkit-border-horizontal-spacing: 2px; -webkit-border-vertical-spacing: 2px; ">Justin</span>, <br>

<br><div>Thank you very much for your clear and full answer.</div><div>Yes, I completely agree with you that in case of low bound tasks and, especially, if run them in routine everyday mode the &quot;queuing system&quot; is an excellent solution. My initial harsh in this question was influenced by the background where I came from, namely - MPI. I thought, that once user has available &quot;on demand&quot; cluster computing nodes and MPI, it eliminates the &quot;queuing system&quot; as a class from the &quot;cloud computing&quot;. Because MPI comes with its own task dispatcher and user can directly acquire whatever powerful cluster configuration he need for his task, without waiting for some proper resources will be available. Now, I see that there are a lot of other applications that had better run in a cluster through a pre-configured &quot;queuing system&quot;, not by hand on a heap of nodes. Thank you.</div>

<div><br></div><div>And, could I just confirm, once again - &quot;If a single user need to run a MPI task just from time to time (not on routine everyday basis), would he have some additional benefits from &quot;queuing system&quot; in a cloud, or it better to use MPI straightforward&quot;?</div>

<div><br></div><div>Thank you in advance, sincerely yours,</div><div>Alexey</div><div><br><div class="gmail_quote">On Sat, Oct 23, 2010 at 6:37 PM, Justin Riley <span dir="ltr">&lt;<a href="mailto:jtriley@mit.edu">jtriley@mit.edu</a>&gt;</span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">

  <div bgcolor="#ffffff" text="#000000">

    Alexey,<br>

    <br>

    The Sun Grid Engine queueing system is useful when you have a lot of

    tasks to execute and not just one at a time interactively. For

    example, you might need to convert 300 videos from one format to

    another. You could either<br>

    <br>

    1. Write a script that gets the list of nodes from /etc/hosts and

    then loops over the jobs and the nodes, ssh&#39;ing commands to be

    executed on each node. A big problem with this approach is that the

    task execution and management all depends on this script executing

    successfully all the way through. What happens if the script fails?

    You would then lose all task accounting information. Also, what if

    you suddenly discover you need to do another batch of 300 videos

    while the previous batch is still processing? Are you going to

    re-execute your script and overload the cluster? This would

    definitely slow down all of your jobs. How will you write your

    script to avoid overloading the cluster in this situation without

    losing the fact that you want to submit new jobs *now*?<br>

    <br>

    OR<br>

    <br>

    2. Skip needing to get the list of nodes and ssh&#39;ing commands to

    them and instead just write a loop that sends 300 jobs to the

    queuing system using &quot;qsub&quot;. The queuing system will then do the

    work to find an available node, execute the job, and store it&#39;s

    accounting information (status, start time, end time, which node

    executed the job, etc) . The queuing system will also handle load

    balancing your tasks across the cluster so that any one node doesn&#39;t

    get significantly overloaded compared to the other nodes in the

    cluster. If you suddenly discover you need 300 more videos processed

    you could simply &quot;qsub&quot; 300 more jobs. These jobs will be

    &#39;queued-up&#39; and executed when a node becomes available. This

    approach reduces your concerns to just executing a task on a node

    rather than managing multiple jobs and nodes.<br>

    <br>

    Also it is true that you can create &quot;as many clusters as you want&quot;

    with cloud computing. However, in many cases it could get *very*

    expensive launching multiple clusters for every single task or set

    of tasks. Whether it&#39;s more cost effective to launch multiple

    clusters or just queue a ton of jobs on a single cluster depends

    highly on the sort of tasks you&#39;re executing.<br>

    <br>

    Of course, just because a queueing system is installed doesn&#39;t mean

    you *have* to use it at all. You can of course run things however

    you want on the cluster. Hopefully I&#39;ve made it clear that there are

    significant advantages to using a queuing system to execute jobs on

    a cluster rather than a home-brewed script.<br>

    <br>

    Hope that helps...<br><font color="#888888">

    <br>

    ~Justin</font><div><div></div><div class="h5"><br>

    <br>

    On 10/22/10 5:02 PM, Alexey PETROV wrote:

    <blockquote type="cite">Ye, StartCluster is a great.

      <div>But, what for do we need to use whatever &quot;<span style="font-family:sans-serif;font-size:13px;line-height:19px"><i>queuing system&quot;.</i></span>

        <div><span style="font-family:sans-serif;font-size:13px;line-height:19px">Surely, in

            cloud computing, user can create as many clusters as he

            wants, each for his particular tasks.</span></div>

        <div><span style="font-family:sans-serif;font-size:13px;line-height:19px">So, why?!</span></div>

      </div>

    </blockquote>

    <br>

  </div></div></div>

<br>_______________________________________________<br>

StarCluster mailing list<br>

<a href="mailto:StarCluster@mit.edu">StarCluster@mit.edu</a><br>

<a href="http://mailman.mit.edu/mailman/listinfo/starcluster" target="_blank">http://mailman.mit.edu/mailman/listinfo/starcluster</a><br>

<br></blockquote></div><br></div>