<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<br>
-----BEGIN PGP SIGNED MESSAGE-----<br>
Hash: SHA1<br>
<br>
Hi Wei,<br>
<br>
The problem is qconf is not in your default path when running
execute. This is because SGE is installed in /opt/sge6 and relies on
/etc/profile.d to setup the paths correctly. Unfortunately these
configs aren't automatically loaded when executing commands over
SSH. For now you can fix this using:<br>
<br>
node.ssh.execute('source /etc/profile && qconf -mattr queue
load_thresholds np_load_avg=1.5 all.q')<br>
<br>
In the upcoming version you can pass source_profile=True as a
parameter to execute which will do this for you.<br>
<br>
With that said I'm working on making an SGE (now OGS) plugin which
will allow you to add custom SGE settings like the one you're
applying above. For those interested in contributing you can see
the latest progress in the 'sge-plugin' branch on github:<br>
<br>
<a class="moz-txt-link-freetext" href="https://github.com/jtriley/StarCluster/blob/sge-plugin/starcluster/plugins/sge.py">https://github.com/jtriley/StarCluster/blob/sge-plugin/starcluster/plugins/sge.py</a><br>
<br>
This will allow you to do what your plugin is doing now:<br>
<br>
[plugin sge]<br>
setup_class = starcluster.plugins.sge.SGEPlugin<br>
load_avg = 1.5<br>
create_queues = myqueue, gpu<br>
scheduling_interval = 5<br>
....<br>
<br>
If anyone's interested in contributing patches that implement such
options please fork the project on GitHUB, checkout the sge-plugin
branch, make the changes to SGEPlugin, and submit a pull request.<br>
<br>
~Justin<br>
<br>
On 12/21/11 3:33 AM, Wei Tao wrote:<br>
<span style="white-space: pre;">> Hi Don,<br>
><br>
> The plugin picked up the queue_to_config (all.q) as evidenced
in the error message:<br>
><br>
> !!! ERROR - command 'qconf -mattr queue load_thresholds
np_load_avg=1.5 *all.q*' failed with status 127<br>
><br>
> My intention is to config the SGE at the cluster boot up time
using the plugin. Since I executed "starcluster runplugin" after
the cluster already booted up, it apparently is not an issue of
plugin execution timing.<br>
><br>
> The only reason I run the plugin or the plugin command after
cluster already booted up is for debugging purposes.<br>
><br>
> It's just very strange to me that as root I can execute the
exact same command on the master node without any issue, but
running as starcluster plugin would fail.<br>
><br>
> Also, what is status 127 anyway??<br>
><br>
> Thanks!<br>
><br>
> -Wei<br>
> <br>
><br>
> On Wed, Dec 21, 2011 at 1:42 AM, Don MacMillen
<<a class="moz-txt-link-abbreviated" href="mailto:macd@nimbic.com">macd@nimbic.com</a> <a class="moz-txt-link-rfc2396E" href="mailto:macd@nimbic.com"><mailto:macd@nimbic.com></a>> wrote:<br>
><br>
> The only difference that I can see is that I have not used
arguments to<br>
> the plugin. I guess you did remember to set the argument
"queue_to_config"<br>
> in your config file?<br>
><br>
> Another possible issue is if you are trying to reconfig a
cluster that is just<br>
> in the process of coming up. If you try that command early
on, it will fail because<br>
> sge has not been installed yet. Why do you want to config the
cluster afterwards<br>
> rather than just on the initial bring up? HTH and let us know
what you find out.<br>
> Regards.<br>
><br>
> Don<br>
><br>
><br>
> On Tue, Dec 20, 2011 at 10:02 PM, Wei Tao
<<a class="moz-txt-link-abbreviated" href="mailto:wei.tao@tsibiocomputing.com">wei.tao@tsibiocomputing.com</a>
<a class="moz-txt-link-rfc2396E" href="mailto:wei.tao@tsibiocomputing.com"><mailto:wei.tao@tsibiocomputing.com></a>> wrote:<br>
><br>
> Hi all,<br>
><br>
> I tried to implement the queue configuration suggested by Don
MacMillen a while ago. Here is my plugin code:<br>
><br>
> from starcluster.clustersetup import ClusterSetup<br>
><br>
> class SgeConfig(ClusterSetup):<br>
> def __init__(self, queue_to_config):<br>
> self.queue_to_config = queue_to_config<br>
><br>
> def run(self, nodes, master, user, user_shell, volumes):<br>
> cmd_strg = 'qconf -mattr queue load_thresholds
np_load_avg=1.5 %s' %self.queue_to_config<br>
> output = master.ssh.execute(cmd_strg)<br>
><br>
> When I execute "starcluster runplugin <myplugin>
<mycluster>", I got:<br>
><br>
> >>> Running plugin <myplugin><br>
> !!! ERROR - command 'qconf -mattr queue load_thresholds
np_load_avg=1.5 all.q' failed with status 127<br>
><br>
> If I sshmaster and run the command directly as this:<br>
><br>
> root@master:~# qconf -mattr queue load_thresholds
np_load_avg=1.5 all.q<br>
> root@master modified "all.q" in cluster queue list<br>
><br>
> It works fine. Could someone please point out why the plugin
would have a status code 127 when direct execution of the command
apparently works fine?<br>
><br>
> Thanks for the help!<br>
><br>
><br>
> -Wei<br>
> _______________________________________________<br>
> StarCluster mailing list<br>
> <a class="moz-txt-link-abbreviated" href="mailto:StarCluster@mit.edu">StarCluster@mit.edu</a> <a class="moz-txt-link-rfc2396E" href="mailto:StarCluster@mit.edu"><mailto:StarCluster@mit.edu></a><br>
> <a class="moz-txt-link-freetext" href="http://mailman.mit.edu/mailman/listinfo/starcluster">http://mailman.mit.edu/mailman/listinfo/starcluster</a><br>
><br>
><br>
><br>
><br>
><br>
> -- <br>
> Wei Tao, Ph.D.<br>
> TSI Biocomputing LLC<br>
> 617-564-0934</span><br>
<br>
-----BEGIN PGP SIGNATURE-----<br>
Version: GnuPG v1.4.11 (Darwin)<br>
Comment: Using GnuPG with Mozilla - <a class="moz-txt-link-freetext" href="http://enigmail.mozdev.org/">http://enigmail.mozdev.org/</a><br>
<br>
iEYEARECAAYFAk7yD3QACgkQ4llAkMfDcrmaQQCcCrSNwpQt53aqTU96MiI9R839<br>
3yYAn1P/CRJjQIvzWLfht3kd3a6mZI1M<br>
=R7Fe<br>
-----END PGP SIGNATURE-----<br>
<br>
</body>
</html>