[StarCluster] ? re CreateUsers error in .95.6

Christopher Clearfield chris.clearfield at system-logic.com
Fri Aug 7 18:51:42 EDT 2015


I wonder if, instead of zeroing-out the file, '> known_hosts' actually
created it?

I noticed the mode for opening the file originally is:
*add_to_known_hosts*

*    khostsf = self.ssh.remote_file(known_hosts_file, 'a')*
'a', rather than 'a+', so it will fail if the file doesn't exist for some
reasons.

–
C


On Fri, Aug 7, 2015 at 3:47 PM Lyn Gerner <schedulerqueen at gmail.com> wrote:

> Update/Close: Strangely, this particular issue was resolved by going to
> the master and zeroing the known_hosts file (as in "> known_hosts").
>
> On Fri, Aug 7, 2015 at 11:39 AM, Lyn Gerner <schedulerqueen at gmail.com>
> wrote:
>
>> Hi Developers,
>>
>> Sorry for the Fri afternoon query, but I'm getting an error never before
>> seen on an addnode, and it recurs even on a -x retry.   Appreciate any
>> workaround/recovery suggestions for the following:
>>
>> *# sc an -x -a node002 w2c*
>>
>> *StarCluster - (http://star.mit.edu/cluster
>> <http://star.mit.edu/cluster>) (v. 0.95.6)*
>>
>> *Software Tools for Academics and Researchers (STAR)*
>>
>> *Please submit bug reports to starcluster at mit.edu <starcluster at mit.edu>*
>>
>>
>> *>>> Waiting for node(s) to come up... (updating every 30s)*
>>
>> *>>> Waiting for all nodes to be in a 'running' state...*
>>
>> *3/3 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>> 100%  *
>>
>> *>>> Waiting for SSH to come up on all nodes...*
>>
>> *3/3 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>> 100%  *
>>
>> *>>> Waiting for cluster to come up took 0.206 mins*
>>
>> *>>> Running plugin starcluster.clustersetup.DefaultClusterSetup*
>>
>> *>>> Configuring hostnames...*
>>
>> *1/1 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>> 100%  *
>>
>> *>>> Configuring /etc/hosts on each node*
>>
>> *3/3 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>> 100%  *
>>
>> *>>> Configuring NFS exports path(s):*
>>
>> */home /jobs/ /usr/share/jobs/ /pipe/*
>>
>> *>>> Mounting all NFS export path(s) on 1 worker node(s)*
>>
>> *1/1 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>> 100%  *
>>
>> *>>> Setting up NFS took 0.021 mins*
>>
>> *1/1 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>> 100%  *
>>
>> *>>> Configuring scratch space for user(s): sgeadmin*
>>
>> *1/1 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>> 100%  *
>>
>> *>>> Configuring passwordless ssh for root*
>>
>> *>>> Configuring passwordless ssh for sgeadmin*
>>
>> *>>> Running plugin swap_addnode_w2c.VISwapConfigurator*
>>
>> *>>> Configuring Swap on node002*
>>
>> *>>> Running plugin starcluster.plugins.users.CreateUsers*
>>
>> *>>> Creating 1 users on node002*
>>
>> *>>> Adding node002 to known_hosts for 1 users*
>>
>> *!!! ERROR - Error occured while running plugin
>> 'starcluster.plugins.users.CreateUsers':*
>>
>> *!!! ERROR - Unhandled exception occured*
>>
>> *Traceback (most recent call last):*
>>
>> *  File
>> "/usr/lib/python2.6/site-packages/StarCluster-0.95.6-py2.6.egg/starcluster/cli.py",
>> line 274, in main*
>>
>> *    sc.execute(args)*
>>
>> *  File
>> "/usr/lib/python2.6/site-packages/StarCluster-0.95.6-py2.6.egg/starcluster/commands/addnode.py",
>> line 128, in execute*
>>
>> *    no_create=self.opts.no_create)*
>>
>> *  File
>> "/usr/lib/python2.6/site-packages/StarCluster-0.95.6-py2.6.egg/starcluster/cluster.py",
>> line 189, in add_nodes*
>>
>> *    no_create=no_create)*
>>
>> *  File
>> "/usr/lib/python2.6/site-packages/StarCluster-0.95.6-py2.6.egg/starcluster/cluster.py",
>> line 1042, in add_nodes*
>>
>> *    self.run_plugins(method_name="on_add_node", node=node)*
>>
>> *  File
>> "/usr/lib/python2.6/site-packages/StarCluster-0.95.6-py2.6.egg/starcluster/cluster.py",
>> line 1690, in run_plugins*
>>
>> *    self.run_plugin(plug, method_name=method_name, node=node)*
>>
>> *  File
>> "/usr/lib/python2.6/site-packages/StarCluster-0.95.6-py2.6.egg/starcluster/cluster.py",
>> line 1715, in run_plugin*
>>
>> *    func(*args)*
>>
>> *  File
>> "/usr/lib/python2.6/site-packages/StarCluster-0.95.6-py2.6.egg/starcluster/plugins/users.py",
>> line 164, in on_add_node*
>>
>> *    master.add_to_known_hosts(user, [node])*
>>
>> *  File
>> "/usr/lib/python2.6/site-packages/StarCluster-0.95.6-py2.6.egg/starcluster/node.py",
>> line 578, in add_to_known_hosts*
>>
>> *    khostsf = self.ssh.remote_file(known_hosts_file, 'a')*
>>
>> *  File
>> "/usr/lib/python2.6/site-packages/StarCluster-0.95.6-py2.6.egg/starcluster/sshutils.py",
>> line 320, in remote_file*
>>
>> *    rfile = self.sftp.open(file, mode)*
>>
>> *  File
>> "/usr/lib/python2.6/site-packages/paramiko-1.15.1-py2.6.egg/paramiko/sftp_client.py",
>> line 327, in open*
>>
>> *    t, msg = self._request(CMD_OPEN, filename, imode, attrblock)*
>>
>> *  File
>> "/usr/lib/python2.6/site-packages/paramiko-1.15.1-py2.6.egg/paramiko/sftp_client.py",
>> line 729, in _request*
>>
>> *    return self._read_response(num)*
>>
>> *  File
>> "/usr/lib/python2.6/site-packages/paramiko-1.15.1-py2.6.egg/paramiko/sftp_client.py",
>> line 776, in _read_response*
>>
>> *    self._convert_status(msg)*
>>
>> *  File
>> "/usr/lib/python2.6/site-packages/paramiko-1.15.1-py2.6.egg/paramiko/sftp_client.py",
>> line 802, in _convert_status*
>>
>> *    raise IOError(errno.ENOENT, text)*
>>
>> *IOError: [Errno 2] No such file*
>>
>>
>> *!!! ERROR - Oops! Looks like you've found a bug in StarCluster*
>>
>> *!!! ERROR - Crash report written to:
>> /root/.starcluster/logs/crash-report-11317.txt*
>>
>> *!!! ERROR - Please remove any sensitive data from the crash report*
>>
>> *!!! ERROR - and submit it to starcluster at mit.edu <starcluster at mit.edu>*
>>
>>
>> There's not much more in the crash report, but I can send it, if it will
>> help.  Thanks in advance.
>>
>>
>> Best,
>>
>> Lyn
>>
>
> _______________________________________________
> StarCluster mailing list
> StarCluster at mit.edu
> http://mailman.mit.edu/mailman/listinfo/starcluster
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/pipermail/starcluster/attachments/20150807/cc21d0f5/attachment.htm


More information about the StarCluster mailing list