[StarCluster] SGE issue with hostnames
Robert Tomkiewicz
tomkiewicz.b at gmail.com
Mon Jul 25 23:25:38 EDT 2011
Hi there,
I started a 4-node EC2 cluster using 0.92rc2, and ami-a5c42dcc, standard
starcluster 9.04 x64 ami.
I ran into the following issue while doing some basic sge setup. At first
qconf worked fine, then a few minutes later...
root at master:/mnt/NAS1/simulations/CDI_0615_amazon/logs# qconf -sq all.q
error: commlib error: access denied (client IP resolved to host name
"domU-12-31-39-09-80-C1.compute-1.internal". This is not identical to
clients host name "master").
after issuing
root at master: ~# hostname master
I was able to proceed normally, and launch my sge jobs. They were running
normally, confirmed by the output of qstat.
However, some minutes later, when checking on them with another qstat, I got
the same thing again.
root at master:/mnt/NAS1/simulations/CDI_0615_amazon/logs# qstat -f
error: commlib error: access denied (client IP resolved to host name
"domU-12-31-39-09-80-C1.compute-1.internal". This is not identical to
clients host name "master").
resetting the hostname was to no avail.
root at master: ~ # hostname
master
root at master: ~ # hostname master
root at master:/mnt/NAS1/simulations/CDI_0615_amazon/logs# qstat -f
error: commlib error: access denied (client IP resolved to host name
"domU-12-31-39-09-80-C1.compute-1.internal". This is not identical to
clients host name "master").
So I tried this
root at master: ~# hostname domU-12-31-39-09-80-C1.compute-1.internal
which yielded, vice versa...
root at master:/mnt/NAS1/simulations/CDI_0615_amazon/logs# qstat -f
error: commlib error: access denied (client IP resolved to host name
"master". This is not identical to clients host name
"domU-12-31-39-09-80-C1.compute-1.internal")
Setting the hostname back to master "hostname master") at this point yields
correct operation for a few minutes.
It seems clear the problem has to do with doubled hostnames, but where are
they set? Has anyone else had a similar problem?
Thank you,
Robert Tomkiewicz
/etc/hostname is simply
master
/etc/hosts is below:
127.0.0.1 localhost
# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts
10.210.135.47 master
10.66.83.219 node001
10.193.155.175 node002
10.206.70.15 node003
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/pipermail/starcluster/attachments/20110725/b4ea78ea/attachment.htm
More information about the StarCluster
mailing list