[StarCluster] SGE issue with hostnames

Robert Tomkiewicz tomkiewicz.b at gmail.com
Mon Jul 25 23:25:38 EDT 2011


Hi there,

I started a 4-node EC2 cluster using 0.92rc2, and ami-a5c42dcc, standard
starcluster 9.04 x64 ami.

I ran into the following issue while doing some basic sge setup.  At first
qconf worked fine, then a few minutes later...

root at master:/mnt/NAS1/simulations/CDI_0615_amazon/logs# qconf -sq all.q
error: commlib error: access denied (client IP resolved to host name
"domU-12-31-39-09-80-C1.compute-1.internal". This is not identical to
clients host name "master").

after issuing

 root at master: ~# hostname master

I was able to proceed normally, and launch my sge jobs.  They were running
normally, confirmed by the output of qstat.

However, some minutes later, when checking on them with another qstat, I got
the same thing again.

root at master:/mnt/NAS1/simulations/CDI_0615_amazon/logs# qstat -f
error: commlib error: access denied (client IP resolved to host name
"domU-12-31-39-09-80-C1.compute-1.internal". This is not identical to
clients host name "master").

resetting the hostname was to no avail.

root at master: ~ # hostname
master
root at master: ~ # hostname master
root at master:/mnt/NAS1/simulations/CDI_0615_amazon/logs# qstat -f
error: commlib error: access denied (client IP resolved to host name
"domU-12-31-39-09-80-C1.compute-1.internal". This is not identical to
clients host name "master").

So I tried this

 root at master: ~# hostname domU-12-31-39-09-80-C1.compute-1.internal

which yielded, vice versa...

root at master:/mnt/NAS1/simulations/CDI_0615_amazon/logs# qstat -f
error: commlib error: access denied (client IP resolved to host name
"master". This is not identical to clients host name
"domU-12-31-39-09-80-C1.compute-1.internal")

Setting the hostname back to master "hostname master")  at this point yields
correct operation for a few minutes.


It seems clear the problem has to do with doubled hostnames, but where are
they set?  Has anyone else had a similar problem?

Thank you,

Robert Tomkiewicz



/etc/hostname is simply

master


/etc/hosts is below:

127.0.0.1 localhost

# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts
10.210.135.47 master
10.66.83.219 node001
10.193.155.175 node002
10.206.70.15 node003
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/pipermail/starcluster/attachments/20110725/b4ea78ea/attachment.htm


More information about the StarCluster mailing list