[StarCluster] SGE issue with hostnames
Justin Riley
justin.t.riley at gmail.com
Tue Jul 26 11:02:27 EDT 2011
Hi Robert,
Sorry to hear you're having issues. The 0.92rc2 version sets the
(user-friendly) hostnames itself in /etc/hostname and /etc/hosts as
you've seen. The only thing I can think of that would cause this issue
is /etc/nsswitch.conf not preferring "files" before "dns" for hosts and
networks. I'm launching a small test cluster to check but I would bet
this is the problem.
I can make a patch for this if this is the case. For your immediate
needs, do you need the 9.04 AMI specifically? The latest AMI is 10.04
which should work fine. You can browse the latest available AMIs using
the 'listpublic' command:
$ starcluster listpublic
Let me know,
~Justin
On 07/25/2011 11:25 PM, Robert Tomkiewicz wrote:
> Hi there,
>
> I started a 4-node EC2 cluster using 0.92rc2, and ami-a5c42dcc, standard
> starcluster 9.04 x64 ami.
>
> I ran into the following issue while doing some basic sge setup. At
> first qconf worked fine, then a few minutes later...
>
> root at master:/mnt/NAS1/simulations/CDI_0615_amazon/logs# qconf -sq all.q
> error: commlib error: access denied (client IP resolved to host name
> "domU-12-31-39-09-80-C1.compute-1.internal". This is not identical to
> clients host name "master").
>
> after issuing
>
> root at master: ~# hostname master
>
> I was able to proceed normally, and launch my sge jobs. They were
> running normally, confirmed by the output of qstat.
>
> However, some minutes later, when checking on them with another qstat, I
> got the same thing again.
>
> root at master:/mnt/NAS1/simulations/CDI_0615_amazon/logs# qstat -f
> error: commlib error: access denied (client IP resolved to host name
> "domU-12-31-39-09-80-C1.compute-1.internal". This is not identical to
> clients host name "master").
>
> resetting the hostname was to no avail.
>
> root at master: ~ # hostname
> master
> root at master: ~ # hostname master
> root at master:/mnt/NAS1/simulations/CDI_0615_amazon/logs# qstat -f
> error: commlib error: access denied (client IP resolved to host name
> "domU-12-31-39-09-80-C1.compute-1.internal". This is not identical to
> clients host name "master").
>
> So I tried this
>
> root at master: ~# hostname domU-12-31-39-09-80-C1.compute-1.internal
>
> which yielded, vice versa...
>
> root at master:/mnt/NAS1/simulations/CDI_0615_amazon/logs# qstat -f
> error: commlib error: access denied (client IP resolved to host name
> "master". This is not identical to clients host name
> "domU-12-31-39-09-80-C1.compute-1.internal")
>
> Setting the hostname back to master "hostname master") at this point
> yields correct operation for a few minutes.
>
>
> It seems clear the problem has to do with doubled hostnames, but where
> are they set? Has anyone else had a similar problem?
>
> Thank you,
>
> Robert Tomkiewicz
>
>
>
> /etc/hostname is simply
>
> master
>
>
> /etc/hosts is below:
>
> 127.0.0.1 localhost
>
> # The following lines are desirable for IPv6 capable hosts
> ::1 ip6-localhost ip6-loopback
> fe00::0 ip6-localnet
> ff00::0 ip6-mcastprefix
> ff02::1 ip6-allnodes
> ff02::2 ip6-allrouters
> ff02::3 ip6-allhosts
> 10.210.135.47 master
> 10.66.83.219 node001
> 10.193.155.175 node002
> 10.206.70.15 node003
>
>
>
>
>
>
> _______________________________________________
> StarCluster mailing list
> StarCluster at mit.edu
> http://mailman.mit.edu/mailman/listinfo/starcluster
More information about the StarCluster
mailing list