[StarCluster] SGE issue with hostnames

"Hung-Sheng Tsao (Lao Tsao 老曹) Ph.D." laotsao at gmail.com
Tue Jul 26 08:38:19 EDT 2011


sorry
restart master and all execd
this host_aliases should be in all nodes



On 7/26/2011 8:37 AM, "Hung-Sheng Tsao (Lao Tsao 老曹) Ph.D." wrote:
> in the $SGE_ROOT/$SGE_CELL/common
> create host_aliases
> < hostname>.privatenet <hostnam>.pubnet
>
>
> On 7/25/2011 11:25 PM, Robert Tomkiewicz wrote:
>> Hi there,
>>
>> I started a 4-node EC2 cluster using 0.92rc2, and ami-a5c42dcc, 
>> standard starcluster 9.04 x64 ami.
>>
>> I ran into the following issue while doing some basic sge setup.  At 
>> first qconf worked fine, then a few minutes later...
>>
>> root at master:/mnt/NAS1/simulations/CDI_0615_amazon/logs# qconf -sq all.q
>> error: commlib error: access denied (client IP resolved to host name 
>> "domU-12-31-39-09-80-C1.compute-1.internal". This is not identical to 
>> clients host name "master").
>>
>> after issuing
>>
>>  root at master: ~# hostname master
>>
>> I was able to proceed normally, and launch my sge jobs.  They were 
>> running normally, confirmed by the output of qstat.
>>
>> However, some minutes later, when checking on them with another 
>> qstat, I got the same thing again.
>>
>> root at master:/mnt/NAS1/simulations/CDI_0615_amazon/logs# qstat -f
>> error: commlib error: access denied (client IP resolved to host name 
>> "domU-12-31-39-09-80-C1.compute-1.internal". This is not identical to 
>> clients host name "master").
>>
>> resetting the hostname was to no avail.
>>
>> root at master: ~ # hostname
>> master
>> root at master: ~ # hostname master
>> root at master:/mnt/NAS1/simulations/CDI_0615_amazon/logs# qstat -f
>> error: commlib error: access denied (client IP resolved to host name 
>> "domU-12-31-39-09-80-C1.compute-1.internal". This is not identical to 
>> clients host name "master").
>>
>> So I tried this
>>
>>  root at master: ~# hostname domU-12-31-39-09-80-C1.compute-1.internal
>>
>> which yielded, vice versa...
>>
>> root at master:/mnt/NAS1/simulations/CDI_0615_amazon/logs# qstat -f
>> error: commlib error: access denied (client IP resolved to host name 
>> "master". This is not identical to clients host name 
>> "domU-12-31-39-09-80-C1.compute-1.internal")
>>
>> Setting the hostname back to master "hostname master")  at this point 
>> yields correct operation for a few minutes.
>>
>>
>> It seems clear the problem has to do with doubled hostnames, but 
>> where are they set?  Has anyone else had a similar problem?
>>
>> Thank you,
>>
>> Robert Tomkiewicz
>>
>>
>>
>> /etc/hostname is simply
>>
>> master
>>
>>
>> /etc/hosts is below:
>>
>> 127.0.0.1 localhost
>>
>> # The following lines are desirable for IPv6 capable hosts
>> ::1 ip6-localhost ip6-loopback
>> fe00::0 ip6-localnet
>> ff00::0 ip6-mcastprefix
>> ff02::1 ip6-allnodes
>> ff02::2 ip6-allrouters
>> ff02::3 ip6-allhosts
>> 10.210.135.47 master
>> 10.66.83.219 node001
>> 10.193.155.175 node002
>> 10.206.70.15 node003
>>
>>
>>
>>
>>
>>
>> _______________________________________________
>> StarCluster mailing list
>> StarCluster at mit.edu
>> http://mailman.mit.edu/mailman/listinfo/starcluster
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/pipermail/starcluster/attachments/20110726/6b387008/attachment.htm
-------------- next part --------------
A non-text attachment was scrubbed...
Name: laotsao.vcf
Type: text/x-vcard
Size: 642 bytes
Desc: not available
Url : http://mailman.mit.edu/pipermail/starcluster/attachments/20110726/6b387008/attachment.vcf


More information about the StarCluster mailing list