[StarCluster] Failure of addnode

Lyn Gerner schedulerqueen at gmail.com
Thu Nov 21 16:50:49 EST 2013


Hi Folks,

I am using 0.94.2.  I am experimenting w/scaling.  I had started a cluster
w/two nodes initially, using default names of master and node001.  I added
another node, node002, then did a removenode of node001.  When I attempted
to add another node of the c3.8xlarge type (supported by Rayson's
mod--thanks, Rayson) using alias of -a c3.8xlarge.  Everything went fine
until it attempted to install OGS.  At that point, it tried to reference
the node being added as  node001, instead of as the alias:

Gerner:.starcluster mary$ sc addnode e1d -a c3.8xlarge -I c3.8xlarge
StarCluster - (http://star.mit.edu/cluster) (v. 0.94.2)
Software Tools for Academics and Researchers (STAR)
Please submit bug reports to starcluster at mit.edu

>>> Launching node(s): c3.8xlarge
Reservation:r-7ad16218
>>> Waiting for instances to propagate...
>>> Waiting for node(s) to come up... (updating every 30s)
>>> Waiting for all nodes to be in a 'running' state...
3/3 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
100%
>>> Waiting for SSH to come up on all nodes...
3/3 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
100%
>>> Waiting for cluster to come up took 2.089 mins
>>> Running plugin starcluster.clustersetup.DefaultClusterSetup
>>> Configuring hostnames...
1/1 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
100%
>>> Configuring /etc/hosts on each node
3/3 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
100%
>>> Configuring NFS exports path(s):
/home /usr/share/jobs/
>>> Mounting all NFS export path(s) on 1 worker node(s)
1/1 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
100%
>>> Setting up NFS took 0.166 mins
1/1 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
100%
>>> Configuring scratch space for user(s): sgeadmin
1/1 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
100%
>>> Configuring passwordless ssh for root
>>> Configuring passwordless ssh for sgeadmin
>>> Running plugin starcluster.plugins.sge.SGEPlugin
>>> Adding c3.8xlarge to SGE
>>> Configuring NFS exports path(s):
/opt/sge6
>>> Mounting all NFS export path(s) on 1 worker node(s)
1/1 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
100%
>>> Setting up NFS took 0.128 mins
!!! ERROR - Error occured while running plugin
'starcluster.plugins.sge.SGEPlugin':
!!! ERROR - remote command 'source /etc/profile && cd /opt/sge6 &&
!!! ERROR - TERM=rxvt ./inst_sge -x -noremote -auto ./ec2_sge.conf'
!!! ERROR - failed with status 1:
!!! ERROR - Reading configuration from file ./ec2_sge.conf
!!! ERROR - [H[2J
!!! ERROR - error resolving host "node001": can't resolve host name
!!! ERROR - (h_errno = HOST_NOT_FOUND)
!!! ERROR - error resolving host "node001": can't resolve host name
!!! ERROR - (h_errno = HOST_NOT_FOUND)
!!! ERROR - error resolving host "node001": can't resolve host name
!!! ERROR - (h_errno = HOST_NOT_FOUND)
!!! ERROR - error resolving host "node001": can't resolve host name
!!! ERROR - (h_errno = HOST_NOT_FOUND)
!!! ERROR - error resolving host "node001": can't resolve host name
!!! ERROR - (h_errno = HOST_NOT_FOUND)
!!! ERROR - error resolving host "node001": can't resolve host name
!!! ERROR - (h_errno = HOST_NOT_FOUND)
Gerner:.starcluster mary$ sc sm e1d

I don't plan on using this approach routinely, but thought you'd want to
know about the error.

Thanks,
Lyn
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/pipermail/starcluster/attachments/20131121/1c689f7c/attachment.htm


More information about the StarCluster mailing list