Hi, <br><br>I'm using Starcluster from the git repo. I think I have
everything configured properly. But when I try to a 1-node cluster,
the process hangs at the "create user" step:<br><br>>>> Validating cluster settings...<br>>>> Cluster settings are valid<br>>>> Starting cluster...<br>>>> Launching a 1-node cluster...<br>
>>> Launching master node...<br>>>> Master AMI: ami-a19e71c8<br>>>> Creating security group @sc-testcluster...<br>Reservation:r-56c3ca3e<br>>>> Waiting for cluster to start.../>>> The master node is <a href="http://ec2-184-73-33-230.compute-1.amazonaws.com">ec2-184-73-33-230.compute-1.amazonaws.com</a><br>
<br>>>> Attaching volume vol-c3d927aa to master node...<br>>>> Setting up the cluster...<br>>>> Mounting EBS volume vol-c3d927aa on /home...<br>>>> Using private key /Users/danielyamins/amazon/id_rsa-gsg-keypair (rsa)<br>
>>> Creating cluster user: gotdata<br><br>... and that's where it hangs.<br><br>I CAN log into the individual nodes -- both as master AND as "gotdata" -- using passwordless ssh. Here's what the /etc/hosts file looks like:<br>
<br>127.0.0.1 localhost.localdomain localhost<br><br># The following lines are desirable for IPv6 capable hosts<br>::1 ip6-localhost ip6-loopback<br>fe00::0 ip6-localnet<br>ff00::0 ip6-mcastprefix<br>ff02::1 ip6-allnodes<br>
ff02::2 ip6-allrouters<br>ff02::3 ip6-allhosts<br><br>Since this is a 1-node cluster, I can't test the passwordless login. <br><br>I can reproduce this problem both with both the 32-bit and 64-bit base starcluster AMIs as well as the AMIs that I created from those.<br>
<br>When I try to create a 2-node cluster, the process hangs a step later:<br><br>>>> Validating cluster settings...<br>>>> Cluster settings are valid<br>>>> Starting cluster...<br>>>> Launching a 2-node cluster...<br>
>>> Launching master node...<br>>>> Master AMI: ami-f129c798<br>>>> Creating security group @sc-testcluster...<br>Reservation:r-e8d9d080<br>>>> Launching worker nodes...<br>>>> Node AMI: ami-f129c798<br>
Reservation:r-ead9d082<br>>>> Waiting for cluster to start... <br>>>> The master node is <a href="http://ec2-184-73-111-239.compute-1.amazonaws.com">ec2-184-73-111-239.compute-1.amazonaws.com</a><br>>>> Attaching volume vol-c3d927aa to master node...<br>
>>> Setting up the cluster...<br>>>> Mounting EBS volume vol-c3d927aa on /home...<br>>>> Using private key /Users/danielyamins/amazon/id_rsa-gsg-keypair (rsa)<br>>>> Creating cluster user: gotdata<br>
>>> Using private key /Users/danielyamins/amazon/id_rsa-gsg-keypair (rsa)<br><br>.... and there it hangs. <br><br>In this case, I can:<br> -- log into the master and worker nodes as root: e.g. "starcluster sshmaster testcluster" and "starcluster sshnode testcluster 1" work fine<br>
-- log into the master as user gotdata, but NOT into the other worker node, e.g. "starcluster sshnode -u gotdata testcluster 0" works but "starclsuter sshnode -u gotdata testcluster 1" DOESN'T.<br>
<br><br>Thanks!<br>Dan<br><br><br><br><br><br><br><br><br>