<div dir="ltr">I have had limited success getting Starcluster to successfully launch a cluster with EC2-VPC nodes under the development version (0.9999). Using a certain AMI I can easily launch a Starcluster cluster with EC2-VPC nodes, but using a different AMI it fails to launch. I do set the config variables "VPC_ID" and "SUBNET_ID" and the only difference between the two cluster templates is the AMI that is used.<div>
<br></div><div>Both AMIs used successfully launch a Starcluster cluster with EC2-classic nodes. The only noted difference between the AMIs is that the one that successfully launches a Starcluster cluster with VPC-EC2 nodes is a private AMI that is "shared" with the account that I am running my VPC within. The AMI that doesn't work with Starcluster-VPC is one that is private AMI "owned" by the account I am running my VPC within. </div>
<div><br></div><div>I believe the error I am getting has something to do with the Tags, specifically the "@sc-core" tag's value being beyond 255 characters, but I could be wrong. Below I have included an example of the successful launch, the failed launch (including error message), and the listed clusters after both commands.</div>
<div><br></div><div>Any suggestions on how to address this issue would be greatly appreciated.</div><div><br></div><div>Thanks in advance for the help,</div><div><br></div><div>-Jennifer</div><div><br></div><div>-------------------------------------------------------------------------------------------------</div>
<div>------ Below is what it looks like when I have a successful launch ---</div><div>-------------------------------------------------------------------------------------------------</div><div><div>(starcluster)root@xxxxxxxxxxx:~# starcluster start -c testvpcA vpcA</div>
<div>StarCluster - (<a href="http://star.mit.edu/cluster">http://star.mit.edu/cluster</a>) (v. 0.9999)</div><div>Software Tools for Academics and Researchers (STAR)</div><div>Please submit bug reports to <a href="mailto:starcluster@mit.edu">starcluster@mit.edu</a></div>
<div><br></div><div>>>> Validating cluster template settings...</div><div>>>> Cluster template settings are valid</div><div>>>> Starting cluster...</div><div>>>> Launching a 1-node cluster...</div>
<div>>>> Creating security group @sc-vpcA...</div><div>Reservation:r-2843fa4e</div><div>>>> Waiting for cluster to come up... (updating every 30s)</div><div>>>> Waiting for all nodes to be in a 'running' state...</div>
<div>1/1 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%</div><div>>>> Waiting for SSH to come up on all nodes...</div><div>1/1 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%</div>
<div>>>> Waiting for cluster to come up took 1.574 mins</div><div>>>> The master node is</div><div>>>> Configuring cluster...</div><div>>>> Running plugin starcluster.clustersetup.DefaultClusterSetup</div>
<div>>>> Configuring hostnames...</div><div>1/1 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%</div><div>>>> Creating cluster user: sgeadmin (uid: 1007, gid: 1000)</div><div>1/1 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%</div>
<div>>>> Configuring scratch space for user(s): sgeadmin</div><div>1/1 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%</div><div>>>> Configuring /etc/hosts on each node</div><div>
1/1 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%</div><div>>>> Starting NFS server on master</div><div>>>> Setting up NFS took 0.113 mins</div><div>>>> Configuring passwordless ssh for root</div>
<div>>>> Configuring passwordless ssh for sgeadmin</div><div>>>> Running plugin starcluster.plugins.sge.SGEPlugin</div><div>>>> Configuring SGE...</div><div>>>> Setting up NFS took 0.000 mins</div>
<div>>>> Removing previous SGE installation...</div><div>>>> Installing Sun Grid Engine...</div><div>>>> Creating SGE parallel environment 'orte'</div><div>1/1 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%</div>
<div>>>> Adding parallel environment 'orte' to queue 'all.q'</div><div>>>> Configuring cluster took 0.679 mins</div><div>>>> Starting cluster took 2.307 mins</div><div><br></div>
<div>The cluster is now ready to use. To login to the master node</div><div>as root, run:</div><div><br></div><div> $ starcluster sshmaster vpcA</div><div><br></div><div>If you're having issues with the cluster you can reboot the</div>
<div>instances and completely reconfigure the cluster from</div><div>scratch using:</div><div><br></div><div> $ starcluster restart vpcA</div><div><br></div><div>When you're finished using the cluster and wish to terminate</div>
<div>it and stop paying for service:</div><div><br></div><div> $ starcluster terminate vpcA</div><div><br></div><div>Alternatively, if the cluster uses EBS instances, you can</div><div>use the 'stop' command to shutdown all nodes and put them</div>
<div>into a 'stopped' state preserving the EBS volumes backing</div><div>the nodes:</div><div><br></div><div> $ starcluster stop vpcA</div><div><br></div><div>WARNING: Any data stored in ephemeral storage (usually /mnt)</div>
<div>will be lost!</div><div><br></div><div>You can activate a 'stopped' cluster by passing the -x</div><div>option to the 'start' command:</div><div><br></div><div> $ starcluster start -x vpcA</div><div>
<br></div><div>This will start all 'stopped' nodes and reconfigure the</div><div>cluster.</div></div><div><div>-------------------------------------------------------------------------------------------------</div>
<div>------ Below is what it looks like when I have a FAILED launch ---</div><div>-------------------------------------------------------------------------------------------------</div></div><div><div>(starcluster)root@xxxxxxxxxxx:~# starcluster start -c testvpcB vpcB</div>
<div>StarCluster - (<a href="http://star.mit.edu/cluster">http://star.mit.edu/cluster</a>) (v. 0.9999)</div><div>Software Tools for Academics and Researchers (STAR)</div><div>Please submit bug reports to <a href="mailto:starcluster@mit.edu">starcluster@mit.edu</a></div>
<div><br></div><div>>>> Validating cluster template settings...</div><div>>>> Cluster template settings are valid</div><div>>>> Starting cluster...</div><div>>>> Launching a 1-node cluster...</div>
<div>>>> Creating security group @sc-vpcB...</div><div>!!! ERROR - InvalidParameterValue: Tag value exceeds the maximum length of 255 characters</div><div>Traceback (most recent call last):</div><div> File "/root/.virtualenvs/starcluster/starcluster/starcluster/cli.py", line 274, in main</div>
<div> sc.execute(args)</div><div> File "/root/.virtualenvs/starcluster/starcluster/starcluster/commands/start.py", line 220, in execute</div><div> validate_running=validate_running)</div><div> File "/root/.virtualenvs/starcluster/starcluster/starcluster/cluster.py", line 1537, in start</div>
<div> return self._start(create=create, create_only=create_only)</div><div> File "<string>", line 2, in _start</div><div> File "/root/.virtualenvs/starcluster/starcluster/starcluster/utils.py", line 111, in wrap_f</div>
<div> res = func(*arg, **kargs)</div><div> File "/root/.virtualenvs/starcluster/starcluster/starcluster/cluster.py", line 1552, in _start</div><div> self.create_cluster()</div><div> File "/root/.virtualenvs/starcluster/starcluster/starcluster/cluster.py", line 1066, in create_cluster</div>
<div> self._create_flat_rate_cluster()</div><div> File "/root/.virtualenvs/starcluster/starcluster/starcluster/cluster.py", line 1091, in _create_flat_rate_cluster</div><div> force_flat=True)[0]</div><div>
File "/root/.virtualenvs/starcluster/starcluster/starcluster/cluster.py", line 859, in create_nodes</div><div> cluster_sg = <a href="http://self.cluster_group.name">self.cluster_group.name</a></div><div> File "/root/.virtualenvs/starcluster/starcluster/starcluster/cluster.py", line 657, in cluster_group</div>
<div> self._add_tags_to_sg(sg)</div><div> File "/root/.virtualenvs/starcluster/starcluster/starcluster/cluster.py", line 698, in _add_tags_to_sg</div><div> sg.add_tag(static.CORE_TAG, core_settings)</div>
<div> File "/root/.virtualenvs/starcluster/local/lib/python2.7/site-packages/boto-2.19.0-py2.7.egg/boto/ec2/ec2object.py", line 82, in add_tag</div><div> dry_run=dry_run</div><div> File "/root/.virtualenvs/starcluster/local/lib/python2.7/site-packages/boto-2.19.0-py2.7.egg/boto/ec2/connection.py", line 4026, in create_tags</div>
<div> return self.get_status('CreateTags', params, verb='POST')</div><div> File "/root/.virtualenvs/starcluster/local/lib/python2.7/site-packages/boto-2.19.0-py2.7.egg/boto/connection.py", line 1158, in get_status</div>
<div> raise self.ResponseError(response.status, response.reason, body)</div><div>EC2ResponseError: EC2ResponseError: 400 Bad Request</div><div><?xml version="1.0" encoding="UTF-8"?></div><div>
<Response><Errors><Error><Code>InvalidParameterValue</Code><Message>Tag value exceeds the maximum length of 255 characters</Message></Error></Errors><RequestID>1f589605-8f30-472d-8989-22ea120aea14</RequestID></Response></div>
</div><div><br></div><div><div>-----------------------------------------------------------------------------------------------------------------</div><div>------ When if FAILS it creates only a security group see "listclusters" below ---</div>
<div>-----------------------------------------------------------------------------------------------------------------</div></div><div><div>(starcluster)root@xxxxxxxxxxx:~# starcluster listclusters</div><div>StarCluster - (<a href="http://star.mit.edu/cluster">http://star.mit.edu/cluster</a>) (v. 0.9999)</div>
<div>Software Tools for Academics and Researchers (STAR)</div><div>Please submit bug reports to <a href="mailto:starcluster@mit.edu">starcluster@mit.edu</a></div><div><br></div><div>-------------------------------</div><div>
vpcB (security group: @sc-vpcB)</div><div>-------------------------------</div><div>Launch time: N/A</div><div>Uptime: N/A</div><div>Zone: N/A</div><div>Keypair: N/A</div><div>EBS volumes: N/A</div><div>Cluster nodes: N/A</div>
<div><br></div><div>-------------------------------</div><div>vpcA (security group: @sc-vpcA)</div><div>-------------------------------</div><div>Launch time: 2013-12-10 14:39:36</div><div>Uptime: 0 days, 00:04:23</div><div>
Zone: us-east-1b</div><div>Keypair: Starcluster_VPC</div><div>EBS volumes: N/A</div><div>Cluster nodes:</div><div> master running i-1d745b65 10.0.0.138</div><div>Total nodes: 1</div><div><br></div><div>(starcluster)root@xxxxxxxxxxx:~#</div>
</div><div><br></div></div>