<div dir="ltr"><div>Thanks for getting back to me. I added the email below as a comment on the github track: <a href="https://github.com/jtriley/StarCluster/issues/348">https://github.com/jtriley/StarCluster/issues/348</a></div>
<div>----------------------------------</div><div>I think I figured out the problem and I have a solution that works for creating a starcluster under the VPC BUT you can't change its attributes after creation other than doing a force termination.</div>
<div><br></div><div>Starcluster works in EC2-Classic because they don't limit the Tag Values to a certain size, but with EC2-VPC Tag Values are limited to size 255 characters. I think this was discussed before if you use the following link <a href="https://github.com/jtriley/StarCluster/issues/21">https://github.com/jtriley/StarCluster/issues/21</a> and search for words “3) StarCluster + VPC limitation:" you can read more about this issue. </div>
<div><br></div><div>Basically the @sc-core Tag's value can be greater than 255 characters in length, if this is the case then Starcluster software will fail to create a cluster under the VPC. This is why for my one AMI (had @sc-core Tag value less than 255 characters) I could successfully create a starcluster under the VPC, but for all other AMI's I wasn't able to.</div>
<div><br></div><div>Temporary Solution:</div><div>1 ) Locate the .../starcluster/cluster.py file and open it up in your favorite editor</div><div>2) Go to line 681 this should be the first line of the method "_add_tags_to_sg" and it should have following text "def _add_tags_to_sg(self, sg):"</div>
<div>3) The value for 'static.CORE_TAG' is the Tag Name '@sc-core' and it's value is 'core_settings'. The issue occurs in adding this Tag with line 698 'sg.add_tag(static.CORE_TAG, core_settings)'. If the value of 'core_settings' is greater than 255 then this is where the program fails. Add the following 3 lines of code (indicated by (+)) between lines 697 'if not static.CORE_TAG in sg.tags:' and 698 'sg.add_tag(static.CORE_TAG, core_settings)'</div>
<div>-------------------------------- Code Change -------------------------------------------------------------------------------</div><div>697 if not static.CORE_TAG in sg.tags:</div><div>698(+) if(len(core_settings) > 255):</div>
<div>699(+) print "\nWarning: For ", static.CORE_TAG, " truncating core_settings from ", len(core_settings), " to length 255."</div><div>700(+) core_settings=core_settings[:255]</div>
<div>701 sg.add_tag(static.CORE_TAG, core_settings)</div><div>-------------------------------- Code Change -------------------------------------------------------------------------------</div><div><br></div>
<div>WARNING - there are implications of making this code change. In doing so you will not be able to change the cluster other than a forced termination IF the core_settings value was truncated (warning appears during creation).</div>
<div><br></div><div>Seems like the @sc-core Tag holds cluster settings for a cluster that have been serialized and then compressed and the tag value is that compressed value. Truncating this value allows you to create a cluster BUT you can't change attributes once it is created because the software needs a non-truncated value to link back to the cluster (I think?).</div>
<div><br></div><div>Once up and running the cluster runs fine, but ultimately for Starcluster to work consistently with EC2-VPC the code would need to be changed as to limit the @sc-core value to only 255 characters or fewer. Maybe eliminating some of the options in the core tag, better compression algorithm, or ?? might be an easy fix to keep from having to do a big re-write of the code. My temporary solution works for a 'static' cluster under the VPC.</div>
<div><br></div><div>Thanks,</div><div>-Jennifer</div></div><div class="gmail_extra"><br><br><div class="gmail_quote">On Thu, Dec 12, 2013 at 12:09 PM, Justin Riley <span dir="ltr"><<a href="mailto:jtriley@mit.edu" target="_blank">jtriley@mit.edu</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi Jennifer,<br>
<br>
Sorry you're having issues and thanks for reporting. I've created an<br>
issue on github to track this:<br>
<br>
<a href="https://github.com/jtriley/StarCluster/issues/348" target="_blank">https://github.com/jtriley/StarCluster/issues/348</a><br>
<br>
Would you mind commenting on that issue with a copy of your config so<br>
that I can take a look? Please remove all sensitive parts of your config<br>
first.<br>
<br>
Thanks!!<br>
<br>
~Justin<br>
<div><div class="h5"><br>
On Tue, Dec 10, 2013 at 11:23:19AM -0500, Jennifer Staab wrote:<br>
> I have had limited success getting Starcluster to successfully launch a<br>
> cluster with EC2-VPC nodes under the development version (0.9999). Using a<br>
> certain AMI I can easily launch a Starcluster cluster with EC2-VPC nodes,<br>
> but using a different AMI it fails to launch. I do set the config<br>
> variables "VPC_ID" and "SUBNET_ID" and the only difference between the two<br>
> cluster templates is the AMI that is used.<br>
> Both AMIs used successfully launch a Starcluster cluster with EC2-classic<br>
> nodes. The only noted difference between the AMIs is that the one that<br>
> successfully launches a Starcluster cluster with VPC-EC2 nodes is a<br>
> private AMI that is "shared" with the account that I am running my VPC<br>
> within. The AMI that doesn't work with Starcluster-VPC is one that is<br>
> private AMI "owned" by the account I am running my VPC within. <br>
> I believe the error I am getting has something to do with the Tags,<br>
> specifically the "@sc-core" tag's value being beyond 255 characters, but I<br>
> could be wrong. Below I have included an example of the successful<br>
> launch, the failed launch (including error message), and the listed<br>
> clusters after both commands.<br>
> Any suggestions on how to address this issue would be greatly appreciated.<br>
> Thanks in advance for the help,<br>
> -Jennifer<br>
> -------------------------------------------------------------------------------------------------<br>
> ------ Below is what it looks like when I have a successful launch ---<br>
> -------------------------------------------------------------------------------------------------<br>
> (starcluster)root@xxxxxxxxxxx:~# starcluster start -c testvpcA vpcA<br>
</div></div>> StarCluster - ([1]<a href="http://star.mit.edu/cluster" target="_blank">http://star.mit.edu/cluster</a>) (v. 0.9999)<br>
<div class="im">> Software Tools for Academics and Researchers (STAR)<br>
</div>> Please submit bug reports to [2]<a href="mailto:starcluster@mit.edu">starcluster@mit.edu</a><br>
<div><div class="h5">> >>> Validating cluster template settings...<br>
> >>> Cluster template settings are valid<br>
> >>> Starting cluster...<br>
> >>> Launching a 1-node cluster...<br>
> >>> Creating security group @sc-vpcA...<br>
> Reservation:r-2843fa4e<br>
> >>> Waiting for cluster to come up... (updating every 30s)<br>
> >>> Waiting for all nodes to be in a 'running' state...<br>
> 1/1 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||<br>
> 100%<br>
> >>> Waiting for SSH to come up on all nodes...<br>
> 1/1 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||<br>
> 100%<br>
> >>> Waiting for cluster to come up took 1.574 mins<br>
> >>> The master node is<br>
> >>> Configuring cluster...<br>
> >>> Running plugin starcluster.clustersetup.DefaultClusterSetup<br>
> >>> Configuring hostnames...<br>
> 1/1 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||<br>
> 100%<br>
> >>> Creating cluster user: sgeadmin (uid: 1007, gid: 1000)<br>
> 1/1 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||<br>
> 100%<br>
> >>> Configuring scratch space for user(s): sgeadmin<br>
> 1/1 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||<br>
> 100%<br>
> >>> Configuring /etc/hosts on each node<br>
> 1/1 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||<br>
> 100%<br>
> >>> Starting NFS server on master<br>
> >>> Setting up NFS took 0.113 mins<br>
> >>> Configuring passwordless ssh for root<br>
> >>> Configuring passwordless ssh for sgeadmin<br>
> >>> Running plugin starcluster.plugins.sge.SGEPlugin<br>
> >>> Configuring SGE...<br>
> >>> Setting up NFS took 0.000 mins<br>
> >>> Removing previous SGE installation...<br>
> >>> Installing Sun Grid Engine...<br>
> >>> Creating SGE parallel environment 'orte'<br>
> 1/1 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||<br>
> 100%<br>
> >>> Adding parallel environment 'orte' to queue 'all.q'<br>
> >>> Configuring cluster took 0.679 mins<br>
> >>> Starting cluster took 2.307 mins<br>
> The cluster is now ready to use. To login to the master node<br>
> as root, run:<br>
> $ starcluster sshmaster vpcA<br>
> If you're having issues with the cluster you can reboot the<br>
> instances and completely reconfigure the cluster from<br>
> scratch using:<br>
> $ starcluster restart vpcA<br>
> When you're finished using the cluster and wish to terminate<br>
> it and stop paying for service:<br>
> $ starcluster terminate vpcA<br>
> Alternatively, if the cluster uses EBS instances, you can<br>
> use the 'stop' command to shutdown all nodes and put them<br>
> into a 'stopped' state preserving the EBS volumes backing<br>
> the nodes:<br>
> $ starcluster stop vpcA<br>
> WARNING: Any data stored in ephemeral storage (usually /mnt)<br>
> will be lost!<br>
> You can activate a 'stopped' cluster by passing the -x<br>
> option to the 'start' command:<br>
> $ starcluster start -x vpcA<br>
> This will start all 'stopped' nodes and reconfigure the<br>
> cluster.<br>
> -------------------------------------------------------------------------------------------------<br>
> ------ Below is what it looks like when I have a FAILED launch ---<br>
> -------------------------------------------------------------------------------------------------<br>
> (starcluster)root@xxxxxxxxxxx:~# starcluster start -c testvpcB vpcB<br>
</div></div>> StarCluster - ([3]<a href="http://star.mit.edu/cluster" target="_blank">http://star.mit.edu/cluster</a>) (v. 0.9999)<br>
<div class="im">> Software Tools for Academics and Researchers (STAR)<br>
</div>> Please submit bug reports to [4]<a href="mailto:starcluster@mit.edu">starcluster@mit.edu</a><br>
<div><div class="h5">> >>> Validating cluster template settings...<br>
> >>> Cluster template settings are valid<br>
> >>> Starting cluster...<br>
> >>> Launching a 1-node cluster...<br>
> >>> Creating security group @sc-vpcB...<br>
> !!! ERROR - InvalidParameterValue: Tag value exceeds the maximum length of<br>
> 255 characters<br>
> Traceback (most recent call last):<br>
> File "/root/.virtualenvs/starcluster/starcluster/starcluster/cli.py",<br>
> line 274, in main<br>
> sc.execute(args)<br>
> File<br>
> "/root/.virtualenvs/starcluster/starcluster/starcluster/commands/start.py",<br>
> line 220, in execute<br>
> validate_running=validate_running)<br>
> File<br>
> "/root/.virtualenvs/starcluster/starcluster/starcluster/cluster.py", line<br>
> 1537, in start<br>
> return self._start(create=create, create_only=create_only)<br>
> File "<string>", line 2, in _start<br>
> File "/root/.virtualenvs/starcluster/starcluster/starcluster/utils.py",<br>
> line 111, in wrap_f<br>
> res = func(*arg, **kargs)<br>
> File<br>
> "/root/.virtualenvs/starcluster/starcluster/starcluster/cluster.py", line<br>
> 1552, in _start<br>
> self.create_cluster()<br>
> File<br>
> "/root/.virtualenvs/starcluster/starcluster/starcluster/cluster.py", line<br>
> 1066, in create_cluster<br>
> self._create_flat_rate_cluster()<br>
> File<br>
> "/root/.virtualenvs/starcluster/starcluster/starcluster/cluster.py", line<br>
> 1091, in _create_flat_rate_cluster<br>
> force_flat=True)[0]<br>
> File<br>
> "/root/.virtualenvs/starcluster/starcluster/starcluster/cluster.py", line<br>
> 859, in create_nodes<br>
</div></div>> cluster_sg = [5]<a href="http://self.cluster_group.name" target="_blank">self.cluster_group.name</a><br>
<div><div class="h5">> File<br>
> "/root/.virtualenvs/starcluster/starcluster/starcluster/cluster.py", line<br>
> 657, in cluster_group<br>
> self._add_tags_to_sg(sg)<br>
> File<br>
> "/root/.virtualenvs/starcluster/starcluster/starcluster/cluster.py", line<br>
> 698, in _add_tags_to_sg<br>
> sg.add_tag(static.CORE_TAG, core_settings)<br>
> File<br>
> "/root/.virtualenvs/starcluster/local/lib/python2.7/site-packages/boto-2.19.0-py2.7.egg/boto/ec2/ec2object.py",<br>
> line 82, in add_tag<br>
> dry_run=dry_run<br>
> File<br>
> "/root/.virtualenvs/starcluster/local/lib/python2.7/site-packages/boto-2.19.0-py2.7.egg/boto/ec2/connection.py",<br>
> line 4026, in create_tags<br>
> return self.get_status('CreateTags', params, verb='POST')<br>
> File<br>
> "/root/.virtualenvs/starcluster/local/lib/python2.7/site-packages/boto-2.19.0-py2.7.egg/boto/connection.py",<br>
> line 1158, in get_status<br>
> raise self.ResponseError(response.status, response.reason, body)<br>
> EC2ResponseError: EC2ResponseError: 400 Bad Request<br>
> <?xml version="1.0" encoding="UTF-8"?><br>
> <Response><Errors><Error><Code>InvalidParameterValue</Code><Message>Tag<br>
> value exceeds the maximum length of 255<br>
> characters</Message></Error></Errors><RequestID>1f589605-8f30-472d-8989-22ea120aea14</RequestID></Response><br>
> -----------------------------------------------------------------------------------------------------------------<br>
> ------ When if FAILS it creates only a security group see "listclusters"<br>
> below ---<br>
> -----------------------------------------------------------------------------------------------------------------<br>
> (starcluster)root@xxxxxxxxxxx:~# starcluster listclusters<br>
</div></div>> StarCluster - ([6]<a href="http://star.mit.edu/cluster" target="_blank">http://star.mit.edu/cluster</a>) (v. 0.9999)<br>
<div class="im">> Software Tools for Academics and Researchers (STAR)<br>
</div>> Please submit bug reports to [7]<a href="mailto:starcluster@mit.edu">starcluster@mit.edu</a><br>
<div class="im">> -------------------------------<br>
> vpcB (security group: @sc-vpcB)<br>
> -------------------------------<br>
> Launch time: N/A<br>
> Uptime: N/A<br>
> Zone: N/A<br>
> Keypair: N/A<br>
> EBS volumes: N/A<br>
> Cluster nodes: N/A<br>
> -------------------------------<br>
> vpcA (security group: @sc-vpcA)<br>
> -------------------------------<br>
> Launch time: 2013-12-10 14:39:36<br>
> Uptime: 0 days, 00:04:23<br>
> Zone: us-east-1b<br>
> Keypair: Starcluster_VPC<br>
> EBS volumes: N/A<br>
> Cluster nodes:<br>
> master running i-1d745b65 10.0.0.138<br>
> Total nodes: 1<br>
> (starcluster)root@xxxxxxxxxxx:~#<br>
><br>
</div>> References<br>
><br>
> Visible links<br>
> 1. <a href="http://star.mit.edu/cluster" target="_blank">http://star.mit.edu/cluster</a><br>
> 2. mailto:<a href="mailto:starcluster@mit.edu">starcluster@mit.edu</a><br>
> 3. <a href="http://star.mit.edu/cluster" target="_blank">http://star.mit.edu/cluster</a><br>
> 4. mailto:<a href="mailto:starcluster@mit.edu">starcluster@mit.edu</a><br>
> 5. <a href="http://self.cluster_group.name/" target="_blank">http://self.cluster_group.name/</a><br>
> 6. <a href="http://star.mit.edu/cluster" target="_blank">http://star.mit.edu/cluster</a><br>
> 7. mailto:<a href="mailto:starcluster@mit.edu">starcluster@mit.edu</a><br>
<br>
> _______________________________________________<br>
> StarCluster mailing list<br>
> <a href="mailto:StarCluster@mit.edu">StarCluster@mit.edu</a><br>
> <a href="http://mailman.mit.edu/mailman/listinfo/starcluster" target="_blank">http://mailman.mit.edu/mailman/listinfo/starcluster</a><br>
<br>
</blockquote></div><br></div>