[StarCluster] StarCluster Development VPC-Starclusters - possible bug relating to "Tag Value exceeds 255 characters"....
Justin Riley
jtriley at MIT.EDU
Thu Dec 12 12:09:07 EST 2013
Hi Jennifer,
Sorry you're having issues and thanks for reporting. I've created an
issue on github to track this:
https://github.com/jtriley/StarCluster/issues/348
Would you mind commenting on that issue with a copy of your config so
that I can take a look? Please remove all sensitive parts of your config
first.
Thanks!!
~Justin
On Tue, Dec 10, 2013 at 11:23:19AM -0500, Jennifer Staab wrote:
> I have had limited success getting Starcluster to successfully launch a
> cluster with EC2-VPC nodes under the development version (0.9999). Using a
> certain AMI I can easily launch a Starcluster cluster with EC2-VPC nodes,
> but using a different AMI it fails to launch. I do set the config
> variables "VPC_ID" and "SUBNET_ID" and the only difference between the two
> cluster templates is the AMI that is used.
> Both AMIs used successfully launch a Starcluster cluster with EC2-classic
> nodes. The only noted difference between the AMIs is that the one that
> successfully launches a Starcluster cluster with VPC-EC2 nodes is a
> private AMI that is "shared" with the account that I am running my VPC
> within. The AMI that doesn't work with Starcluster-VPC is one that is
> private AMI "owned" by the account I am running my VPC within.
> I believe the error I am getting has something to do with the Tags,
> specifically the "@sc-core" tag's value being beyond 255 characters, but I
> could be wrong. Below I have included an example of the successful
> launch, the failed launch (including error message), and the listed
> clusters after both commands.
> Any suggestions on how to address this issue would be greatly appreciated.
> Thanks in advance for the help,
> -Jennifer
> -------------------------------------------------------------------------------------------------
> ------ Below is what it looks like when I have a successful launch ---
> -------------------------------------------------------------------------------------------------
> (starcluster)root at xxxxxxxxxxx:~# starcluster start -c testvpcA vpcA
> StarCluster - ([1]http://star.mit.edu/cluster) (v. 0.9999)
> Software Tools for Academics and Researchers (STAR)
> Please submit bug reports to [2]starcluster at mit.edu
> >>> Validating cluster template settings...
> >>> Cluster template settings are valid
> >>> Starting cluster...
> >>> Launching a 1-node cluster...
> >>> Creating security group @sc-vpcA...
> Reservation:r-2843fa4e
> >>> Waiting for cluster to come up... (updating every 30s)
> >>> Waiting for all nodes to be in a 'running' state...
> 1/1 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> 100%
> >>> Waiting for SSH to come up on all nodes...
> 1/1 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> 100%
> >>> Waiting for cluster to come up took 1.574 mins
> >>> The master node is
> >>> Configuring cluster...
> >>> Running plugin starcluster.clustersetup.DefaultClusterSetup
> >>> Configuring hostnames...
> 1/1 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> 100%
> >>> Creating cluster user: sgeadmin (uid: 1007, gid: 1000)
> 1/1 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> 100%
> >>> Configuring scratch space for user(s): sgeadmin
> 1/1 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> 100%
> >>> Configuring /etc/hosts on each node
> 1/1 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> 100%
> >>> Starting NFS server on master
> >>> Setting up NFS took 0.113 mins
> >>> Configuring passwordless ssh for root
> >>> Configuring passwordless ssh for sgeadmin
> >>> Running plugin starcluster.plugins.sge.SGEPlugin
> >>> Configuring SGE...
> >>> Setting up NFS took 0.000 mins
> >>> Removing previous SGE installation...
> >>> Installing Sun Grid Engine...
> >>> Creating SGE parallel environment 'orte'
> 1/1 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> 100%
> >>> Adding parallel environment 'orte' to queue 'all.q'
> >>> Configuring cluster took 0.679 mins
> >>> Starting cluster took 2.307 mins
> The cluster is now ready to use. To login to the master node
> as root, run:
> $ starcluster sshmaster vpcA
> If you're having issues with the cluster you can reboot the
> instances and completely reconfigure the cluster from
> scratch using:
> $ starcluster restart vpcA
> When you're finished using the cluster and wish to terminate
> it and stop paying for service:
> $ starcluster terminate vpcA
> Alternatively, if the cluster uses EBS instances, you can
> use the 'stop' command to shutdown all nodes and put them
> into a 'stopped' state preserving the EBS volumes backing
> the nodes:
> $ starcluster stop vpcA
> WARNING: Any data stored in ephemeral storage (usually /mnt)
> will be lost!
> You can activate a 'stopped' cluster by passing the -x
> option to the 'start' command:
> $ starcluster start -x vpcA
> This will start all 'stopped' nodes and reconfigure the
> cluster.
> -------------------------------------------------------------------------------------------------
> ------ Below is what it looks like when I have a FAILED launch ---
> -------------------------------------------------------------------------------------------------
> (starcluster)root at xxxxxxxxxxx:~# starcluster start -c testvpcB vpcB
> StarCluster - ([3]http://star.mit.edu/cluster) (v. 0.9999)
> Software Tools for Academics and Researchers (STAR)
> Please submit bug reports to [4]starcluster at mit.edu
> >>> Validating cluster template settings...
> >>> Cluster template settings are valid
> >>> Starting cluster...
> >>> Launching a 1-node cluster...
> >>> Creating security group @sc-vpcB...
> !!! ERROR - InvalidParameterValue: Tag value exceeds the maximum length of
> 255 characters
> Traceback (most recent call last):
> File "/root/.virtualenvs/starcluster/starcluster/starcluster/cli.py",
> line 274, in main
> sc.execute(args)
> File
> "/root/.virtualenvs/starcluster/starcluster/starcluster/commands/start.py",
> line 220, in execute
> validate_running=validate_running)
> File
> "/root/.virtualenvs/starcluster/starcluster/starcluster/cluster.py", line
> 1537, in start
> return self._start(create=create, create_only=create_only)
> File "<string>", line 2, in _start
> File "/root/.virtualenvs/starcluster/starcluster/starcluster/utils.py",
> line 111, in wrap_f
> res = func(*arg, **kargs)
> File
> "/root/.virtualenvs/starcluster/starcluster/starcluster/cluster.py", line
> 1552, in _start
> self.create_cluster()
> File
> "/root/.virtualenvs/starcluster/starcluster/starcluster/cluster.py", line
> 1066, in create_cluster
> self._create_flat_rate_cluster()
> File
> "/root/.virtualenvs/starcluster/starcluster/starcluster/cluster.py", line
> 1091, in _create_flat_rate_cluster
> force_flat=True)[0]
> File
> "/root/.virtualenvs/starcluster/starcluster/starcluster/cluster.py", line
> 859, in create_nodes
> cluster_sg = [5]self.cluster_group.name
> File
> "/root/.virtualenvs/starcluster/starcluster/starcluster/cluster.py", line
> 657, in cluster_group
> self._add_tags_to_sg(sg)
> File
> "/root/.virtualenvs/starcluster/starcluster/starcluster/cluster.py", line
> 698, in _add_tags_to_sg
> sg.add_tag(static.CORE_TAG, core_settings)
> File
> "/root/.virtualenvs/starcluster/local/lib/python2.7/site-packages/boto-2.19.0-py2.7.egg/boto/ec2/ec2object.py",
> line 82, in add_tag
> dry_run=dry_run
> File
> "/root/.virtualenvs/starcluster/local/lib/python2.7/site-packages/boto-2.19.0-py2.7.egg/boto/ec2/connection.py",
> line 4026, in create_tags
> return self.get_status('CreateTags', params, verb='POST')
> File
> "/root/.virtualenvs/starcluster/local/lib/python2.7/site-packages/boto-2.19.0-py2.7.egg/boto/connection.py",
> line 1158, in get_status
> raise self.ResponseError(response.status, response.reason, body)
> EC2ResponseError: EC2ResponseError: 400 Bad Request
> <?xml version="1.0" encoding="UTF-8"?>
> <Response><Errors><Error><Code>InvalidParameterValue</Code><Message>Tag
> value exceeds the maximum length of 255
> characters</Message></Error></Errors><RequestID>1f589605-8f30-472d-8989-22ea120aea14</RequestID></Response>
> -----------------------------------------------------------------------------------------------------------------
> ------ When if FAILS it creates only a security group see "listclusters"
> below ---
> -----------------------------------------------------------------------------------------------------------------
> (starcluster)root at xxxxxxxxxxx:~# starcluster listclusters
> StarCluster - ([6]http://star.mit.edu/cluster) (v. 0.9999)
> Software Tools for Academics and Researchers (STAR)
> Please submit bug reports to [7]starcluster at mit.edu
> -------------------------------
> vpcB (security group: @sc-vpcB)
> -------------------------------
> Launch time: N/A
> Uptime: N/A
> Zone: N/A
> Keypair: N/A
> EBS volumes: N/A
> Cluster nodes: N/A
> -------------------------------
> vpcA (security group: @sc-vpcA)
> -------------------------------
> Launch time: 2013-12-10 14:39:36
> Uptime: 0 days, 00:04:23
> Zone: us-east-1b
> Keypair: Starcluster_VPC
> EBS volumes: N/A
> Cluster nodes:
> master running i-1d745b65 10.0.0.138
> Total nodes: 1
> (starcluster)root at xxxxxxxxxxx:~#
>
> References
>
> Visible links
> 1. http://star.mit.edu/cluster
> 2. mailto:starcluster at mit.edu
> 3. http://star.mit.edu/cluster
> 4. mailto:starcluster at mit.edu
> 5. http://self.cluster_group.name/
> 6. http://star.mit.edu/cluster
> 7. mailto:starcluster at mit.edu
> _______________________________________________
> StarCluster mailing list
> StarCluster at mit.edu
> http://mailman.mit.edu/mailman/listinfo/starcluster
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
Url : http://mailman.mit.edu/pipermail/starcluster/attachments/20131212/83a59dc6/attachment.bin
More information about the StarCluster
mailing list