[StarCluster] StarCluster Development VPC-Starclusters - possible bug relating to "Tag Value exceeds 255 characters"....

Justin Riley jtriley at MIT.EDU
Thu Dec 12 12:09:07 EST 2013


Hi Jennifer,

Sorry you're having issues and thanks for reporting. I've created an
issue on github to track this:

https://github.com/jtriley/StarCluster/issues/348

Would you mind commenting on that issue with a copy of your config so
that I can take a look? Please remove all sensitive parts of your config
first.

Thanks!!

~Justin

On Tue, Dec 10, 2013 at 11:23:19AM -0500, Jennifer Staab wrote:
>    I have had limited success getting Starcluster to successfully launch a
>    cluster with EC2-VPC nodes under the development version (0.9999). Using a
>    certain AMI I can easily launch a Starcluster cluster with EC2-VPC nodes,
>    but using a different AMI it fails to launch.  I do set the config
>    variables "VPC_ID" and "SUBNET_ID" and the only difference between the two
>    cluster templates is the AMI that is used.
>    Both AMIs used successfully launch a Starcluster cluster with EC2-classic
>    nodes.  The only noted difference between the AMIs is that the one that
>    successfully launches a Starcluster cluster with VPC-EC2 nodes is a
>    private AMI that is "shared" with the account that I am running my VPC
>    within.  The AMI that doesn't work with Starcluster-VPC is one that is
>    private AMI "owned" by the account I am running my VPC within.   
>    I believe the error I am getting has something to do with the Tags,
>    specifically the "@sc-core" tag's value being beyond 255 characters, but I
>    could be wrong.  Below I have included an example of the successful
>    launch, the failed launch (including error message), and the listed
>    clusters after both commands.
>    Any suggestions on how to address this issue would be greatly appreciated.
>    Thanks in advance for the help,
>    -Jennifer
>    -------------------------------------------------------------------------------------------------
>    ------ Below is what it looks like when I have a successful launch ---
>    -------------------------------------------------------------------------------------------------
>    (starcluster)root at xxxxxxxxxxx:~# starcluster start -c testvpcA vpcA
>    StarCluster - ([1]http://star.mit.edu/cluster) (v. 0.9999)
>    Software Tools for Academics and Researchers (STAR)
>    Please submit bug reports to [2]starcluster at mit.edu
>    >>> Validating cluster template settings...
>    >>> Cluster template settings are valid
>    >>> Starting cluster...
>    >>> Launching a 1-node cluster...
>    >>> Creating security group @sc-vpcA...
>    Reservation:r-2843fa4e
>    >>> Waiting for cluster to come up... (updating every 30s)
>    >>> Waiting for all nodes to be in a 'running' state...
>    1/1 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>    100%
>    >>> Waiting for SSH to come up on all nodes...
>    1/1 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>    100%
>    >>> Waiting for cluster to come up took 1.574 mins
>    >>> The master node is
>    >>> Configuring cluster...
>    >>> Running plugin starcluster.clustersetup.DefaultClusterSetup
>    >>> Configuring hostnames...
>    1/1 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>    100%
>    >>> Creating cluster user: sgeadmin (uid: 1007, gid: 1000)
>    1/1 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>    100%
>    >>> Configuring scratch space for user(s): sgeadmin
>    1/1 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>    100%
>    >>> Configuring /etc/hosts on each node
>    1/1 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>    100%
>    >>> Starting NFS server on master
>    >>> Setting up NFS took 0.113 mins
>    >>> Configuring passwordless ssh for root
>    >>> Configuring passwordless ssh for sgeadmin
>    >>> Running plugin starcluster.plugins.sge.SGEPlugin
>    >>> Configuring SGE...
>    >>> Setting up NFS took 0.000 mins
>    >>> Removing previous SGE installation...
>    >>> Installing Sun Grid Engine...
>    >>> Creating SGE parallel environment 'orte'
>    1/1 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>    100%
>    >>> Adding parallel environment 'orte' to queue 'all.q'
>    >>> Configuring cluster took 0.679 mins
>    >>> Starting cluster took 2.307 mins
>    The cluster is now ready to use. To login to the master node
>    as root, run:
>        $ starcluster sshmaster vpcA
>    If you're having issues with the cluster you can reboot the
>    instances and completely reconfigure the cluster from
>    scratch using:
>        $ starcluster restart vpcA
>    When you're finished using the cluster and wish to terminate
>    it and stop paying for service:
>        $ starcluster terminate vpcA
>    Alternatively, if the cluster uses EBS instances, you can
>    use the 'stop' command to shutdown all nodes and put them
>    into a 'stopped' state preserving the EBS volumes backing
>    the nodes:
>        $ starcluster stop vpcA
>    WARNING: Any data stored in ephemeral storage (usually /mnt)
>    will be lost!
>    You can activate a 'stopped' cluster by passing the -x
>    option to the 'start' command:
>        $ starcluster start -x vpcA
>    This will start all 'stopped' nodes and reconfigure the
>    cluster.
>    -------------------------------------------------------------------------------------------------
>    ------ Below is what it looks like when I have a FAILED launch ---
>    -------------------------------------------------------------------------------------------------
>    (starcluster)root at xxxxxxxxxxx:~# starcluster start -c testvpcB vpcB
>    StarCluster - ([3]http://star.mit.edu/cluster) (v. 0.9999)
>    Software Tools for Academics and Researchers (STAR)
>    Please submit bug reports to [4]starcluster at mit.edu
>    >>> Validating cluster template settings...
>    >>> Cluster template settings are valid
>    >>> Starting cluster...
>    >>> Launching a 1-node cluster...
>    >>> Creating security group @sc-vpcB...
>    !!! ERROR - InvalidParameterValue: Tag value exceeds the maximum length of
>    255 characters
>    Traceback (most recent call last):
>      File "/root/.virtualenvs/starcluster/starcluster/starcluster/cli.py",
>    line 274, in main
>        sc.execute(args)
>      File
>    "/root/.virtualenvs/starcluster/starcluster/starcluster/commands/start.py",
>    line 220, in execute
>        validate_running=validate_running)
>      File
>    "/root/.virtualenvs/starcluster/starcluster/starcluster/cluster.py", line
>    1537, in start
>        return self._start(create=create, create_only=create_only)
>      File "<string>", line 2, in _start
>      File "/root/.virtualenvs/starcluster/starcluster/starcluster/utils.py",
>    line 111, in wrap_f
>        res = func(*arg, **kargs)
>      File
>    "/root/.virtualenvs/starcluster/starcluster/starcluster/cluster.py", line
>    1552, in _start
>        self.create_cluster()
>      File
>    "/root/.virtualenvs/starcluster/starcluster/starcluster/cluster.py", line
>    1066, in create_cluster
>        self._create_flat_rate_cluster()
>      File
>    "/root/.virtualenvs/starcluster/starcluster/starcluster/cluster.py", line
>    1091, in _create_flat_rate_cluster
>        force_flat=True)[0]
>      File
>    "/root/.virtualenvs/starcluster/starcluster/starcluster/cluster.py", line
>    859, in create_nodes
>        cluster_sg = [5]self.cluster_group.name
>      File
>    "/root/.virtualenvs/starcluster/starcluster/starcluster/cluster.py", line
>    657, in cluster_group
>        self._add_tags_to_sg(sg)
>      File
>    "/root/.virtualenvs/starcluster/starcluster/starcluster/cluster.py", line
>    698, in _add_tags_to_sg
>        sg.add_tag(static.CORE_TAG, core_settings)
>      File
>    "/root/.virtualenvs/starcluster/local/lib/python2.7/site-packages/boto-2.19.0-py2.7.egg/boto/ec2/ec2object.py",
>    line 82, in add_tag
>        dry_run=dry_run
>      File
>    "/root/.virtualenvs/starcluster/local/lib/python2.7/site-packages/boto-2.19.0-py2.7.egg/boto/ec2/connection.py",
>    line 4026, in create_tags
>        return self.get_status('CreateTags', params, verb='POST')
>      File
>    "/root/.virtualenvs/starcluster/local/lib/python2.7/site-packages/boto-2.19.0-py2.7.egg/boto/connection.py",
>    line 1158, in get_status
>        raise self.ResponseError(response.status, response.reason, body)
>    EC2ResponseError: EC2ResponseError: 400 Bad Request
>    <?xml version="1.0" encoding="UTF-8"?>
>    <Response><Errors><Error><Code>InvalidParameterValue</Code><Message>Tag
>    value exceeds the maximum length of 255
>    characters</Message></Error></Errors><RequestID>1f589605-8f30-472d-8989-22ea120aea14</RequestID></Response>
>    -----------------------------------------------------------------------------------------------------------------
>    ------ When if FAILS it creates only a security group see "listclusters"
>    below ---
>    -----------------------------------------------------------------------------------------------------------------
>    (starcluster)root at xxxxxxxxxxx:~# starcluster listclusters
>    StarCluster - ([6]http://star.mit.edu/cluster) (v. 0.9999)
>    Software Tools for Academics and Researchers (STAR)
>    Please submit bug reports to [7]starcluster at mit.edu
>    -------------------------------
>    vpcB (security group: @sc-vpcB)
>    -------------------------------
>    Launch time: N/A
>    Uptime: N/A
>    Zone: N/A
>    Keypair: N/A
>    EBS volumes: N/A
>    Cluster nodes: N/A
>    -------------------------------
>    vpcA (security group: @sc-vpcA)
>    -------------------------------
>    Launch time: 2013-12-10 14:39:36
>    Uptime: 0 days, 00:04:23
>    Zone: us-east-1b
>    Keypair: Starcluster_VPC
>    EBS volumes: N/A
>    Cluster nodes:
>         master running i-1d745b65 10.0.0.138
>    Total nodes: 1
>    (starcluster)root at xxxxxxxxxxx:~#
> 
> References
> 
>    Visible links
>    1. http://star.mit.edu/cluster
>    2. mailto:starcluster at mit.edu
>    3. http://star.mit.edu/cluster
>    4. mailto:starcluster at mit.edu
>    5. http://self.cluster_group.name/
>    6. http://star.mit.edu/cluster
>    7. mailto:starcluster at mit.edu

> _______________________________________________
> StarCluster mailing list
> StarCluster at mit.edu
> http://mailman.mit.edu/mailman/listinfo/starcluster

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
Url : http://mailman.mit.edu/pipermail/starcluster/attachments/20131212/83a59dc6/attachment.bin


More information about the StarCluster mailing list