<div dir="ltr">We are trying to run the loadbalancing when launching a cluster of QIIME AMI&#39;s (a software for analysis of next-gen sequencing data) and are running into some errors.<div><br></div><div style>The loadbalancing works well when running the StarCluster AMI but we have not been able to get QIIME (easily) installed on that AMI.</div>

<div style><br></div><div style>Below is the error.  Any info would be great.</div><div style><br></div><div style>Thanks.</div><div style><br></div><div style><font face="Calibri, Verdana, Helvetica, Arial"><span style="font-size:10.5pt"><b><br>

</b></span></font></div><div style><font face="Calibri, Verdana, Helvetica, Arial"><span style="font-size:10.5pt"><b>StarCluster Configuration<br>

<br>

</b>####################################<br>

## StarCluster Configuration File ##<br>

####################################<br>

[global]<br>

DEFAULT_TEMPLATE=qiime<br>

ENABLE_EXPERIMENTAL=True<br>

<br>

###########################<br>

## Defining Cluster      ##<br>

###########################<br>

<br>

[cluster QIIMETest]<br>

# change this to the name of one of the keypair sections defined above<br>

KEYNAME = StarCluster<br>

# number of ec2 instances to launch<br>

CLUSTER_SIZE = 2<br>

# create the following user on the cluster<br>

NODE_IMAGE_ID = ami-d5cc8fbc   #FDA QIIME 11.10 image<br>

<br>

# instance type for all cluster nodes<br>

# (options: m1.medium, m3.2xlarge, cc2.8xlarge, m1.large, c1.xlarge, hs1.8xlarge, cr1.8xlarge, m1.small, c1.medium, cg1.4xlarge, m1.xlarge, m2.xlarge, hi1.4xlarge, t1.micro, m2.4xlarge, m2.2xlarge, m3.xlarge, cc1.4xlarge)<br>


NODE_INSTANCE_TYPE = m2.2xlarge<br>

VOLUMES = CFSANdata3<br>

CLUSTER_SHELL = bash<br>

CLUSTER_USER = ubuntu<br>

<br>

#############################<br>

## Configuring EBS Volumes ##<br>

#############################<br>

<br>

[volume CFSANdata3]<br>

#attach vol-c9999999 to /home on master node and NFS-shre to worker nodes<br>

VOLUME_ID = vol-xxxxxx<br>

MOUNT_PATH = /home/ubuntu/CFSANdata<br>

<br>

[plugin ipcluster]<br>

SETUP_CLASS = starcluster.plugins.ipcluster.IPCluster<br>

<br>

<b>Creating cluster…<br>

<br>

</b>ubuntu@ip-10-181-159-232:~$ starcluster start -c QIIMETest STAR-ELASTIC<br>

StarCluster - (<font color="#0000FF"><u><a href="http://star.mit.edu/cluster">http://star.mit.edu/cluster</a></u></font>) (v. 0.94)<br>

Software Tools for Academics and Researchers (STAR)<br><font color="#0000FF"><u><br>

</u></font><br>

&gt;&gt;&gt; Validating cluster template settings...<br>

&gt;&gt;&gt; Cluster template settings are valid<br>

&gt;&gt;&gt; Starting cluster...<br>

&gt;&gt;&gt; Launching a 2-node cluster...<br>

&gt;&gt;&gt; Creating security group @sc-STAR-ELASTIC...<br>

Reservation:r-f8c93594<br>

&gt;&gt;&gt; Waiting for cluster to come up... (updating every 30s)<br>

&gt;&gt;&gt; Waiting for all nodes to be in a &#39;running&#39; state...<br>

2/2 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%<br>

&gt;&gt;&gt; Waiting for SSH to come up on all nodes...<br>

2/2 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%<br>

&gt;&gt;&gt; Waiting for cluster to come up took 1.148 mins<br>

&gt;&gt;&gt; The master node is <a href="http://ec2-50-19-65-196.compute-1.amazonaws.com">ec2-50-19-65-196.compute-1.amazonaws.com</a><br>

&gt;&gt;&gt; Configuring cluster...<br>

&gt;&gt;&gt; Attaching volume vol-12183458 to master node on /dev/sdz ...<br>

&gt;&gt;&gt; Waiting for vol-12183458 to transition to: attached...<br>

&gt;&gt;&gt; Running plugin starcluster.clustersetup.DefaultClusterSetup<br>

&gt;&gt;&gt; Configuring hostnames...<br>

2/2 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%<br>

&gt;&gt;&gt; Mounting EBS volume vol-12183458 on /home/ubuntu/CFSANdata...<br>

&gt;&gt;&gt; Creating cluster user: ubuntu (uid: 1000, gid: 1000)<br>

2/2 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%<br>

&gt;&gt;&gt; Configuring scratch space for user(s): ubuntu<br>

2/2 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%<br>

&gt;&gt;&gt; Configuring /etc/hosts on each node<br>

2/2 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%<br>

&gt;&gt;&gt; Starting NFS server on master<br>

&gt;&gt;&gt; Configuring NFS exports path(s):<br>

/home /home/ubuntu/CFSANdata<br>

&gt;&gt;&gt; Mounting all NFS export path(s) on 1 worker node(s)<br>

1/1 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%<br>

&gt;&gt;&gt; Setting up NFS took 0.087 mins<br>

&gt;&gt;&gt; Configuring passwordless ssh for root<br>

&gt;&gt;&gt; Configuring passwordless ssh for ubuntu<br>

&gt;&gt;&gt; Running plugin starcluster.plugins.sge.SGEPlugin<br>

&gt;&gt;&gt; Configuring SGE...<br>

&gt;&gt;&gt; Configuring NFS exports path(s):<br>

/opt/sge6<br>

&gt;&gt;&gt; Mounting all NFS export path(s) on 1 worker node(s)<br>

1/1 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%<br>

&gt;&gt;&gt; Setting up NFS took 0.018 mins<br>

&gt;&gt;&gt; Installing Sun Grid Engine...<br>

1/1 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%<br>

&gt;&gt;&gt; Creating SGE parallel environment &#39;orte&#39;<br>

2/2 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%<br>

&gt;&gt;&gt; Adding parallel environment &#39;orte&#39; to queue &#39;all.q&#39;<br>

&gt;&gt;&gt; Configuring cluster took 0.867 mins<br>

&gt;&gt;&gt; Starting cluster took 2.060 mins<br>

<br>

The cluster is now ready to use. To login to the master node<br>

as root, run:<br>

<br>

    $ starcluster sshmaster STAR-ELASTIC<br>

<br>

If you&#39;re having issues with the cluster you can reboot the<br>

instances and completely reconfigure the cluster from<br>

scratch using:<br>

<br>

    $ starcluster restart STAR-ELASTIC<br>

<br>

When you&#39;re finished using the cluster and wish to terminate<br>

it and stop paying for service:<br>

<br>

    $ starcluster terminate STAR-ELASTIC<br>

<br>

Alternatively, if the cluster uses EBS instances, you can<br>

use the &#39;stop&#39; command to shutdown all nodes and put them<br>

into a &#39;stopped&#39; state preserving the EBS volumes backing<br>

the nodes:<br>

<br>

    $ starcluster stop STAR-ELASTIC<br>

<br>

WARNING: Any data stored in ephemeral storage (usually /mnt)<br>

will be lost!<br>

<br>

You can activate a &#39;stopped&#39; cluster by passing the -x<br>

option to the &#39;start&#39; command:<br>

<br>

    $ starcluster start -x STAR-ELASTIC<br>

<br>

This will start all &#39;stopped&#39; nodes and reconfigure the<br>

cluster.<br>

<br>

<b>ubuntu@ip-$ starcluster loadbalance -m 80 -a 2 -n 2 -d -w 60 STAR-ELASTIC<br>

</b>StarCluster - (<font color="#0000FF"><u><a href="http://star.mit.edu/cluster">http://star.mit.edu/cluster</a></u></font>) (v. 0.94)<br>

Software Tools for Academics and Researchers (STAR)<br><br>

&gt;&gt;&gt; Starting load balancer (Use ctrl-c to exit)<br>

Maximum cluster size: 80<br>

Minimum cluster size: 2<br>

Cluster growth rate: 2 nodes/iteration<br>

<br>

&gt;&gt;&gt; Writing stats to file: /home/ubuntu/.starcluster/sge/STAR-ELASTIC/sge-stats.csv<br>

&gt;&gt;&gt; Loading full job history<br>

*** WARNING - Failed to retrieve stats (1/5):<br>

Traceback (most recent call last):<br>

  File &quot;/usr/local/lib/python2.7/dist-packages/StarCluster-0.94-py2.7.egg/starcluster/balancers/sge/__init__.py&quot;, line 536, in get_stats<br>

    return self._get_stats()<br>

  File &quot;/usr/local/lib/python2.7/dist-packages/StarCluster-0.94-py2.7.egg/starcluster/balancers/sge/__init__.py&quot;, line 507, in _get_stats<br>

    qstatxml = &#39;\n&#39;.join(master.ssh.execute(qstat_cmd))<br>

  File &quot;/usr/local/lib/python2.7/dist-packages/StarCluster-0.94-py2.7.egg/starcluster/sshutils/__init__.py&quot;, line 555, in execute<br>

    msg, command, exit_status, out_str)<br>

RemoteCommandFailed: remote command &#39;source /etc/profile &amp;&amp; qstat -u \* -xml -f -r&#39; failed with status 2:<br>

qstat: invalid option -- &#39;m&#39;<br>

qstat: conflicting options.<br>

usage:<br>

qstat [-f [-1]] [-W site_specific] [-x] [ job_identifier... | destination... ]<br>

qstat [-a|-i|-r|-e] [-u user] [-n [-1]] [-s] [-G|-M] [-R] [job_id... | destination...]<br>

qstat -Q [-f [-1]] [-W site_specific] [ destination... ]<br>

qstat -q [-G|-M] [ destination... ]<br>

qstat -B [-f [-1]] [-W site_specific] [ server_name... ]<br>

*** WARNING - Retrying in 60s<br>

</span></font>

</div></div>