[StarCluster] Issues with using an EBS volume
Jacob Barhak
jacob.barhak at gmail.com
Sun Jul 28 03:58:33 EDT 2013
Hello,
Perhaps someone in the group can help out with using an EBS volume.
I created an EBS volume and want to launch a cluster that uses it. I am
doing this in an attempt to solve the disk limitation problem I encountered
and is reported in this list at:
http://star.mit.edu/cluster/mlarchives/1795.html
However, I encounter the following error during starting the cluster.
>>> Setting up the cluster...
>>> Attaching volume vol-f0ae61cb to master node on /dev/sdz ...
>>> Configuring hostnames...
2/2 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
100%
!!! ERROR - volume has more than one partition, please specify which
partition t
o use (e.g. partition=0, partition=1, etc.) in the volume's config
The full transcript is attached.
I have the following lines in my configuration file:
VOLUMES = mydata
...
[volume mydata]
VOLUME_ID = vol-f0ae61cb
MOUNT_PATH = /mydata
I tried adding PARTITION = 0, and PARTITION = 1 to the volume definition in
the configuration file, yet nothing seems to fix this.
I also tried using "starcluster createvolume" to create the volume yet I
encountered the same issue as above in whatever method I created the 20gb
volume.
I am using ami-a4d64194 for my node images. My configuration and plugin are
derived from the files in https://github.com/ContinuumIO/anaconda-ec2
I am operating in us-west-2. I am using starcluster 0.93.3 with windows 7
If anyone has a quick solution or diagnosis test, I will appreciate the
feedback.
Jacob
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/pipermail/starcluster/attachments/20130728/6d936f47/attachment.htm
-------------- next part --------------
starcluster start -s 2 mycluster
StarCluster - (http://web.mit.edu/starcluster) (v. 0.93.3)
Software Tools for Academics and Researchers (STAR)
Please submit bug reports to starcluster at mit.edu
>>> Using default cluster template: smallcluster
>>> Validating cluster template settings...
>>> Cluster template settings are valid
>>> Starting cluster...
>>> Launching a 2-node cluster...
>>> Creating security group @sc-mycluster...
>>> Opening tcp port range 8989-8989 for CIDR 0.0.0.0/0
Reservation:r-34fe9f03
>>> Waiting for cluster to come up... (updating every 30s)
>>> Waiting for all nodes to be in a 'running' state...
2/2 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%
>>> Waiting for SSH to come up on all nodes...
2/2 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%
>>> Waiting for cluster to come up took 4.245 mins
>>> The master node is ec2-54-212-118-205.us-west-2.compute.amazonaws.com
>>> Setting up the cluster...
>>> Attaching volume vol-f0ae61cb to master node on /dev/sdz ...
>>> Configuring hostnames...
2/2 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%
!!! ERROR - volume has more than one partition, please specify which partition t
o use (e.g. partition=0, partition=1, etc.) in the volume's config
>>> Creating cluster user: None (uid: 1003, gid: 1003)
2/2 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%
>>> Configuring scratch space for user(s):
>>> a_user_not_named_disco
2/2 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%
>>> Configuring /etc/hosts on each node
2/2 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%
>>> Starting NFS server on master
>>> Configuring NFS exports path(s):
/home /mydata
>>> Mounting all NFS export path(s) on 1 worker node(s)
!!! ERROR - command 'mount /mydata' failed with status 32 | 0%
1/1 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%
>>> Setting up NFS took 0.148 mins
>>> Configuring passwordless ssh for root
>>> Configuring passwordless ssh for a_user_not_named_disco
>>> Shutting down threads...
20/20 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%
>>> Configuring SGE...
>>> Configuring NFS exports path(s):
/opt/sge6
>>> Mounting all NFS export path(s) on 1 worker node(s)
1/1 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%
>>> Setting up NFS took 0.071 mins
>>> Removing previous SGE installation...
>>> Installing Sun Grid Engine...
1/1 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%
>>> Creating SGE parallel environment 'orte'
2/2 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%
>>> Adding parallel environment 'orte' to queue 'all.q'
>>> Shutting down threads...
20/20 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%
>>> Running plugin anaconda_plugin
node001
>>> Configuring cluster took 1.567 mins
>>> Starting cluster took 5.856 mins
The cluster is now ready to use. To login to the master node
as root, run:
$ starcluster sshmaster mycluster
If you're having issues with the cluster you can reboot the
instances and completely reconfigure the cluster from
scratch using:
$ starcluster restart mycluster
When you're finished using the cluster and wish to terminate
it and stop paying for service:
$ starcluster terminate mycluster
Alternatively, if the cluster uses EBS instances, you can
use the 'stop' command to shutdown all nodes and put them
into a 'stopped' state preserving the EBS volumes backing
the nodes:
$ starcluster stop mycluster
WARNING: Any data stored in ephemeral storage (usually /mnt)
will be lost!
You can activate a 'stopped' cluster by passing the -x
option to the 'start' command:
$ starcluster start -x mycluster
This will start all 'stopped' nodes and reconfigure the
cluster.
More information about the StarCluster
mailing list