Read the code in clustersetup.py and have retried this process w/no tags or any non-essential data associated w/the latest created volume, vol-52fa8f23. Same failure mode as before:<div><br></div><div><div>.starcluster mary$ sc start -b 0.25 -i m1.small -I m1.small -c jobscluster jobscluster</div>
<div>StarCluster - (<a href="http://web.mit.edu/starcluster">http://web.mit.edu/starcluster</a>) (v. 0.93.3)</div><div>Software Tools for Academics and Researchers (STAR)</div><div>Please submit bug reports to <a href="mailto:starcluster@mit.edu">starcluster@mit.edu</a></div>
<div><br></div><div>*** WARNING - ************************************************************</div><div>*** WARNING - SPOT INSTANCES ARE NOT GUARANTEED TO COME UP</div><div>*** WARNING - </div><div>*** WARNING - Spot instances can take a long time to come up and may not</div>
<div>*** WARNING - come up at all depending on the current AWS load and your</div><div>*** WARNING - max spot bid price.</div><div>*** WARNING - </div><div>*** WARNING - StarCluster will wait indefinitely until all instances (2)</div>
<div>*** WARNING - come up. If this takes too long, you can cancel the start</div><div>*** WARNING - command using CTRL-C. You can then resume the start command</div><div>*** WARNING - later on using the --no-create (-x) option:</div>
<div>*** WARNING - </div><div>*** WARNING - $ starcluster start -x jobscluster</div><div>*** WARNING - </div><div>*** WARNING - This will use the existing spot instances launched</div><div>*** WARNING - previously and continue starting the cluster. If you don't</div>
<div>*** WARNING - wish to wait on the cluster any longer after pressing CTRL-C</div><div>*** WARNING - simply terminate the cluster using the 'terminate' command.</div><div>*** WARNING - ************************************************************</div>
<div><br></div><div>*** WARNING - Waiting 5 seconds before continuing...</div><div>*** WARNING - Press CTRL-C to cancel...</div><div>5...4...3...2...1...</div><div>>>> Validating cluster template settings...</div>
<div>>>> Cluster template settings are valid</div><div>>>> Starting cluster...</div><div>>>> Launching a 2-node cluster...</div><div>>>> Launching master node (ami: ami-4b9f0a22, type: m1.small)...</div>
<div>>>> Creating security group @sc-jobscluster...</div><div>Reservation:r-22c1d659</div><div>>>> Launching node001 (ami: ami-4b9f0a22, type: m1.small)</div><div>SpotInstanceRequest:sir-654c2614</div><div>
>>> Waiting for cluster to come up... (updating every 30s)</div><div>>>> Waiting for open spot requests to become active...</div><div>1/1 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100% </div>
<div>>>> Waiting for all nodes to be in a 'running' state...</div><div>2/2 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100% </div><div>>>> Waiting for SSH to come up on all nodes...</div>
<div>2/2 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100% </div><div>>>> Waiting for cluster to come up took 5.990 mins</div><div>>>> The master node is <a href="http://ec2-50-16-56-237.compute-1.amazonaws.com">ec2-50-16-56-237.compute-1.amazonaws.com</a></div>
<div>>>> Setting up the cluster...</div><div>>>> Attaching volume vol-52fa8f23 to master node on /dev/sdz ...</div><div>>>> Configuring hostnames...</div><div>2/2 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100% </div>
<div>*** WARNING - Cannot find device /dev/xvdz for volume vol-52fa8f23</div><div>*** WARNING - Not mounting vol-52fa8f23 on /usr/share/jobs/</div><div>*** WARNING - This usually means there was a problem attaching the EBS volume to the master node</div>
<div><snip></div><div><br></div><div>However, starcluster listclusters shows the volume attached to the master:</div><br><div>starcluster mary$ sc listclusters</div><div>StarCluster - (<a href="http://web.mit.edu/starcluster">http://web.mit.edu/starcluster</a>) (v. 0.93.3)</div>
<div>Software Tools for Academics and Researchers (STAR)</div><div>Please submit bug reports to <a href="mailto:starcluster@mit.edu">starcluster@mit.edu</a></div><div><br></div><div>---------------------------------------------</div>
<div>jobscluster (security group: @sc-jobscluster)</div><div>---------------------------------------------</div><div>Launch time: 2013-02-12 18:51:26</div><div>Uptime: 0 days, 00:36:27</div><div>Zone: us-east-1c</div><div>
Keypair: lapuserkey</div><div>EBS volumes:</div><div> vol-52fa8f23 on master:/dev/sdz (status: attached)</div><div> vol-e6e39697 on master:/dev/sda (status: attached)</div><div> vol-bce99ccd on node001:/dev/sda (status: attached)</div>
<div>Spot requests: 1 active</div><div>Cluster nodes:</div><div> master running i-859591f5 <a href="http://ec2-50-16-56-237.compute-1.amazonaws.com">ec2-50-16-56-237.compute-1.amazonaws.com</a></div><div> node001 running i-679d9917 <a href="http://ec2-54-234-176-219.compute-1.amazonaws.com">ec2-54-234-176-219.compute-1.amazonaws.com</a> (spot sir-654c2614)</div>
<div>Total nodes: 2</div><div><br></div><div>...but on the master itself, neither /dev/sdz nor /dev/xdvz shows up:</div><div><br></div><div><div>[root@master ~]# ls /dev/sd*</div><div>/dev/sda /dev/sda1 /dev/sda2 /dev/sda3 /dev/sdad /dev/sdb</div>
<div><br></div><div>[root@master ~]# ls /dev/xvd*</div><div>/dev/xvdad /dev/xvde /dev/xvde1 /dev/xvde2 /dev/xvde3 /dev/xvdf</div></div><div><br></div><div>Thanks again for any suggestions on how to get this volume to successfully mount on the master.</div>
<div><br></div><div>Lyn</div><div><br></div><div><br></div><div class="gmail_quote">On Tue, Feb 12, 2013 at 1:50 PM, Lyn Gerner <span dir="ltr"><<a href="mailto:schedulerqueen@gmail.com" target="_blank">schedulerqueen@gmail.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi All,<div><br></div><div>I've been receiving an error, consistently, from multiple attempts to boot a cluster that references an EBS volume that I've created w/"starcluster createvolume":</div>
<div><br>
</div><div>Here is the output from the most recent createvolume; looks like everything goes fine:</div><div><br></div><div>.starcluster mary$ alias sc=starcluster</div><div><div>.starcluster mary$ sc createvolume --name=usrsharejobs-cv5g-use1c 5 us-east-1c</div>
<div>StarCluster - (<a href="http://web.mit.edu/starcluster" target="_blank">http://web.mit.edu/starcluster</a>) (v. 0.93.3)</div><div>Software Tools for Academics and Researchers (STAR)</div><div>Please submit bug reports to <a href="mailto:starcluster@mit.edu" target="_blank">starcluster@mit.edu</a></div>
<div><br></div><div>>>> No keypair specified, picking one from config...</div><div>>>> Using keypair: lapuserkey</div><div>>>> Creating security group @sc-volumecreator...</div><div>>>> No instance in group @sc-volumecreator for zone us-east-1c, launching one now.</div>
<div>Reservation:r-de9f8aa5</div><div>>>> Waiting for volume host to come up... (updating every 30s)</div><div>>>> Waiting for all nodes to be in a 'running' state...</div><div>1/1 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100% </div>
<div>>>> Waiting for SSH to come up on all nodes...</div><div>1/1 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100% </div><div>>>> Waiting for cluster to come up took 1.447 mins</div>
<div>>>> Checking for required remote commands...</div><div>>>> Creating 5GB volume in zone us-east-1c</div><div>>>> New volume id: vol-53600b22</div><div>>>> Waiting for new volume to become 'available'... </div>
<div>>>> Attaching volume vol-53600b22 to instance i-6b714b1b... </div><div>>>> Formatting volume...</div><div>Filesystem label=</div><div>OS type: Linux</div><div>Block size=4096 (log=2)</div><div>Fragment size=4096 (log=2)</div>
<div>Stride=0 blocks, Stripe width=0 blocks</div><div>327680 inodes, 1310720 blocks</div><div>65536 blocks (5.00%) reserved for the super user</div><div>First data block=0</div><div>Maximum filesystem blocks=1342177280</div>
<div>40 block groups</div><div>32768 blocks per group, 32768 fragments per group</div><div>8192 inodes per group</div><div>Superblock backups stored on blocks: </div><div><span style="white-space:pre-wrap">        </span>32768, 98304, 163840, 229376, 294912, 819200, 884736</div>
<div><br></div><div>Writing inode tables: done </div><div>Creating journal (32768 blocks): done</div><div>Writing superblocks and filesystem accounting information: done</div><div><br></div><div>
This filesystem will be automatically checked every 33 mounts or</div><div>180 days, whichever comes first. Use tune2fs -c or -i to override.</div><div>mke2fs 1.41.14 (22-Dec-2010)</div><div><br></div><div>>>> Leaving volume vol-53600b22 attached to instance i-6b714b1b</div>
<div>>>> Not terminating host instance i-6b714b1b</div><div>*** WARNING - There are still volume hosts running: i-6b714b1b</div><div>*** WARNING - Run 'starcluster terminate volumecreator' to terminate *all* volume host instances once they're no longer needed</div>
<div>>>> Your new 5GB volume vol-53600b22 has been created successfully</div><div>>>> Creating volume took 1.871 mins</div><div><br></div><div><div>.starcluster mary$ sc terminate volumecreator</div><div>
StarCluster - (<a href="http://web.mit.edu/starcluster" target="_blank">http://web.mit.edu/starcluster</a>) (v. 0.93.3)</div><div>Software Tools for Academics and Researchers (STAR)</div><div>Please submit bug reports to <a href="mailto:starcluster@mit.edu" target="_blank">starcluster@mit.edu</a></div>
<div><br></div><div>Terminate EBS cluster volumecreator (y/n)? y</div><div>>>> Detaching volume vol-53600b22 from volhost-us-east-1c</div><div>>>> Terminating node: volhost-us-east-1c (i-6b714b1b)</div>
<div>
>>> Waiting for cluster to terminate... </div><div>>>> Removing @sc-volumecreator security group</div></div><div><br></div><div>.starcluster mary$ sc listvolumes</div></div><div><snip></div><div><div>
<br></div><div>volume_id: vol-53600b22</div><div>size: 5GB</div><div>status: available</div><div>availability_zone: us-east-1c</div><div>create_time: 2013-02-12 13:12:16</div><div>tags: Name=usrsharejobs-cv5g-use1c</div>
</div>
<div><br></div><div><snip></div><div><br></div><div>So here is the subsequent attempt to boot a cluster that tries to mount the new EBS volume: </div><div><br></div><div><div>.starcluster mary$ sc start -b 0.25 -i m1.small -I m1.small -c jobscluster jobscluster</div>
<div>StarCluster - (<a href="http://web.mit.edu/starcluster" target="_blank">http://web.mit.edu/starcluster</a>) (v. 0.93.3)</div><div>Software Tools for Academics and Researchers (STAR)</div><div>Please submit bug reports to <a href="mailto:starcluster@mit.edu" target="_blank">starcluster@mit.edu</a></div>
<div><br></div><div>*** WARNING - ************************************************************</div><div>*** WARNING - SPOT INSTANCES ARE NOT GUARANTEED TO COME UP</div><div>*** WARNING - </div><div>*** WARNING - Spot instances can take a long time to come up and may not</div>
<div>*** WARNING - come up at all depending on the current AWS load and your</div><div>*** WARNING - max spot bid price.</div><div>*** WARNING - </div><div>*** WARNING - StarCluster will wait indefinitely until all instances (2)</div>
<div>*** WARNING - come up. If this takes too long, you can cancel the start</div><div>*** WARNING - command using CTRL-C. You can then resume the start command</div><div>*** WARNING - later on using the --no-create (-x) option:</div>
<div>*** WARNING - </div><div>*** WARNING - $ starcluster start -x jobscluster</div><div>*** WARNING - </div><div>*** WARNING - This will use the existing spot instances launched</div><div>*** WARNING - previously and continue starting the cluster. If you don't</div>
<div>*** WARNING - wish to wait on the cluster any longer after pressing CTRL-C</div><div>*** WARNING - simply terminate the cluster using the 'terminate' command.</div><div>*** WARNING - ************************************************************</div>
<div><br></div><div>*** WARNING - Waiting 5 seconds before continuing...</div><div>*** WARNING - Press CTRL-C to cancel...</div><div>5...4...3...2...1...</div><div>>>> Validating cluster template settings...</div>
<div>>>> Cluster template settings are valid</div><div>>>> Starting cluster...</div><div>>>> Launching a 2-node cluster...</div><div>>>> Launching master node (ami: ami-4b9f0a22, type: m1.small)...</div>
<div>>>> Creating security group @sc-jobscluster...</div><div>Reservation:r-ba8c99c1</div><div>>>> Launching node001 (ami: ami-4b9f0a22, type: m1.small)</div><div>SpotInstanceRequest:sir-a05ae014</div><div>
>>> Waiting for cluster to come up... (updating every 30s)</div><div>>>> Waiting for open spot requests to become active...</div><div>1/1 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100% </div>
<div>>>> Waiting for all nodes to be in a 'running' state...</div><div>2/2 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100% </div><div>>>> Waiting for SSH to come up on all nodes...</div>
<div>2/2 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100% </div><div>>>> Waiting for cluster to come up took 6.245 mins</div><div>>>> The master node is <a href="http://ec2-54-242-244-139.compute-1.amazonaws.com" target="_blank">ec2-54-242-244-139.compute-1.amazonaws.com</a></div>
<div>>>> Setting up the cluster...</div><div>>>> Attaching volume vol-53600b22 to master node on /dev/sdz ...</div><div>>>> Configuring hostnames...</div><div>2/2 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100% </div>
<div>*** WARNING - Cannot find device /dev/xvdz for volume vol-53600b22</div><div>*** WARNING - Not mounting vol-53600b22 on /usr/share/jobs</div><div>*** WARNING - This usually means there was a problem attaching the EBS volume to the master node</div>
</div><div><snip></div><div><br></div><div>So per the relevant, past email threads, I'm using the createvolume command, and it still gives this error. Also tried creating the volume thru the AWS console; subsequent cluster boot fails at the same point w/the same problem of not finding the device.</div>
<div><br></div><div>I'll appreciate any suggestions.</div><div><br></div><div>Thanks much,</div><div>Lyn</div><div><br></div><div><br></div>
</blockquote></div><br></div>