Thank you Justin for your reply, <div><br></div><div>1. I started using your ami for the GPU &quot;ami-4583572c&quot; as you see in the config file for the first cluster template that is commented out and not used now, but that&#39;s when I first created the initial volume on the AWS console and it as by default in the different region &quot;us-east-1c&quot; while the ami was in &quot;us-east-1a&quot;, so the first time it didn&#39;t connect because of the different region, but the error message was not clear, so, I didn&#39;t understand about the region until I used ec2 commands and it said it clearly and the next days I received the replies in this mail group confirming that it is the reason. Then when Rayson said I should use starcluster createvolume command, I deleted the volume and recreated as required in the instructions and terminated the volumecreator, but I think I didn&#39;t see it attached once I started sfmcluster, I think it only appeared after using the AWS console, </div>


<div><br></div><div>I did my configurations and installations and downloads, then created a new image &quot;ami-fae74193&quot; and it is available for public now if you need to have a look, using this command and yes while the volume was attached:</div>


<div><br></div><div>ec2-create-image instanceID --name sfmimage --description &#39;GPU Cluster Ubunto with VisualSFM, MeshlabServer, FFMPEG&#39; -K mykeypath/pkfile.pem<br><br></div><div>Now, I am using the second cluster  &quot;mysfmcluster&quot; using my ami &quot;ami-fae74193&quot; and I think I had to detach from the AWS console, and even force detach, to attach to the new cluster. I kept both running for a while to test, and not sure if trying to detach while the cluster is running is the problem, but it took a while, and I am not sure if I had to terminate the first cluster before attaching to second or not, but I remember I had to terminate first, </div>


<div><br></div><div>2. As mentioned on point &quot;1&quot;, I created the first volume using AWS console, and in different region, then recreated using starcluster commands, and in both ways was 30GB unpartitioned, and I didn&#39;t see errors, </div>


<div><br></div><div>3. Yes, as seen on the attached file, </div><div><br></div><div>4. I just started the cluster now, and it is attached this time, it might be my problems, or that I didn&#39;t wait enough till everything is available, however the bad news, there is another problem that didn&#39;t stop me from sshmaster afterwards, and the screen output is copied below and I think in the attached debug.log, </div>


<div><br></div><div>5. both files attached, </div><div><br></div><div>thanks again for your support, </div><div><br></div><div>Manal</div><div><br></div><div><br></div><div><div>$ starcluster start mysfmcluster</div><div>


StarCluster - (<a href="http://web.mit.edu/starcluster">http://web.mit.edu/starcluster</a>) (v. 0.93.3)</div><div>Software Tools for Academics and Researchers (STAR)</div><div>Please submit bug reports to <a href="mailto:starcluster@mit.edu">starcluster@mit.edu</a></div>


<div><br></div><div>&gt;&gt;&gt; Using default cluster template: mysfmcluster</div><div>&gt;&gt;&gt; Validating cluster template settings...</div><div>&gt;&gt;&gt; Cluster template settings are valid</div><div>&gt;&gt;&gt; Starting cluster...</div>


<div>&gt;&gt;&gt; Launching a 1-node cluster...</div><div>&gt;&gt;&gt; Launching master node (ami: ami-fae74193, type: cg1.4xlarge)...</div><div>&gt;&gt;&gt; Creating security group @sc-mysfmcluster...</div><div>&gt;&gt;&gt; Creating placement group @sc-mysfmcluster...</div>


<div>SpotInstanceRequest:sir-eeb33011</div><div>&gt;&gt;&gt; Waiting for cluster to come up... (updating every 30s)</div><div>&gt;&gt;&gt; Waiting for open spot requests to become active...</div><div>1/1 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%  </div>


<div>&gt;&gt;&gt; Waiting for all nodes to be in a &#39;running&#39; state...</div><div>1/1 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%  </div><div>&gt;&gt;&gt; Waiting for SSH to come up on all nodes...</div>


<div>1/1 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%  </div><div>&gt;&gt;&gt; Waiting for cluster to come up took 8.399 mins</div><div>&gt;&gt;&gt; The master node is <a href="http://ec2-23-20-139-233.compute-1.amazonaws.com">ec2-23-20-139-233.compute-1.amazonaws.com</a></div>


<div>&gt;&gt;&gt; Setting up the cluster...</div><div>&gt;&gt;&gt; Attaching volume vol-69bd4807 to master node on /dev/sdz ...</div><div>&gt;&gt;&gt; Configuring hostnames...</div><div>1/1 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%  </div>


<div>&gt;&gt;&gt; Mounting EBS volume vol-69bd4807 on /home...</div><div>&gt;&gt;&gt; Creating cluster user: None (uid: 1001, gid: 1001)</div><div>1/1 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%  </div>


<div>&gt;&gt;&gt; Configuring scratch space for user(s): sgeadmin</div><div>1/1 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%  </div><div>&gt;&gt;&gt; Configuring /etc/hosts on each node</div>

<div>

1/1 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%  </div><div>&gt;&gt;&gt; Starting NFS server on master</div><div>&gt;&gt;&gt; Setting up NFS took 0.073 mins</div><div>&gt;&gt;&gt; Configuring passwordless ssh for root</div>


<div>&gt;&gt;&gt; Shutting down threads...</div><div>20/20 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%  </div><div>Traceback (most recent call last):</div><div>  File &quot;/Library/Python/2.6/site-packages/StarCluster-0.93.3-py2.6.egg/starcluster/cli.py&quot;, line 255, in main</div>


<div>    sc.execute(args)</div><div>  File &quot;/Library/Python/2.6/site-packages/StarCluster-0.93.3-py2.6.egg/starcluster/commands/start.py&quot;, line 194, in execute</div><div>    validate_running=validate_running)</div>


<div>  File &quot;/Library/Python/2.6/site-packages/StarCluster-0.93.3-py2.6.egg/starcluster/cluster.py&quot;, line 1414, in start</div><div>    return self._start(create=create, create_only=create_only)</div><div>  File &quot;&lt;string&gt;&quot;, line 2, in _start</div>


<div>  File &quot;/Library/Python/2.6/site-packages/StarCluster-0.93.3-py2.6.egg/starcluster/utils.py&quot;, line 87, in wrap_f</div><div>    res = func(*arg, **kargs)</div><div>  File &quot;/Library/Python/2.6/site-packages/StarCluster-0.93.3-py2.6.egg/starcluster/cluster.py&quot;, line 1437, in _start</div>


<div>    self.setup_cluster()</div><div>  File &quot;/Library/Python/2.6/site-packages/StarCluster-0.93.3-py2.6.egg/starcluster/cluster.py&quot;, line 1446, in setup_cluster</div><div>    self._setup_cluster()</div><div>

  File &quot;&lt;string&gt;&quot;, line 2, in _setup_cluster</div>

<div>  File &quot;/Library/Python/2.6/site-packages/StarCluster-0.93.3-py2.6.egg/starcluster/utils.py&quot;, line 87, in wrap_f</div><div>    res = func(*arg, **kargs)</div><div>  File &quot;/Library/Python/2.6/site-packages/StarCluster-0.93.3-py2.6.egg/starcluster/cluster.py&quot;, line 1460, in _setup_cluster</div>


<div>    self.cluster_shell, self.volumes)</div><div>  File &quot;/Library/Python/2.6/site-packages/StarCluster-0.93.3-py2.6.egg/starcluster/clustersetup.py&quot;, line 350, in run</div><div>    self._setup_passwordless_ssh()</div>


<div>  File &quot;/Library/Python/2.6/site-packages/StarCluster-0.93.3-py2.6.egg/starcluster/clustersetup.py&quot;, line 225, in _setup_passwordless_ssh</div><div>    auth_conn_key=True)</div><div>  File &quot;/Library/Python/2.6/site-packages/StarCluster-0.93.3-py2.6.egg/starcluster/node.py&quot;, line 418, in generate_key_for_user</div>


<div>    key = self.ssh.load_remote_rsa_key(private_key)</div><div>  File &quot;/Library/Python/2.6/site-packages/StarCluster-0.93.3-py2.6.egg/starcluster/sshutils/__init__.py&quot;, line 210, in load_remote_rsa_key</div>


<div>    key = ssh.RSAKey(file_obj=rfile)</div><div>  File &quot;build/bdist.macosx-10.6-universal/egg/ssh/rsakey.py&quot;, line 48, in __init__</div><div>    self._from_private_key(file_obj, password)</div><div>  File &quot;build/bdist.macosx-10.6-universal/egg/ssh/rsakey.py&quot;, line 167, in _from_private_key</div>


<div>    data = self._read_private_key(&#39;RSA&#39;, file_obj, password)</div><div>  File &quot;build/bdist.macosx-10.6-universal/egg/ssh/pkey.py&quot;, line 323, in _read_private_key</div><div>    raise PasswordRequiredException(&#39;Private key file is encrypted&#39;)</div>


<div>PasswordRequiredException: Private key file is encrypted</div><div><br></div><div>!!! ERROR - Oops! Looks like you&#39;ve found a bug in StarCluster</div><div>!!! ERROR - Crash report written to: /Users/manal/.starcluster/logs/crash-report-556.txt</div>


<div>!!! ERROR - Please remove any sensitive data from the crash report</div><div>!!! ERROR - and submit it to <a href="mailto:starcluster@mit.edu">starcluster@mit.edu</a></div></div><div><br></div><div><div class="gmail_quote">


On 23 May 2012 04:49, Justin Riley <span dir="ltr">&lt;<a href="mailto:jtriley@mit.edu" target="_blank">jtriley@mit.edu</a>&gt;</span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Manal,<br>

<br>

StarCluster chooses which device to attach external EBS volumes on<br>

automatically - you do not and should not need to specify this in your<br>

config. Assuming you use &#39;createvolume&#39; and update your config correctly<br>

things should &quot;just work&quot;.<br>

<br>

You should not have to use the AWS console to attach volumes manually<br>

and if you&#39;re having to do this then I&#39;d like to figure out why so we<br>

can fix it. This is a core feature of StarCluster and many users are<br>

using external EBS with StarCluster without issue so I&#39;m extremely<br>

curious why you&#39;re having issues...<br>

<br>

With that said I&#39;m having trouble pulling out all of the details I need<br>

from this long thread so I&#39;ll ask direct questions instead:<br>

<br>

1. Which AMI are you using? Did you create the AMI yourself? If so how<br>

did you go about creating the AMI and did you have any external EBS<br>

volumes attached while creating the AMI?<br>

<br>

2. How did you create the volume you were having issues mounting with<br>

StarCluster? StarCluster expects your volume to either be completely<br>

unpartitioned (format entire device) or only contain a single partition.<br>

If this isn&#39;t the case you should see an error when starting a cluster.<br>

<br>

3. Did you add your volume to your cluster config correctly according to<br>

the docs? (ie add your volume to the VOLUMES list in your cluster<br>

config?)<br>

<br>

4. StarCluster should be spitting out errors when creating the cluster<br>

if it fails to attach/mount/NFS-share any external EBS volumes - did you<br>

notice any errors? Can you please attach the complete screen output of a<br>

failed StarCluster run? Also it would extremely useful if you could send<br>

me your ~/.starcluster/logs/debug.log for a failed run so that I can<br>

take a look.<br>

<br>

5. Would you mind sending me a copy of your config with all of the<br>

sensitive data removed? I just want to make sure you&#39;ve configured<br>

things as expected.<br>

<br>

Thanks,<br>

<br>

~Justin<br>

<div><br></div></blockquote></div>

</div>