[Starcluster] SOLVED!! instance ssh problem...

Nicholas Ampazis n.ampazis at gmail.com
Wed Mar 31 11:05:42 EDT 2010


Justin,

For the 3 EMIs that I currently have I always get a "None" response:

In [4]: print ec2.conn.get_image('emi-A286143C')
------> print(ec2.conn.get_image('emi-A286143C'))
None

In [5]: print ec2.conn.get_image('emi-DD1F1052')
------> print(ec2.conn.get_image('emi-DD1F1052'))
None

In [6]: print ec2.conn.get_image('emi-39361D20')
------> print(ec2.conn.get_image('emi-39361D20'))
None

On the other hand this command seems to work:

print ec2.conn.get_all_images(owners=['self'])
------> print(ec2.conn.get_all_images(owners=['self']))
[Image:emi-A286143C, Image:emi-39361D20, Image:eri-07A01138,
Image:eki-DFFB2245, Image:emi-DD1F1052, Image:eri-B5F721BA,
Image:eki-F36110DA]

This one doesn't though:

print ec2.conn.get_all_images(owners=['000100729354'])
-------> print(ec2.conn.get_all_images(owners=['000100729354']))
[]


and finally, the following seems to work:

print ec2.conn.get_all_images()[0].ownerId
-------> print(ec2.conn.get_all_images()[0].ownerId)
admin


Thanks a lot!


Nicholas


On Wed, Mar 31, 2010 at 5:55 PM, Justin Riley <jtriley at mit.edu> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Hi Nicholas,
>
> The warning about the zone not being available is not causing
> StarCluster to fail. The fact that ec2.conn.get_image() is not returning
> your image is what's causing the failure.
>
> The availability zone message comes from the fact that Eucalyptus does
> not report a "state" for zones whereas EC2 does. So, to allow Eucalyptus
> to work with StarCluster, I relaxed the validation criteria to simply
> print a warning if the state is not available rather than failing out
> completely provided that a zone object was able to be retrieved. You can
> safely ignore this warning when using Eucalyptus.
>
> Could you please see if the get_image command works at all for you by
> trying another ami? e.g.
>
> $ ipython
> ~> print ec2.conn.get_image('emi-39361D20')
>
> Also what's the output of:
>
> ~> print ec2.conn.get_all_images(owners=['self'])
>
> and of:
>
> ~> print ec2.conn.get_all_images(owners=['YOUR_EUCA_USER_ID_HERE'])
>
> and of:
>
> ~> print ec2.conn.get_all_images()[0].ownerId
>
> What's strange is that it's working for me. I'm curious if this has to
> do with Eucalyptus version/setup?
>
> Requiring that an image object exists for the specified AMI is not a
> condition that I'm willing to relax. StarCluster really needs to be able
> to retrieve the image from boto before moving forward.
>
> Thanks,
>
> ~Justin
>
> On 03/31/2010 08:30 AM, Nicholas Ampazis wrote:
>> Justin,
>>
>> This is what I got from within ipython:
>>
>>> ~> print ec2.conn.get_image('emi-A286143C')
>>
>> None
>>
>>> ~> print(ec2.conn.get_all_images())
>> [Image:emi-A286143C, Image:emi-39361D20, Image:eri-07A01138,
>> Image:eki-DFFB2245, Image:emi-DD1F1052, Image:eri-B5F721BA,
>> Image:eki-F36110DA]
>>
>>
>> I also downloaded the latest version from
>>
>> git clone http://github.com/jtriley/StarCluster.git
>>
>> and these is my current output:
>>
>> starcluster start smallcluster test
>>
>> /Library/Python/2.6/site-packages/pycrypto-2.0.1-py2.6-macosx-10.6-universal.egg/Crypto/Hash/SHA.py:6:
>> DeprecationWarning: the sha module is deprecated; use the hashlib
>> module instead
>>   import warnings
>> /Library/Python/2.6/site-packages/pycrypto-2.0.1-py2.6-macosx-10.6-universal.egg/Crypto/Hash/MD5.py:6:
>> DeprecationWarning: the md5 module is deprecated; use hashlib instead
>>   import warnings
>> Leopard libedit detected.
>> StarCluster - (http://web.mit.edu/starcluster)
>> Software Tools for Academics and Researchers (STAR)
>> Please submit bug reports to starcluster at mit.edu
>>
>> Validating cluster settings...
>> cluster.py:501 - WARNING - The AVAILABILITY_ZONE = Zone:UEC-TMOD is
>> not available at this time
>> cluster.py:458 - ERROR - NODE_IMAGE_ID emi-A286143C does not exist
>> cli.py:196 - ERROR - The cluster settings provided are not valid
>>
>> However the zone is indeed available (as verified by
>> "euca-describe-availability-zones verbose"), and other starcluster
>> commands such as listimages etc seem to work.
>>
>>
>> Thanks,
>>
>>
>> Nicholas
>>
>>
>> On Wed, Mar 31, 2010 at 7:20 AM, Justin Riley <jtriley at mit.edu> wrote:
>>> Hi Nicholas,
>>>
>>> Hmmm, this is strange, I don't get the issue with the architecture attribute
>>> using Eucalyptus. Could you verify that ec2.conn.get_image is returning None
>>> by running the following:
>>>
>>> $ ipython
>>> ~> from starcluster.config import get_config
>>> ~> cfg = get_config(); cfg.load()
>>> ~> ec2 = cfg.get_easy_ec2()
>>> ~> print ec2.conn.get_image('emi-A286143C')
>>>
>>> Also, what does ec2.conn.get_all_images() return?
>>>
>>> In any event, I've fixed the second error you encountered in github and this
>>> should get you to the point that I was discussing in my last post: "waiting
>>> for cluster to start".
>>>
>>> The security groups will be created and the instances will be started,
>>> however, the wait condition will never be met due to issues discussed in the
>>> last post with regards to eucalyptus/boto and instance ip addresses in the
>>> euca-describe-instances response.
>>>
>>> I have an idea for how to get around this but I need to think a bit about how
>>> to incorporate it without making a mess of the code. I'll let you know when I
>>> have something worth testing in github.
>>>
>>> ~Justin
>>>
>>> On Tuesday 30 March 2010 3:56:08 pm Nicholas Ampazis wrote:
>>>> Justin,
>>>>
>>>> Unfortunately I still get the architecture error:
>>>>> 1) File "build/bdist.macosx-10.6-universal/egg/starcluster/cluster.py",
>>>>> line 541, in __check_platform
>>>>> AttributeError: 'NoneType' object has no attribute 'architecture'
>>>>>
>>>>> I commented the line
>>>>>
>>>>> #image_platform = self.ec2.conn.get_image(image_id).architecture
>>>>
>>>> and I have to set 'x86_64' by hand.
>>>>
>>>> Past this point, this is how far I could get:
>>>>
>>>> cluster.py:526 - WARNING - The AVAILABILITY_ZONE = Zone:UEC-TMOD is
>>>> not available at this time
>>>>
>>>>>>> Starting cluster...
>>>>>>> Launching a 2-node cluster...
>>>>>>> Launching master node...
>>>>>>> Master AMI: emi-A286143C
>>>>>>> Creating security group @sc-masters...
>>>>>>> Creating security group @sc-test...
>>>>
>>>> Traceback (most recent call last):
>>>>   File "/usr/local/bin/starcluster", line 5, in <module>
>>>>     pkg_resources.run_script('StarCluster==0.9999', 'starcluster')
>>>>   File
>>>>  "/System/Library/Frameworks/Python.framework/Versions/2.6/Extras/lib/pytho
>>>> n/pkg_resources.py", line 442, in run_script
>>>>     self.require(requires)[0].run_script(script_name, ns)
>>>>   File
>>>>  "/System/Library/Frameworks/Python.framework/Versions/2.6/Extras/lib/pytho
>>>> n/pkg_resources.py", line 1167, in run_script
>>>>     exec script_code in namespace, namespace
>>>>   File
>>>>  "/Library/Python/2.6/site-packages/StarCluster-0.9999-py2.6.egg/EGG-INFO/s
>>>> cripts/starcluster", line 6, in <module>
>>>>
>>>>   File "build/bdist.macosx-10.6-universal/egg/starcluster/cli.py",
>>>> line 588, in main
>>>>   File "build/bdist.macosx-10.6-universal/egg/starcluster/cli.py",
>>>> line 187, in execute
>>>>   File "build/bdist.macosx-10.6-universal/egg/starcluster/utils.py",
>>>> line 23, in wrapper
>>>>   File "build/bdist.macosx-10.6-universal/egg/starcluster/cluster.py",
>>>> line 423, in start
>>>>   File "build/bdist.macosx-10.6-universal/egg/starcluster/cluster.py",
>>>> line 324, in create_cluster
>>>>   File "build/bdist.macosx-10.6-universal/egg/starcluster/cluster.py",
>>>> line 253, in cluster_group
>>>>   File "build/bdist.macosx-10.6-universal/egg/starcluster/awsutils.py",
>>>> line 125, in get_or_create_group
>>>>   File "build/bdist.macosx-10.6-universal/egg/starcluster/awsutils.py",
>>>> line 98, in create_group
>>>>   File "build/bdist.macosx-10.6-universal/egg/starcluster/awsutils.py",
>>>> line 103, in get_group_or_none
>>>> IndexError: list index out of range
>>>>
>>>>
>>>> Please note that no instance of the "Master AMI: emi-A286143C"
>>>> actually starts. However new security groups have been created as
>>>> reported by "euca-describe-groups"
>>>>
>>>>  euca-describe-groups
>>>> GROUP admin   @sc-masters     StarCluster Master Nodes
>>>> PERMISSION    admin   @sc-masters     ALLOWS  tcp     22      22      FROM    CIDR    0.0.0.0/0
>>>> GROUP admin   @sc-test        Cluster requested at 201003302244
>>>> PERMISSION    admin   @sc-test        ALLOWS  tcp     22      22      FROM    CIDR    0.0.0.0/0
>>>> GROUP admin   default default group
>>>> PERMISSION    admin   default ALLOWS  tcp     22      22      FROM    CIDR    0.0.0.0/0
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Nicholas
>>>>
>>>> On Tue, Mar 30, 2010 at 10:30 PM, Justin Riley <jtriley at mit.edu> wrote:
>> Hi Nicholas,
>>
>> I just fixed the issues you were having that required you to modify a
>> bunch of the code by hand (related to security group stuff). Could you
>> please undo your modifications and pull the latest github code? If
>> necessary, just remove the working directory and re-clone:
>>
>> git clone http://github.com/jtriley/StarCluster.git
>>
>> If you could test that this code gets you to the point of "Waiting for
>> cluster to start" on Eucalyptus, that'd be great. Please let me know if
>>
>> you still need to do this:
>>>>>>> 1) File "build/bdist.macosx-10.6-universal/egg/starcluster/cluster.py",
>>>>>>> line 541, in __check_platform
>>>>>>> AttributeError: 'NoneType' object has no attribute 'architecture'
>>>>>>>
>>>>>>> I commented the line
>>>>>>>
>>>>>>> #image_platform = self.ec2.conn.get_image(image_id).architecture
>>
>> However, beyond the "Waiting for cluster to start", I'm afraid I've now
>> reached as far as I can go with Eucalyptus and StarCluster due to the
>> fact that boto cannot report the private/public ip addresses of
>> eucalyptus instances (ie we get N/A's for ip addresses with $starcluster
>> listinstances).
>>
>> There's really not any way to address this cleanly without breaking EC2
>> support that I can think of. The ip address info that I need gets put
>> into the dns_name and private_dns_name by Eucalyptus, however, using
>> this would be a serious hack and certainly make the code uglier.
>>
>> As I said before, I'm not sure whether a more sophisticated dns setup
>> would get Eucalyptus to respond with these ip addresses or whether it's
>> not possible at all with Eucalyptus.
>>
>> In either case, I think we need to do some investigation on this before
>> moving forward.
>>
>> ~Justin
>>
>> On 03/29/2010 02:28 PM, Nicholas Ampazis wrote:
>>>>>>> Justin,
>>>>>>>
>>>>>>> I'm sorry, but I forgot to tell you that I've also made the following
>>>>>>> "manual" modifications before I was able to reach the point of my
>>>>>>> previous e-mail ('str' object has no attribute 'instances'" error)
>>>>>>>
>>>>>>>
>>>>>>> 1) File "build/bdist.macosx-10.6-universal/egg/starcluster/cluster.py",
>>>>>>> line 541, in __check_platform
>>>>>>> AttributeError: 'NoneType' object has no attribute 'architecture'
>>>>>>>
>>>>>>> I commented the line
>>>>>>>
>>>>>>> #image_platform = self.ec2.conn.get_image(image_id).architecture
>>>>>>>
>>>>>>> and added
>>>>>>>
>>>>>>> image_platform = "x86_64"
>>>>>>>
>>>>>>>
>>>>>>> 2) File "build/bdist.macosx-10.6-universal/egg/starcluster/awsutils.py",
>>>>>>> line 106, in get_or_create_group
>>>>>>> IndexError: list index out of range
>>>>>>>
>>>>>>> I commented out
>>>>>>>
>>>>>>> #sg = self.conn.get_all_security_groups(
>>>>>>> #                groupnames=[name])[0]
>>>>>>>
>>>>>>> and added
>>>>>>>
>>>>>>> sg='default'
>>>>>>>
>>>>>>>
>>>>>>> 3) File "build/bdist.macosx-10.6-universal/egg/starcluster/cluster.py",
>>>>>>> line 323, in create_cluster
>>>>>>> AttributeError: 'str' object has no attribute 'name'
>>>>>>>
>>>>>>> I commented out the lines
>>>>>>>
>>>>>>> #master_sg = self.master_group.name
>>>>>>> #cluster_sg = self.cluster_group.name
>>>>>>>
>>>>>>> and added
>>>>>>>
>>>>>>> master_sg = 'default'
>>>>>>> cluster_sg = 'default'
>>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Nicholas
>>>>>>>
>>>>>>> On Mon, Mar 29, 2010 at 11:11 AM, Justin Riley <jtriley at mit.edu> wrote:
>>>>>>> Nicholas,
>>>>>>>
>>>>>>> I've fixed the VOLUMES issue on github. I'll report back once I've
>>>>>>> checked out the other "AttributeError: 'str' object has no attribute
>>>>>>> 'instances'" error.
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> ~Justin
>>>>>>>
>>>>>>> On 03/29/2010 01:31 PM, Nicholas Ampazis wrote:
>>>>>>>>>> Justin,
>>>>>>>>>>
>>>>>>>>>> For some reason VOLUMES must be defined in the config file otherwise
>>>>>>>>>> "start" command ends with the following message:
>>>>>>>>>>
>>>>>>>>>> Traceback (most recent call last):
>>>>>>>>>>   File "/usr/local/bin/starcluster", line 5, in <module>
>>>>>>>>>>     pkg_resources.run_script('StarCluster==0.9999', 'starcluster')
>>>>>>>>>>   File
>>>>>>>>>> "/System/Library/Frameworks/Python.framework/Versions/2.6/Extras/lib/
>>>>>>>>>> python/pkg_resources.py", line 442, in run_script
>>>>>>>>>>     self.require(requires)[0].run_script(script_name, ns)
>>>>>>>>>>   File
>>>>>>>>>> "/System/Library/Frameworks/Python.framework/Versions/2.6/Extras/lib/
>>>>>>>>>> python/pkg_resources.py", line 1167, in run_script
>>>>>>>>>>     exec script_code in namespace, namespace
>>>>>>>>>>   File
>>>>>>>>>> "/Library/Python/2.6/site-packages/StarCluster-0.9999-py2.6.egg/EGG-I
>>>>>>>>>> NFO/scripts/starcluster", line 6, in <module>
>>>>>>>>>>
>>>>>>>>>>   File "build/bdist.macosx-10.6-universal/egg/starcluster/cli.py",
>>>>>>>>>> line 585, in main
>>>>>>>>>>   File "build/bdist.macosx-10.6-universal/egg/starcluster/cli.py",
>>>>>>>>>> line 184, in execute
>>>>>>>>>>   File
>>>>>>>>>> "build/bdist.macosx-10.6-universal/egg/starcluster/cluster.py", line
>>>>>>>>>> 480, in is_valid
>>>>>>>>>>   File
>>>>>>>>>> "build/bdist.macosx-10.6-universal/egg/starcluster/cluster.py", line
>>>>>>>>>> 619, in _validate_ebs_settings
>>>>>>>>>> TypeError: 'NoneType' object is not iterable
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> I defined a VOLUME in the config and re-run the start command. This
>>>>>>>>>> is how far I went:
>>>>>>>>>>
>>>>>>>>>> starcluster start smallcluster test
>>>>>>>>>>
>>>>>>>>>>>>> Starting cluster...
>>>>>>>>>>>>> Launching a 2-node cluster...
>>>>>>>>>>>>> Launching master node...
>>>>>>>>>>>>> Master AMI: emi-A286143C
>>>>>>>>>>
>>>>>>>>>> Reservation:r-4BC1082B
>>>>>>>>>>
>>>>>>>>>>>>> Launching worker nodes...
>>>>>>>>>>>>> Node AMI: emi-A286143C
>>>>>>>>>>
>>>>>>>>>> Reservation:r-458E0899
>>>>>>>>>>
>>>>>>>>>>>>> Waiting for cluster to start...|Traceback (most recent call last):
>>>>>>>>>>
>>>>>>>>>>   File "/usr/local/bin/starcluster", line 5, in <module>
>>>>>>>>>>     pkg_resources.run_script('StarCluster==0.9999', 'starcluster')
>>>>>>>>>>   File
>>>>>>>>>> "/System/Library/Frameworks/Python.framework/Versions/2.6/Extras/lib/
>>>>>>>>>> python/pkg_resources.py", line 442, in run_script
>>>>>>>>>>     self.require(requires)[0].run_script(script_name, ns)
>>>>>>>>>>   File
>>>>>>>>>> "/System/Library/Frameworks/Python.framework/Versions/2.6/Extras/lib/
>>>>>>>>>> python/pkg_resources.py", line 1167, in run_script
>>>>>>>>>>     exec script_code in namespace, namespace
>>>>>>>>>>   File
>>>>>>>>>> "/Library/Python/2.6/site-packages/StarCluster-0.9999-py2.6.egg/EGG-I
>>>>>>>>>> NFO/scripts/starcluster", line 6, in <module>
>>>>>>>>>>
>>>>>>>>>>   File "build/bdist.macosx-10.6-universal/egg/starcluster/cli.py",
>>>>>>>>>> line 585, in main
>>>>>>>>>>   File "build/bdist.macosx-10.6-universal/egg/starcluster/cli.py",
>>>>>>>>>> line 186, in execute
>>>>>>>>>>   File "build/bdist.macosx-10.6-universal/egg/starcluster/utils.py",
>>>>>>>>>> line 23, in wrapper
>>>>>>>>>>   File
>>>>>>>>>> "build/bdist.macosx-10.6-universal/egg/starcluster/cluster.py", line
>>>>>>>>>> 429, in start
>>>>>>>>>>   File
>>>>>>>>>> "build/bdist.macosx-10.6-universal/egg/starcluster/cluster.py", line
>>>>>>>>>> 374, in is_cluster_up
>>>>>>>>>>   File
>>>>>>>>>> "build/bdist.macosx-10.6-universal/egg/starcluster/cluster.py", line
>>>>>>>>>> 304, in running_nodes
>>>>>>>>>>   File
>>>>>>>>>> "build/bdist.macosx-10.6-universal/egg/starcluster/cluster.py", line
>>>>>>>>>> 272, in nodes
>>>>>>>>>> AttributeError: 'str' object has no attribute 'instances'
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Nicholas
>>>>>>>>>>
>>>>>>>>>> On Mon, Mar 29, 2010 at 10:24 AM, Justin Riley <jtriley at mit.edu>
>>>>>>>>>> wrote: Hi Nicholas,
>>>>>>>>>>
>>>>>>>>>> Hmmm, this is interesting. The reason this is happening is because
>>>>>>>>>> zone.state is returning the controller's ip address rather than
>>>>>>>>>> 'available' as it does when using EC2. I just tested this with my
>>>>>>>>>> local Eucalyptus. It appears that Eucalyptus does not have support
>>>>>>>>>> for availability zone 'states' like Amazon EC2 does.
>>>>>>>>>>
>>>>>>>>>> So, I've relaxed the check for availability zone to simply print a
>>>>>>>>>> warning rather than erroring out if the zone state is not
>>>>>>>>>> 'available'. As long as StarCluster can retrieve the zone it allows
>>>>>>>>>> the cluster validation to proceed.
>>>>>>>>>>
>>>>>>>>>> You will see a warning about the availability zone when using
>>>>>>>>>> Eucalyptus although it should be safe to ignore.
>>>>>>>>>>
>>>>>>>>>> Please try again with latest dev code and report back.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>>
>>>>>>>>>> ~Justin
>>>>>>>>>>
>>>>>>>>>> On 03/29/2010 12:08 PM, Nicholas Ampazis wrote:
>>>>>>>>>>>>> Justin,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks for the update. I've donwloaded the latest development
>>>>>>>>>>>>> version using git.
>>>>>>>>>>>>>
>>>>>>>>>>>>> This is what I got when I invoked "starcluster start smallcluster
>>>>>>>>>>>>> test" (there is a cluster template "smallcluster" defined in the
>>>>>>>>>>>>> configuration file):
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> cluster.py:525 - ERROR - The AVAILABILITY_ZONE = %s is not
>>>>>>>>>>>>> available at this time
>>>>>>>>>>>>> Traceback (most recent call last):
>>>>>>>>>>>>>   File "/usr/local/bin/starcluster", line 5, in <module>
>>>>>>>>>>>>>     pkg_resources.run_script('StarCluster==0.9999', 'starcluster')
>>>>>>>>>>>>>   File
>>>>>>>>>>>>> "/System/Library/Frameworks/Python.framework/Versions/2.6/Extras/l
>>>>>>>>>>>>> ib/python/pkg_resources.py", line 442, in run_script
>>>>>>>>>>>>>     self.require(requires)[0].run_script(script_name, ns)
>>>>>>>>>>>>>   File
>>>>>>>>>>>>> "/System/Library/Frameworks/Python.framework/Versions/2.6/Extras/l
>>>>>>>>>>>>> ib/python/pkg_resources.py", line 1167, in run_script
>>>>>>>>>>>>>     exec script_code in namespace, namespace
>>>>>>>>>>>>>   File
>>>>>>>>>>>>> "/Library/Python/2.6/site-packages/StarCluster-0.9999-py2.6.egg/EG
>>>>>>>>>>>>> G-INFO/scripts/starcluster", line 6, in <module>
>>>>>>>>>>>>>
>>>>>>>>>>>>>   File "build/bdist.macosx-10.6-universal/egg/starcluster/cli.py",
>>>>>>>>>>>>> line 585, in main
>>>>>>>>>>>>>   File "build/bdist.macosx-10.6-universal/egg/starcluster/cli.py",
>>>>>>>>>>>>> line 184, in execute
>>>>>>>>>>>>>   File
>>>>>>>>>>>>> "build/bdist.macosx-10.6-universal/egg/starcluster/cluster.py",
>>>>>>>>>>>>> line 478, in is_valid
>>>>>>>>>>>>>   File
>>>>>>>>>>>>> "build/bdist.macosx-10.6-universal/egg/starcluster/cluster.py",
>>>>>>>>>>>>> line 616, in _validate_ebs_settings
>>>>>>>>>>>>> TypeError: 'NoneType' object is not iterable
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Nicholas
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Mon, Mar 29, 2010 at 8:55 AM, Justin Riley <jtriley at mit.edu>
>>>>>>>>>>>>> wrote: Hi Nicholas,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Awesome, I'll try to wget that url on a local instance sometime
>>>>>>>>>>>>> today and see how it goes just to verify that this is the case (ie
>>>>>>>>>>>>> latest points to 1.0 by default on 1.6.2)
>>>>>>>>>>>>>
>>>>>>>>>>>>> I meant to send you an announcement this last night, but I made
>>>>>>>>>>>>> some modifications that should allow you to get past the
>>>>>>>>>>>>> credentials step when starting StarCluster on Eucalyptus. You'll
>>>>>>>>>>>>> need to pull in the latest code to test it out.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Please let me know how things go and what the next obstacles are.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I am aware of one obstacle that guarantees things will not quite
>>>>>>>>>>>>> run successfully. When we do:
>>>>>>>>>>>>>
>>>>>>>>>>>>> $ starcluster listinstances
>>>>>>>>>>>>>
>>>>>>>>>>>>> I noticed that both of our outputs of this command using
>>>>>>>>>>>>> Eucalyptus reports private_ip_address and ip_address as N/A. These
>>>>>>>>>>>>> variables are used by StarCluster to setup things like /etc/hosts,
>>>>>>>>>>>>> Sun Grid Engine, etc.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I have a feeling this is due to needing a more sophisticated DNS
>>>>>>>>>>>>> setup with Eucalyptus but I haven't tried to solve this just yet.
>>>>>>>>>>>>> In any event, things will almost certainly not work until we can
>>>>>>>>>>>>> get these values to be properly populated (ie starcluster
>>>>>>>>>>>>> listinstances should show the ip addresses and not N/A's).
>>>>>>>>>>>>>
>>>>>>>>>>>>> Hope that helps,
>>>>>>>>>>>>>
>>>>>>>>>>>>> ~Justin
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 03/29/2010 11:06 AM, Nicholas Ampazis wrote:
>>>>>>>>>>>>>>>> Justin,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I do have a "add_key.pl" in  the "/usr/share/eucalyptus"
>>>>>>>>>>>>>>>> directory.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> However this might not be much relevant in the process of
>>>>>>>>>>>>>>>> copying the ssh key in later versions of Eucalyptus (i.e.
>>>>>>>>>>>>>>>> 1.6.x), since I've discovered that I could have achieved the
>>>>>>>>>>>>>>>> same fix if I had substituted
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> public_key_url=http://169.254.169.254/1.0/meta-data/public-keys
>>>>>>>>>>>>>>>> /0/openssh-key
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> by
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> public_key_url=http://169.254.169.254/latest/meta-data/public-k
>>>>>>>>>>>>>>>> eys/0/openssh-key
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> (instead of
>>>>>>>>>>>>>>>>  public_key_url=http://169.254.169.254/2008-02-01/meta-data/pub
>>>>>>>>>>>>>>>> lic-keys/0/openssh-key)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> in /etc/init.d/ec2-get-credentials" of starcluster iso.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Notice that in this case "latest" points to the same directory
>>>>>>>>>>>>>>>> as "api_ver" which in your eucalyptus installation (1.6.2) just
>>>>>>>>>>>>>>>> happens to be "1.0", so it works out of the box!
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Nicholas
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> P.S. Is there any progress in the starcluster git python code
>>>>>>>>>>>>>>>> with regards to commands that did not work with eucalyptus
>>>>>>>>>>>>>>>> credentials (e.g. starcluster start , etc)?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Mon, Mar 29, 2010 at 7:35 AM, Justin Riley <jtriley at mit.edu>
>>>> wrote:
>>>>>>>>>>>>>>>>> -----BEGIN PGP SIGNED MESSAGE-----
>>>>>>>>>>>>>>>>> Hash: SHA1
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Hi Nicholas,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Awesome, glad to hear you've got the StarCluster ami working
>>>>>>>>>>>>>>>>> with Eucalyptus. I'm still a little curious as to why I didn't
>>>>>>>>>>>>>>>>> need those modifications to /etc/init.d/ec2-get-credentials
>>>>>>>>>>>>>>>>> and you did.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> My current theory on this:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I believe that Eucalyptus is running the script
>>>>>>>>>>>>>>>>> $EUCALYPTUS/usr/share/eucalyptus/add_key.pl somewhere in the
>>>>>>>>>>>>>>>>> process of bringing the instance up.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Looking at this script it appears that they manually pipe the
>>>>>>>>>>>>>>>>> pub key into root's authorized_keys file (ie they're mounting
>>>>>>>>>>>>>>>>> the iso and creating the authorized_keys outside of the
>>>>>>>>>>>>>>>>> instance).
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> My only guess as to why my EMI worked out of the box with
>>>>>>>>>>>>>>>>> respect to ssh is because of this script. Maybe it's not being
>>>>>>>>>>>>>>>>> executed for some reason?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Can you check if that script exists for you in
>>>>>>>>>>>>>>>>> /usr/share/eucalyptus?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks and in any event, thanks for tracking this down :D
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> ~Justin
>>
>>>>
>>>
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v2.0.14 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
>
> iEYEARECAAYFAkuzYn4ACgkQ4llAkMfDcrm5mwCghxJVZeuGvthrfe75alZWpkoO
> /T8AnA+Fd274HugcNCZYlJAio6UuL5bl
> =HKL6
> -----END PGP SIGNATURE-----
>




More information about the StarCluster mailing list