<html>
  <head>
    <meta content="text/html; charset=ISO-8859-1"
      http-equiv="Content-Type">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <br>
    -----BEGIN PGP SIGNED MESSAGE-----<br>
    Hash: SHA1<br>
    <br>
    Hi Hugh,<br>
    <br>
    After many tries I was able to reproduce this and can confirm it's a
    transient issue related to polling for spot requests too quickly.<br>
    <br>
    I'm working on a patch now but in the mean time if this happens
    simply CTRL-C the 'start' command and then run the same start
    command again with the -x option. The second run should work as
    expected given that some time will go by and the spot instance
    requests will be available.<br>
    <br>
    I've created an issue to keep track of this:<br>
    <br>
    <a class="moz-txt-link-freetext" href="http://web.mit.edu/star/cluster/issues/105">http://web.mit.edu/star/cluster/issues/105</a><br>
    <br>
    ~Justin<br>
    <br>
    On 4/12/12 1:30 PM, MacMullan, Hugh wrote:<br>
    <span style="white-space: pre;">&gt;<br>
      &gt; Folks:<br>
      &gt;<br>
      &gt; <br>
      &gt;<br>
      &gt; First: any good way to search the archives? I tried various
      google strings to no good effect. I hate to duplicate
      effort/messages ?<br>
      &gt;<br>
      &gt; <br>
      &gt;<br>
      &gt; More importantly: A possible bug? Sometimes when starting
      SPOT_BID clusters (~30% of the time?) I'm seeing ?start? skip
      (apparently) ?Waiting for open spot requests to become active??
      and just process the master. When it works correctly, I see:<br>
      &gt;<br>
      &gt; <br>
      &gt;<br>
      &gt; &gt;&gt;&gt; Launching node001 (ami: ami-12b6477b, type:
      cc1.4xlarge)<br>
      &gt;<br>
      &gt; SpotInstanceRequest:sir-9f38a214<br>
      &gt;<br>
      &gt; &gt;&gt;&gt; Launching node002 (ami: ami-12b6477b, type:
      cc1.4xlarge)<br>
      &gt;<br>
      &gt; SpotInstanceRequest:sir-c4505a11<br>
      &gt;<br>
      &gt; &gt;&gt;&gt; Launching node003 (ami: ami-12b6477b, type:
      cc1.4xlarge)<br>
      &gt;<br>
      &gt; SpotInstanceRequest:sir-cbb32414<br>
      &gt;<br>
      &gt; &gt;&gt;&gt; Waiting for cluster to come up... (updating
      every 20s)<br>
      &gt;<br>
      &gt; &gt;&gt;&gt; Waiting for open spot requests to become
      active...<br>
      &gt;<br>
      &gt; 0/3 | | 0% <br>
      &gt;<br>
      &gt; <br>
      &gt;<br>
      &gt; When it doesn?t work correctly, I see the following, where it
      skips the highlighted section above and goes straight to ?Waiting
      for all nodes?, and the count is /1 instead of /4 (or whatever the
      CLUSTER_SIZE is).<br>
      &gt;<br>
      &gt; <br>
      &gt;<br>
      &gt; # starcluster start -c spottest spottest<br>
      &gt;<br>
      &gt; StarCluster - (<a class="moz-txt-link-freetext" href="http://web.mit.edu/starcluster">http://web.mit.edu/starcluster</a>) (v. 0.93.3)<br>
      &gt;<br>
      &gt; Software Tools for Academics and Researchers (STAR)<br>
      &gt;<br>
      &gt; Please submit bug reports to <a class="moz-txt-link-abbreviated" href="mailto:starcluster@mit.edu">starcluster@mit.edu</a><br>
      &gt;<br>
      &gt; <br>
      &gt;<br>
      &gt; &gt;&gt;&gt; Validating cluster template settings...<br>
      &gt;<br>
      &gt; &gt;&gt;&gt; Cluster template settings are valid<br>
      &gt;<br>
      &gt; &gt;&gt;&gt; Starting cluster...<br>
      &gt;<br>
      &gt; &gt;&gt;&gt; Launching a 4-node cluster...<br>
      &gt;<br>
      &gt; &gt;&gt;&gt; Launching master node (ami: ami-12b6477b, type:
      cc1.4xlarge)...<br>
      &gt;<br>
      &gt; &gt;&gt;&gt; Creating security group @sc-spottest...<br>
      &gt;<br>
      &gt; &gt;&gt;&gt; Opening tcp port range 22-22 for CIDR
      XXXXXXXXXX/22<br>
      &gt;<br>
      &gt; &gt;&gt;&gt; Creating placement group @sc-spottest...<br>
      &gt;<br>
      &gt; Reservation:r-02fbac61<br>
      &gt;<br>
      &gt; &gt;&gt;&gt; Launching node001 (ami: ami-12b6477b, type:
      cc1.4xlarge)<br>
      &gt;<br>
      &gt; SpotInstanceRequest:sir-6cb0f014<br>
      &gt;<br>
      &gt; &gt;&gt;&gt; Launching node002 (ami: ami-12b6477b, type:
      cc1.4xlarge)<br>
      &gt;<br>
      &gt; SpotInstanceRequest:sir-b0ff9e11<br>
      &gt;<br>
      &gt; &gt;&gt;&gt; Launching node003 (ami: ami-12b6477b, type:
      cc1.4xlarge)<br>
      &gt;<br>
      &gt; SpotInstanceRequest:sir-2ef6f814<br>
      &gt;<br>
      &gt; &gt;&gt;&gt; Waiting for cluster to come up... (updating
      every 20s)<br>
      &gt;<br>
      &gt; &gt;&gt;&gt; Waiting for all nodes to be in a 'running'
      state...<br>
      &gt;<br>
      &gt; 1/1
      ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
      100% <br>
      &gt;<br>
      &gt; &gt;&gt;&gt; Waiting for SSH to come up on all nodes...<br>
      &gt;<br>
      &gt; 1/1
      ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
      100% <br>
      &gt;<br>
      &gt; &gt;&gt;&gt; Waiting for cluster to come up took 3.547 mins<br>
      &gt;<br>
      &gt; &gt;&gt;&gt; The master node is
      ec2-184-72-156-11.compute-1.amazonaws.com<br>
      &gt;<br>
      &gt; <br>
      &gt;<br>
      &gt; I haven?t tried this with anything but ?bigger? stuff (cc1
      &amp; cc2), so don?t know if that has any bearing on the
      situation. My config:<br>
      &gt;<br>
      &gt; <br>
      &gt;<br>
      &gt; [global]<br>
      &gt;<br>
      &gt; DEFAULT_TEMPLATE=Rcluster<br>
      &gt;<br>
      &gt; ENABLE_EXPERIMENTAL=True<br>
      &gt;<br>
      &gt; REFRESH_INTERVAL=20<br>
      &gt;<br>
      &gt; <br>
      &gt;<br>
      &gt; [aws info]<br>
      &gt;<br>
      &gt; AWS_ACCESS_KEY_ID = XXXXXXXXXXXX<br>
      &gt;<br>
      &gt; AWS_SECRET_ACCESS_KEY = XXXXXXXXXXXXX<br>
      &gt;<br>
      &gt; AWS_USER_ID = XXXXXXXXXXX<br>
      &gt;<br>
      &gt; EC2_CERT = XXXXXXXXXXX.pem<br>
      &gt;<br>
      &gt; EC2_PRIVATE_KEY = XXXXXXXXXXXXX.pem<br>
      &gt;<br>
      &gt; <br>
      &gt;<br>
      &gt; [key mykey]<br>
      &gt;<br>
      &gt; KEY_LOCATION=XXXXXXXXXXXXXXX.pem<br>
      &gt;<br>
      &gt; <br>
      &gt;<br>
      &gt; [cluster spottest]<br>
      &gt;<br>
      &gt; KEYNAME = mykey<br>
      &gt;<br>
      &gt; CLUSTER_SIZE = 4<br>
      &gt;<br>
      &gt; CLUSTER_USER = sgeadmin<br>
      &gt;<br>
      &gt; CLUSTER_SHELL = bash<br>
      &gt;<br>
      &gt; NODE_IMAGE_ID = ami-12b6477b<br>
      &gt;<br>
      &gt; NODE_INSTANCE_TYPE = cc1.4xlarge<br>
      &gt;<br>
      &gt; AVAILABILITY_ZONE = us-east-1c<br>
      &gt;<br>
      &gt; VOLUMES = Rlocal-spottest<br>
      &gt;<br>
      &gt; PLUGINS = setup-centos<br>
      &gt;<br>
      &gt; PERMISSIONS = ssh-local<br>
      &gt;<br>
      &gt; SPOT_BID = 1.50<br>
      &gt;<br>
      &gt; <br>
      &gt;<br>
      &gt; [volume Rlocal-spottest]<br>
      &gt;<br>
      &gt; VOLUME_ID = vol-XXXXXXXXXX<br>
      &gt;<br>
      &gt; MOUNT_PATH = /usr/local<br>
      &gt;<br>
      &gt; <br>
      &gt;<br>
      &gt; [plugin setup-centos]<br>
      &gt;<br>
      &gt; setup_class = setup-centos.PackageInstaller<br>
      &gt;<br>
      &gt; pkg_to_install = R<br>
      &gt;<br>
      &gt; <br>
      &gt;<br>
      &gt; [permission ssh-local]<br>
      &gt;<br>
      &gt; protocol = tcp<br>
      &gt;<br>
      &gt; from_port = 22<br>
      &gt;<br>
      &gt; to_port = 22<br>
      &gt;<br>
      &gt; cidr_ip = XXXXXXXXXXX/22<br>
      &gt;<br>
      &gt; <br>
      &gt;<br>
      &gt; This exact config works sometimes, other times not. Thanks
      for listening, or any advice you might have.<br>
      &gt;<br>
      &gt; -Hugh<br>
      &gt;<br>
      &gt;<br>
      &gt;<br>
      &gt; _______________________________________________<br>
      &gt; StarCluster mailing list<br>
      &gt; <a class="moz-txt-link-abbreviated" href="mailto:StarCluster@mit.edu">StarCluster@mit.edu</a><br>
      &gt; <a class="moz-txt-link-freetext" href="http://mailman.mit.edu/mailman/listinfo/starcluster">http://mailman.mit.edu/mailman/listinfo/starcluster</a></span><br>
    <br>
    -----BEGIN PGP SIGNATURE-----<br>
    Version: GnuPG v1.4.11 (Darwin)<br>
    Comment: Using GnuPG with Mozilla - <a class="moz-txt-link-freetext" href="http://enigmail.mozdev.org/">http://enigmail.mozdev.org/</a><br>
    <br>
    iEYEARECAAYFAk+RbccACgkQ4llAkMfDcrnEYgCeKmUcGy8spO9I2sgHOVfQeE03<br>
    pS0AniRXrGY3ObOXZ26R6emB2fs5B5eg<br>
    =QRb4<br>
    -----END PGP SIGNATURE-----<br>
    <br>
  </body>
</html>