<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=Windows-1252">
</head>
<body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; color: rgb(0, 0, 0); ">
<div><span class="Apple-style-span" style="font-size: 15px;"><br>
</span></div>
<span id="OLK_SRC_BODY_SECTION" style="font-size: 14px; font-family: Calibri, sans-serif; ">
<div>
<div style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; color: rgb(0, 0, 0); font-size: 14px; font-family: Calibri, sans-serif; ">
<div>Dear folks, </div>
<div><br>
</div>
<div>I have the following problem while creating a cluster and mounting an ebs volume </div>
<div>on /data. Here is the config file part corresponding to my template: </div>
<div><br>
</div>
<div>
<div>[cluster issm]</div>
<div># change this to the name of one of the keypair sections defined above</div>
<div>KEYNAME = ISSMStarCluster</div>
<div># number of ec2 instances to launch</div>
<div>CLUSTER_SIZE = 2</div>
<div># create the following user on the cluster</div>
<div>CLUSTER_USER = sgeadmin</div>
<div># optionally specify shell (defaults to bash)</div>
<div># (options: tcsh, zsh, csh, bash, ksh)</div>
<div>CLUSTER_SHELL = bash</div>
<div># AMI to use for cluster nodes. These AMIs are for the us-east-1 region.</div>
<div># Use the 'listpublic' command to list StarCluster AMIs in other regions</div>
<div># The base i386 StarCluster AMI is ami-899d49e0</div>
<div># The base x86_64 StarCluster AMI is ami-999d49f0</div>
<div># The base HVM StarCluster AMI is ami-4583572c</div>
<div>NODE_IMAGE_ID = ami-4583572c</div>
<div># instance type for all cluster nodes</div>
<div># (options: cg1.4xlarge, c1.xlarge, m1.small, c1.medium, m2.xlarge, t1.micro, cc1.4xlarge, m1.medium, cc2.8xlarge, m1.large, m1.xlarge, hi1.4xlarge, m2.4xlarge, m2.2xlarge)</div>
<div>NODE_INSTANCE_TYPE = cc2.8xlarge</div>
<div># Uncomment to disable installing/configuring a queueing system on the</div>
<div># cluster (SGE)</div>
<div>#DISABLE_QUEUE=True</div>
<div># Uncomment to specify a different instance type for the master node (OPTIONAL)</div>
<div># (defaults to NODE_INSTANCE_TYPE if not specified)</div>
<div>#MASTER_INSTANCE_TYPE = m1.small</div>
<div># Uncomment to specify a separate AMI to use for the master node. (OPTIONAL)</div>
<div># (defaults to NODE_IMAGE_ID if not specified)</div>
<div>#MASTER_IMAGE_ID = ami-899d49e0 (OPTIONAL)</div>
<div># availability zone to launch the cluster in (OPTIONAL)</div>
<div># (automatically determined based on volumes (if any) or</div>
<div># selected by Amazon if not specified)</div>
<div>#AVAILABILITY_ZONE = us-east-1c</div>
<div># list of volumes to attach to the master node (OPTIONAL)</div>
<div># these volumes, if any, will be NFS shared to the worker nodes</div>
<div># see "Configuring EBS Volumes" below on how to define volume sections</div>
<div>VOLUMES = issm</div>
</div>
<div><br>
</div>
<div>
<div># Sections starting with "volume" define your EBS volumes</div>
<div>[volume issm]</div>
<div>VOLUME_ID = vol-7d113b07</div>
<div>MOUNT_PATH = /data</div>
</div>
<div><br>
</div>
<div><br>
</div>
<div><br>
</div>
<div>when I first start this cluster: </div>
<div>starcluster start issm, everything works perfectly. </div>
<div><br>
</div>
<div>
<div> start issm</div>
<div>StarCluster - (<a href="http://web.mit.edu/starcluster">http://web.mit.edu/starcluster</a>) (v. 0.9999)</div>
<div>Software Tools for Academics and Researchers (STAR)</div>
<div>Please submit bug reports to <a href="mailto:starcluster@mit.edu">starcluster@mit.edu</a></div>
<div><br>
</div>
<div>>>> Using default cluster template: issm</div>
<div>>>> Validating cluster template settings...</div>
<div>>>> Cluster template settings are valid</div>
<div>>>> Starting cluster...</div>
<div>>>> Launching a 2-node cluster...</div>
<div>>>> Creating security group @sc-issm...</div>
<div>>>> Creating placement group @sc-issm...</div>
<div>Reservation:r-e3538485</div>
<div>>>> Waiting for cluster to come up... (updating every 10s)</div>
<div>>>> Waiting for all nodes to be in a 'running' state...</div>
<div>2/2 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%</div>
<div>>>> Waiting for SSH to come up on all nodes...</div>
<div>2/2 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%</div>
<div>>>> Waiting for cluster to come up took 2.281 mins</div>
<div>>>> The master node is ec2-107-22-25-149.compute-1.amazonaws.com</div>
<div>>>> Setting up the cluster...</div>
<div>>>> Attaching volume vol-7d113b07 to master node on /dev/sdz ...</div>
<div>>>> Configuring hostnames...</div>
<div>2/2 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%</div>
<div>>>> Mounting EBS volume vol-7d113b07 on /data...</div>
<div>>>> Creating cluster user: None (uid: 1001, gid: 1001)</div>
<div>2/2 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%</div>
<div>>>> Configuring scratch space for user(s): sgeadmin</div>
<div>0/2 | | 0% </div>
<div><br>
</div>
<div><br>
</div>
<div>2/2 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100% </div>
<div>>>> Configuring /etc/hosts on each node</div>
<div>2/2 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100% </div>
<div>>>> Starting NFS server on master</div>
<div>>>> Configuring NFS exports path(s):</div>
<div>/home /data</div>
<div>>>> Mounting all NFS export path(s) on 1 worker node(s)</div>
</div>
<div>
<div>1/1 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100% </div>
<div>>>> Setting up NFS took 0.152 mins</div>
<div>>>> Configuring passwordless ssh for root</div>
<div>>>> Configuring passwordless ssh for sgeadmin</div>
<div>>>> Shutting down threads...</div>
<div>20/20 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100% </div>
<div>>>> Configuring SGE...</div>
<div>>>> Configuring NFS exports path(s):</div>
<div>/opt/sge6</div>
<div>>>> Mounting all NFS export path(s) on 1 worker node(s)</div>
<div>1/1 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100% </div>
<div>>>> Setting up NFS took 0.102 mins</div>
<div>>>> Installing Sun Grid Engine...</div>
<div>1/1 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100% </div>
<div>>>> Creating SGE parallel environment 'orte'</div>
<div>2/2 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100% </div>
<div>>>> Adding parallel environment 'orte' to queue 'all.q'</div>
<div>>>> Shutting down threads...</div>
<div>20/20 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100% </div>
<div>>>> Configuring cluster took 1.506 mins</div>
<div>>>> Starting cluster took 3.877 mins</div>
<div><br>
</div>
<div>The cluster is now ready to use. To login to the master node</div>
<div>as root, run:</div>
<div><br>
</div>
<div> $ starcluster sshmaster issm</div>
</div>
<div><br>
</div>
<div><br>
</div>
<div><br>
</div>
<div>I checked, /data is correctly mounted on my ebs volume, everything fine. </div>
<div>Here is an frisk dump: </div>
<div><br>
</div>
<div>
<div>root@master:/data# df</div>
<div>Filesystem 1K-blocks Used Available Use% Mounted on</div>
<div>/dev/sda1 8246240 5386292 2441056 69% /</div>
<div>udev 31263832 4 31263828 1% /dev</div>
<div>tmpfs 12507188 220 12506968 1% /run</div>
<div>none 5120 0 5120 0% /run/lock</div>
<div>none 31267964 0 31267964 0% /run/shm</div>
<div>/dev/xvdb 866917368 205028 822675452 1% /mnt</div>
<div>/dev/xvdz 103212320 192268 97777172 1% /data</div>
</div>
<div><br>
</div>
<div>the ebs volume I'm mounting is 100Gb in men, so everything checks out.</div>
<div><br>
</div>
<div><br>
</div>
<div>Now, if I stop the cluster, and start it again using the –x option, the cluster will boot </div>
<div>fine, but will not attach to the volume (won't attempt it at all) and will not even try </div>
<div>to mount /data. It's as though the [volumes] section of my config did not exist!</div>
<div><br>
</div>
<div><br>
</div>
<div>Here is the output of the starcluster start –x issm command: </div>
<div><br>
</div>
<div>
<div>st start -c issm -x issm</div>
<div>StarCluster - (<a href="http://web.mit.edu/starcluster">http://web.mit.edu/starcluster</a>) (v. 0.9999)</div>
<div>Software Tools for Academics and Researchers (STAR)</div>
<div>Please submit bug reports to <a href="mailto:starcluster@mit.edu">starcluster@mit.edu</a></div>
<div><br>
</div>
<div>>>> Validating existing instances...</div>
<div>>>> Validating cluster template settings...</div>
<div>>>> Cluster template settings are valid</div>
<div>>>> Starting cluster...</div>
<div>>>> Starting stopped node: node001</div>
<div>>>> Waiting for cluster to come up... (updating every 10s)</div>
<div>>>> Waiting for all nodes to be in a 'running' state...</div>
<div>2/2 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%</div>
<div>>>> Waiting for SSH to come up on all nodes...</div>
<div>2/2 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%</div>
<div>>>> Waiting for cluster to come up took 1.780 mins</div>
<div>>>> The master node is ec2-23-22-242-221.compute-1.amazonaws.com</div>
<div>>>> Setting up the cluster...</div>
<div>>>> Configuring hostnames...</div>
<div>2/2 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%</div>
<div>>>> Creating cluster user: None (uid: 1001, gid: 1001)</div>
<div>2/2 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%</div>
<div>>>> Configuring scratch space for user(s): sgeadmin</div>
<div>2/2 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%</div>
<div>>>> Configuring /etc/hosts on each node</div>
<div>2/2 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%</div>
<div>>>> Starting NFS server on master</div>
<div>>>> Configuring NFS exports path(s):</div>
<div>/home</div>
<div>>>> Mounting all NFS export path(s) on 1 worker node(s)</div>
<div>1/1 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100% </div>
<div>>>> Setting up NFS took 0.106 mins</div>
<div>>>> Configuring passwordless ssh for root</div>
<div>>>> Configuring passwordless ssh for sgeadmin</div>
<div>>>> Shutting down threads...</div>
<div>20/20 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100% </div>
<div>>>> Configuring SGE...</div>
<div>>>> Configuring NFS exports path(s):</div>
<div>/opt/sge6</div>
<div>>>> Mounting all NFS export path(s) on 1 worker node(s)</div>
<div>1/1 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100% </div>
<div>>>> Setting up NFS took 0.065 mins</div>
<div>>>> Removing previous SGE installation...</div>
<div>>>> Installing Sun Grid Engine...</div>
<div>1/1 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100% </div>
<div>>>> Creating SGE parallel environment 'orte'</div>
<div>2/2 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100% </div>
<div>>>> Adding parallel environment 'orte' to queue 'all.q'</div>
<div>>>> Shutting down threads...</div>
<div>20/20 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100% </div>
<div>>>> Configuring cluster took 0.846 mins</div>
</div>
<div>
<div>>>> Starting cluster took 2.647 mins</div>
<div><br>
</div>
<div>The cluster is now ready to use. To login to the master node</div>
<div>as root, run:</div>
<div><br>
</div>
<div> $ starcluster sshmaster issm</div>
</div>
<div><br>
</div>
<div><br>
</div>
<div><br>
</div>
<div><br>
</div>
<div>As you can see, no attempt was made at attaching to the ebs volume, and mounting of </div>
<div>/data was not attempted! When I log in, there is no ebs volume device for /data either</div>
<div> </div>
<div><br>
</div>
<div><br>
</div>
<div><br>
</div>
<div>Any help or pointers would be appreciated! </div>
<div><br>
</div>
<div>Thanks in advance! </div>
<div><br>
</div>
<div>Eric L.</div>
<div><br>
</div>
<div>
<div>
<div>
<div style="font-family: Calibri, sans-serif; font-size: 14px; ">--------------------------------------------------------------------------</div>
<div>
<div style="font-family: Calibri, sans-serif; font-size: 14px; ">
<div>
<div class="x_MsoNormal" style="margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: 12pt; font-family: 'Times New Roman', serif; ">
<span style="font-size: 11pt; color: black; font-family: Calibri, sans-serif; ">Dr. Eric Larour, Software Engineer III,</span></div>
<div class="x_MsoNormal" style="margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: 12pt; font-family: 'Times New Roman', serif; ">
<span style="font-size: 11pt; color: black; font-family: Calibri, sans-serif; ">ISSM Task Manager (<a href="http://issm.jpl.nasa.gov/" style="color: blue; text-decoration: underline; ">http://issm.jpl.nasa.gov</a>) </span></div>
<div class="x_MsoNormal" style="margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: 12pt; font-family: 'Times New Roman', serif; ">
<span style="font-size: 11pt; color: black; font-family: Calibri, sans-serif; ">Mechanical division, Propulsion Thermal and Materials Section, Applied Low Temperature Physics Group.</span></div>
<div class="x_MsoNormal" style="margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: 12pt; font-family: 'Times New Roman', serif; ">
<span style="font-size: 11pt; color: black; font-family: Calibri, sans-serif; ">Jet Propulsion Laboratory.</span></div>
<div class="x_MsoNormal" style="margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: 12pt; font-family: 'Times New Roman', serif; ">
<span style="font-size: 11pt; color: black; font-family: Calibri, sans-serif; ">MS 79-24, 4800 Oak Grove Drive, Pasadena CA 91109.</span></div>
<div class="x_MsoNormal" style="margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: 12pt; font-family: 'Times New Roman', serif; ">
<span style="font-size: 11pt; color: black; font-family: Calibri, sans-serif; "><a href="mailto:eric.larour@jpl.nasa.gov" style="color: blue; text-decoration: underline; ">eric.larour@jpl.nasa.gov</a></span></div>
<div class="x_MsoNormal" style="margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: 12pt; font-family: 'Times New Roman', serif; ">
<span style="font-size: 11pt; color: black; font-family: Calibri, sans-serif; "><a href="http://issm.jpl.nasa.gov/" style="color: blue; text-decoration: underline; ">http://issm.jpl.nasa.gov</a></span></div>
<div class="x_MsoNormal" style="margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: 12pt; font-family: 'Times New Roman', serif; ">
<span style="font-size: 11pt; color: black; font-family: Calibri, sans-serif; ">Tel: 1 818 393 2435.</span></div>
<div class="x_MsoNormal" style="margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: 12pt; font-family: 'Times New Roman', serif; ">
<span style="font-size: 11pt; color: black; font-family: Calibri, sans-serif; "> </span><span class="x_apple-style-span"><span style="font-size: 10.5pt; color: black; font-family: Calibri, sans-serif; ">--------------------------------------------------------------------------</span></span></div>
</div>
</div>
</div>
</div>
</div>
</div>
<div><br>
</div>
<div><br>
</div>
<div><br>
</div>
</div>
</div>
</span>
</body>
</html>