<div dir="ltr">Hi All,<div><br></div><div style>I've been using 0.93.3 with only minor, known operational issues. Decided it was time to move to 0.94.2 about a week ago. Haven't been operational since, because of new, incorrect behavior in the way a volume is getting attached. Actually it's the way that starcluster is trying to mount it --referring to the wrong /dev/xvd* --when trying to mount the vol.</div>
<div style><br></div><div style>I have the following specified for the volume I'm telling starcluster to mount:</div><div style><br></div><div style><div>[volume jobspoolse1d]</div><div>VOLUME_ID = vol-f1a0d380</div><div>
MOUNT_PATH = /usr/share/jobs/</div><div>DEVICE = /dev/sdq</div><div><br></div></div><div style>Here is the excerpt of the failing vol attachment:</div><div style><br></div><div style><div>>>> Configuring cluster...</div>
<div>>>> Attaching volume vol-f1a0d380 to master node on /dev/sdq ...</div><div>>>> Waiting for vol-f1a0d380 to transition to: attached... </div><div>>>> Running plugin starcluster.clustersetup.DefaultClusterSetup</div>
<div>>>> Configuring hostnames...</div><div>2/2 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100% </div><div>*** WARNING - Cannot find device /dev/xvdq for volume vol-f1a0d380</div><div>*** WARNING - Not mounting vol-f1a0d380 on /usr/share/jobs/</div>
<div>*** WARNING - This usually means there was a problem attaching the EBS volume to the master node</div></div><div style><br></div><div style>Here is how the master node looks:</div><div style><br></div><div style><div>
[root@master ~]# ls -l /dev/sd*</div><div>lrwxrwxrwx 1 root root 4 Nov 5 22:31 /dev/sda -> xvde</div><div>lrwxrwxrwx 1 root root 5 Nov 5 22:31 /dev/sda1 -> xvde1</div><div>lrwxrwxrwx 1 root root 5 Nov 5 22:31 /dev/sda2 -> xvde2</div>
<div>lrwxrwxrwx 1 root root 5 Nov 5 22:31 /dev/sda3 -> xvde3</div><div>lrwxrwxrwx 1 root root 5 Nov 5 22:31 /dev/sdaa -> xvdaa</div><div>lrwxrwxrwx 1 root root 4 Nov 5 22:32 /dev/sdq -> xvdu</div><div>[root@master ~]# mount</div>
<div>/dev/xvde1 on / type ext4 (rw,relatime)</div><div>proc on /proc type proc (rw)</div><div>sysfs on /sys type sysfs (rw)</div><div>devpts on /dev/pts type devpts (rw,gid=5,mode=620)</div><div>tmpfs on /dev/shm type tmpfs (rw)</div>
<div>none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)</div><div>sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)</div><div>nfsd on /proc/fs/nfsd type nfsd (rw)</div><div>[root@master ~]# df</div><div>Filesystem 1K-blocks Used Available Use% Mounted on</div>
<div>/dev/xvde1 4909772 3344868 1315500 72% /</div><div>tmpfs 847740 0 847740 0% /dev/shm</div><div><br></div><div style>So, the vol is getting attached at /dev/sdq, but this actually points to /dev/xvdu, instead of the /dev/xvdq that starcluster is looking for. This "multiletter offset" (sdq -> xvdu) in the device mapping was present in .93.3 also, but was handled therein correctly.</div>
<div style><br></div><div style>I can manually mount /dev/xvdu and confirm that it's got the right stuff in it:</div><div style><br></div><div style><div>[root@master ~]# mount /dev/xvdu /usr/share/jobs</div><div>[root@master ~]# ls -l /usr/share/jobs</div>
<div>total 32</div><div>drwxrwsr-x 2 prod prod 4096 Jun 20 19:31 ami-bridge</div><div>drwxrwsr-x 3 prod prod 4096 Mar 20 2013 common</div><div>drwxrwsr-x 10 prod prod 4096 Jun 19 19:19 demo</div><div>drwxr-sr-x 3 root prod 4096 Mar 21 2013 etc</div>
<div>drwxrwsr-x 8 prod prod 4096 Feb 20 2013 internal</div><div>drwxrwsr-x 10 prod prod 4096 Jun 19 19:19 live</div><div>drwxrwsr-x 2 prod prod 4096 Jun 20 18:58 log</div><div>drwxrwsr-x 10 prod prod 4096 Jun 19 19:20 test</div>
<div><br></div><div style>Just what I wanted to see there.</div><div style><br></div><div style>So, I'll appreciate any support regarding a fix for this, whether it's a patch or a workaround.</div><div style><br></div>
<div style>Thanks much,</div><div style>Lyn</div></div></div></div>