[StarCluster] NFS Help !!!!

Fri Jun 5 14:58:30 EDT 2015

Glad it worked with the new volume.  Could be there is some issue with the
data you have on the volume and not Starcluster itself. If you get the same
error when you copy the data over to the new volume and try to mount the
volume with Starcluster, I would guess this is the source of your issues.

If you google your error message (like here
<https://www.google.com/webhp?sourceid=chrome-instant&ion=1&espv=2&ie=UTF-8#q=ls%3A+cannot+access+biodata%3A+Too+many+levels+of+symbolic+links>),
you might be able to figure out what the issue is with the data you have on
the volume.  Seems like it might be as simple as having a circular symbolic
link some how created/discovered when NFS attempts to mount the volumes on
the client / worker nodes.

Good Luck!
-Jennifer

On Fri, Jun 5, 2015 at 2:33 PM, Andrey Evtikhov <Andrey_Evtikhov at epam.com>
wrote:

>  Thank you , Jennifer !
>
>
>
> Built from scratch Volume seems OK  - with NO and WITH starclsuter .
>
> Just need to copy data over J
>
>
>
> What troubled me – on master everything  looked normal…..
>
>
>
> All good now when I made new volume ….
>
>
>
> root at node001:/geodata# mkdir TESTFROMCLENT
>
> root at node001:/geodata# ls -l
>
> total 32
>
> drwx------ 2 root root 16384 Jun  5 17:47 lost+found
>
> drwxr-xr-x 2 root root  4096 Jun  5 17:48 TEST4CHEVRON
>
> drwxr-xr-x 2 root root  4096 Jun  5 17:48 TEST4CHEVRON2
>
> drwxr-xr-x 2 root root  4096 Jun  5 17:48 TEST4CHEVRON3
>
> drwxr-xr-x 2 root root  4096 Jun  5 17:51 TESTFROMCLENT
>
> root at node001:/geodata# exit
>
> logout
>
> Connection to node001 closed.
>
> root at master:/geodata# ls -l
>
> total 32
>
> drwx------ 2 root root 16384 Jun  5 17:47 lost+found
>
> drwxr-xr-x 2 root root  4096 Jun  5 17:48 TEST4CHEVRON
>
> drwxr-xr-x 2 root root  4096 Jun  5 17:48 TEST4CHEVRON2
>
> drwxr-xr-x 2 root root  4096 Jun  5 17:48 TEST4CHEVRON3
>
> drwxr-xr-x 2 root root  4096 Jun  5 17:51 TESTFROMCLENT
>
>
>
>
>
> *Andrey Evtikhov*
>
> *Lead Software Maintenance Engineer*
>
>
>
> *Email: *andrey_evtikhov at epam.com
>
> *Saint-Petersburg,* *Russia *(GMT+3)   *epam.com <http://www.epam.com>*
>
>
>
> CONFIDENTIALITY CAUTION AND DISCLAIMER
> This message is intended only for the use of the individual(s) or
> entity(ies) to which it is addressed and contains information that is
> legally privileged and confidential. If you are not the intended recipient,
> or the person responsible for delivering the message to the intended
> recipient, you are hereby notified that any dissemination, distribution or
> copying of this communication is strictly prohibited. All unintended
> recipients are obliged to delete this message and destroy any printed
> copies.
>
>
>
> *From:* Jennifer Staab [mailto:jstaab at cs.unc.edu]
> *Sent:* Friday, June 5, 2015 6:48 PM
> *To:* Andrey Evtikhov
> *Cc:* starcluster at mit.edu
> *Subject:* Re: [StarCluster] NFS Help !!!!
>
>
>
> Something doesn't seem right with how you are mounting your volumes.  If
> you are mounting a volume make sure that volume exists in your AWS system
> and is unattached to any EC2's.  Next make sure the mount point let's say
> it's called /MyData exists as an empty directory on both Master and Worker
> AMIs. Make certain your OS is up-to-date on all AMIs (workers and master).
> Also make sure your volumes and EC2's are in the same availability zone.
>
>
>
> Try this, create an empty volume, attach to the EC2 that's based upon your
> Master Node AMI. Then mount it to that Master Node EC2 (not using
> Starcluster) -- make sure it mounts successfully and you can write some
> temporary text files to it.  Next unmount it and unattach it from the EC2.
> Once it is unmounted and unattached create a new cluster and see if you can
> get that cluster to successfully NFS that volume between Master & Workers.
> If you can get that volume to mount and NFS successfully between Master and
> Workers, then it is likely something about the volumes themselves and/or
> how you are mounting them that's causing the problem.
>
>
>
> If you can't get that empty volume to NFS between Master & Workers -- then
> it is likely something regarding NFS or the OS you have running on the
> AMIs. Make sure you are using up-to-date OS and software and that your
> mount points are empty directories. Also don't mount under an existing
> mount point unless you are using crossmnt option in /etc/exports.
> Starcluster doesn't use the crossmnt option -- I altered my version of
> Starcluster as to allow crossmnt option - but it isn't a native feature.
>
>
>
> Next I would try running NFS on an EC2 based upon the Master node AMI with
> that empty volume mounted and seeing if you can successfully NFS that
> volume with an EC2 based upon your Worker Node. My guess is if you can set
> NFS up successfully without running Starcluster -- in the process you will
> likely see where the NFS problem is.
>
>
>
> Good Luck!
>
>
>
> -Jennifer
>
>
>
>
>
> On Fri, Jun 5, 2015 at 9:28 AM, Andrey Evtikhov <Andrey_Evtikhov at epam.com>
> wrote:
>
>  Really need help with this issue – just created new cluster – still the
> same mess – everything nice on master and complete mess on NFS client:
>
>
>
> Except – /home ! Home mounted Ok !
>
>
>
> root at node001:/geodata# ls -l
>
> ls: cannot access biodata: Too many levels of symbolic links
>
> ls: cannot access geodata: Too many levels of symbolic links
>
>
>
> in config :
>
> VOLUMES = geodata, biodata
>
> [volume geodata]
>
> VOLUME_ID = vol-17feadfc
>
> MOUNT_PATH = /geodata
>
>
>
> [volume biodata]
>
> VOLUME_ID = vol-f3fdae18
>
> MOUNT_PATH = /biodata
>
>
>
> It is how it looks on client /biodata
>
> total 8
>
> d????????? ? ?    ?       ?            ? biodata
>
> d????????? ? ?    ?       ?            ? geodata
>
> drwxr-xr-x 4 root root 4096 Jun  5 13:11 home
>
> drwxr-xr-x 4 root root 4096 Jun  5 13:11 opt
>
>
>
>
>
>
>
>
>
>
>
> *Andrey Evtikhov*
>
> *Lead Software Maintenance Engineer*
>
>
>
> *Email: *andrey_evtikhov at epam.com
>
> *Saint-Petersburg,* *Russia *(GMT+3)   *epam.com <http://www.epam.com>*
>
>
>
> CONFIDENTIALITY CAUTION AND DISCLAIMER
> This message is intended only for the use of the individual(s) or
> entity(ies) to which it is addressed and contains information that is
> legally privileged and confidential. If you are not the intended recipient,
> or the person responsible for delivering the message to the intended
> recipient, you are hereby notified that any dissemination, distribution or
> copying of this communication is strictly prohibited. All unintended
> recipients are obliged to delete this message and destroy any printed
> copies.
>
>
>
> *From:* Andrey Evtikhov
> *Sent:* Tuesday, June 2, 2015 9:20 PM
> *To:* 'starcluster at mit.edu'
> *Subject:* Cannot mount properly /home over NFS
>
>
>
> Cannot mount properly /home
>
>
>
> VOLUMES = mycompany-data
>
>
>
> [volume mycompany-data]
>
> VOLUME_ID = vol-xxxxxxx
>
> MOUNT_PATH = /home
>
>
>
> Master seems to be OK  , but nodes has mounted home inside home !
>
>
>
>
>
> root at prod-sc-triad-node001:/home# cd /home
>
> root at prod-sc-triad-node001:/home# ls -l
>
> ls: cannot access home: Too many levels of symbolic links
>
> total 4
>
> d????????? ? ?    ?       ?            ? home
>
> drwxr-xr-x 4 root root 4096 Apr 30 17:55 opt
>
>
>
>
>
> *Andrey Evtikhov*
>
>
> _______________________________________________
> StarCluster mailing list
> StarCluster at mit.edu
> http://mailman.mit.edu/mailman/listinfo/starcluster
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/pipermail/starcluster/attachments/20150605/935bb467/attachment-0001.htm