[StarCluster] docker daemon not found when docker command executed with qsub

Xander Dunn xander.dunn at icloud.com
Mon Nov 16 19:15:50 EST 2015


I have star cluster installed from the develop branch because I need to use c4 instance types, which aren’t in a released version yet.  I have open grid scheduler 2011.11 installed on an Ubuntu 14.04 AMI.  

I have Docker installed in that AMI and the daemon starts on boot.  If I manually ssh into my master node or any worker node and execute a Docker command, it works.  The Docker daemon is found and the command succeeds.  Furthermore, executing any docker command from the master node in the form `ssh node001 docker pull IMAGE` also works correctly.  

However, those same commands, when executed with qsub, will fail because the running Docker daemon can’t be found: 
Post IMAGE: dial unix /var/run/docker.sock: permission denied.
* Are you trying to connect to a TLS-enabled daemon without TLS?
* Is your docker daemon up and running?

Example: `qsub -V -b y -cwd docker pull ubuntu:14.04`

What’s the difference in the way qsub executes commands that’s causing this?  

Thanks,
Xander


More information about the StarCluster mailing list