[StarCluster] docker daemon not found when docker command executed with qsub

Rayson Ho raysonlogin at gmail.com
Mon Nov 16 22:26:55 EST 2015


Can you check whether the Grid Engine job environment has the "docker"
group as one of the supplemental groups by submitting a job that runs "id"?


IIRC, Docker requires the process to be a member of the docker group in
order to dial  /var/run/docker.sock.


Open Grid Scheduler - The Official Open Source Grid Engine

On Mon, Nov 16, 2015 at 7:15 PM, Xander Dunn <xander.dunn at icloud.com> wrote:
> I have star cluster installed from the develop branch because I need to
use c4 instance types, which aren’t in a released version yet.  I have open
grid scheduler 2011.11 installed on an Ubuntu 14.04 AMI.
> I have Docker installed in that AMI and the daemon starts on boot.  If I
manually ssh into my master node or any worker node and execute a Docker
command, it works.  The Docker daemon is found and the command succeeds.
Furthermore, executing any docker command from the master node in the form
`ssh node001 docker pull IMAGE` also works correctly.
> However, those same commands, when executed with qsub, will fail because
the running Docker daemon can’t be found:
> Post IMAGE: dial unix /var/run/docker.sock: permission denied.
> * Are you trying to connect to a TLS-enabled daemon without TLS?
> * Is your docker daemon up and running?
> Example: `qsub -V -b y -cwd docker pull ubuntu:14.04`
> What’s the difference in the way qsub executes commands that’s causing
> Thanks,
> Xander
> _______________________________________________
> StarCluster mailing list
> StarCluster at mit.edu
> http://mailman.mit.edu/mailman/listinfo/starcluster
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/pipermail/starcluster/attachments/20151116/7a84459f/attachment.html

More information about the StarCluster mailing list