[StarCluster] Starcluster - Taking advantage of multiple cores on EC2

Bill Lennon blennon at shopzilla.com
Wed Aug 31 14:27:16 EDT 2011


processor       : 7
vendor_id       : GenuineIntel
cpu family      : 6
model           : 26
model name      : Intel(R) Xeon(R) CPU           X5550  @ 2.67GHz
stepping        : 5
cpu MHz         : 2666.760
cache size      : 8192 KB
physical id     : 7
siblings        : 1
core id         : 0
cpu cores       : 1
apicid          : 7
initial apicid  : 2
fpu             : yes
fpu_exception   : yes
cpuid level     : 11
wp              : yes
flags           : fpu de tsc msr pae cx8 apic sep cmov pat clflush acpi mmx fxsr sse sse2 ss ht syscall nx lm constant_tsc rep_good nonstop_tsc aperfmperf pni ssse3 cx16 sse4_1 sse4_2 popcnt hypervisor lahf_lm
bogomips        : 5335.08
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management:

-----Original Message-----
From: starcluster-bounces at mit.edu [mailto:starcluster-bounces at mit.edu] On Behalf Of Bill Lennon
Sent: Wednesday, August 31, 2011 11:24 AM
To: Rayson Ho; starcluster at mit.edu
Subject: Re: [StarCluster] Starcluster - Taking advantage of multiple cores on EC2

processor       : 0
processor       : 1
processor       : 2
processor       : 3
processor       : 4
processor       : 5
processor       : 6
processor       : 7

-----Original Message-----
From: Rayson Ho [mailto:raysonlogin at yahoo.com]
Sent: Wednesday, August 31, 2011 11:23 AM
To: starcluster at mit.edu; Bill Lennon
Subject: RE: [StarCluster] Starcluster - Taking advantage of multiple cores on EC2

--- On Wed, 8/31/11, Bill Lennon <blennon at shopzilla.com> wrote:
> When I run my app interactively outside of sge and look at htop it 
> only uses one core :(

OK, so at least we are not dealing with SGE... Looks like an OS/app issue now :-D

>From the shell, run:

% cat /proc/cpuinfo

or:

% cat /proc/cpuinfo|grep processor

This should at least tell us the number of cores/threads on the EC2 node.

Rayson

=================================
Grid Engine / Open Grid Scheduler
http://gridscheduler.sourceforge.net


> 
> In my original message I tried to explain that I launched the 
> starcluster ami on a single ec2 instance, so I'm not working with a 
> cluster.  But I'd still like to take advantage of all the cores.
> 
> -----Original Message-----
> From: Rayson Ho [mailto:raysonlogin at yahoo.com]
> 
> Sent: Wednesday, August 31, 2011 11:09 AM
> To: starcluster at mit.edu;
> Bill Lennon
> Subject: RE: [StarCluster] Starcluster - Taking advantage of multiple 
> cores on EC2
> 
> --- On Wed, 8/31/11, Bill Lennon <blennon at shopzilla.com>
> wrote: 
> > 1) What do you get when you run "qhost" on the EC2
> cluster??
> > 
> > error: commlib error: got select error (Connection
> > refused)
> > error: unable to send message to qmaster using port
> 6444 on host
> > "localhost": got send error
> 
> 
> Looks like you are not able to connect to the SGE qmaster... did you 
> actually submit jobs to SGE??
> 
> 
> > 2) If you run your application outside of SGE on your
> EC2 cluster, do
> > you get the same behavior??
> > 
> > If I 'python job.py' I don't see those errors...if
> that's what your
> > asking?
> 
> 
> I mean, on one of your EC2 nodes, run your application interactively. 
> Then run "top" or "uptime" and see if outside of SGE, your application 
> is able to use all the cores on the node.
> 
> Rayson
> 
> =================================
> Grid Engine / Open Grid Scheduler
> http://gridscheduler.sourceforge.net

> 
> 
> 
> > 
> > 3) Intel MKL uses OpenMP internally, did you set the
> env.
> > var. OMP_NUM_THREADS on the laptop??
> > 
> > Nope.
> > 
> > Hope that may give you a lead.  I'm unfortunately a
> noob.
> > 
> > -----Original Message-----
> > From: Rayson Ho [mailto:raysonlogin at yahoo.com]
> > 
> > Sent: Wednesday, August 31, 2011 10:57 AM
> > To: Bill Lennon; starcluster at mit.edu
> > Subject: Re: [StarCluster] Starcluster - Taking
> advantage of multiple
> > cores on EC2
> > 
> > Bill,
> > 
> > 1) What do you get when you run "qhost" on the EC2
> cluster??
> > 
> > 2) If you run your application outside of SGE on your
> EC2 cluster, do
> > you get the same behavior??
> > 
> > 3) Intel MKL uses OpenMP internally, did you set the
> env.
> > var. OMP_NUM_THREADS on the laptop??
> > 
> > Rayson
> > 
> > =================================
> > Grid Engine / Open Grid Scheduler
> > http://gridscheduler.sourceforge.net

> 
> > 
> > 
> > 
> > --- On Wed, 8/31/11, Chris Dagdigian <dag at bioteam.net>
> > wrote:
> > > Grid Engine just executes jobs and manages
> resources.
> > > 
> > > It's up to your code to use more than one core.
> > > 
> > > Maybe there is a config difference between your
> local
> > scipy/numpy etc.
> > > install and how StarCluster deploys it's
> version?
> > > 
> > > Grid Engine assumes by default a  1:1 ratio
> between
> > job and CPU core
> > > unless you are explicitly submitting to a
> parallel
> > environment.
> > > 
> > > If you are the only user on a small cluster you
> > probably don't have to
> > > do much, the worst that could happen would be
> that SGE
> > queues up and
> > > runs more than one of your threaded app job on
> the
> > same host and they
> > > end up competing for CPU/memory resources to the
> > detriment of all.
> > > 
> > > One way around that would be to configure
> exclusive
> > job access and
> > > submit your job with the "exclusive" request.
> That
> > will ensure that
> > > your job when it runs will get an entire
> execution
> > host.
> > > 
> > > Another way is to fake up a parallel environment.
> For
> > your situation
> > > it is very common for people to build a parallel
> > environment called
> > > "Threaded" or "SMP" so that they can run threaded
> apps
> > without
> > > oversubscribing an execution host.
> > > 
> > > With a threaded PE set up you'd submit your job:
> > > 
> > >   $ qsub -pe threaded=<# CPU>
> my-job-script.sh
> > > 
> > > ... and SGE would account for your single job
> using
> > more than one CPU
> > > on a single host.
> > > 
> > > 
> > > FYI Grid Engine has recently picked up some Linux
> core
> > binding
> > > enhancements that make it easier to pin jobs and
> tasks
> > to specific
> > > cores. I'm not sure if the version of GE that is
> built
> > into
> > > StarCluster today has those features yet but it
> should
> > gain them
> > > eventually.
> > > 
> > > Regards,
> > > Chris
> > > 
> > > 
> > > 
> > > 
> > > 
> > > 
> > > 
> > > Bill Lennon wrote:
> > > > Dear Starcluster Gurus,
> > > >
> > > > I’ve successfully loaded the Starcluster
> AMI
> > onto a
> > > single high-memory
> > > > quadruple extra large instance and am
> performing
> > an
> > > SVD on a large
> > > > sparse matrix and then performing k-means on
> the
> > > result.  However, I’m
> > > > only taking advantage of one core when I do
> > > this?  On my laptop (using
> > > > scipy numpy, intel MKL), on a small version
> of
> > this,
> > > all cores are taken
> > > > advantage of automagically.  Is there an
> easy
> > way
> > > to do this with a
> > > > single starcluster instance with Atlas? Or
> do I
> > need
> > > to explicitly write
> > > > my code to multithread?
> > > >
> > > > My thanks,
> > > >
> > > > Bill
> > > >
> > > _______________________________________________
> > > StarCluster mailing list
> > > StarCluster at mit.edu
> > > http://mailman.mit.edu/mailman/listinfo/starcluster

> 
> > > 
> > 
> > 
> 
> 


_______________________________________________
StarCluster mailing list
StarCluster at mit.edu
http://mailman.mit.edu/mailman/listinfo/starcluster




More information about the StarCluster mailing list