[StarCluster] Is StarCluster still under active development?
Rajat Banerjee
rajatb at post.harvard.edu
Fri Apr 1 11:22:49 EDT 2016
Regarding:
How about we just call qacct every 5 mins, or if the qacct buffer is empty.
calling qacct and getting the job stats is the first part of the load
balancers loop to see what the cluster is up to. I prioritized knowing the
current state, and keeping the LB running it's loop as fast as possible
(2-10 seconds), so it could run in a 1-minute loop and stay roughly
on-schedule. It's easy to run the whole LB loop with 5 minutes between
loops with the command line arg polling_interval, if that suits your
workload better. I do not mean to sound dismissive, but the command line
options (with reasonable defaults)are there so you can test and tweak to
your work load.
Regarding:
Three sorts of jobs, all of which should occur in the same numbers,
Have you tried testing your call to qacct to see if it's returning what you
want? You could modify it in your source if it's not representative of your
jobs:
https://github.com/jtriley/StarCluster/blob/develop/starcluster/balancers/sge/__init__.py#L528
qacct_cmd = 'qacct -j -b ' + qatime
Obviously one size doesn't fit all here, but if you find a set of args for
qacct that work better for you, let me know.
Thanks,
Raj
On Fri, Apr 1, 2016 at 11:08 AM, Tony Robinson <tonyr at speechmatics.com>
wrote:
> Hi Raj and all,
>
> I think that there is another problem as well, one that I haven't tracked
> down yet. I have three sorts of jobs, all of which should occur in the
> same numbers, but when I measure what's in the cache then one job name is
> massively under represented.
>
> We have:
>
> lookback_window = 3
>
> which means we pull in three hours of history (by default). How about we
> just call qacct every 5 mins, or if the qacct buffer is empty. I don't
> think every 5 mins is a big overhead and the "if empty" means that we can
> power up a new cluster and it'll just be a bit slower before it populates
> the job stats (but not that much slower as it's parsing an empty buffer).
> Also I don't see the need to continually be recalculating stats - these
> could be done every time qacct is called and stored. If this is going to
> break something then do let me know.
>
> I don't know when I'll next get time for this but when I get it working
> I'll report back my findings (I have an AWS cluster where nodes are brought
> up or down every few minutes so there is plenty of data to try this out on).
>
>
> Tony
>
>
> On 01/04/16 15:44, Rajat Banerjee wrote:
>
> I see what you are saying but did not test the case with a range of
> taskid's , so I did not see the problem you mentioned. The cache may have
> been a premature optimization to avoid doing large pulls from jobstat once
> every 30-60 seconds. When Justin and I were designing it, it seemed wise to
> cache some amount of SGE's output instead of doing a full pull every time,
> and it got very slow when there were >100 jobs.
>
> Raj
>
> On Sat, Mar 26, 2016 at 3:29 PM, Tony Robinson <tonyr at speechmatics.com>
> wrote:
>
>> Following up (and hopefully not talking to myself), I've found at least
>> one problem with jobstats[]. The code says:
>>
>> if l.find('jobnumber') != -1:
>> job_id = int(l[13:len(l)])
>> ...
>> hash = {'jobname': jobname, 'queued': qd, 'start': start,
>> 'end': end}
>> self.jobstats[job_id % self.jobstat_cachesize] = hash
>>
>> So it doesn't take into account array jobs which have a range of taskid -
>> it just counts one instance.
>>
>> That explains why the estimated job duration is wrong. The most obvious
>> solution is just to get rid of the cache. If compute time is a problem,
>> keep ru_wallclock as at the moment most time is spent in converting time
>> formats.
>>
>> I'm also working on a gridEngine scheduler that works well with
>> starcluster, that is it keeps the most recently booted nodes (and master)
>> the most loaded so when you get to nearly the end of the hour there's nodes
>> free you can take down. I've just got this going. Next I'd like to
>> distribute the load evenly across all nodes that are up (these are vCPU,
>> lightly loaded runs much faster) unless they are near the end of the hour,
>> and in that case make sure the ones nearest the end are empty. I'm happy
>> to go into details but I fear there aren't that many users of starcluster
>> who really care about getting things going efficiently for short running
>> jobs (or the above bug would have been fixed) so I'm talking to myself.
>>
>>
>> Tony
>>
>>
>> On 25/03/16 19:56, Tony Robinson wrote:
>>
>> Hi Rajat,
>>
>> The main issue that I have with the load balancer is sometimes bringing
>> up a node or taking down a node fails and this caused the loadbalancer to
>> fall over. This is almost certainly an issue with boto - I just haven't
>> looked into it enough.
>>
>> I'm working on the loadbalancer right now. I'm running a few different
>> sorts of jobs, some take half a minute some take five minutes. It takes
>> me about five minutes to bring a node up, so load balancing is quite a hard
>> task, certainly what's there at the moment isn't optimal.
>>
>> In your masters thesis you had a go at anticipating the future load based
>> on the queue, although I see no trace of this in the current code. What
>> seems like the most obvious approach to me is to look at what's running and
>> in the queue and see if it's all going to complete within some specified
>> period. If it is, then fine, if not assume you are going to bring n nodes
>> up (start at n=1) and then see if it'll complete, if not then increment n.
>>
>> I've got a version of this running but it isn't completed because
>> avg_job_duration() consistently under reports. I'm doing some debugging,
>> it seems that jobstats[] has a bug, I have three type of job, a start,
>> middle and end, and as they are all run in sequence then jobstats[] should
>> have equal numbers of each. It doesn't.
>>
>> This is a weekend (with unreliable time) activity for me. If you or
>> anyone else wants to help:
>>
>> a) getting avg_job_duration() working which probably means fixing
>> jobstats[]
>> b) getting a clean simple predictive load balancer working
>>
>> then please contact me.
>>
>>
>> Tony
>>
>> On 25/03/16 17:17, Rajat Banerjee wrote:
>>
>> I'll fix any issues with the load balancer if they come up.
>>
>>
>>
>> --
>> Speechmatics is a trading name of Cantab Research Limited
>> We are hiring: www.speechmatics.com/careers
>> Dr A J Robinson, Founder, Cantab Research Ltd
>> Phone direct: 01223 794096, office: 01223 794497
>> Company reg no GB 05697423, VAT reg no 925606030
>> 51 Canterbury Street, Cambridge, CB4 3QG, UK
>>
>>
>>
>> --
>> Speechmatics is a trading name of Cantab Research Limited
>> We are hiring: www.speechmatics.com/careers
>> Dr A J Robinson, Founder, Cantab Research Ltd
>> Phone direct: 01223 794096, office: 01223 794497
>> Company reg no GB 05697423, VAT reg no 925606030
>> 51 Canterbury Street, Cambridge, CB4 3QG, UK
>>
>> _______________________________________________
>> StarCluster mailing list
>> StarCluster at mit.edu
>> http://mailman.mit.edu/mailman/listinfo/starcluster
>>
>>
>
>
> --
> Speechmatics is a trading name of Cantab Research Limited
> We are hiring: www.speechmatics.com/careers
> <http:www.speechmatics.com/careers>
> Dr A J Robinson, Founder, Cantab Research Ltd
> Phone direct: 01223 794096, office: 01223 794497
> Company reg no GB 05697423, VAT reg no 925606030
> 51 Canterbury Street, Cambridge, CB4 3QG, UK
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/pipermail/starcluster/attachments/20160401/ced96531/attachment.html
More information about the StarCluster
mailing list