<div dir="ltr"><div><div>There shouldn't be too much I/O be unless I'm missing something. <br><br>In iPython, I read the data from an HDF store on each node (once), then instantiate a class on each node with the data: <br><br>%%px
<br>store = pd.HDFStore(data_file, 'r')
rows = store.select('results', ['cv_score_mean > 0'])
rows = rows.sort('cv_score_mean', ascending=False)
rows['results_index'] = rows.index<br><br></div><div># This doesn't take too long.<br></div><div>model_analytics = ResultsAnalytics(rows, store['data_model'])
<br>---<br></div>## This dispatch takes between 1.5 min to 5 min<br></div>## 66K jobs<br><div><div>ar = lview.map(lambda x: model_analytics.generate_prediction_heuristic(x), rows_index)
<br>---<br>ar.wait_interactive(interval=1.0)<br><pre style="overflow:auto;font-family:monospace;font-size:14px;display:block;padding:0px;margin:0px;line-height:17.0000591278076px;word-break:break-all;word-wrap:break-word;color:rgb(0,0,0);border:0px;border-radius:0px;white-space:pre-wrap;vertical-align:baseline;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;text-align:left;text-indent:0px;text-transform:none;word-spacing:0px;background-color:rgb(255,255,255)">63999/66230 tasks finished after 2181 s
done<br><br></pre>So the whole run takes awhile, though each job itself is relatively short. But I don't understand why CPU isn't the limiting factor. <br><br></div><div>Rajat, thanks for recommending dstat. <br><br></div><div>Best,<br></div><div>Chris<br><br></div><div> <br><br><br><br><div><br><div class="gmail_quote"><div dir="ltr">On Thu, Jul 30, 2015 at 10:52 AM Jacob Barhak <<a href="mailto:jacob.barhak@gmail.com">jacob.barhak@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><p dir="ltr">Hi Christopher, </p>
<p dir="ltr">Do you have a lot of I/O? For example writing and reading many files to the same NFS location? </p>
<p dir="ltr">This may explain things. </p>
<p dir="ltr"> Jacob</p>
<div class="gmail_quote"></div><div class="gmail_quote">On Jul 30, 2015 2:34 AM, "Christopher Clearfield" <<a href="mailto:chris.clearfield@system-logic.com" target="_blank">chris.clearfield@system-logic.com</a>> wrote:<br type="attribution"></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div><div><div><div><div><div>Hi All, <br></div>I'm running a set of about 60K relatively short jobs that take 30 minutes to run. This is through ipython parallel.<br><br></div>Yet my CPU utilization levels are relatively small: <br><br>queuename qtype resv/used/tot. load_avg arch states
---------------------------------------------------------------------------------
all.q@master BIP 0/0/2 0.98 linux-x64
---------------------------------------------------------------------------------
all.q@node001 BIP 0/0/8 8.01 linux-x64
---------------------------------------------------------------------------------
all.q@node002 BIP 0/0/8 8.07 linux-x64
---------------------------------------------------------------------------------
all.q@node003 BIP 0/0/8 7.96 linux-x64<br><br></div>(I disabled the ipython engines on master because I was having heartbeat timeout issues with the worker engines on my nodes, which explains why that is so low). <br><br></div>But ~8% utilization on the nodes. Is that expected? <br><br></div>Thanks,<br></div>Chris<br><br></div>
<br></blockquote></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">_______________________________________________<br>
StarCluster mailing list<br>
<a href="mailto:StarCluster@mit.edu" target="_blank">StarCluster@mit.edu</a><br>
<a href="http://mailman.mit.edu/mailman/listinfo/starcluster" rel="noreferrer" target="_blank">http://mailman.mit.edu/mailman/listinfo/starcluster</a><br>
<br></blockquote></div>
</blockquote></div></div></div></div></div>