[miso-users] MISO with multiple threads?

Marvin Jens marvin.jens at mdc-berlin.de
Fri Sep 16 08:23:10 EDT 2011


Hi Holger, hi Yarden

I faked multiprocessing support in the cheapest possible way. I emulate 
the qsub interface with a little script that just runs a local process 
in the background (in a shell with the '&' in the end).

Here is sth. from my settings/miso_settings.py

[cluster]
#cluster_command = qsub
cluster_command = fake_qsub.py

The fake_qsub.py is here:

-------SNIP-----
#!/usr/bin/env python
from optparse import *

import os,sys

usage = """
usage: %prog [options]
"""

parser = OptionParser(usage=usage)
parser.add_option("-o","--out",dest="output",default=".",help="blup")
parser.add_option("-e","--env",dest="environment",default=".",help="blup")
parser.add_option("-q","--queue",dest="queue",default="long",help="blup")
options,args = parser.parse_args()

os.chdir(options.environment)
os.system(args[0]+ " &")
-------SNAP-----

I put it to ~/bin and made it exectuable (maybe you need to add this to 
your PATH?).
Now, all you have to do is provide an appropriate  --chunk-jobs size to 
the run_events_analysis.py call.
I have 16 cores, so I chose 2500 ( ~ 40,000 alt exons divided by 16)

For the record, I got the 40,000 doing this:

mjens at scarbo13:~/MISO$ find hg18_events/pickled/SE/ | wc -l
39270

Hope this helps,

     -Marvin

P.S.: Excited to try out the new fastmiso! Hope my hack still works 
there... :)


On 09/14/2011 08:44 PM, Yarden Katz wrote:
> Hi Holger,
>
> There is currently no support for multiple processors, though it should be possible to add that via Python's subprocess and multicore modules (this is on our list.)  The built-in support for cluster usage simply generates a bunch of shell scripts that call MISO, and so these could be used as input to a "dispatcher" script that submits executes each shell script on a separate process of the same machine.  It's on our list of things to add, as I mentioned, but we don't have it yet as a feature.
>
> However, we recently released a fast version of MISO, written in the C language, that is 60-100x faster than the pure Python version.  See instructions below for how to use it.  This version comes with a Python interface that is identical to the current MISO interface, so you wouldn't have to change anything on your end, except compiling the C version.  We'd love any comments or feedback you might have on this version.
>
> Hope this helps.
>
> Best, --Yarden
>
> ==
> The C version of MISO ("fastmiso") is now available for testing.  The C version of MISO was written by Gábor Csárdi (Harvard Statistics Dept).
>
> The C code is 60-100x faster than the Python-only version per event/gene and scales much better to high coverage RNA-Seq samples.  A cluster shouldn't be necessary to use MISO anymore, but it still helps considerably since the problem is highly parallelizable.
>
> You can get the new version from the "fastmiso" branch of github (see instructions below for installation.)
>
> The C version comes with a Python interface which is identical to the original Python-only version of MISO.  The interface and output files are in an identical format, so no change will require on the part of the users apart from compiling the C code (which is invoked under the hood by the Python interface).
>
> There is an R interface to MISO as well (as alternative to Python one), written by Gábor, and documentation for this version will be made available in the future too.
>
> The current version still has some bottlenecks, which all arise from I/O costs (reading of GFF files, accessing of BAM files) and we are trying to optimize these as well.  But the good news is that the actual inference is no longer a bottleneck.
>
> We'd greatly appreciate any comments/issues/questions you might have.  If possible, please check these results against your previous MISO runs -- they should highly correlated.
>
> ==
> USING THE C VERSION OF MISO:
>
> 1. Get the C version from github, as follows.  First check out the MISO repository, if you haven't already:
>
> $ git clone git at github.com:yarden/MISO.git
>
> Then checkout the "fastmiso" branch:
>
> $ git checkout fastmiso
>
> Now compile the relevant packages:
>
> $ make
>
> Then compile the Python interface
>
> $ cd pysplicing/
> $ python setup.py install
>
> Note: if you are compiling on a Unix system where you do not have root access, use the --prefix option, e.g.:
>
> $ python setup.py install --prefix=~/
>
> to install the Python interface in your home directory.
>
> Test that the version works through Python:
>
> $ cd ~/
> $ python
>>> import pysplicing
>>>
> If you get an error after the import, something went wrong.
>
> On Sep 14, 2011, at 3:44 AM, Holger Brandl wrote:
>
>> Hi,
>>
>> I've discovered MISO recently, figured out how to prepare a gff3, and did the
>> indexing as described in the manual. When computing the expression estimates
>> using run_events_analysis.py MISO just uses one single core (which takes
>> forever). Is there a way to convince it to use several cores? (similar to the -p
>> option in cufflinks) I'm aware of the cluster-option, but I don't now
>> whether/how this also would/could work locally.
>>
>> Best,
>> Holger Brandl
>>
>> -- 
>> Dr. Holger Brandl
>> Bioinformatics Service
>> Max Planck Institute of Molecular Cell Biology and Genetics
>> Pfotenhauerstrasse 108
>> 01307 Dresden, Germany
>>
>> Tel.:   +49/351/210-2738
>> Fax:    +49 351 210 2000
>> www:http://www.mpi-cbg.de
>>
>> _______________________________________________
>> miso-users mailing list
>> miso-users at mit.edu
>> http://mailman.mit.edu/mailman/listinfo/miso-users
>
> _______________________________________________
> miso-users mailing list
> miso-users at mit.edu
> http://mailman.mit.edu/mailman/listinfo/miso-users




More information about the miso-users mailing list