[miso-users] Updates to MISO and release of sashimi-plot RNA-Seq visualization tool
Yarden Katz
yarden at MIT.EDU
Tue Dec 20 13:41:24 EST 2011
Hi all,
We've added some new helpful features to MISO like computing insert length distributions for paired-end reads, and released sashimi-plot, a tool for visualizing RNA-Seq reads along isoforms and for plotting MISO output. Brief descriptions below:
1. sashimi-plot is an easy to use tool that plots raw RNA-Seq reads aligned to alternative events, along with MISO estimates, in multiple RNA-Seq samples. It automatically makes publication-quality plots like the one attached, that quantitatively highlight the junctions supporting each event.
The documentation for it is here:
http://genes.mit.edu/burgelab/miso/docs/sashimi.html
This can be useful in looking at candidate differential splicing events across samples.
sashimi-plot can also plot MISO output, like posterior distributions for certain events, in addition to the raw data. The features of its plots can be extensively customized through a text setting file.
2. We've implemented some utilities for calculating the insert length distributions for paired-end RNA-Seq samples and computing their statistics.
If we have a GFF file containing all genes (easily downloadable from Ensembl, for example), then the insert length can be computed like this:
(A) Get all constitutive exons from Ensembl file that are at least N bases in size:
python exon_utils.py --get-const-exons Mus_musculus.NCBIM37.65.gff --min-exon-size 1000 --output-dir exons/
This gets all exons at least 1 kb in size, and saves them in a GFF. This only has to be done once for each species.
(B) Compute insert length distribution for a BAM file "sample.bam" given the constitutive exons GFF:
python pe_utils.py --compute-insert-len sample.bam Mus_musculus.NCBIM37.65.min_1000.const_exons.gff --output-dir insert-dist/
This outputs a file (ending in ".insert_len") with a header that computes these parameters
#mean=129.0,sdev=12.1,dispersion=1.1,num_pairs=862148
this gives the mean, standard deviation, dispersion constant statistics of the distribution, and tells you how many read pairs were used in computing the distribution. For more information on these values, including interpretation of the dispersion statistic, see MISO paper.
(C) The distribution can then be plotted using sashimi-plot as follows:
python plot.py --plot-insert-len sample.insert_len --output-dir plots/
which yields the second attached plot.
Hope these are helpful to you in your work.
Happy new year!
Best, --Yarden
-------------- next part --------------
A non-text attachment was scrubbed...
Name: plot_event_bar_posteriors.png
Type: image/png
Size: 96479 bytes
Desc: not available
Url : http://mailman.mit.edu/mailman/private/miso-users/attachments/20111220/67e42adb/attachment-0002.png
-------------- next part --------------
-------------- next part --------------
A non-text attachment was scrubbed...
Name: insert_length.png
Type: image/png
Size: 49111 bytes
Desc: not available
Url : http://mailman.mit.edu/mailman/private/miso-users/attachments/20111220/67e42adb/attachment-0003.png
-------------- next part --------------
More information about the miso-users
mailing list