[miso-users] Required SAM format

Yarden Katz yarden at mit.edu
Fri Aug 16 15:25:29 EDT 2013


Sounds great. Happy to help with problematic BAM files if they arise. Best, Yarden 

Sent from a mobile device

On Aug 16, 2013, at 2:49 PM, Alexander Kanitz <alexander.kanitz at alumni.ethz.ch> wrote:

> Dear Yarden,
> 
> thanks a lot for the info.
> 
> I have since looked at some recent TopHat outputs. 
> 
> Since we are following quite a complicated (read: "sophisticated" ;) mapping strategy (map against gen and trx, map the trx coordinates back to gen, merge files, keep best alignments and discard non-unique primary alignments), re-creating the SAM file was anyway required, and the conversion to the required output should be easy.
> 
> Thanks a lot for your help and enjoy your travels!
> 
> Best,
> Alex
> 
> 
> 
> On Fri, Aug 16, 2013 at 8:29 PM, Yarden Katz <yarden at mit.edu> wrote:
> Hi Alex,
> 
> See comments below:
> 
> On Aug 13, 2013, at 6:56 AM, Alexander Kanitz wrote:
> 
> > Hi everyone,
> >
> > I am planning to use MISO but I want to be flexible with the aligner I use (read: don't want to use Bowtie/TopHat).
> >
> > Since I did not find any detailed specs regarding this issue, I have a few questions:
> > What does MISO need out of a SAM/BAM file? Specifically, which tags are considered and which are required (assuming all others will be ignored)?
> 
> MISO tries to not rely on too many binary SAM/BAM flags since as you imply many different aligners follow slightly different conventions.
> 
> The main flags used as strandedness if you have stranded data and paired-ness/orientation of reads if you have paired-end data.  There's a flag to MISO that can tell it to group read pairs by ID, e.g. assuming that the reads have the format:
> 
> readX/1
> readX/2
> 
> or something similar that makes it clear that they are both mates of the same sequenced molecule.
> 
> > And most importantly: in what format (one line, separate lines, CIGAR string, MD tag, etc) do split/spliced alignments have to be represented?
> 
> CIGAR string.  I can add a note to the manual clarifying this.
> 
> 
> >
> > I would be very grateful for some help here and ideally a few lines of a MISO-compatible SAM file including some split/spliced alignments if possible (or a link to such a file).
> 
> Any BAM file produced with Tophat and Bowtie 1 should be an example of a MISO-accepted BAM, and we normally use Tophat around here though have users that use other aligners.  I am traveling and don't have access handy now, but it should be easy to find.
> 
> One thing to note: MISO ignores insertions and deletions as they do not match the annotation -- so any reads with insertion/deletions in their CIGAR strings will be discarded.
> 
> >
> > Thanks a lot!
> >
> > Best,
> > Alex
> >
> >
> > --
> > Alexander Kanitz, Ph.D.
> >
> > Winterthurerstrasse 358
> > 8057 Zurich
> > Switzerland
> >
> > Phone: +41 76 278 39 36
> > Email: alexander.kanitz at alumni.ethz.ch
> > _______________________________________________
> > miso-users mailing list
> > miso-users at mit.edu
> > http://mailman.mit.edu/mailman/listinfo/miso-users
> 
> 
> 
> 
> -- 
> Alexander Kanitz, Ph.D.
> 
> Winterthurerstrasse 358
> 8057 Zurich
> Switzerland
> 
> Phone: +41 76 278 39 36
> Email: alexander.kanitz at alumni.ethz.ch
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/pipermail/miso-users/attachments/20130816/ad3db700/attachment.htm


More information about the miso-users mailing list