[miso-users] MISO Handling of Marked Duplicates

Yarden Katz yarden at MIT.EDU
Mon Nov 19 18:43:58 EST 2012


Hi Fong,

You're correct that they will be treated independently in the case you describe.  Best, --Yarden



On Nov 19, 2012, at 6:17 PM, Fong Chun Chan wrote:

> Hi Yarden,
> 
> Thanks for the reply. 
> 
> Just to clarify, I was referring to PCR duplicates and therefore the duplicates marked will NOT have the same read ID but simply have a flag in the *.bam file indicating it is a PCR duplicate. So If I understand you correctly, if I were to pass in my GSNAP bam with duplicates marked using PICARD tools then currently MISO will just treat each of the duplicates independently?
> 
> Fong
> 
> 
> On Mon, Nov 19, 2012 at 3:10 PM, Yarden Katz <yarden at mit.edu> wrote:
> Hi Fong,
> 
> There's no explicit handling of duplicates, but that can be added.  In general, if you have paired-end reads, they will be paired by the read ID, in which case duplicates that have the same ID will not be treated separately.  For single-end reads, duplicates with distinct IDs will be treated as independent reads, and a SAM flag marking duplicates will be ignored.  Please let me know if you have any questions about this.
> 
> Best, --Yarden
> 
> 
> 
> On Nov 19, 2012, at 4:57 PM, Fong Chun Chan wrote:
> 
> > Hi,
> >
> > I have some GSNAP aligned bam files that have been post-processed with GATK to mark duplicates. How does MISO handle *.bam files that have been duplicates marked?  I wasn't able to find any documentation about this. Does it ignore them or treat them as normal reads?
> >
> > Thanks,
> >
> > Fong
> > _______________________________________________
> > miso-users mailing list
> > miso-users at mit.edu
> > http://mailman.mit.edu/mailman/listinfo/miso-users
> 
> 




More information about the miso-users mailing list