[miso-users] Accounting for batch effects

Yarden Katz yarden at MIT.EDU
Thu Jul 25 20:31:51 EDT 2013


Hi Larry,

If you were to normalize, I think it would make more sense to use the normalized counts values (e.g. by quantile normalization) and not the RPKM/FPKM quantity, since that's not really a discrete count value and it already incorporates information about length and library size which would mess up assumptions of our model.

Since the Psi value is a percentage, I think it's less likely to have various scaling artifacts that result from batch effects on RPKMs.  For example, it's common to compare the raw RPKMs of two independent RNA-Seq libraries and find that one library has way higher RPKMs than another uniformly (which would need to be normalized away.)  With Psi values, I think it's less likely to happen, since the inclusion value of an exon (for example) is measured as a percentage of the flanking exons, and that creates a kind of internally controlled measure, at least most of the time.  My suggestion is to try the Psi values as is, and see if there's statistical evidence for a batch effect. 

Best, --Yarden


On Jul 25, 2013, at 8:23 PM, Singh, Larry (NIH/NHGRI) [E] wrote:

> Hi Yarden,
> 
> Thanks very much for your response.  This may be a naïve question, but
> what about using normalized read counts to start with.  For instance,
> instead of computing PSI_MISO with read counts, use RPKM (FPKM) instead.
> I haven't read the methods in your paper completely, so I apologize if
> this suggestion makes sense. :)
> 
> Thanks again for getting back to me so promptly.
> 
> Regards,
> -Larry.
> 
> On 7/25/13 8:02 PM, "Yarden Katz" <yarden at mit.edu> wrote:
> 
>> Hi Larry,
>> 
>> The problem of batch effects is very similar to the problem of modeling
>> variability within biological replicates.  MISO currently doesn't have a
>> built-in solution for that.  See this
>> (http://genes.mit.edu/burgelab/miso/docs/#answer13) for a discussion of
>> the issue and how to deal with it outside of MISO.
>> 
>> We're working on modeling this problem, but in the meantime the ways to
>> address it are discussed in the link above. For what it's worth, the Psi
>> quantity is internally normalized.  It doesn't mean that it will not
>> suffer potentially from batch effects, but (anecdotally) we've found that
>> this quantity suffers less from batch effects compared with RPKM or units
>> of gene expression which will be more sensitive to the composition of the
>> RNA, etc. -- although gene expression values are certainly easier overall
>> to estimate reliably than Psi values.  Again, I'm sure batch effects can
>> creep in, but they're less obvious in this context.
>> 
>> In any case, it's worth seeing what the extent of the batch effects are
>> and whether they can be normalized as a post-processing step.  We're
>> working on native probabilistic models for this but I imagine that the
>> right solution will depend on the kind of batch effects and variability
>> in your experiment, and a generic solution that fits all experimental
>> design is in my view unlikely.
>> 
>> Best, --Yarden
>> 
>> On Jul 25, 2013, at 5:25 PM, Singh, Larry (NIH/NHGRI) [E] wrote:
>> 
>>> Dear MISO users,
>>> 
>>> I'm new to MISO, but would like to use it for differential expression
>>> and eQTL analyses of a large number of samples.  Initial analyses have
>>> shown though that there are likely batch effects.  Is there a method in
>>> MISO for accounting for batch effects?  I've searched the web and the
>>> miso-users archives and couldn't find an answer.
>>> 
>>> Thank you kindly for your attention.
>>> -Larry.
>>> 
>>> --
>>> Larry N. Singh, Ph.D.
>>> Research Fellow
>>> Genetic Diseases Research Branch
>>> National Human Genome Research Institute, NIH
>>> Building 49, Room 4A52
>>> 49 Convent Dr., Bethesda, MD 20892-8004
>>> (301) 451-4699
>>> 
>>> _______________________________________________
>>> miso-users mailing list
>>> miso-users at mit.edu
>>> http://mailman.mit.edu/mailman/listinfo/miso-users
>> 
> 




More information about the miso-users mailing list