[miso-users] Accounting for batch effects

Singh, Larry (NIH/NHGRI) [E] larry.singh at nih.gov
Thu Jul 25 21:02:34 EDT 2013


Hi Yarden,

That makes sense to me.  Correct me if I'm wrong, but the only case where
I can think that this approach may not work is if you have RNA degradation
for a set of samples and part of the transcript is gone, but I think that
would be a problem for most approaches.

I will try your suggestion and see if we find any evidence of batch
effects after PSI computations.  Thank you once again for your helpful
e-mails and advice.

Much appreciated,
-Larry.

On 7/25/13 8:31 PM, "Yarden Katz" <yarden at mit.edu> wrote:

>Hi Larry,
>
>If you were to normalize, I think it would make more sense to use the
>normalized counts values (e.g. by quantile normalization) and not the
>RPKM/FPKM quantity, since that's not really a discrete count value and it
>already incorporates information about length and library size which
>would mess up assumptions of our model.
>
>Since the Psi value is a percentage, I think it's less likely to have
>various scaling artifacts that result from batch effects on RPKMs.  For
>example, it's common to compare the raw RPKMs of two independent RNA-Seq
>libraries and find that one library has way higher RPKMs than another
>uniformly (which would need to be normalized away.)  With Psi values, I
>think it's less likely to happen, since the inclusion value of an exon
>(for example) is measured as a percentage of the flanking exons, and that
>creates a kind of internally controlled measure, at least most of the
>time.  My suggestion is to try the Psi values as is, and see if there's
>statistical evidence for a batch effect?.
>
>Best, --Yarden
>
>
>On Jul 25, 2013, at 8:23 PM, Singh, Larry (NIH/NHGRI) [E] wrote:
>
>> Hi Yarden,
>> 
>> Thanks very much for your response.  This may be a naïve question, but
>> what about using normalized read counts to start with.  For instance,
>> instead of computing PSI_MISO with read counts, use RPKM (FPKM) instead.
>> I haven't read the methods in your paper completely, so I apologize if
>> this suggestion makes sense. :)
>> 
>> Thanks again for getting back to me so promptly.
>> 
>> Regards,
>> -Larry.
>> 
>> On 7/25/13 8:02 PM, "Yarden Katz" <yarden at mit.edu> wrote:
>> 
>>> Hi Larry,
>>> 
>>> The problem of batch effects is very similar to the problem of modeling
>>> variability within biological replicates.  MISO currently doesn't have
>>>a
>>> built-in solution for that.  See this
>>> (http://genes.mit.edu/burgelab/miso/docs/#answer13) for a discussion of
>>> the issue and how to deal with it outside of MISO.
>>> 
>>> We're working on modeling this problem, but in the meantime the ways to
>>> address it are discussed in the link above. For what it's worth, the
>>>Psi
>>> quantity is internally normalized.  It doesn't mean that it will not
>>> suffer potentially from batch effects, but (anecdotally) we've found
>>>that
>>> this quantity suffers less from batch effects compared with RPKM or
>>>units
>>> of gene expression which will be more sensitive to the composition of
>>>the
>>> RNA, etc. -- although gene expression values are certainly easier
>>>overall
>>> to estimate reliably than Psi values.  Again, I'm sure batch effects
>>>can
>>> creep in, but they're less obvious in this context.
>>> 
>>> In any case, it's worth seeing what the extent of the batch effects are
>>> and whether they can be normalized as a post-processing step.  We're
>>> working on native probabilistic models for this but I imagine that the
>>> right solution will depend on the kind of batch effects and variability
>>> in your experiment, and a generic solution that fits all experimental
>>> design is in my view unlikely.
>>> 
>>> Best, --Yarden
>>> 
>>> On Jul 25, 2013, at 5:25 PM, Singh, Larry (NIH/NHGRI) [E] wrote:
>>> 
>>>> Dear MISO users,
>>>> 
>>>> I'm new to MISO, but would like to use it for differential expression
>>>> and eQTL analyses of a large number of samples.  Initial analyses have
>>>> shown though that there are likely batch effects.  Is there a method
>>>>in
>>>> MISO for accounting for batch effects?  I've searched the web and the
>>>> miso-users archives and couldn't find an answer.
>>>> 
>>>> Thank you kindly for your attention.
>>>> -Larry.
>>>> 
>>>> --
>>>> Larry N. Singh, Ph.D.
>>>> Research Fellow
>>>> Genetic Diseases Research Branch
>>>> National Human Genome Research Institute, NIH
>>>> Building 49, Room 4A52
>>>> 49 Convent Dr., Bethesda, MD 20892-8004
>>>> (301) 451-4699
>>>> 
>>>> _______________________________________________
>>>> miso-users mailing list
>>>> miso-users at mit.edu
>>>> http://mailman.mit.edu/mailman/listinfo/miso-users
>>> 
>> 
>




More information about the miso-users mailing list