[miso-users] problem with GFF file

Yarden Katz yarden at mit.edu
Mon Jul 20 12:03:06 EDT 2015


Hi Nirmala,

That is the problem then.  MISO has no way of knowing which transcripts go with what genes without this unit.  The GFF3 format is hierarchical describes genes as parent nodes that have "mRNA" entries as their children, with each "mRNA" entry having "exon" nodes as children.  The "transcript" entry can be a substitute for "mRNA", but "gene" entries are required.

Best, Yarden


On Jul 20, 2015, at 11:50 AM, Akula, Nirmala (NIH/NIMH) [C] <akulan at mail.nih.gov> wrote:

> Hi,
> 
> The GFF file has only "transcript and "exon" entries. No "genes".
> 
> Thanks,
> Nirmala
> 
> -----Original Message-----
> From: Yarden Katz [mailto:yarden.katz at gmail.com] On Behalf Of Yarden Katz
> Sent: Friday, July 17, 2015 5:55 PM
> To: Akula, Nirmala (NIH/NIMH) [C]
> Cc: miso-users at mit.edu
> Subject: Re: [miso-users] problem with GFF file
> 
> Hi,
> 
> Does your GFF file contain "gene" entries, or just "transcript" and "exon" entries?
> 
> The "gene" entries are used to determine genes.  
> 
> Yarden
> 
> On Jul 17, 2015, at 5:16 PM, Akula, Nirmala (NIH/NIMH) [C] <akulan at mail.nih.gov> wrote:
> 
>> Hi,
>> 
>> I converted GTF file (generated by Cufflinks) to GFF3 format using Cufflinks using the following command:
>> 
>> gffread -E merged.gtf -o- >merged_gtfToGff3.gff3
>> 
>> When I try to index merged_gtfToGff3.gff3 file using MISO I see that 0 genes were loaded and genes.gff file is 0.
>> 
>> [akulan at helix stringtieGtfs_v1-0-3_cuffmerged]$ index_gff --index merged_gtfToGff3.gff3 mergedIndexedGff/
>> Indexing GFF...
>>  - GFF: /akulan/merged_gtfToGff3.gff3
>>  - Outputting to: / akulan/mergedIndexedGff
>> Loaded 0 genes
>>  - Loading of genes from GFF took 199.95 seconds
>> Outputting gene records in GFF format...
>>  - Output file: /gpfs/gsfs4/users/akulan/transcriptome/stringtie/stringtieGtfs_v1-0-3_cuffmerged/mergedIndexedGff/genes.gff
>>  - Serialization of genes from GFF took 13.24 seconds
>> Indexing of GFF took 213.18 seconds.
>> 
>> Here are the top 5 lines from the gff3 file
>> # gffread -E merged.gtf -o-
>> ##gff-version 3
>> chr1    Cufflinks       transcript      11869   14409   .       +       .       ID=TCONS_00000001;geneID=XLOC_000001;gene_name=DDX11L1
>> chr1    Cufflinks       exon    11869   12227   .       +       .       Parent=TCONS_00000001
>> chr1    Cufflinks       exon    12613   12721   .       +       .       Parent=TCONS_00000001
>> chr1    Cufflinks       exon    13221   14409   .       +       .       Parent=TCONS_00000001
>> chr1    Cufflinks       transcript      11869   29022   .       +       .       ID=TCONS_00000002;geneID=XLOC_000001;gene_name=DDX11L1
>> 
>> Any suggestions as to why the genes are not loaded from the gff file would be really helpful.
>> 
>> Thank you very much.
>> 
>> Regards,
>> Nirmala
>> _______________________________________________
>> miso-users mailing list
>> miso-users at mit.edu
>> http://mailman.mit.edu/mailman/listinfo/miso-users
> 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 1838 bytes
Desc: not available
Url : http://mailman.mit.edu/pipermail/miso-users/attachments/20150720/52fa2177/attachment.bin


More information about the miso-users mailing list