[miso-users] problem with GFF file

Akula, Nirmala (NIH/NIMH) [C] akulan at mail.nih.gov
Mon Jul 20 12:28:14 EDT 2015


Is there a way to format this file to add the gene annotation?

Thank you very much.
Nirmala

-----Original Message-----
From: Yarden Katz [mailto:yarden.katz at gmail.com] On Behalf Of Yarden Katz
Sent: Monday, July 20, 2015 12:03 PM
To: Akula, Nirmala (NIH/NIMH) [C]
Cc: miso-users at mit.edu
Subject: Re: [miso-users] problem with GFF file

Hi Nirmala,

That is the problem then.  MISO has no way of knowing which transcripts go with what genes without this unit.  The GFF3 format is hierarchical describes genes as parent nodes that have "mRNA" entries as their children, with each "mRNA" entry having "exon" nodes as children.  The "transcript" entry can be a substitute for "mRNA", but "gene" entries are required.

Best, Yarden


On Jul 20, 2015, at 11:50 AM, Akula, Nirmala (NIH/NIMH) [C] <akulan at mail.nih.gov> wrote:

> Hi,
> 
> The GFF file has only "transcript and "exon" entries. No "genes".
> 
> Thanks,
> Nirmala
> 
> -----Original Message-----
> From: Yarden Katz [mailto:yarden.katz at gmail.com] On Behalf Of Yarden Katz
> Sent: Friday, July 17, 2015 5:55 PM
> To: Akula, Nirmala (NIH/NIMH) [C]
> Cc: miso-users at mit.edu
> Subject: Re: [miso-users] problem with GFF file
> 
> Hi,
> 
> Does your GFF file contain "gene" entries, or just "transcript" and "exon" entries?
> 
> The "gene" entries are used to determine genes.  
> 
> Yarden
> 
> On Jul 17, 2015, at 5:16 PM, Akula, Nirmala (NIH/NIMH) [C] <akulan at mail.nih.gov> wrote:
> 
>> Hi,
>> 
>> I converted GTF file (generated by Cufflinks) to GFF3 format using Cufflinks using the following command:
>> 
>> gffread -E merged.gtf -o- >merged_gtfToGff3.gff3
>> 
>> When I try to index merged_gtfToGff3.gff3 file using MISO I see that 0 genes were loaded and genes.gff file is 0.
>> 
>> [akulan at helix stringtieGtfs_v1-0-3_cuffmerged]$ index_gff --index merged_gtfToGff3.gff3 mergedIndexedGff/
>> Indexing GFF...
>>  - GFF: /akulan/merged_gtfToGff3.gff3
>>  - Outputting to: / akulan/mergedIndexedGff
>> Loaded 0 genes
>>  - Loading of genes from GFF took 199.95 seconds
>> Outputting gene records in GFF format...
>>  - Output file: /gpfs/gsfs4/users/akulan/transcriptome/stringtie/stringtieGtfs_v1-0-3_cuffmerged/mergedIndexedGff/genes.gff
>>  - Serialization of genes from GFF took 13.24 seconds
>> Indexing of GFF took 213.18 seconds.
>> 
>> Here are the top 5 lines from the gff3 file
>> # gffread -E merged.gtf -o-
>> ##gff-version 3
>> chr1    Cufflinks       transcript      11869   14409   .       +       .       ID=TCONS_00000001;geneID=XLOC_000001;gene_name=DDX11L1
>> chr1    Cufflinks       exon    11869   12227   .       +       .       Parent=TCONS_00000001
>> chr1    Cufflinks       exon    12613   12721   .       +       .       Parent=TCONS_00000001
>> chr1    Cufflinks       exon    13221   14409   .       +       .       Parent=TCONS_00000001
>> chr1    Cufflinks       transcript      11869   29022   .       +       .       ID=TCONS_00000002;geneID=XLOC_000001;gene_name=DDX11L1
>> 
>> Any suggestions as to why the genes are not loaded from the gff file would be really helpful.
>> 
>> Thank you very much.
>> 
>> Regards,
>> Nirmala
>> _______________________________________________
>> miso-users mailing list
>> miso-users at mit.edu
>> http://mailman.mit.edu/mailman/listinfo/miso-users
> 




More information about the miso-users mailing list