[miso-users] problem with GFF file
Akula, Nirmala (NIH/NIMH) [C]
akulan at mail.nih.gov
Mon Jul 20 12:28:14 EDT 2015
Is there a way to format this file to add the gene annotation?
Thank you very much.
Nirmala
-----Original Message-----
From: Yarden Katz [mailto:yarden.katz at gmail.com] On Behalf Of Yarden Katz
Sent: Monday, July 20, 2015 12:03 PM
To: Akula, Nirmala (NIH/NIMH) [C]
Cc: miso-users at mit.edu
Subject: Re: [miso-users] problem with GFF file
Hi Nirmala,
That is the problem then. MISO has no way of knowing which transcripts go with what genes without this unit. The GFF3 format is hierarchical describes genes as parent nodes that have "mRNA" entries as their children, with each "mRNA" entry having "exon" nodes as children. The "transcript" entry can be a substitute for "mRNA", but "gene" entries are required.
Best, Yarden
On Jul 20, 2015, at 11:50 AM, Akula, Nirmala (NIH/NIMH) [C] <akulan at mail.nih.gov> wrote:
> Hi,
>
> The GFF file has only "transcript and "exon" entries. No "genes".
>
> Thanks,
> Nirmala
>
> -----Original Message-----
> From: Yarden Katz [mailto:yarden.katz at gmail.com] On Behalf Of Yarden Katz
> Sent: Friday, July 17, 2015 5:55 PM
> To: Akula, Nirmala (NIH/NIMH) [C]
> Cc: miso-users at mit.edu
> Subject: Re: [miso-users] problem with GFF file
>
> Hi,
>
> Does your GFF file contain "gene" entries, or just "transcript" and "exon" entries?
>
> The "gene" entries are used to determine genes.
>
> Yarden
>
> On Jul 17, 2015, at 5:16 PM, Akula, Nirmala (NIH/NIMH) [C] <akulan at mail.nih.gov> wrote:
>
>> Hi,
>>
>> I converted GTF file (generated by Cufflinks) to GFF3 format using Cufflinks using the following command:
>>
>> gffread -E merged.gtf -o- >merged_gtfToGff3.gff3
>>
>> When I try to index merged_gtfToGff3.gff3 file using MISO I see that 0 genes were loaded and genes.gff file is 0.
>>
>> [akulan at helix stringtieGtfs_v1-0-3_cuffmerged]$ index_gff --index merged_gtfToGff3.gff3 mergedIndexedGff/
>> Indexing GFF...
>> - GFF: /akulan/merged_gtfToGff3.gff3
>> - Outputting to: / akulan/mergedIndexedGff
>> Loaded 0 genes
>> - Loading of genes from GFF took 199.95 seconds
>> Outputting gene records in GFF format...
>> - Output file: /gpfs/gsfs4/users/akulan/transcriptome/stringtie/stringtieGtfs_v1-0-3_cuffmerged/mergedIndexedGff/genes.gff
>> - Serialization of genes from GFF took 13.24 seconds
>> Indexing of GFF took 213.18 seconds.
>>
>> Here are the top 5 lines from the gff3 file
>> # gffread -E merged.gtf -o-
>> ##gff-version 3
>> chr1 Cufflinks transcript 11869 14409 . + . ID=TCONS_00000001;geneID=XLOC_000001;gene_name=DDX11L1
>> chr1 Cufflinks exon 11869 12227 . + . Parent=TCONS_00000001
>> chr1 Cufflinks exon 12613 12721 . + . Parent=TCONS_00000001
>> chr1 Cufflinks exon 13221 14409 . + . Parent=TCONS_00000001
>> chr1 Cufflinks transcript 11869 29022 . + . ID=TCONS_00000002;geneID=XLOC_000001;gene_name=DDX11L1
>>
>> Any suggestions as to why the genes are not loaded from the gff file would be really helpful.
>>
>> Thank you very much.
>>
>> Regards,
>> Nirmala
>> _______________________________________________
>> miso-users mailing list
>> miso-users at mit.edu
>> http://mailman.mit.edu/mailman/listinfo/miso-users
>
More information about the miso-users
mailing list