<html><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; ">Hi,<div><br></div><div><div>See replies below:</div><br>On Jul 7, 2013, at 2:04 PM, Gu Mi wrote:<br><br><blockquote type="cite">Dear All:<br><br>I am using MISO to test for differential exon usage between a control and a treatment group. I got an error when computing the insert length distribution using pe_utils.py --compute-insert-len. I list the steps I used below:<br><br>1. sort the BAM file from TopHat (by coordinate):<br>samtools sort control.bam control_sorted<br><br>2. index the BAM file:<br>samtools index control_sorted.bam control_sorted.bai<br><br>3. run pe_utils.py:<br>python pe_utils.py --compute-insert-len controlam /directories/exons/Homo_sapiens.GRCh37.65.min_1000.const_exons.gff --output-dir /directories/insert-dist/<br><br>After the command above, I got the error message:<br><br>Preparing to call bedtools 'tagBam'<br>tagBam -i control.bam -files /directories/exons/Homo_sapiens.GRCh37.65.min_1000.const_exons.gff -labels gff -intervals -f 1 | samtools view - -h | egrep '^@|:gff:' | samtools view - -Shb -o /directories/insert-dist/bam2gff_Homo_sapiens.GRCh37.65.min_1000.const_exons.gff/control.bam<br>[samopen] SAM header is present: 25 sequences.<br>[sam_read1] reference 'ID:TopHat CL:/informatics/tools/Linux-AS5/bin/tophat -o Lane3 -g 1 --coverage-search --microexon -r 100 --phred64-quals --library-type fr-unstranded -p 4 -G gene_models/Homo_sapiens.GRCh37.72_norm.gtf --transcriptome-index=gene_models/transcripts /directories/Genomes/NCBI_Jul-09-2012/Human/bowtie/human_ref_genome Lane3_1.fq.gz Lane3_2.fq.gz VN:1.4.1<br>' is recognized as '*'.<br>[main_samview] truncated file.<br>Traceback (most recent call last):<br> File "/pe_utils.py", line 520, in <module><br> main()<br> File "pe_utils.py", line 517, in main<br> sd_max=sd_max)<br> File "pe_utils.py", line 271, in compute_insert_len<br> output_dir)<br> File "exon_utils.py", line 185, in map_bam2gff<br> raise Exception, "Error: tagBam call failed."<br>Exception: Error: tagBam call failed.<br></blockquote><div><br></div><div>It sounds like tagBam could not map any of your reads to the GFF, which is usually a headers mismatch issue. </div><div><br></div><div>Could you confirm that your BAM file contains Ensembl style headers, and not UCSC ones? What is the output of this command command for you:</div><div><br></div><div>samtools view -H /directories/insert-dist/bam2gff_Homo_sapiens.GRCh37.65.min_1000.const_exons.gff/control.bam</div><div><br></div>Also, what is the output of the following for you?</div><div><br></div><div><div>tagBam -i control.bam -files /directories/exons/Homo_sapiens.GRCh37.65.min_1000.const_exons.gff -labels gff -intervals -f 1 | samtools view - -h | egrep '^@|:gff:'</div><div><br></div><div>If this yields something empty, it means that tagBam could not match your BAM to the GFF, probably because of the chromosome headers mismatch issue that I mentioned. </div><div><br></div><div>Best, --Yarden</div><br><blockquote type="cite"><br>I used Homo_sapiens.GRCh37.72_norm.gtf from Ensembl as the annotation file when preparing my data, but downloaded <br><div><span class="Apple-tab-span" style="white-space:pre">        </span>• Human genome (hg19) alternative events v2.0<br></div>from the MISO website and unzipped. I saw it is based on Homo_sapiens.GRCh37.65. Is this the version problem? If so, could anyone provide the latest GFF3 file for use? Thank you for your suggestions!<br><br>Best,<br>Gu<br>_______________________________________________<br>miso-users mailing list<br><a href="mailto:miso-users@mit.edu">miso-users@mit.edu</a><br>http://mailman.mit.edu/mailman/listinfo/miso-users<br></blockquote><br></div></body></html>