<html><head><meta http-equiv="content-type" content="text/html; charset=utf-8"></head><body dir="auto"><div style="-webkit-text-size-adjust: auto; "><span></span></div><div style="-webkit-text-size-adjust: auto; "><span>Hi Manuel,</span><br><span></span><br><span>You seem to run into similar issues I had. </span><br><span></span><br><span>I am not familiar with your application, yet I may have some answers/ideas for you. </span><br><span></span><br><span>On c1.xlarge you have 8 virtual CPUs per node. So you have 16 cores in your cluster if I read this correctly. And such a machine has 7gb of memory. </span><br><span></span><br><span>The 10gb issue is something I encountered myself. Unless your application is installed on a different volume, then the 10gb restriction may limit the disk space of your temp files. Your runs many just choke because your temp is full. I am not sure about this, yet you can check this by running df during your runs. </span><br><span></span><br><span>If I am correct it seems you have 2 solutions:</span><br><span>1. Install your application on the large volume so the the tmp space will be there as well. </span><br><span>2. Share and mount a big volume to replace the tmp directory of your application. </span><br><span></span><br><span>I posted a few commands to the list that allow doing this, yet they did not reach the archives, probably due to the length of the post. So I am sending you those lines I used to share a bigger volume and run my simulation on it. Perhaps this will work for you - yet you will have to rewrite the lines for you OS if you are not using windows. Yet it is possible your problem is not disk space. </span></div><div style="-webkit-text-size-adjust: auto; "><span><br></span></div><div style="-webkit-text-size-adjust: auto; "><span> Jacob </span></div><div style="-webkit-text-size-adjust: auto; "><br></div><div style="-webkit-text-size-adjust: auto; ">#### from my other post ####</div><div style="-webkit-text-size-adjust: auto; ">Hi Rayson,</div><div style="-webkit-text-size-adjust: auto; "><br></div><div><div dir="ltr"><div><span style="-webkit-text-size-adjust: auto; background-color: rgba(255, 255, 255, 0);">Thanks to your guidance and thanks to Bruce Fields who provided additional guidance and shortcuts I was able to reach a solution for using the larger ephemeral storage that comes with the larger AMI. </span></div><div><span style="-webkit-text-size-adjust: auto; background-color: rgba(255, 255, 255, 0);"> </span></div><div><span style="-webkit-text-size-adjust: auto; background-color: rgba(255, 255, 255, 0);">I am sending the solution that worked for me to this list to close this issue in a way that others can follow in the future.</span></div><div><span style="-webkit-text-size-adjust: auto; background-color: rgba(255, 255, 255, 0);"> </span></div><div><span style="-webkit-text-size-adjust: auto; background-color: rgba(255, 255, 255, 0);">The solution consists of 3 lines of code that should be executed after the cluster was started and is running. The following lines are for a windows cmd and assume there are 20 nodes in the cluster.</span></div><div><span style="-webkit-text-size-adjust: auto; background-color: rgba(255, 255, 255, 0);"> </span></div><div><span style="-webkit-text-size-adjust: auto; background-color: rgba(255, 255, 255, 0);">FOR %n IN (001,002,003,004,005,006,007,008,009,010,011,012,013,014,015,016,017,018,019) DO (starcluster sshmaster mycluster "echo '/mnt/vol0 node%n(async,no_root_squash,no_subtree_check,rw)' >> /etc/exports")</span></div><div><span style="-webkit-text-size-adjust: auto; background-color: rgba(255, 255, 255, 0);"> </span></div><div><span style="-webkit-text-size-adjust: auto; background-color: rgba(255, 255, 255, 0);">starcluster sshmaster mycluster "service nfs start"</span></div><div><span style="-webkit-text-size-adjust: auto; background-color: rgba(255, 255, 255, 0);"> </span></div><div><span style="-webkit-text-size-adjust: auto; background-color: rgba(255, 255, 255, 0);">FOR %n IN (001,002,003,004,005,006,007,008,009,010,011,012,013,014,015,016,017,018,019) DO (starcluster sshnode mycluster node%n "mount master:/mnt/vol0 /mnt/vol0")<br></span></div><div><span style="-webkit-text-size-adjust: auto; background-color: rgba(255, 255, 255, 0);"> </span></div><div><span style="-webkit-text-size-adjust: auto; background-color: rgba(255, 255, 255, 0);">Those 3 lines NFS share /mnt/vol0 on the master with all the nodes. a bash version of these lines should not be hard to write for Linux users. The number of nodes here is handled manually, so this is not the most elegant solution, yet it works and the only solution that seems to easily work at this point in time.</span></div><div><span style="-webkit-text-size-adjust: auto; background-color: rgba(255, 255, 255, 0);"> </span></div><div><span style="-webkit-text-size-adjust: auto; background-color: rgba(255, 255, 255, 0);">There are other solution options that are in the works that may lead to superior solutions for the disk space problem. You can find leads to those I know about in the following links:</span></div><div><span style="-webkit-text-size-adjust: auto; background-color: rgba(255, 255, 255, 0);"> </span></div><div><a href="http://star.mit.edu/cluster/mlarchives/1803.html" style="-webkit-text-size-adjust: auto; background-color: rgba(255, 255, 255, 0);"><font color="#000000">http://star.mit.edu/cluster/mlarchives/1803.html</font></a></div><div><a href="https://github.com/jtriley/StarCluster/issues/44" style="-webkit-text-size-adjust: auto; background-color: rgba(255, 255, 255, 0);"><font color="#000000">https://github.com/jtriley/StarCluster/issues/44</font></a></div><div><span style="-webkit-text-size-adjust: auto; background-color: rgba(255, 255, 255, 0);"> </span></div><div><span style="-webkit-text-size-adjust: auto; background-color: rgba(255, 255, 255, 0);">I hope documenting this solution will help other users</span></div><div><span style="-webkit-text-size-adjust: auto; background-color: rgba(255, 255, 255, 0);"><br></span></div><div><span style="-webkit-text-size-adjust: auto; background-color: rgba(255, 255, 255, 0);">####### end of copied post #######</span></div><div><span style="-webkit-text-size-adjust: auto; background-color: rgba(255, 255, 255, 0);"> </span></div><div style="-webkit-tap-highlight-color: rgba(26, 26, 26, 0.292969); -webkit-composition-fill-color: rgba(175, 192, 227, 0.230469); -webkit-composition-frame-color: rgba(77, 128, 180, 0.230469); -webkit-text-size-adjust: auto; "><br></div></div></div><div style="-webkit-text-size-adjust: auto; "><span></span><br><span>Sent from my iPhone</span><br><span></span><br><span>On Aug 2, 2013, at 3:51 PM, "Manuel J. Torres" <<a href="mailto:mjtorres.phd@gmail.com">mjtorres.phd@gmail.com</a>> wrote:</span><br><span></span><br><blockquote type="cite"><span>I am trying to run tophat software mapping ~38 Gb of RNA-seq reads in fastq format to a reference genome on a 2-node cluster with the following properties:</span><br></blockquote><blockquote type="cite"><span>NODE_IMAGE_ID = ami-999d49f0</span><br></blockquote><blockquote type="cite"><span>NODE_INSTANCE_TYPE = c1.xlarge</span><br></blockquote><blockquote type="cite"><span></span><br></blockquote><blockquote type="cite"><span>Question: How many CPUs are there on this type of cluster?</span><br></blockquote><blockquote type="cite"><span></span><br></blockquote><blockquote type="cite"><span>Here is a df -h listing of my cluster:</span><br></blockquote><blockquote type="cite"><span>root@master:~# df -h</span><br></blockquote><blockquote type="cite"><span>Filesystem Size Used Avail Use% Mounted on</span><br></blockquote><blockquote type="cite"><span>/dev/xvda1 9.9G 9.9G 0 100% /</span><br></blockquote><blockquote type="cite"><span>udev 3.4G 4.0K 3.4G 1% /dev</span><br></blockquote><blockquote type="cite"><span>tmpfs 1.4G 184K 1.4G 1% /run</span><br></blockquote><blockquote type="cite"><span>none 5.0M 0 5.0M 0% /run/lock</span><br></blockquote><blockquote type="cite"><span>none 3.5G 0 3.5G 0% /run/shm</span><br></blockquote><blockquote type="cite"><span>/dev/xvdb1 414G 199M 393G 1% /mnt</span><br></blockquote><blockquote type="cite"><span>/dev/xvdz 99G 96G 0 100% /home/large-data</span><br></blockquote><blockquote type="cite"><span>/dev/xvdy 20G 5.3G 14G 29% /home/genomic-data</span><br></blockquote><blockquote type="cite"><span></span><br></blockquote><blockquote type="cite"><span>I created a third volume for the output that does not appear in this list but is listed in my config file and which I determined I can read and write to. I wrote the output files to this larger empty volume. </span><br></blockquote><blockquote type="cite"><span></span><br></blockquote><blockquote type="cite"><span>I can't get tophat to run to completion. It appears to be generating truncated intermediate files. Here is the tophat output:</span><br></blockquote><blockquote type="cite"><span></span><br></blockquote><blockquote type="cite"><span>[2013-08-01 17:34:19] Beginning TopHat run (v2.0.9)</span><br></blockquote><blockquote type="cite"><span>-----------------------------------------------</span><br></blockquote><blockquote type="cite"><span>[2013-08-01 17:34:19] Checking for Bowtie</span><br></blockquote><blockquote type="cite"><span> Bowtie version: 2.1.0.0</span><br></blockquote><blockquote type="cite"><span>[2013-08-01 17:34:21] Checking for Samtools</span><br></blockquote><blockquote type="cite"><span> Samtools version: 0.1.19.0</span><br></blockquote><blockquote type="cite"><span>[2013-08-01 17:34:21] Checking for Bowtie index files (genome)..</span><br></blockquote><blockquote type="cite"><span>[2013-08-01 17:34:21] Checking for reference FASTA file</span><br></blockquote><blockquote type="cite"><span>[2013-08-01 17:34:21] Generating SAM header for /home/genomic-data/data/Nemve1.allmasked</span><br></blockquote><blockquote type="cite"><span> format: fastq</span><br></blockquote><blockquote type="cite"><span> quality scale: phred33 (default)</span><br></blockquote><blockquote type="cite"><span>[2013-08-01 17:34:27] Reading known junctions from GTF file</span><br></blockquote><blockquote type="cite"><span>[2013-08-01 17:36:56] Preparing reads</span><br></blockquote><blockquote type="cite"><span> left reads: min. length=50, max. length=50, 165174922 kept reads (113024 discarded)</span><br></blockquote><blockquote type="cite"><span>[2013-08-01 18:24:07] Building transcriptome data files..</span><br></blockquote><blockquote type="cite"><span>[2013-08-01 18:26:43] Building Bowtie index from Nemve1.allmasked.fa</span><br></blockquote><blockquote type="cite"><span>[2013-08-01 18:29:01] Mapping left_kept_reads to transcriptome Nemve1.allmasked with Bowtie2</span><br></blockquote><blockquote type="cite"><span>[2013-08-02 07:34:40] Resuming TopHat pipeline with unmapped reads</span><br></blockquote><blockquote type="cite"><span>[bam_header_read] EOF marker is absent. The input is probably truncated.</span><br></blockquote><blockquote type="cite"><span>[bam_header_read] EOF marker is absent. The input is probably truncated.</span><br></blockquote><blockquote type="cite"><span>[2013-08-02 07:34:41] Mapping left_kept_reads.m2g_um to genome Nemve1.allmasked with Bowtie2</span><br></blockquote><blockquote type="cite"><span>[main_samview] truncated file.</span><br></blockquote><blockquote type="cite"><span>[main_samview] truncated file.</span><br></blockquote><blockquote type="cite"><span>[bam_header_read] EOF marker is absent. The input is probably truncated.</span><br></blockquote><blockquote type="cite"><span>[bam_header_read] invalid BAM binary header (this is not a BAM file).</span><br></blockquote><blockquote type="cite"><span>[main_samview] fail to read the header from "/home/results-data/top-results-8-01-2013/topout/tmp/left_kept_reads.m2g\</span><br></blockquote><blockquote type="cite"><span>_um_unmapped.bam".</span><br></blockquote><blockquote type="cite"><span>[2013-08-02 07:34:54] Retrieving sequences for splices</span><br></blockquote><blockquote type="cite"><span>[2013-08-02 07:35:16] Indexing splices</span><br></blockquote><blockquote type="cite"><span>Warning: Empty fasta file: '/home/results-data/top-results-8-01-2013/topout/tmp/segment_juncs.fa'</span><br></blockquote><blockquote type="cite"><span>Warning: All fasta inputs were empty</span><br></blockquote><blockquote type="cite"><span>Error: Encountered internal Bowtie 2 exception (#1)</span><br></blockquote><blockquote type="cite"><span>Command: /home/genomic-data/bin/bowtie2-2.1.0/bowtie2-build /home/results-data/top-results-8-01-2013/topout/tmp/segm\</span><br></blockquote><blockquote type="cite"><span>ent_juncs.fa /home/results-data/top-results-8-01-2013/topout/tmp/segment_juncs</span><br></blockquote><blockquote type="cite"><span> [FAILED]</span><br></blockquote><blockquote type="cite"><span>Error: Splice sequence indexing failed with err =1</span><br></blockquote><blockquote type="cite"><span></span><br></blockquote><blockquote type="cite"><span>Questions:</span><br></blockquote><blockquote type="cite"><span></span><br></blockquote><blockquote type="cite"><span>Am I running out of memory?</span><br></blockquote><blockquote type="cite"><span></span><br></blockquote><blockquote type="cite"><span>How much RAM does the AMI have and can I make that larger?</span><br></blockquote><blockquote type="cite"><span></span><br></blockquote><blockquote type="cite"><span>No matter what configuration starcluster I define, I can't seem to make my root directory larger that 10Gb and it appears to full. </span><br></blockquote><blockquote type="cite"><span></span><br></blockquote><blockquote type="cite"><span>Can I make the root directory larger that 10GB?</span><br></blockquote><blockquote type="cite"><span></span><br></blockquote><blockquote type="cite"><span>Thanks!</span><br></blockquote><blockquote type="cite"><span></span><br></blockquote><blockquote type="cite"><span>-- </span><br></blockquote><blockquote type="cite"><span>Manuel J Torres, PhD</span><br></blockquote><blockquote type="cite"><span>219 Brannan Street Unit 6G</span><br></blockquote><blockquote type="cite"><span>San Francisco, CA 94107</span><br></blockquote><blockquote type="cite"><span>VOICE: 415-656-9548</span><br></blockquote><blockquote type="cite"><span>_______________________________________________</span><br></blockquote><blockquote type="cite"><span>StarCluster mailing list</span><br></blockquote><blockquote type="cite"><span><a href="mailto:StarCluster@mit.edu">StarCluster@mit.edu</a></span><br></blockquote><blockquote type="cite"><span><a href="http://mailman.mit.edu/mailman/listinfo/starcluster">http://mailman.mit.edu/mailman/listinfo/starcluster</a></span><br></blockquote></div></body></html>