Hello,
I am performing a metagenome assembly with PacBio HiFi data and have run into a repeatable error. I am using canu 2.1 with:canu -d STD -p Zymo6331-STD -pacbio-hifi Zymo6331-STD.fastq genomeSize=100m maxInputCoverage=1000 batMemory=200
I am running on SGE with the same configuration I have used to run other assemblies successfully (gridEngineResourceOption="-pe smp THREADS -l mem_free=MEMORY"
and gridOptions="-V -S /bin/bash -q bigmem"
).
I am working with three samples, each run using the same set of arguments as above (but with different -d
, -p
, and fastq names). Two finished without any issues. For one sample, the run eventually fails. I scanned the error message and it suggested trying to run it again. I did, and it produced the same error again. It suggests the issue occurs when running Bogart:
Found perl:
/bin/perl
This is perl 5, version 16, subversion 3 (v5.16.3) built for x86_64-linux-thread-multi
Found java:
/mnt/software/j/jre/1.8.0_144/bin/java
java version "1.8.0_144"
Found canu:
/mnt/software/c/canu/2.1/bin/canu
canu 2.1
-- canu 2.1
--
-- CITATIONS
--
-- For assemblies of PacBio HiFi reads:
-- Nurk S, Walenz BP, Rhiea A, Vollger MR, Logsdon GA, Grothe R, Miga KH, Eichler EE, Phillippy AM, Koren S.
-- HiCanu: accurate assembly of segmental duplications, satellites, and allelic variants from high-fidelity long reads.
-- biorXiv. 2020.
-- https://doi.org/10.1101/2020.03.14.992248
--
-- Read and contig alignments during correction and consensus use:
-- Šošic M, Šikic M.
-- Edlib: a C/C ++ library for fast, exact sequence alignment using edit distance.
-- Bioinformatics. 2017 May 1;33(9):1394-1395.
-- http://doi.org/10.1093/bioinformatics/btw753
--
-- Overlaps are generated using:
-- Berlin K, et al.
-- Assembling large genomes with single-molecule sequencing and locality-sensitive hashing.
-- Nat Biotechnol. 2015 Jun;33(6):623-30.
-- http://doi.org/10.1038/nbt.3238
--
-- Myers EW, et al.
-- A Whole-Genome Assembly of Drosophila.
-- Science. 2000 Mar 24;287(5461):2196-204.
-- http://doi.org/10.1126/science.287.5461.2196
--
-- Contig consensus sequences are generated using an algorithm derived from pbdagcon:
-- Chin CS, et al.
-- Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data.
-- Nat Methods. 2013 Jun;10(6):563-9
-- http://doi.org/10.1038/nmeth.2474
--
-- CONFIGURE CANU
--
-- Detected Java(TM) Runtime Environment '1.8.0_144' (from '/mnt/software/j/jre/1.8.0_144/bin/java') with -d64 support.
-- Detected gnuplot version '4.6 patchlevel 2 ' (from 'gnuplot') and image format 'png'.
-- Detected 72 CPUs and 504 gigabytes of memory.
-- Detected Sun Grid Engine in '/usr/share/gridengine/default'.
-- User supplied Parallel Environment 'smp'.
-- User supplied Memory Resource 'mem_free'.
--
-- Found 17 hosts with 80 cores and 754 GB memory under Sun Grid Engine control.
-- Found 1 host with 48 cores and 440 GB memory under Sun Grid Engine control.
-- Found 4 hosts with 24 cores and 47 GB memory under Sun Grid Engine control.
-- Found 18 hosts with 48 cores and 251 GB memory under Sun Grid Engine control.
-- Found 14 hosts with 16 cores and 46 GB memory under Sun Grid Engine control.
-- Found 14 hosts with 80 cores and 755 GB memory under Sun Grid Engine control.
-- Found 1 host with 24 cores and 39 GB memory under Sun Grid Engine control.
-- Found 14 hosts with 56 cores and 503 GB memory under Sun Grid Engine control.
-- Found 2 hosts with 24 cores and 31 GB memory under Sun Grid Engine control.
-- Found 14 hosts with 72 cores and 503 GB memory under Sun Grid Engine control.
-- Found 13 hosts with 24 cores and 94 GB memory under Sun Grid Engine control.
-- Found 26 hosts with 48 cores and 503 GB memory under Sun Grid Engine control.
-- Found 4 hosts with 256 cores and 503 GB memory under Sun Grid Engine control.
-- Found 2 hosts with 16 cores and 94 GB memory under Sun Grid Engine control.
-- Found 1 host with 48 cores and 188 GB memory under Sun Grid Engine control.
-- Found 1 host with 24 cores and 141 GB memory under Sun Grid Engine control.
--
-- (tag)Threads
-- (tag)Memory |
-- (tag) | | algorithm
-- ------- ---------- -------- -----------------------------
-- Grid: meryl 13.000 GB 8 CPUs (k-mer counting)
-- Grid: hap 10.000 GB 8 CPUs (read-to-haplotype assignment)
-- Grid: cormhap 10.000 GB 8 CPUs (overlap detection with mhap)
-- Grid: obtovl 8.000 GB 8 CPUs (overlap detection)
-- Grid: utgovl 8.000 GB 8 CPUs (overlap detection)
-- Grid: cor 16.000 GB 4 CPUs (read correction)
-- Grid: ovb 4.000 GB 1 CPU (overlap store bucketizer)
-- Grid: ovs 8.000 GB 1 CPU (overlap store sorting)
-- Grid: red 8.000 GB 4 CPUs (read error detection)
-- Grid: oea 8.000 GB 1 CPU (overlap error adjustment)
-- Grid: bat 200.000 GB 8 CPUs (contig construction with bogart)
-- Grid: cns -.--- GB 8 CPUs (consensus)
--
-- In 'Zymo6331-ULI.seqStore', found PacBio HiFi reads:
-- PacBio HiFi: 1
--
-- Corrected: 1
-- Corrected and Trimmed: 1
--
-- Generating assembly 'Zymo6331-ULI' in '/dept/appslab/projects/old/2020/dp_Zymo/HiCanu_Assembly/ULI':
-- - assemble HiFi reads.
--
-- Parameters:
--
-- genomeSize 100000000
--
-- Overlap Generation Limits:
-- corOvlErrorRate 0.0000 ( 0.00%)
-- obtOvlErrorRate 0.0250 ( 2.50%)
-- utgOvlErrorRate 0.0100 ( 1.00%)
--
-- Overlap Processing Limits:
-- corErrorRate 0.0000 ( 0.00%)
-- obtErrorRate 0.0250 ( 2.50%)
-- utgErrorRate 0.0100 ( 1.00%)
-- cnsErrorRate 0.0500 ( 5.00%)
--
--
-- BEGIN ASSEMBLY
--
--
-- Bogart failed, tried 2 times, giving up.
--
ABORT:
ABORT: canu 2.1
ABORT: Don't panic, but a mostly harmless error occurred and Canu stopped.
ABORT: Try restarting. If that doesn't work, ask for help.
ABORT:
ABORT: Disk space available: 91264.777 GB
ABORT:
ABORT: Last 50 lines of the relevant log file (unitigging/4-unitigger/unitigger.err):
ABORT:
ABORT: mergeOrphans()-- ignored 0 tigs with 0 reads; failed to place
ABORT: mergeOrphans()--
ABORT:
ABORT: ==> MARK SIMPLE BUBBLES.
ABORT: using 0.010000 user-specified threshold
ABORT:
ABORT:
ABORT: findPotentialOrphans()-- working on 4954 tigs.
ABORT: findPotentialOrphans()-- found 3866 potential orphans.
ABORT: mergeOrphans()-- flagged 3691 bubble tigs with 550991 reads
ABORT: mergeOrphans()-- placed 0 unique orphan tigs with 0 reads
ABORT: mergeOrphans()-- shattered 0 repeat orphan tigs with 0 reads
ABORT: mergeOrphans()-- ignored 1 tigs with 38 reads; failed to place
ABORT: mergeOrphans()--
ABORT: classifyAsUnassembled()-- 0 tigs 0 bases -- singleton
ABORT: classifyAsUnassembled()-- 0 tigs 0 bases -- too few reads (< 2 reads)
ABORT: classifyAsUnassembled()-- 0 tigs 0 bases -- too short (< 0 bp)
ABORT: classifyAsUnassembled()-- 0 tigs 0 bases -- single spanning read (> 1.000000 tig length)
ABORT: classifyAsUnassembled()-- 58 tigs 687914 bases -- low coverage (> 0.500000 tig length at < 3 coverage)
ABORT: classifyAsUnassembled()-- 4758 tigs 111950257 bases -- acceptable contigs
ABORT:
ABORT:
ABORT: ==> GENERATING ASSEMBLY GRAPH.
ABORT:
ABORT: computeErrorProfiles()-- Computing error profiles for 4954 tigs, with 8 threads.
ABORT: computeErrorProfiles()-- Finished.
ABORT:
ABORT: AssemblyGraph()-- allocating vectors for placements, 117.061MB
ABORT: AssemblyGraph()-- finding edges for 885032 reads (821084 contained), ignoring 1672196 unplaced reads, with 8 threads.
ABORT: AssemblyGraph()-- building reverse edges.
ABORT: AssemblyGraph()-- build complete.
ABORT:
ABORT: ==> BREAK REPEATS.
ABORT:
ABORT: computeErrorProfiles()-- Computing error profiles for 4954 tigs, with 8 threads.
ABORT: computeErrorProfiles()-- Finished.
ABORT: bogart: bogart/AS_BAT_MarkRepeatReads.C:887: std::vector<breakReadEnd> buildBreakPoints(TigVector&, Unitig*, intervalList<int>&, intervalList<int>&, std::vector<confusedEdge>&): Assertion `isRepeat == true' failed.
ABORT:
ABORT: Failed with 'Aborted'; backtrace (libbacktrace):
ABORT: utility/src/utility/system-stackTrace.C::83 in _Z17AS_UTL_catchCrashiP7siginfoPv()
ABORT: (null)::0 in (null)()
ABORT: (null)::0 in (null)()
ABORT: (null)::0 in (null)()
ABORT: (null)::0 in (null)()
ABORT: (null)::0 in (null)()
ABORT: bogart/AS_BAT_MarkRepeatReads.C::887 in _Z16buildBreakPointsR9TigVectorP6UnitigR12intervalListIiES5_RSt6vectorI12confusedEdgeSaIS7_EE()
ABORT: bogart/AS_BAT_MarkRepeatReads.C::1051 in _Z15markRepeatReadsP13AssemblyGraphR9TigVectordjdRSt6vectorI12confusedEdgeSaIS4_EE()
ABORT: bogart/bogart.C::656 in main()
ABORT: (null)::0 in (null)()
ABORT: (null)::0 in (null)()
ABORT:
Any suggestions as to what might be going wrong here? I have attached the unitigger.err file here too (unitigger.err.txt), in case that helps.
Thanks!
Dan
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4