Hi,
it happened to me that overlapInCorePartition
generated too many partitions and a 67GB large correction/1-overlapper/overlap.sh
shellscript, practically just huge if/else code which a call to perl utility afterwards. Interpreting this file takes several CPU cores.
$ canu/build/bin/overlapInCorePartition
ERROR: Hash length (-hl) must be specified.
ERROR: Reference length (-rl) must be specified.
ERROR: seqStore (-S) must be supplied.
usage: /storage/plzen1/home/mmokrejs/apps/canu/build/bin/overlapInCorePartition [opts]
Someone should write the command line help.
But this is only used interally to canu, so...
$
When canu configured itself, it logged:
-- (tag)Concurrency
-- (tag)Threads |
-- (tag)Memory | |
-- (tag) | | | total usage algorithm
-- ------- ---------- -------- -------- -------------------- -----------------------------
-- Local: meryl 64.000 GB 8 CPUs x 16 jobs 1024.000 GB 128 CPUs (k-mer counting)
-- Local: hap 16.000 GB 64 CPUs x 2 jobs 32.000 GB 128 CPUs (read-to-haplotype assignment)
-- Local: corovl 8.000 GB 1 CPU x 128 jobs 1024.000 GB 128 CPUs (overlap detection)
-- Local: obtovl 24.000 GB 16 CPUs x 8 jobs 192.000 GB 128 CPUs (overlap detection)
-- Local: utgovl 24.000 GB 16 CPUs x 8 jobs 192.000 GB 128 CPUs (overlap detection)
-- Local: cor 24.000 GB 4 CPUs x 32 jobs 768.000 GB 128 CPUs (read correction)
-- Local: ovb 4.000 GB 1 CPU x 128 jobs 512.000 GB 128 CPUs (overlap store bucketizer)
-- Local: ovs 32.000 GB 1 CPU x 62 jobs 1984.000 GB 62 CPUs (overlap store sorting)
-- Local: red 64.000 GB 8 CPUs x 16 jobs 1024.000 GB 128 CPUs (read error detection)
-- Local: oea 8.000 GB 1 CPU x 128 jobs 1024.000 GB 128 CPUs (overlap error adjustment)
-- Local: bat 1024.000 GB 64 CPUs x 1 job 1024.000 GB 64 CPUs (contig construction with bogart)
-- Local: cns -.--- GB 8 CPUs x - jobs -.--- GB - CPUs (consensus)
Then, it did its math:
-- OVERLAPPER (normal) (correction) erate=0.32
--
----------------------------------------
-- Starting command on Sat Feb 13 23:39:51 2021 with 270611.71 GB free disk space
cd correction/1-overlapper
/auto/plzen1/home/mmokrejs/apps/canu-2.1.1/build/bin/overlapInCorePartition \
-S ../../my_genome.seqStore \
-hl 2500000 \
-rl 2000000 \
-ol 500 \
-o ./my_genome.partition \
> ./my_genome.partition.err 2>&1
-- Finished on Sat Feb 13 23:56:51 2021 (1020 seconds) with 270368.935 GB free disk space
----------------------------------------
--
-- Configured 492959931 overlapInCore jobs.
-- Finished stage 'cor-overlapConfigure', reset canuIteration.
--
-- Running jobs. First attempt out of 2.
----------------------------------------
-- Starting 'corovl' concurrent execution on Sun Feb 14 04:54:43 2021 with 269805.128 GB free disk space (492959931 processes; 128 concurrently)
I think for sure a roadblock should be installed somewhere so this huge piece of code should be prevented from further processing. Than we can think of what to change in the code (dunno where).
BTW, the if
and fi
should at least be turned into if
and elif
.
if [ $jobid -eq 90 ] ; then
bat="001"
job="001/000090"
opt="-h 5595-5812 -r 752-1503 --hashdatalen 2525793"
fi
if [ $jobid -eq 91 ] ; then
bat="001"
job="001/000091"
opt="-h 5595-5812 -r 1504-2250 --hashdatalen 2525793"
fi
if [ $jobid -eq 92 ] ; then
bat="001"
job="001/000092"
opt="-h 5595-5812 -r 2251-2770 --hashdatalen 2525793"
fi
if [ $jobid -eq 93 ] ; then
bat="001"
job="001/000093"
opt="-h 5595-5812 -r 2771-3197 --hashdatalen 2525793"
fi
if [ $jobid -eq 94 ] ; then
bat="001"
job="001/000094"
opt="-h 5595-5812 -r 3198-3562 --hashdatalen 2525793"
fi
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4