A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://github.com/lh3/minimap2/issues/855 below:

High memory use when using Python and threads · Issue #855 · lh3/minimap2 · GitHub

The program align.py uses mappy to align reads in Python using multiple worker threads. After loading the index the memory usage jumps up quickly to >20Gb and then continues to climb steadily through 40Gb an beyond.

This issue was first discovered in bonito and isolated to mappy. The data flow in the example mirrors that in bonito but reduced to using only Python stdlib functionality.

mappy: v2.24
pysam: v0.18 (just for optionally reading fastq inputs)
python: v3.8.6

Run program, creating query sequences from index on the fly

python align.py GCA_000001405.15_GRCh38_no_alt_analysis_set.fasta.mmi --threads 48

or using a directory containing *.fastq* files:

python align.py GCA_000001405.15_GRCh38_no_alt_analysis_set.fasta.mmi --fastq_dir FAQ32498 --threads 48

The inputs I am using are available in the AWS S3 bucket at:

s3://ont-research/misc/mappy-mem/FAQ32498.tar
s3://ont-research/misc/mappy-mem/GCA_000001405.15_GRCh38_no_alt_analysis_set.fasta.mmi

I've not fully ascertained if using lots of threads exacerbates the problem or simply makes the symptom apparent more quickly.


RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4