A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://github.com/duckdb/duckdb/issues/3969 below:

OOM when reading Parquet file · Issue #3969 · duckdb/duckdb · GitHub

What happens?

It is using all available memory and is terminated by OOM.

To Reproduce

Allocate a machine with 32 GB RAM, like c6a.4xlarge on AWS, with Ubuntu 22.04.
ssh into that machine.
Run the following commands:

sudo apt-get update
sudo apt-get install python3-pip
pip install duckdb
wget 'https://datasets.clickhouse.com/hits_compatible/hits.parquet'

Create the following run.py file:

#!/usr/bin/env python3

import duckdb
import timeit

con = duckdb.connect(database='my-db.duckdb', read_only=False)

print("Will load the data")

start = timeit.timeit()
con.execute("CREATE TABLE hits AS SELECT * FROM parquet_scan('hits.parquet')")
end = timeit.timeit()
print(end - start)

Make it executable:

Run it:

Wait around 10 minutes...

Will load the data
Killed
Environment (please complete the following information): Identity Disclosure:

With OOM it cannot qualify in the ClickHouse benchmark.

Before Submitting

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4