asyncpg is a new fully-featured open-source Python client library for PostgreSQL. It is built specifically for asyncio and Python 3.5 async
/ await
. asyncpg is the fastest driver among common Python, NodeJS and Go implementations.
We are building EdgeDBâthe next generation object database with PostgreSQL as a backing store. We need high-performance, low-latency access to the advanced features of PostgreSQL.
The most obvious option was psycopg2âthe most popular Python driver for PostgreSQL. It is well-supported, stable, proven technology. There is also aiopg, which provides async interface on top of psycopg2. With that there is an obvious question: why reinvent the wheel? Short answer is twofold: features and performance. We will cover each item in detail below.
Data Type SupportOur biggest gripe with psycopg2 is its mediocre support for handling PostgreSQL data types, especially arrays and composite types. Rich data type system is one of the hallmarks of PostgreSQL. And yet, out of the box psycopg2 only supports simple builtin types like integers, strings, and timestamps, forcing the users to write custom âtypecastersâ for everything else. This is cumbersome and inefficient.
The reason is fundamental: psycopg2 exchanges data with the database server in text format. This necessitates a non-trivial amount of parsing, especially so for complex types.
Unlike psycopg2, asyncpg implements PostgreSQL binary I/O protocol, which, aside from performance benefits, allows for generic support of container types (arrays, composites and ranges).
Prepared Statementsasyncpg extensively uses PostgreSQL prepared statements. This is an important optimization feature, as it allows to avoid repeated parsing, analysis, and planning of queries. Additionally, asyncpg caches the data I/O pipeline for each prepared statement.
Prepared statements in asyncpg can be created and used explicitly. They provide an API to fetch and introspect query results. Most query methods are also exposed on the connection object directly, and asyncpg will create and cache a prepared statement implicitly.
Ease of DeploymentAnother important feature of asyncpg is that it has zero dependencies. Direct implementation of PostgreSQL protocol means that there is no need for libpq to be installed, and you can just pip install asyncpg
. Additionally, we provide binary wheels for Linux and macOS (Windows support is planned.)
It soon became evident that by implementing PostgreSQL frontend/backend protocol directly, we can yield significant speed improvements. Our earlier experience with uvloop has shown that Cython can be used to build very efficient libraries. asyncpg is written almost entirely in Cython with careful buffer management and highly optimized data decoding pipeline.
The result is that asyncpg is, on average, at least 3x faster than psycopg2 (or aiopg). This result is remarkable, as psycopg2, written in C and optimized, is not slow at all. See the benchmarks section below for more.
Similarly to uvloop, we created a standalone toolbench to measure and report the performance of asyncpg and other PostgreSQL driver implementations. We measured query throughput (in rows per second) and latency. The main purpose of this benchmark is to measure the driver overhead.
For fairness, all tests were run in a single-thread (GOMAXPROCS=1
for Go code) in async mode. Python drivers were run under uvloop.
The benchmark results featured in this post were obtained from a bare-metal server with the following setup:
Driver Implementations:
Each test consisted of running queries in a tight loop with 8 concurrent connections to the database server for 10 seconds, with 5 second warmup time.
The charts show the geometric average of results obtained by running four types of queries:
pg_type
table (~350Â rows). This is relatively close to an average application query. The purpose is to test general data decoding performance. This is the titular benchmark, on which asyncpg achieves 1M rows/s. See details.We firmly believe that high-performance and scalable systems in Python are possible. For that we need to put maximum effort into making fast, high-quality drivers, event loops, and frameworks.
asyncpg is another step in that direction. It is the result of careful design fuelled by our experience creating uvloop and using Cython and asyncio efficiently.
Previous post: uvloop: Blazing fast Python networking.
magicstack is a Toronto-based team of software engineers. Feel free to drop us a line at hello@magic.io.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4