PEP 574 (scheduled for Python 3.8) introduces pickle protocol 5 with support for no-copy pickling of large mutable buffers.
I made a small proof-of-concept benchmark script using @pitrou's pickle5 backport of his draft implementation of PEP 547.
See: https://gist.github.com/ogrisel/a2b0e5ae4987a398caa7f9277cb3b90a
The meat lies in the following reducer:
from pickle5 import PickleBuffer def _array_from_buffer(buffer, dtype, shape): return np.frombuffer(buffer, dtype=dtype).reshape(shape) def reduce_ndarray_pickle5(a): # This reducer assumes protocol 5 as currently there is no way to register # protocol-aware reduce function in the global copyreg dispatch table. if not a.dtype.hasobject and a.flags.c_contiguous: # No-copy pickling for C-contiguous arrays and protocol 5 return _array_from_buffer, (PickleBuffer(a), a.dtype, a.shape), None else: # Fall-back to generic method return a.__reduce__()
This works as expected (no extra copy when dumping and loading) and also fixes the in-memory speed overhead reported in by @mrocklin in #7544.
To get this in numpy, we would need to make a protocol-aware reduce function that is, have ndarray
implement a __reduce_ex__
method that accepts a protocol
argument instead of the existing bytes
-based implementation from array_reduce
in https://github.com/numpy/numpy/blob/master/numpy/core/src/multiarray/methods.c#L1577. This bytes-based implementation should probably be kept as a fallback when protocol < 5
.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4