Add a new PyBytesWriter
C API to create bytes
objects.
Soft deprecate PyBytes_FromStringAndSize(NULL, size)
and _PyBytes_Resize()
APIs. These APIs treat an immutable bytes
object as a mutable object. They remain available and maintained, don’t emit deprecation warning, but are no longer recommended when writing new code.
Creating a Python bytes
object using PyBytes_FromStringAndSize(NULL, size)
and _PyBytes_Resize()
treats an immutable bytes
object as mutable. It goes against the principle that bytes
objects are immutable. It also creates an incomplete or “invalid” object since bytes are not initialized. In Python, a bytes
object should always have its bytes fully initialized.
When creating a bytes string and the output size is unknown, one strategy is to allocate a short buffer and extend it (to the exact size) each time a larger write is needed.
This strategy is inefficient because it requires enlarging the buffer multiple times. It’s more efficient to overallocate the buffer the first time that a larger write is needed. It reduces the number of expensive realloc()
operations which can imply a memory copy.
bytes
writer instance created by PyBytesWriter_Create()
.
The instance must be destroyed by PyBytesWriter_Finish()
or PyBytesWriter_Discard()
.
PyBytesWriter
to write size bytes.
If size is greater than zero, allocate size bytes, and set the writer size to size. The caller is responsible to write size bytes using PyBytesWriter_GetData()
.
On error, set an exception and return NULL.
size must be positive or zero.
PyBytesWriter
created by PyBytesWriter_Create()
.
On success, return a Python bytes
object. On error, set an exception and return NULL
.
The writer instance is invalid after the call in any case.
PyBytesWriter_Finish()
, but resize the writer to size bytes before creating the bytes
object.
PyBytesWriter_Finish()
, but resize the writer using buf pointer before creating the bytes
object.
Set an exception and return NULL
if buf pointer is outside the internal buffer bounds.
Function pseudo-code:
Py_ssize_t size = (char*)buf - (char*)PyBytesWriter_GetData(writer); return PyBytesWriter_FinishWithSize(writer, size);
PyBytesWriter
created by PyBytesWriter_Create()
.
Do nothing if writer is NULL
.
The writer instance is invalid after the call.
If size is equal to -1
, call strlen(bytes)
to get the string length.
On success, return 0
. On error, set an exception and return -1
.
PyBytes_FromFormat()
, but write the output directly at the writer end. Grow the writer internal buffer on demand. Then add the written size to the writer size.
On success, return 0
. On error, set an exception and return -1
.
The pointer is valid until PyBytesWriter_Finish()
or PyBytesWriter_Discard()
is called on writer.
Newly allocated bytes are left uninitialized.
On success, return 0
. On error, set an exception and return -1
.
size must be positive or zero.
Newly allocated bytes are left uninitialized.
On success, return 0
. On error, set an exception and return -1
.
size can be negative to shrink the writer.
PyBytesWriter_Grow()
, but update also the buf pointer.
The buf pointer is moved if the internal buffer is moved in memory. The buf relative position within the internal buffer is left unchanged.
On error, set an exception and return NULL
.
buf must not be NULL
.
Function pseudo-code:
Py_ssize_t pos = (char*)buf - (char*)PyBytesWriter_GetData(writer); if (PyBytesWriter_Grow(writer, size) < 0) { return NULL; } return (char*)PyBytesWriter_GetData(writer) + pos;
PyBytesWriter_Resize()
and PyBytesWriter_Grow()
overallocate the internal buffer to reduce the number of realloc()
calls and so reduce memory copies.
PyBytesWriter_Finish()
trims overallocations: it shrinks the internal buffer to the exact size when creating the final bytes
object.
The API is not thread safe: a writer should only be used by a single thread at the same time.
Soft deprecationsSoft deprecate PyBytes_FromStringAndSize(NULL, size)
and _PyBytes_Resize()
APIs. These APIs treat an immutable bytes
object as a mutable object. They remain available and maintained, don’t emit deprecation warning, but are no longer recommended when writing new code.
PyBytes_FromStringAndSize(str, size)
is not soft deprecated. Only calls with NULL
str are soft deprecated.
Create the bytes string b"Hello World!"
:
PyObject* hello_world(void) { PyBytesWriter *writer = PyBytesWriter_Create(0); if (writer == NULL) { goto error; } if (PyBytesWriter_WriteBytes(writer, "Hello", -1) < 0) { goto error; } if (PyBytesWriter_Format(writer, " %s!", "World") < 0) { goto error; } return PyBytesWriter_Finish(writer); error: PyBytesWriter_Discard(writer); return NULL; }Create the bytes string “abc”
Example creating the bytes string b"abc"
, with a fixed size of 3 bytes:
PyObject* create_abc(void) { PyBytesWriter *writer = PyBytesWriter_Create(3); if (writer == NULL) { return NULL; } char *str = PyBytesWriter_GetData(writer); memcpy(str, "abc", 3); return PyBytesWriter_Finish(writer); }
GrowAndUpdatePointer()
example
Example using a pointer to write bytes and to track the written size.
Create the bytes string b"Hello World"
:
PyObject* grow_example(void) { // Allocate 10 bytes PyBytesWriter *writer = PyBytesWriter_Create(10); if (writer == NULL) { return NULL; } // Write some bytes char *buf = PyBytesWriter_GetData(writer); memcpy(buf, "Hello ", strlen("Hello ")); buf += strlen("Hello "); // Allocate 10 more bytes buf = PyBytesWriter_GrowAndUpdatePointer(writer, 10, buf); if (buf == NULL) { PyBytesWriter_Discard(writer); return NULL; } // Write more bytes memcpy(buf, "World", strlen("World")); buf += strlen("World"); // Truncate the string at 'buf' position // and create a bytes object return PyBytesWriter_FinishWithPointer(writer, buf); }Update
PyBytes_FromStringAndSize()
code
Example of code using the soft deprecated PyBytes_FromStringAndSize(NULL, size)
API:
PyObject *result = PyBytes_FromStringAndSize(NULL, num_bytes); if (result == NULL) { return NULL; } if (copy_bytes(PyBytes_AS_STRING(result), start, num_bytes) < 0) { Py_CLEAR(result); } return result;
It can now be updated to:
PyBytesWriter *writer = PyBytesWriter_Create(num_bytes); if (writer == NULL) { return NULL; } if (copy_bytes(PyBytesWriter_GetData(writer), start, num_bytes) < 0) { PyBytesWriter_Discard(writer); return NULL; } return PyBytesWriter_Finish(writer);Update
_PyBytes_Resize()
code
Example of code using the soft deprecated _PyBytes_Resize()
API:
PyObject *v = PyBytes_FromStringAndSize(NULL, size); if (v == NULL) { return NULL; } char *p = PyBytes_AS_STRING(v); // ... fill bytes into 'p' ... if (_PyBytes_Resize(&v, (p - PyBytes_AS_STRING(v)))) { return NULL; } return v;
It can now be updated to:
PyBytesWriter *writer = PyBytesWriter_Create(size); if (writer == NULL) { return NULL; } char *p = PyBytesWriter_GetData(writer); // ... fill bytes into 'p' ... return PyBytesWriter_FinishWithPointer(writer, p);Reference Implementation
Notes on the CPython reference implementation which are not part of the Specification:
bytes
object, so PyBytesWriter_Finish()
just returns the object without having to copy memory.bytes
object which is inefficient. At the end, PyBytesWriter_Finish()
creates the bytes
object from this small buffer.PyBytesWriter
on the heap memory.There is no impact on the backward compatibility, only new APIs are added.
PyBytes_FromStringAndSize(NULL, size)
and _PyBytes_Resize()
APIs are soft deprecated. No new warnings is emitted when these functions are used and they are not planned for removal.
_PyBytesWriter
C API.This document is placed in the public domain or under the CC0-1.0-Universal license, whichever is more permissive.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4