Showing content from http://mail.python.org/pipermail/python-dev/attachments/20180806/702762a1/attachment.html below:
<html xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta name="Generator" content="Microsoft Word 15 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
{font-family:"MS Gothic";
panose-1:2 11 6 9 7 2 5 8 2 4;}
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
{font-family:"\@MS Gothic";
panose-1:2 11 6 9 7 2 5 8 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
margin-bottom:.0001pt;
font-size:12.0pt;
font-family:"Calibri",sans-serif;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:#0563C1;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:#954F72;
text-decoration:underline;}
span.EmailStyle17
{mso-style-type:personal-compose;
font-family:"Calibri",sans-serif;
color:windowtext;}
.MsoChpDefault
{mso-style-type:export-only;}
@page WordSection1
{size:8.5in 11.0in;
margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
{page:WordSection1;}
--></style>
</head>
<body lang="EN-US" link="#0563C1" vlink="#954F72">
<div class="WordSection1">
<p class="MsoNormal"><span style="font-size:11.0pt">Background:<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">Through the implementation of an alternate runtime I've been poking around some of the class initialization routines and I found out that there was a subtle bug with PyType_Ready and the header initializer
PyVarObject_HEAD_INIT.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">Looking through the codebase, I couldn't really find any pattern of when the type should be defined within PyVarObject_HEAD_INIT. Sometimes it was initialized to NULL (or 0) and PyType_Type (let's ignore Py_True
and Py_False from now).<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">From PyType_Ready it turns out that setting the value PyType_Type is never actually needed outside of PyType_Type and PyBaseObject type. This is clear from the code:<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">if (Py_TYPE(type) == NULL && base != NULL)<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"> Py_TYPE(type) = Py_TYPE(base);</span><span style="font-size:11.0pt;font-family:"MS Gothic"">â¨</span><span style="font-size:11.0pt"><o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">Given that any PyTypeObject's base is of type PyType_Type, setting PyVarObject_HEAD_INIT(&PyType_Ready) is superfluous. Therefore, setting all static PyTypeObjects to their ob_type to NULL should be a safe
assumption to make.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">Uninitialized Types:<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">A quick s/PyVarObject_HEAD_INIT(&PyType_Type/PyVarObject_HEAD_INIT(NULL/ shows that some objects do need to have their ob_type set from the outset, violating the previous assumption. After writing a quick
script, I found that out of the ~300 PyVarObject_HEAD_INIT present in CPython, only these 14 types segfaulted:<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">PyByteArrayIter_Type<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">PyBytesIter_Type<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">PyDictIterKey_Type <o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">PyDictIterValue_Type<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">PyDictIterItem_Type<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">PyClassMethod_Type<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">PyAsyncGen_Type<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">PyListIter_Type<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">PyListRevIter_Type<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">PyODictIter_Type<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">PyLongRangeIter_Type<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">PySetIter_Type<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">PyTupleIter_Type<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">PyUnicodeIter_Type<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">Bug:<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">It turns out that these PyTypeObjects are never initialized through PyType_Ready. However, they are used as fully initialized types. It is by pure chance that the work without having to call the initializer
on them. This though is undefined behavior. This not only might result in a weird future bug which is hard to chase down but also, it affects other runtimes as this behavior depends on implementation details of CPython.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">This is a pervasive pattern that should be removed from the codebase and ideally extensions should follow as well.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">Solution:<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">Here are my proposed solutions in order from less controversial to most controversial. Note that all of them I already tried in a local branch and are working:<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">1) Initialize all uninitialized types.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">Example: <a href="https://github.com/eduardo-elizondo/cpython/commit/bc53db3cf4e5a6923b0b1afa6181305553faf173">
https://github.com/eduardo-elizondo/cpython/commit/bc53db3cf4e5a6923b0b1afa6181305553faf173</a><br>
<br>
<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">2) Move all PyVarObject_HEAD_INIT to NULL except PyType_Type, PyBaseObject_Type and the booleans.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">3) Special case the initialization of PyType_Type and PyBaseObject_Type within PyType_Ready to now make all calls to PyVarObject_HEAD_INIT use NULL. To enable this a small change within PyType_Ready is needed
to initialize PyType_Type PyBaseObject:<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">if (base == NULL) {<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"> Py_TYPE(&PyType_Type) = &PyType_Type;<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"> Py_TYPE(type) = &PyType_Type;<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">}<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">Also, the booleans have to be fully initialized without calling PyVarObject_HEAD_INIT. I propose:<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">struct _longobject _Py_FalseStruct = {<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"> PyObject_HEAD_INIT(&PyBool_Type), 0, { 0 }<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">};<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">This will clean-up the entire codebase of this anti-pattern.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">Example: <a href="https://github.com/eduardo-elizondo/cpython/commit/542fd79e4279c64c077c127b175a8d856d3c5f0b">
https://github.com/eduardo-elizondo/cpython/commit/542fd79e4279c64c077c127b175a8d856d3c5f0b</a><br>
<br>
<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">4) Modify PyVarObject_HEAD_INIT to ignore the input and initialize to NULL and 0.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">In order to prevent this antipattern within extension code as well, we should make PyVarObject_HEAD_INIT ignore the inputs and just set the value to NULL.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">#define PyVarObject_HEAD_INIT(type, size) \<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"> { PyObject_HEAD_INIT(NULL) 0 },<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">This will prevent external code to have a semi-initialized type that is not initialized through PyType_Ready.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">5) Finally, I would go even further and suggest making PyVarObject_HEAD_INIT argumentless.
<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">#define PyVarObject_HEAD_INIT \<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"> { PyObject_HEAD_INIT(NULL) 0 },<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">However, this breaks backward compatibility. That being said, this will make extension maintainers aware of this antipattern.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">Example: <a href="https://github.com/eduardo-elizondo/cpython/commit/3869e53843008ff096764f4adaf26efbb5625996">
https://github.com/eduardo-elizondo/cpython/commit/3869e53843008ff096764f4adaf26efbb5625996</a><br>
<br>
Thoughts?<o:p></o:p></span></p>
</div>
</body>
</html>
RetroSearch is an open source project built by @garambo
| Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4