A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://mail.python.org/pipermail/python-dev/2003-May/035639.html below:

[Python-Dev] Simple dicts

[Python-Dev] Simple dictsDamien Morton damien.morton@acm.org
Thu, 15 May 2003 18:25:13 -0400
Im currently working on an open-chaining dict. Between paying work and
coming to grips with the python innards, it might take a little while.

I was working on an implementation optimised for dicts with <256 entries
that attempted to squeeze the most out of memory by using bytes for the
'first' and 'next' pointers. This kind of hashtable can be _extremely_
sparse compared to the current dict implementation. With the
byte-oriented open-chaining approach, the break-even point for memory
usage (compared to the current approach) happens at a max load factor of
about 0.1. 

Im not sure that alloc()/free() for each dictentry is a win (if only
because of pymalloc call overhead), and instead imagine a scheme whereby
each dict would pre-alloc() a block of memory and manages its own
free-lists. Theoretically, this makes copying and disposing of dicts
much easier. It also helps ensure locality of reference. In fact,
immediately after a doubling, the open-addressing hashtable scheme still
'uses' (in the sense of potentially addressing) all of the memory
allocated to it, whereas the open-chaining approach 'uses' only the
first pointers and the actual dictentries in use - about 2/3 of the
space the open-addressing scheme uses.

On the other hand, as Tim pointed out to me in a private email, there is
so much overhead in just getting to the hashtable inner loop, going
around that loop one time instead of two or three seems inconsequential.

On the third hand, first-miss and first-hit lookups are simple enough
that they could easily be put into a macro. I will need to take a closer
look at Oren Tirosh's fastnames patch.

I have a question that someone may be able to answer:

There seem to be two different ways to get/set/del from a dictionary.

The first is using PyDict_[Get|Set|Del]Item()

The second is using the embarssingly named dict_ass_sub() and its
partner dict_subscript().

Which of these two access methods is most likely to be used?




RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4