RetroSearch Browse

Fri Oct 19 14:36:39 CEST 2012 · https://mail.python.org/pipermail/python-dev/2012-October/122236.html

2012/10/19 Benjamin Peterson <benjamin at python.org>:
> It would be interesting to see how common it is for strings which have
> their hash computed to be compared.

I implemented a quick hack. When running "./python -m test test_os":
Python calls PyUnicode_RichCompare() 15206 times with Py_EQ or Py_NE
operator. In 41.4% (6295 calls), the hash of the two operands is
known. In 41.2% (6262 times on 15206), the hash of the two operands
are known *and are different*!

The hit rate may depend since when the process was started. For
example, in a fresh interpreter: the hit rate is only 7% (189 hit /
2703 calls).

When running the test suite, the hit rate is around 80% (hashs are
known in 90%) after running 70 tests. At the same time, the average of
string length is 4.1 characters and quite all strings are pure ASCII.

I create the issue http://bugs.python.org/issue16286 to discuss this
optimization.

Victor

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Showing content from https://mail.python.org/pipermail/python-dev/2012-October/122236.html below:

[Python-Dev] Why not using the hash when comparing strings?