Showing content from http://mail.python.org/pipermail/python-dev/attachments/20121019/345830f3/attachment.html below:
<br><br><div class="gmail_quote">On Fri, Oct 19, 2012 at 8:36 AM, Victor Stinner <span dir="ltr"><<a href="mailto:victor.stinner@gmail.com" target="_blank">victor.stinner@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
2012/10/19 Benjamin Peterson <<a href="mailto:benjamin@python.org">benjamin@python.org</a>>:<br>
<div class="im">> It would be interesting to see how common it is for strings which have<br>
> their hash computed to be compared.<br>
<br>
</div>I implemented a quick hack. When running "./python -m test test_os":<br>
Python calls PyUnicode_RichCompare() 15206 times with Py_EQ or Py_NE<br>
operator. In 41.4% (6295 calls), the hash of the two operands is<br>
known. In 41.2% (6262 times on 15206), the hash of the two operands<br>
are known *and are different*!<br>
<br>
The hit rate may depend since when the process was started. For<br>
example, in a fresh interpreter: the hit rate is only 7% (189 hit /<br>
2703 calls).<br>
<br>
When running the test suite, the hit rate is around 80% (hashs are<br>
known in 90%) after running 70 tests. At the same time, the average of<br>
string length is 4.1 characters and quite all strings are pure ASCII.<br>
<br>
I create the issue <a href="http://bugs.python.org/issue16286" target="_blank">http://bugs.python.org/issue16286</a> to discuss this<br>
optimization.<br></blockquote><div><br></div><div>If you want to measure the performance impact compared to a clean build then you can use the unladen benchmarks as it contains several Python 3-compatible benchmarks now.</div>
</div>
RetroSearch is an open source project built by @garambo
| Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4