On 05/06/14 22:51, Nathaniel Smith wrote: > This gets evaluated as: > > tmp1 = a + b > tmp2 = tmp1 + c > result = tmp2 / c > > All these temporaries are very expensive. Suppose that a, b, c are > arrays with N bytes each, and N is large. For simple arithmetic like > this, then costs are dominated by memory access. Allocating an N byte > array requires the kernel to clear the memory, which incurs N bytes of > memory traffic. It seems to be the case that a large portion of the run-time in Python code using NumPy can be spent in the kernel zeroing pages (which the kernel does for security reasons). I think this can also be seen as a 'malloc problem'. It comes about because each new NumPy array starts with a fresh buffer allocated by malloc. Perhaps buffers can be reused? Sturla
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4