A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://github.com/numpy/numpy/issues/6467 below:

performance regression for record array access in numpy 1.10.1 · Issue #6467 · numpy/numpy · GitHub

It appears that access numpy record arrays by field name is significantly slower in numpy 1.10.1. I have put below a simple example test that illustrates the issue. (I am aware that this particular example is much better accomplished by other means. The point is that array access is slow, not that this a representative problem.)

The test script is

#!/usr/bin/env python
import os
import time
import sys
import numpy as np

def test(N=100000,verbose=False):
    d = np.zeros(1,dtype=[('col','f8')])

    t0 = time.time()
    for i in xrange(N):
        d['col'] += i
    t0 = time.time() - t0

    if verbose:
    print 'numpy version:',np.version.version
        print 'time: %g' % t0

if __name__ == "__main__":
    if len(sys.argv) > 1:
    N = int(sys.argv[1])
    test(N=N,verbose=True)
    else:
    test(verbose=True)

Here are the running times for different versions of numpy:

numpy version: 1.9.3
time: 0.262786

numpy version: 1.10.1
time: 3.57254

@esheldon has reproduced the relative timing differences on linux in addition to my tests which were with my mac.

I profiled the code for v1.10.1 and found this

         3200006 function calls (3000006 primitive calls) in 4.521 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
   200000    1.386    0.000    1.883    0.000 _internal.py:372(_check_field_overlap)
600000/400000    0.751    0.000    0.906    0.000 _internal.py:337(_get_all_field_offsets)
        1    0.566    0.566    4.521    4.521 numpy_test.py:7(test)
   200000    0.401    0.000    3.189    0.000 _internal.py:425(_getfield_is_safe)
   200000    0.350    0.000    3.955    0.000 _internal.py:287(_index_fields)
   200000    0.323    0.000    3.513    0.000 {method 'getfield' of 'numpy.ndarray' objects}
   400000    0.279    0.000    0.279    0.000 {range}
   400000    0.155    0.000    0.155    0.000 {method 'update' of 'set' objects}
   400000    0.106    0.000    0.106    0.000 {method 'append' of 'list' objects}
   200000    0.093    0.000    0.093    0.000 {isinstance}
   200000    0.062    0.000    0.062    0.000 {method 'difference' of 'set' objects}
   200000    0.048    0.000    0.048    0.000 {method 'extend' of 'list' objects}
        1    0.000    0.000    0.000    0.000 {numpy.core.multiarray.zeros}
        1    0.000    0.000    4.521    4.521 <string>:1(<module>)
        2    0.000    0.000    0.000    0.000 {time.time}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

It appears that new code added at the python level for error checking is significantly degrading performance.


RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4