A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://github.com/opencv/opencv/issues/25011 below:

New shiny Core module for OpenCV 5.0 · Issue #25011 · opencv/opencv · GitHub

Introduction

Core module is the crucial module in OpenCV. All other modules depend on it.
It must form a solid foundation for the future-proof OpenCV.

What are the desirable key properties of the module:

A bit of history and why core module should be somewhat like Python's numpy

OpenCV's Core module in its current form has been created in ~2009, where cv::Mat, a multi-dimensional dense array, has been introduced as a complete replacement for CvMat, CvMatND and IplImage. The whole OpenCV API has been reconstructed (before 2009 it was a C API) around this cv::Mat and a few other basic structures like std::vector<> (to handle point clouds etc.). The idea behind combinding image, matrix and multi-dimensional array (tensor) has been borrowed from Matlab, where toolboxes, including image processing toolbox, basic linear algebra toolbox, Jean-Yves Bouguet camera calibration toolbox etc. all happily use Matlab matices and so it's super-easy to create pipelines that use algorithms from different areas.

It seems that Python's famous numerical extension numpy borrowed the same idea and also implemented ubiquitous matrix/array type called ndarray there. On top of numpy some bigger packages have been developed like scipy, scikit-learn etc. An efficient and yet quite comprehensive set of basic operations extended by the derived packages mostly eliminated the problem of very low speed of manually-written Python code (because all the kernels in numpy are implemented in efficient C or Fortran). That suddenly made Python a sound substitution for Fortran & Matlab in the new century.

With the rise of Deep Learning the idea has been greatly extended. Efficient, GPU-accelerated, comprehensive set of operations (very similar to numpy) that can be put together into graphs, together with automatic differentiation tools, formed the foundation of the modern Deep Learning technology. If one looks at PyTorch, Tensorflow, JAX, ONNX specification etc., he/she will find many similarities with numpy. In particular, many ONNX operations follow numpy quite closely and use numpy for illustration of those operations. Of course, there are some deep learning-specific operations like Convolution or SoftMax or Dropout or Attention, but most of ONNX operations have numpy counterparts.

Python community (since all aforementioned frameworks, except for OpenCV, mainly use Python) noticed this close resemblance of many array processing frameworks and decided to introduce so called Python array API standard. It's clear that this is emerging standard, as its API lacks some important numpy functionality, some important PyTorch/ONNX operations, it lacks the notion of an external accelerator (like GPU or NPU) where user may want to transfer array/tensor to, perform a set of operations there and transfer the results back. This is crucial functionality for deep learning frameworks, for OpenCV, its deep learning module and its GPU-accelerated image processing functionality etc. So the standard will definitely evolve, but it makes sense for us in OpenCV 5 to comply with it more or less even now. Besides implementation of already specified API, for us it's opportunity to offer extra kernels to the community that are important for computer vision and image processing use cases.

The list of functions to implement/improve in Core module in OpenCV 5.0

Basically, OpenCV's core module should implement a big subset of "Python array API standard" with certain extensions that we consider useful.

The list of functions below has been directly copied from https://data-apis.org/array-api/latest/API_specification/index.html. Probably, the following content should be presented in a table.

  1. Unary/binary arithmetic, math and logic operations. We have implementation of many of those operations already (sometimes under slightly different names), but we need to support broadcasting for binary operations.

    abs    // cv::absdiff with 0 as a second parameter
    acos
    acosh  // via cv::log()
    add    // cv::add
    asin
    asinh  // via cv::log()
    atan
    atan2  // cv::polarToCart
    atanh  // via cv::log()
    bitwise_and // cv::bitwise_and
    bitwise_left_shift
    bitwise_invert // cv::bitwise_not
    bitwise_or // cv::bitwise_or
    bitwise_right_shift
    bitwise_xor // cv::bitwise_xor
    ceil
    conj
    cos    // cv::cartToPolar
    cosh   // via cv::exp()
    divide // cv::divide
    equal  // cv::compare(..., CMP_EQ)
    exp    // cv::exp
    expm1
    floor
    floor_divide
    greater // cv::compare(..., CMP_GT)
    greater_equal // cv::compare(..., CMP_GE)
    imag
    isfinite
    isinf
    isnan
    less  // cv::compare(..., CMP_LT)
    less_equal // cv::compare(..., CMP_LE)
    log   // cv::log
    log1p
    log2
    log10
    logaddexp
    logical_and
    logical_not
    logical_or
    logical_xor
    multiply  // cv::multiply
    negative
    not_equal  // compare(...,CMP_NE)
    positive   // copyTo()
    pow        // cv::pow
    real
    remainder
    round     // convertTo()
    sign
    sin      // only cartToPolar
    sinh      // via cv::exp()
    square    // via cv::multiply()
    sqrt      // cv::sqrt()
    subtract  // cv::subtract()
    tan
    tanh    // no direct function. Can be computed via cv::exp()
    trunc
    
  2. Linear algebra functions. Same situation.

    matmul // cv::gemm()
    matrix_transpose // cv::transpose()
    tensordot
    vecdot
    
  3. Array permutation functions. Same situation:

    broadcast_arrays
    broadcast_to   // +
    concat        // in OpenCV we have 2D hconcat and vconcat
    expand_dims
    flip          // 2D only for now
    permute_dims  // in cv::dnn we have general Transpose. in core we have 2D transpose
    reshape
    roll
    squeeze
    stack
    
  4. Statistical functions. Same situation:

    max   // minMaxIdx() computes min, max and their indices.
    mean  // cv::mean
    min   // via minMaxIdx()
    prod
    std   // cv::meanStdDev() computes both mean and standard deviation
    sum   // cv::sum
    var   // via cv::meanStdDev()
    
  5. Misc functions from several other groups. Mostly implemented as well in one form or another:

    // searching functions
    argmax
    argmin
    nonzero // ~ cv::countNonZero()
    where // element-wise ternary operator ?:
    
    // set functions
    unique_all
    unique_counts
    unique_inverse
    unique_values
    
    // sorting
    argsort  // called cv::sortIdx() in OpenCV
    sort     // cv::sort()
    
    // utility functions
    all    // as cv::countNonZero(m) == m.total()
    any   // cv::hasNonZero
    
    // initialization functions
    arange
    asarray // many non-mat array can be converted to cv::Mat using getMat().
            // in Python bindings Mat is implicitly constructed from ndarray and vice versa
    empty   // Mat()
    empty_like
    eye     // Mat::eye()
    from_dlpack
    full
    full_like
    linspace
    meshgrid
    ones     // Mat::ones()
    ones_like
    tril
    triu
    zeros    // Mat::zeros()
    zeros_like
    
  6. Some useful extra operations not included into "Python array API standard", but included into numpy and/or ONNX specifications:

    einsum // already implemented in cv::dnn
    einops.* // a family of operations from excellent einops package:
             // https://github.com/arogozhnikov/einops
    reduce(..., sum|min|max|avg|...) // in core we already have 2D reduce(),
                                    // need to extend it to ND, as in ONNX or cv::dnn
    

Also, in Core we already have a bunch of functions implemented in numpy, but missing in the standard, like various matrix decomposition and backward substitution algorithms (LU, Cholesky, SVD, QR), FFT etc.

As you can see, many of the operations are already implemented in Core or in DNN module.

What needs to be done basically: So, once again, why is it important, besides the declaration that we 'sort of implemented' the emerging standard?

lcnittl, ChrisJones79, timsurber, SLchowis and lin72hcrackwitz and lin72h


RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4