Compute the distance matrix between each pair from a feature array X and Y.
For efficiency reasons, the euclidean distance between a pair of row vector x and y is computed as:
dist(x, y) = sqrt(dot(x, x) - 2 * dot(x, y) + dot(y, y))
This formulation has two advantages over other ways of computing distances. First, it is computationally efficient when dealing with sparse data. Second, if one argument varies but the other remains unchanged, then dot(x, x)
and/or dot(y, y)
can be pre-computed.
However, this is not the most precise way of doing this computation, because this equation potentially suffers from “catastrophic cancellation”. Also, the distance matrix returned by this function may not be exactly symmetric as required by, e.g., scipy.spatial.distance
functions.
Read more in the User Guide.
An array where each row is a sample and each column is a feature.
An array where each row is a sample and each column is a feature. If None
, method uses Y=X
.
Pre-computed dot-products of vectors in Y (e.g., (Y**2).sum(axis=1)
) May be ignored in some cases, see the note below.
Return squared Euclidean distances.
Pre-computed dot-products of vectors in X (e.g., (X**2).sum(axis=1)
) May be ignored in some cases, see the note below.
Returns the distances between the row vectors of X
and the row vectors of Y
.
Notes
To achieve a better accuracy, X_norm_squared
and Y_norm_squared
may be unused if they are passed as np.float32
.
Examples
>>> from sklearn.metrics.pairwise import euclidean_distances >>> X = [[0, 1], [1, 1]] >>> # distance between rows of X >>> euclidean_distances(X, X) array([[0., 1.], [1., 0.]]) >>> # get distance to origin >>> euclidean_distances(X, [[0, 0]]) array([[1. ], [1.41421356]])
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4