Dimension reduction with OT
Warning
Note that by default the module is not imported in ot
. In order to use it you need to explicitly import ot.dr
Compute squared euclidean distance between samples (autograd)
Entropic Wasserstein Component Analysis [52].
The function solves the following optimization problem:
\[\mathbf{U} = \mathop{\arg \min}_\mathbf{U} \quad W(\mathbf{X}, \mathbf{U}\mathbf{U}^T \mathbf{X})\]
where :
\(\mathbf{U}\) is a matrix in the Stiefel(p, d) manifold
\(W\) is entropic regularized Wasserstein distances
\(\mathbf{X}\) are samples
X (ndarray, shape (n, d)) – Samples from measure \(\mu\).
U0 (ndarray, shape (d, k), optional) – Initial starting point for projection.
reg (float, optional) – Regularization term >0 (entropic regularization).
k (int, optional) – Subspace dimension.
method (str, optional) – Eather ‘BCD’ or ‘MM’ (Block Coordinate Descent or Majorization-Minimization). Prefer MM when d is large.
sinkhorn_method (str) – Method used for the Sinkhorn solver, see ot.bregman.sinkhorn for more details.
stopThr (float, optional) – Stop threshold on error (>0).
maxiter (int, optional) – Maximum number of iterations of the BCD/MM.
maxiter_sink (int, optional) – Maximum number of iterations of the Sinkhorn solver.
maxiter_MM (int, optional) – Maximum number of iterations of the MM (only used when method=’MM’).
verbose (int, optional) – Print information along iterations.
pi (ndarray, shape (n, n)) – Optimal transportation matrix for the given parameters.
U (ndarray, shape (d, k)) – Matrix Stiefel manifold.
References
ot.dr.ewca
Fisher Discriminant Analysis
P (ndarray, shape (d, p)) – Optimal transportation matrix for the given parameters
proj (callable) – projection function including mean centering
ot.dr.fda
Log-sum-exp reduction compatible with autograd (no numpy implementation)
Projection Robust Wasserstein Distance [32]
The function solves the following optimization problem:
\[\max_{U \in St(d, k)} \ \min_{\pi \in \Pi(\mu,\nu)} \quad \sum_{i,j} \pi_{i,j} \|U^T(\mathbf{x}_i - \mathbf{y}_j)\|^2 - \mathrm{reg} \cdot H(\pi)\]
\(U\) is a linear projection operator in the Stiefel(d, k) manifold
\(H(\pi)\) is entropy regularizer
\(\mathbf{x}_i\), \(\mathbf{y}_j\) are samples of measures \(\mu\) and \(\nu\) respectively
X (ndarray, shape (n, d)) – Samples from measure \(\mu\)
Y (ndarray, shape (n, d)) – Samples from measure \(\nu\)
a (ndarray, shape (n, )) – weights for measure \(\mu\)
b (ndarray, shape (n, )) – weights for measure \(\nu\)
tau (float) – stepsize for Riemannian Gradient Descent
U0 (ndarray, shape (d, p)) – Initial starting point for projection.
reg (float, optional) – Regularization term >0 (entropic regularization)
k (int) – Subspace dimension
stopThr (float, optional) – Stop threshold on error (>0)
verbose (int, optional) – Print information along iterations.
random_state (int, RandomState instance or None, default=None) – Determines random number generation for initial value of projection operator when U0 is not given.
pi (ndarray, shape (n, n)) – Optimal transportation matrix for the given parameters
U (ndarray, shape (d, k)) – Projection operator.
References
Sinkhorn algorithm with fixed number of iteration (autograd)
Sinkhorn algorithm in log-domain with fixed number of iteration (autograd)
split samples in \(\mathbf{X}\) by classes in \(\mathbf{y}\)
Wasserstein Discriminant Analysis [11]
The function solves the following optimization problem:
\[\mathbf{P} = \mathop{\arg \min}_\mathbf{P} \quad \frac{\sum\limits_i W(P \mathbf{X}^i, P \mathbf{X}^i)}{\sum\limits_{i, j \neq i} W(P \mathbf{X}^i, P \mathbf{X}^j)}\]
where :
\(P\) is a linear projection operator in the Stiefel(p, d) manifold
\(W\) is entropic regularized Wasserstein distances
\(\mathbf{X}^i\) are samples in the dataset corresponding to class i
Choosing a Sinkhorn solver
By default and when using a regularization parameter that is not too small the default sinkhorn solver should be enough. If you need to use a small regularization to get sparse cost matrices, you should use the ot.dr.sinkhorn_log()
solver that will avoid numerical errors, but can be slow in practice.
X (ndarray, shape (n, d)) – Training samples.
y (ndarray, shape (n,)) – Labels for training samples.
p (int, optional) – Size of dimensionality reduction.
reg (float, optional) – Regularization term >0 (entropic regularization)
solver (None | str, optional) – None for steepest descent or ‘TrustRegions’ for trust regions algorithm else should be a pymanopt.solvers
sinkhorn_method (str) – method used for the Sinkhorn solver, either ‘sinkhorn’ or ‘sinkhorn_log’
P0 (ndarray, shape (d, p)) – Initial starting point for projection.
normalize (bool, optional) – Normalize the Wasserstaiun distance by the average distance on P0 (default : False)
verbose (int, optional) – Print information along iterations.
P (ndarray, shape (d, p)) – Optimal transportation matrix for the given parameters
proj (callable) – Projection function including mean centering.
References
ot.dr.wda
Compute squared euclidean distance between samples (autograd)
Entropic Wasserstein Component Analysis [52].
The function solves the following optimization problem:
\[\mathbf{U} = \mathop{\arg \min}_\mathbf{U} \quad W(\mathbf{X}, \mathbf{U}\mathbf{U}^T \mathbf{X})\]
where :
\(\mathbf{U}\) is a matrix in the Stiefel(p, d) manifold
\(W\) is entropic regularized Wasserstein distances
\(\mathbf{X}\) are samples
X (ndarray, shape (n, d)) – Samples from measure \(\mu\).
U0 (ndarray, shape (d, k), optional) – Initial starting point for projection.
reg (float, optional) – Regularization term >0 (entropic regularization).
k (int, optional) – Subspace dimension.
method (str, optional) – Eather ‘BCD’ or ‘MM’ (Block Coordinate Descent or Majorization-Minimization). Prefer MM when d is large.
sinkhorn_method (str) – Method used for the Sinkhorn solver, see ot.bregman.sinkhorn for more details.
stopThr (float, optional) – Stop threshold on error (>0).
maxiter (int, optional) – Maximum number of iterations of the BCD/MM.
maxiter_sink (int, optional) – Maximum number of iterations of the Sinkhorn solver.
maxiter_MM (int, optional) – Maximum number of iterations of the MM (only used when method=’MM’).
verbose (int, optional) – Print information along iterations.
pi (ndarray, shape (n, n)) – Optimal transportation matrix for the given parameters.
U (ndarray, shape (d, k)) – Matrix Stiefel manifold.
References
Fisher Discriminant Analysis
P (ndarray, shape (d, p)) – Optimal transportation matrix for the given parameters
proj (callable) – projection function including mean centering
Log-sum-exp reduction compatible with autograd (no numpy implementation)
Projection Robust Wasserstein Distance [32]
The function solves the following optimization problem:
\[\max_{U \in St(d, k)} \ \min_{\pi \in \Pi(\mu,\nu)} \quad \sum_{i,j} \pi_{i,j} \|U^T(\mathbf{x}_i - \mathbf{y}_j)\|^2 - \mathrm{reg} \cdot H(\pi)\]
\(U\) is a linear projection operator in the Stiefel(d, k) manifold
\(H(\pi)\) is entropy regularizer
\(\mathbf{x}_i\), \(\mathbf{y}_j\) are samples of measures \(\mu\) and \(\nu\) respectively
X (ndarray, shape (n, d)) – Samples from measure \(\mu\)
Y (ndarray, shape (n, d)) – Samples from measure \(\nu\)
a (ndarray, shape (n, )) – weights for measure \(\mu\)
b (ndarray, shape (n, )) – weights for measure \(\nu\)
tau (float) – stepsize for Riemannian Gradient Descent
U0 (ndarray, shape (d, p)) – Initial starting point for projection.
reg (float, optional) – Regularization term >0 (entropic regularization)
k (int) – Subspace dimension
stopThr (float, optional) – Stop threshold on error (>0)
verbose (int, optional) – Print information along iterations.
random_state (int, RandomState instance or None, default=None) – Determines random number generation for initial value of projection operator when U0 is not given.
pi (ndarray, shape (n, n)) – Optimal transportation matrix for the given parameters
U (ndarray, shape (d, k)) – Projection operator.
References
Sinkhorn algorithm with fixed number of iteration (autograd)
Sinkhorn algorithm in log-domain with fixed number of iteration (autograd)
split samples in \(\mathbf{X}\) by classes in \(\mathbf{y}\)
Wasserstein Discriminant Analysis [11]
The function solves the following optimization problem:
\[\mathbf{P} = \mathop{\arg \min}_\mathbf{P} \quad \frac{\sum\limits_i W(P \mathbf{X}^i, P \mathbf{X}^i)}{\sum\limits_{i, j \neq i} W(P \mathbf{X}^i, P \mathbf{X}^j)}\]
where :
\(P\) is a linear projection operator in the Stiefel(p, d) manifold
\(W\) is entropic regularized Wasserstein distances
\(\mathbf{X}^i\) are samples in the dataset corresponding to class i
Choosing a Sinkhorn solver
By default and when using a regularization parameter that is not too small the default sinkhorn solver should be enough. If you need to use a small regularization to get sparse cost matrices, you should use the ot.dr.sinkhorn_log()
solver that will avoid numerical errors, but can be slow in practice.
X (ndarray, shape (n, d)) – Training samples.
y (ndarray, shape (n,)) – Labels for training samples.
p (int, optional) – Size of dimensionality reduction.
reg (float, optional) – Regularization term >0 (entropic regularization)
solver (None | str, optional) – None for steepest descent or ‘TrustRegions’ for trust regions algorithm else should be a pymanopt.solvers
sinkhorn_method (str) – method used for the Sinkhorn solver, either ‘sinkhorn’ or ‘sinkhorn_log’
P0 (ndarray, shape (d, p)) – Initial starting point for projection.
normalize (bool, optional) – Normalize the Wasserstaiun distance by the average distance on P0 (default : False)
verbose (int, optional) – Print information along iterations.
P (ndarray, shape (d, p)) – Optimal transportation matrix for the given parameters
proj (callable) – Projection function including mean centering.
References
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4