Partial OT solvers
FunctionsReturns the partial Gromov-Wasserstein transport between \((\mathbf{C_1}, \mathbf{p})\) and \((\mathbf{C_2}, \mathbf{q})\)
The function solves the following optimization problem:
\[\gamma = \mathop{\arg \min}_{\gamma} \quad \sum_{i,j,k,l} L(\mathbf{C_1}_{i,k}, \mathbf{C_2}_{j,l})\cdot \gamma_{i,j}\cdot\gamma_{k,l} + \mathrm{reg} \cdot\Omega(\gamma)\]
\[ \begin{align}\begin{aligned}s.t. \ \gamma &\geq 0\\ \gamma \mathbf{1} &\leq \mathbf{a}\\ \gamma^T \mathbf{1} &\leq \mathbf{b}\\ \mathbf{1}^T \gamma^T \mathbf{1} = m &\leq \min\{\|\mathbf{a}\|_1, \|\mathbf{b}\|_1\}\end{aligned}\end{align} \]
where :
\(\mathbf{C_1}\) is the metric cost matrix in the source space
\(\mathbf{C_2}\) is the metric cost matrix in the target space
\(\mathbf{p}\) and \(\mathbf{q}\) are the sample weights
L: quadratic loss function
\(\Omega\) is the entropic regularization term, \(\Omega=\sum_{i,j} \gamma_{i,j}\log(\gamma_{i,j})\)
m is the amount of mass to be transported
The formulation of the GW problem has been proposed in [12] and the partial GW in [29]
Note
This function will be deprecated in a near future, please use
ot.gromov.entropic_partial_gromov_wasserstein instead.
C1 (ndarray, shape (ns, ns)) – Metric cost matrix in the source space
C2 (ndarray, shape (nt, nt)) – Metric cost matrix in the target space
p (ndarray, shape (ns,)) – Distribution in the source space
q (ndarray, shape (nt,)) – Distribution in the target space
reg (float) – entropic regularization parameter
m (float, optional) – Amount of mass to be transported (default: \(\min\{\|\mathbf{p}\|_1, \|\mathbf{q}\|_1\}\))
G0 (ndarray, shape (ns, nt), optional) – Initialization of the transportation matrix
numItermax (int, optional) – Max number of iterations
tol (float, optional) – Stop threshold on error (>0)
log (bool, optional) – return log if True
verbose (bool, optional) – Print information along iterations
Examples
>>> import ot >>> import scipy as sp >>> a = np.array([0.25] * 4) >>> b = np.array([0.25] * 4) >>> x = np.array([1,2,100,200]).reshape((-1,1)) >>> y = np.array([3,2,98,199]).reshape((-1,1)) >>> C1 = sp.spatial.distance.cdist(x, x) >>> C2 = sp.spatial.distance.cdist(y, y) >>> np.round(entropic_partial_gromov_wasserstein(C1, C2, a, b, 50), 2) array([[0.12, 0.13, 0. , 0. ], [0.13, 0.12, 0. , 0. ], [0. , 0. , 0.25, 0. ], [0. , 0. , 0. , 0.25]]) >>> np.round(entropic_partial_gromov_wasserstein(C1, C2, a, b, 50,0.25), 2) array([[0.02, 0.03, 0. , 0.03], [0.03, 0.03, 0. , 0.03], [0. , 0. , 0.03, 0. ], [0.02, 0.02, 0. , 0.03]])
math: gamma : (dim_a, dim_b) ndarray – Optimal transportation matrix for the given parameters
log (dict) – log dictionary returned only if log is True
References
Returns the partial Gromov-Wasserstein discrepancy between \((\mathbf{C_1}, \mathbf{p})\) and \((\mathbf{C_2}, \mathbf{q})\)
The function solves the following optimization problem:
\[GW = \min_{\gamma} \quad \sum_{i,j,k,l} L(\mathbf{C_1}_{i,k}, \mathbf{C_2}_{j,l})\cdot \gamma_{i,j}\cdot\gamma_{k,l} + \mathrm{reg} \cdot\Omega(\gamma)\]
\[ \begin{align}\begin{aligned}s.t. \ \gamma &\geq 0\\ \gamma \mathbf{1} &\leq \mathbf{a}\\ \gamma^T \mathbf{1} &\leq \mathbf{b}\\ \mathbf{1}^T \gamma^T \mathbf{1} = m &\leq \min\{\|\mathbf{a}\|_1, \|\mathbf{b}\|_1\}\end{aligned}\end{align} \]
where :
\(\mathbf{C_1}\) is the metric cost matrix in the source space
\(\mathbf{C_2}\) is the metric cost matrix in the target space
\(\mathbf{p}\) and \(\mathbf{q}\) are the sample weights
L : quadratic loss function
\(\Omega\) is the entropic regularization term, \(\Omega=\sum_{i,j} \gamma_{i,j}\log(\gamma_{i,j})\)
m is the amount of mass to be transported
The formulation of the GW problem has been proposed in [12] and the partial GW in [29]
Note
This function will be deprecated in a near future, please use
ot.gromov.entropic_partial_gromov_wasserstein2 instead.
C1 (ndarray, shape (ns, ns)) – Metric cost matrix in the source space
C2 (ndarray, shape (nt, nt)) – Metric cost matrix in the target space
p (ndarray, shape (ns,)) – Distribution in the source space
q (ndarray, shape (nt,)) – Distribution in the target space
reg (float) – entropic regularization parameter
m (float, optional) – Amount of mass to be transported (default: \(\min\{\|\mathbf{p}\|_1, \|\mathbf{q}\|_1\}\))
G0 (ndarray, shape (ns, nt), optional) – Initialization of the transportation matrix
numItermax (int, optional) – Max number of iterations
tol (float, optional) – Stop threshold on error (>0)
log (bool, optional) – return log if True
verbose (bool, optional) – Print information along iterations
partial_gw_dist (float) – Gromov-Wasserstein distance
log (dict) – log dictionary returned only if log is True
Examples
>>> import ot >>> import scipy as sp >>> a = np.array([0.25] * 4) >>> b = np.array([0.25] * 4) >>> x = np.array([1,2,100,200]).reshape((-1,1)) >>> y = np.array([3,2,98,199]).reshape((-1,1)) >>> C1 = sp.spatial.distance.cdist(x, x) >>> C2 = sp.spatial.distance.cdist(y, y) >>> np.round(entropic_partial_gromov_wasserstein2(C1, C2, a, b,50), 2) 1.87
References
Solves the partial optimal transport problem and returns the OT plan
The function considers the following problem:
\[ \begin{align}\begin{aligned}\gamma = \mathop{\arg \min}_\gamma \quad \langle \gamma, \mathbf{M} \rangle_F + \mathrm{reg} \cdot\Omega(\gamma)\\\begin{split}s.t. \gamma \mathbf{1} &\leq \mathbf{a} \\ \gamma^T \mathbf{1} &\leq \mathbf{b} \\ \gamma &\geq 0 \\ \mathbf{1}^T \gamma^T \mathbf{1} = m &\leq \min\{\|\mathbf{a}\|_1, \|\mathbf{b}\|_1\} \\\end{split}\end{aligned}\end{align} \]
where :
\(\mathbf{M}\) is the metric cost matrix
\(\Omega\) is the entropic regularization term, \(\Omega=\sum_{i,j} \gamma_{i,j}\log(\gamma_{i,j})\)
\(\mathbf{a}\) and \(\mathbf{b}\) are the sample weights
m is the amount of mass to be transported
The formulation of the problem has been proposed in [3] (prop. 5)
a (np.ndarray (dim_a,)) – Unnormalized histogram of dimension dim_a
b (np.ndarray (dim_b,)) – Unnormalized histograms of dimension dim_b
M (np.ndarray (dim_a, dim_b)) – cost matrix
reg (float) – Regularization term > 0
m (float, optional) – Amount of mass to be transported
numItermax (int, optional) – Max number of iterations
stopThr (float, optional) – Stop threshold on error (>0)
verbose (bool, optional) – Print information along iterations
log (bool, optional) – record log if True
gamma ((dim_a, dim_b) ndarray) – Optimal transportation matrix for the given parameters
log (dict) – log dictionary returned only if log is True
Examples
>>> import ot >>> a = [.1, .2] >>> b = [.1, .1] >>> M = [[0., 1.], [2., 3.]] >>> np.round(entropic_partial_wasserstein(a, b, M, 1, 0.1), 2) array([[0.06, 0.02], [0.01, 0. ]])
References
ot.partial.entropic_partial_wasserstein
Compute the GW gradient. Note: we can not use the trick in [12] as the marginals may not sum to 1.
Note
This function will be deprecated in a near future, please use
ot.gromov.gwggrad instead.
C1 (array of shape (n_p,n_p)) – intra-source (P) cost matrix
C2 (array of shape (n_u,n_u)) – intra-target (U) cost matrix
T (array of shape(n_p+nb_dummies, n_u) (default: None)) – Transport matrix
gradient
numpy.array of shape (n_p+nb_dummies, n_u)
References
Compute the GW loss.
Note
This function will be deprecated in a near future, please use
ot.gromov.gwloss instead.
C1 (array of shape (n_p,n_p)) – intra-source (P) cost matrix
C2 (array of shape (n_u,n_u)) – intra-target (U) cost matrix
T (array of shape(n_p+nb_dummies, n_u) (default: None)) – Transport matrix
GW loss
Solves the partial optimal transport problem and returns the OT plan
The function considers the following problem:
\[\gamma = \mathop{\arg \min}_\gamma \quad \langle \gamma, \mathbf{M} \rangle_F\]
\[ \begin{align}\begin{aligned}s.t. \ \gamma \mathbf{1} &\leq \mathbf{a}\\ \gamma^T \mathbf{1} &\leq \mathbf{b}\\ \gamma &\geq 0\\ \mathbf{1}^T \gamma^T \mathbf{1} = m &\leq \min\{\|\mathbf{a}\|_1, \|\mathbf{b}\|_1\}\end{aligned}\end{align} \]
where :
\(\mathbf{M}\) is the metric cost matrix
\(\Omega\) is the entropic regularization term, \(\Omega(\gamma) = \sum_{i,j} \gamma_{i,j}\log(\gamma_{i,j})\)
\(\mathbf{a}\) and \(\mathbf{b}\) are the sample weights
m is the amount of mass to be transported
The formulation of the problem has been proposed in [29]
Note
This function will be deprecated in a near future, please use
ot.gromov.partial_gromov_wasserstein instead.
C1 (ndarray, shape (ns, ns)) – Metric cost matrix in the source space
C2 (ndarray, shape (nt, nt)) – Metric costfr matrix in the target space
p (ndarray, shape (ns,)) – Distribution in the source space
q (ndarray, shape (nt,)) – Distribution in the target space
m (float, optional) – Amount of mass to be transported (default: \(\min\{\|\mathbf{p}\|_1, \|\mathbf{q}\|_1\}\))
nb_dummies (int, optional) – Number of dummy points to add (avoid instabilities in the EMD solver)
G0 (ndarray, shape (ns, nt), optional) – Initialization of the transportation matrix
thres (float, optional) – quantile of the gradient matrix to populate the cost matrix when 0 (default: 1)
numItermax (int, optional) – Max number of iterations
tol (float, optional) – tolerance for stopping iterations
log (bool, optional) – return log if True
verbose (bool, optional) – Print information along iterations
**kwargs (dict) – parameters can be directly passed to the emd solver
gamma ((dim_a, dim_b) ndarray) – Optimal transportation matrix for the given parameters
log (dict) – log dictionary returned only if log is True
Examples
>>> import ot >>> import scipy as sp >>> a = np.array([0.25] * 4) >>> b = np.array([0.25] * 4) >>> x = np.array([1,2,100,200]).reshape((-1,1)) >>> y = np.array([3,2,98,199]).reshape((-1,1)) >>> C1 = sp.spatial.distance.cdist(x, x) >>> C2 = sp.spatial.distance.cdist(y, y) >>> np.round(partial_gromov_wasserstein(C1, C2, a, b),2) array([[0. , 0.25, 0. , 0. ], [0.25, 0. , 0. , 0. ], [0. , 0. , 0.25, 0. ], [0. , 0. , 0. , 0.25]]) >>> np.round(partial_gromov_wasserstein(C1, C2, a, b, m=0.25),2) array([[0. , 0. , 0. , 0. ], [0. , 0. , 0. , 0. ], [0. , 0. , 0.25, 0. ], [0. , 0. , 0. , 0. ]])
References
Solves the partial optimal transport problem and returns the partial Gromov-Wasserstein discrepancy
The function considers the following problem:
\[GW = \min_\gamma \quad \langle \gamma, \mathbf{M} \rangle_F\]
\[ \begin{align}\begin{aligned}s.t. \ \gamma \mathbf{1} &\leq \mathbf{a}\\ \gamma^T \mathbf{1} &\leq \mathbf{b}\\ \gamma &\geq 0\\ \mathbf{1}^T \gamma^T \mathbf{1} = m &\leq \min\{\|\mathbf{a}\|_1, \|\mathbf{b}\|_1\}\end{aligned}\end{align} \]
where :
\(\mathbf{M}\) is the metric cost matrix
\(\Omega\) is the entropic regularization term, \(\Omega(\gamma) = \sum_{i,j} \gamma_{i,j}\log(\gamma_{i,j})\)
\(\mathbf{a}\) and \(\mathbf{b}\) are the sample weights
m is the amount of mass to be transported
The formulation of the problem has been proposed in [29]
Note
This function will be deprecated in a near future, please use
ot.gromov.partial_gromov_wasserstein2 instead.
C1 (ndarray, shape (ns, ns)) – Metric cost matrix in the source space
C2 (ndarray, shape (nt, nt)) – Metric cost matrix in the target space
p (ndarray, shape (ns,)) – Distribution in the source space
q (ndarray, shape (nt,)) – Distribution in the target space
m (float, optional) – Amount of mass to be transported (default: \(\min\{\|\mathbf{p}\|_1, \|\mathbf{q}\|_1\}\))
nb_dummies (int, optional) – Number of dummy points to add (avoid instabilities in the EMD solver)
G0 (ndarray, shape (ns, nt), optional) – Initialization of the transportation matrix
thres (float, optional) – quantile of the gradient matrix to populate the cost matrix when 0 (default: 1)
numItermax (int, optional) – Max number of iterations
tol (float, optional) – tolerance for stopping iterations
log (bool, optional) – return log if True
verbose (bool, optional) – Print information along iterations
**kwargs (dict) – parameters can be directly passed to the emd solver
Warning
When dealing with a large number of points, the EMD solver may face some instabilities, especially when the mass associated to the dummy point is large. To avoid them, increase the number of dummy points (allows a smoother repartition of the mass over the points).
partial_gw_dist (float) – partial GW discrepancy
log (dict) – log dictionary returned only if log is True
Examples
>>> import ot >>> import scipy as sp >>> a = np.array([0.25] * 4) >>> b = np.array([0.25] * 4) >>> x = np.array([1,2,100,200]).reshape((-1,1)) >>> y = np.array([3,2,98,199]).reshape((-1,1)) >>> C1 = sp.spatial.distance.cdist(x, x) >>> C2 = sp.spatial.distance.cdist(y, y) >>> np.round(partial_gromov_wasserstein2(C1, C2, a, b),2) 1.69 >>> np.round(partial_gromov_wasserstein2(C1, C2, a, b, m=0.25),2) 0.0
References
Solves the partial optimal transport problem for the quadratic cost and returns the OT plan
The function considers the following problem:
\[\gamma = \mathop{\arg \min}_\gamma \quad \langle \gamma, \mathbf{M} \rangle_F\]
\[ \begin{align}\begin{aligned}s.t. \ \gamma \mathbf{1} &\leq \mathbf{a}\\ \gamma^T \mathbf{1} &\leq \mathbf{b}\\ \gamma &\geq 0\\ \mathbf{1}^T \gamma^T \mathbf{1} = m &\leq \min\{\|\mathbf{a}\|_1, \|\mathbf{b}\|_1\}\end{aligned}\end{align} \]
where :
\(\mathbf{M}\) is the metric cost matrix
\(\mathbf{a}\) and \(\mathbf{b}\) are source and target unbalanced distributions
m is the amount of mass to be transported
a (np.ndarray (dim_a,)) – Unnormalized histogram of dimension dim_a
b (np.ndarray (dim_b,)) – Unnormalized histograms of dimension dim_b
M (np.ndarray (dim_a, dim_b)) – cost matrix for the quadratic cost
m (float, optional) – amount of mass to be transported
nb_dummies (int, optional, default:1) – number of reservoir points to be added (to avoid numerical instabilities, increase its value if an error is raised)
log (bool, optional) – record log if True
**kwargs (dict) – parameters can be directly passed to the emd solver
Warning
When dealing with a large number of points, the EMD solver may face some instabilities, especially when the mass associated to the dummy point is large. To avoid them, increase the number of dummy points (allows a smoother repartition of the mass over the points).
gamma ((dim_a, dim_b) ndarray) – Optimal transportation matrix for the given parameters
log (dict) – log dictionary returned only if log is True
Examples
>>> import ot >>> a = [.1, .2] >>> b = [.1, .1] >>> M = [[0., 1.], [2., 3.]] >>> np.round(partial_wasserstein(a,b,M), 2) array([[0.1, 0. ], [0. , 0.1]]) >>> np.round(partial_wasserstein(a,b,M,m=0.1), 2) array([[0.1, 0. ], [0. , 0. ]])
References
ot.partial.partial_wasserstein
Solves the partial optimal transport problem for the quadratic cost and returns the partial GW discrepancy
The function considers the following problem:
\[\gamma = \min_\gamma \quad \langle \gamma, \mathbf{M} \rangle_F\]
\[ \begin{align}\begin{aligned}s.t. \ \gamma \mathbf{1} &\leq \mathbf{a}\\ \gamma^T \mathbf{1} &\leq \mathbf{b}\\ \gamma &\geq 0\\ \mathbf{1}^T \gamma^T \mathbf{1} = m &\leq \min\{\|\mathbf{a}\|_1, \|\mathbf{b}\|_1\}\end{aligned}\end{align} \]
where :
\(\mathbf{M}\) is the metric cost matrix
\(\mathbf{a}\) and \(\mathbf{b}\) are source and target unbalanced distributions
m is the amount of mass to be transported
a (np.ndarray (dim_a,)) – Unnormalized histogram of dimension dim_a
b (np.ndarray (dim_b,)) – Unnormalized histograms of dimension dim_b
M (np.ndarray (dim_a, dim_b)) – cost matrix for the quadratic cost
m (float, optional) – amount of mass to be transported
nb_dummies (int, optional, default:1) – number of reservoir points to be added (to avoid numerical instabilities, increase its value if an error is raised)
log (bool, optional) – record log if True
**kwargs (dict) – parameters can be directly passed to the emd solver
Warning
When dealing with a large number of points, the EMD solver may face some instabilities, especially when the mass associated to the dummy point is large. To avoid them, increase the number of dummy points (allows a smoother repartition of the mass over the points).
GW (float) – partial GW discrepancy
log (dict) – log dictionary returned only if log is True
Examples
>>> import ot >>> a=[.1, .2] >>> b=[.1, .1] >>> M=[[0., 1.], [2., 3.]] >>> np.round(partial_wasserstein2(a, b, M), 1) 0.3 >>> np.round(partial_wasserstein2(a,b,M,m=0.1), 1) 0.0
References
Solves the partial optimal transport problem for the quadratic cost and returns the OT plan
The function considers the following problem:
\[\gamma = \mathop{\arg \min}_\gamma \quad \langle \gamma, (\mathbf{M} - \lambda) \rangle_F\]
\[ \begin{align}\begin{aligned}s.t. \ \gamma \mathbf{1} &\leq \mathbf{a}\\ \gamma^T \mathbf{1} &\leq \mathbf{b}\\ \gamma &\geq 0\\ \mathbf{1}^T \gamma^T \mathbf{1} = m & \leq \min\{\|\mathbf{a}\|_1, \|\mathbf{b}\|_1\}\end{aligned}\end{align} \]
or equivalently (see Chizat, L., Peyré, G., Schmitzer, B., & Vialard, F. X. (2018). An interpolating distance between optimal transport and Fisher–Rao metrics. Foundations of Computational Mathematics, 18(1), 1-44.)
\[ \begin{align}\begin{aligned}\gamma = \mathop{\arg \min}_\gamma \quad \langle \gamma, \mathbf{M} \rangle_F + \sqrt{\frac{\lambda}{2} (\|\gamma \mathbf{1} - \mathbf{a}\|_1 + \|\gamma^T \mathbf{1} - \mathbf{b}\|_1)}\\s.t. \ \gamma \geq 0\end{aligned}\end{align} \]
where :
\(\mathbf{M}\) is the metric cost matrix
\(\mathbf{a}\) and \(\mathbf{b}\) are source and target unbalanced distributions
\(\lambda\) is the lagrangian cost. Tuning its value allows attaining a given mass to be transported m
The formulation of the problem has been proposed in [28]
a (np.ndarray (dim_a,)) – Unnormalized histogram of dimension dim_a
b (np.ndarray (dim_b,)) – Unnormalized histograms of dimension dim_b
M (np.ndarray (dim_a, dim_b)) – cost matrix for the quadratic cost
reg_m (float, optional) – Lagrangian cost
nb_dummies (int, optional, default:1) – number of reservoir points to be added (to avoid numerical instabilities, increase its value if an error is raised)
log (bool, optional) – record log if True
**kwargs (dict) – parameters can be directly passed to the emd solver
Warning
When dealing with a large number of points, the EMD solver may face some instabilities, especially when the mass associated to the dummy point is large. To avoid them, increase the number of dummy points (allows a smoother repartition of the mass over the points).
gamma ((dim_a, dim_b) ndarray) – Optimal transportation matrix for the given parameters
log (dict) – log dictionary returned only if log is True
Examples
>>> import ot >>> a = [.1, .2] >>> b = [.1, .1] >>> M = [[0., 1.], [2., 3.]] >>> np.round(partial_wasserstein_lagrange(a,b,M), 2) array([[0.1, 0. ], [0. , 0.1]]) >>> np.round(partial_wasserstein_lagrange(a,b,M,reg_m=2), 2) array([[0.1, 0. ], [0. , 0. ]])
References
Returns the partial Gromov-Wasserstein transport between \((\mathbf{C_1}, \mathbf{p})\) and \((\mathbf{C_2}, \mathbf{q})\)
The function solves the following optimization problem:
\[\gamma = \mathop{\arg \min}_{\gamma} \quad \sum_{i,j,k,l} L(\mathbf{C_1}_{i,k}, \mathbf{C_2}_{j,l})\cdot \gamma_{i,j}\cdot\gamma_{k,l} + \mathrm{reg} \cdot\Omega(\gamma)\]
\[ \begin{align}\begin{aligned}s.t. \ \gamma &\geq 0\\ \gamma \mathbf{1} &\leq \mathbf{a}\\ \gamma^T \mathbf{1} &\leq \mathbf{b}\\ \mathbf{1}^T \gamma^T \mathbf{1} = m &\leq \min\{\|\mathbf{a}\|_1, \|\mathbf{b}\|_1\}\end{aligned}\end{align} \]
where :
\(\mathbf{C_1}\) is the metric cost matrix in the source space
\(\mathbf{C_2}\) is the metric cost matrix in the target space
\(\mathbf{p}\) and \(\mathbf{q}\) are the sample weights
L: quadratic loss function
\(\Omega\) is the entropic regularization term, \(\Omega=\sum_{i,j} \gamma_{i,j}\log(\gamma_{i,j})\)
m is the amount of mass to be transported
The formulation of the GW problem has been proposed in [12] and the partial GW in [29]
Note
This function will be deprecated in a near future, please use
ot.gromov.entropic_partial_gromov_wasserstein instead.
C1 (ndarray, shape (ns, ns)) – Metric cost matrix in the source space
C2 (ndarray, shape (nt, nt)) – Metric cost matrix in the target space
p (ndarray, shape (ns,)) – Distribution in the source space
q (ndarray, shape (nt,)) – Distribution in the target space
reg (float) – entropic regularization parameter
m (float, optional) – Amount of mass to be transported (default: \(\min\{\|\mathbf{p}\|_1, \|\mathbf{q}\|_1\}\))
G0 (ndarray, shape (ns, nt), optional) – Initialization of the transportation matrix
numItermax (int, optional) – Max number of iterations
tol (float, optional) – Stop threshold on error (>0)
log (bool, optional) – return log if True
verbose (bool, optional) – Print information along iterations
Examples
>>> import ot >>> import scipy as sp >>> a = np.array([0.25] * 4) >>> b = np.array([0.25] * 4) >>> x = np.array([1,2,100,200]).reshape((-1,1)) >>> y = np.array([3,2,98,199]).reshape((-1,1)) >>> C1 = sp.spatial.distance.cdist(x, x) >>> C2 = sp.spatial.distance.cdist(y, y) >>> np.round(entropic_partial_gromov_wasserstein(C1, C2, a, b, 50), 2) array([[0.12, 0.13, 0. , 0. ], [0.13, 0.12, 0. , 0. ], [0. , 0. , 0.25, 0. ], [0. , 0. , 0. , 0.25]]) >>> np.round(entropic_partial_gromov_wasserstein(C1, C2, a, b, 50,0.25), 2) array([[0.02, 0.03, 0. , 0.03], [0.03, 0.03, 0. , 0.03], [0. , 0. , 0.03, 0. ], [0.02, 0.02, 0. , 0.03]])
math: gamma : (dim_a, dim_b) ndarray – Optimal transportation matrix for the given parameters
log (dict) – log dictionary returned only if log is True
References
Returns the partial Gromov-Wasserstein discrepancy between \((\mathbf{C_1}, \mathbf{p})\) and \((\mathbf{C_2}, \mathbf{q})\)
The function solves the following optimization problem:
\[GW = \min_{\gamma} \quad \sum_{i,j,k,l} L(\mathbf{C_1}_{i,k}, \mathbf{C_2}_{j,l})\cdot \gamma_{i,j}\cdot\gamma_{k,l} + \mathrm{reg} \cdot\Omega(\gamma)\]
\[ \begin{align}\begin{aligned}s.t. \ \gamma &\geq 0\\ \gamma \mathbf{1} &\leq \mathbf{a}\\ \gamma^T \mathbf{1} &\leq \mathbf{b}\\ \mathbf{1}^T \gamma^T \mathbf{1} = m &\leq \min\{\|\mathbf{a}\|_1, \|\mathbf{b}\|_1\}\end{aligned}\end{align} \]
where :
\(\mathbf{C_1}\) is the metric cost matrix in the source space
\(\mathbf{C_2}\) is the metric cost matrix in the target space
\(\mathbf{p}\) and \(\mathbf{q}\) are the sample weights
L : quadratic loss function
\(\Omega\) is the entropic regularization term, \(\Omega=\sum_{i,j} \gamma_{i,j}\log(\gamma_{i,j})\)
m is the amount of mass to be transported
The formulation of the GW problem has been proposed in [12] and the partial GW in [29]
Note
This function will be deprecated in a near future, please use
ot.gromov.entropic_partial_gromov_wasserstein2 instead.
C1 (ndarray, shape (ns, ns)) – Metric cost matrix in the source space
C2 (ndarray, shape (nt, nt)) – Metric cost matrix in the target space
p (ndarray, shape (ns,)) – Distribution in the source space
q (ndarray, shape (nt,)) – Distribution in the target space
reg (float) – entropic regularization parameter
m (float, optional) – Amount of mass to be transported (default: \(\min\{\|\mathbf{p}\|_1, \|\mathbf{q}\|_1\}\))
G0 (ndarray, shape (ns, nt), optional) – Initialization of the transportation matrix
numItermax (int, optional) – Max number of iterations
tol (float, optional) – Stop threshold on error (>0)
log (bool, optional) – return log if True
verbose (bool, optional) – Print information along iterations
partial_gw_dist (float) – Gromov-Wasserstein distance
log (dict) – log dictionary returned only if log is True
Examples
>>> import ot >>> import scipy as sp >>> a = np.array([0.25] * 4) >>> b = np.array([0.25] * 4) >>> x = np.array([1,2,100,200]).reshape((-1,1)) >>> y = np.array([3,2,98,199]).reshape((-1,1)) >>> C1 = sp.spatial.distance.cdist(x, x) >>> C2 = sp.spatial.distance.cdist(y, y) >>> np.round(entropic_partial_gromov_wasserstein2(C1, C2, a, b,50), 2) 1.87
References
Solves the partial optimal transport problem and returns the OT plan
The function considers the following problem:
\[ \begin{align}\begin{aligned}\gamma = \mathop{\arg \min}_\gamma \quad \langle \gamma, \mathbf{M} \rangle_F + \mathrm{reg} \cdot\Omega(\gamma)\\\begin{split}s.t. \gamma \mathbf{1} &\leq \mathbf{a} \\ \gamma^T \mathbf{1} &\leq \mathbf{b} \\ \gamma &\geq 0 \\ \mathbf{1}^T \gamma^T \mathbf{1} = m &\leq \min\{\|\mathbf{a}\|_1, \|\mathbf{b}\|_1\} \\\end{split}\end{aligned}\end{align} \]
where :
\(\mathbf{M}\) is the metric cost matrix
\(\Omega\) is the entropic regularization term, \(\Omega=\sum_{i,j} \gamma_{i,j}\log(\gamma_{i,j})\)
\(\mathbf{a}\) and \(\mathbf{b}\) are the sample weights
m is the amount of mass to be transported
The formulation of the problem has been proposed in [3] (prop. 5)
a (np.ndarray (dim_a,)) – Unnormalized histogram of dimension dim_a
b (np.ndarray (dim_b,)) – Unnormalized histograms of dimension dim_b
M (np.ndarray (dim_a, dim_b)) – cost matrix
reg (float) – Regularization term > 0
m (float, optional) – Amount of mass to be transported
numItermax (int, optional) – Max number of iterations
stopThr (float, optional) – Stop threshold on error (>0)
verbose (bool, optional) – Print information along iterations
log (bool, optional) – record log if True
gamma ((dim_a, dim_b) ndarray) – Optimal transportation matrix for the given parameters
log (dict) – log dictionary returned only if log is True
Examples
>>> import ot >>> a = [.1, .2] >>> b = [.1, .1] >>> M = [[0., 1.], [2., 3.]] >>> np.round(entropic_partial_wasserstein(a, b, M, 1, 0.1), 2) array([[0.06, 0.02], [0.01, 0. ]])
References
Compute the GW gradient. Note: we can not use the trick in [12] as the marginals may not sum to 1.
Note
This function will be deprecated in a near future, please use
ot.gromov.gwggrad instead.
C1 (array of shape (n_p,n_p)) – intra-source (P) cost matrix
C2 (array of shape (n_u,n_u)) – intra-target (U) cost matrix
T (array of shape(n_p+nb_dummies, n_u) (default: None)) – Transport matrix
gradient
numpy.array of shape (n_p+nb_dummies, n_u)
References
Compute the GW loss.
Note
This function will be deprecated in a near future, please use
ot.gromov.gwloss instead.
C1 (array of shape (n_p,n_p)) – intra-source (P) cost matrix
C2 (array of shape (n_u,n_u)) – intra-target (U) cost matrix
T (array of shape(n_p+nb_dummies, n_u) (default: None)) – Transport matrix
GW loss
Solves the partial optimal transport problem and returns the OT plan
The function considers the following problem:
\[\gamma = \mathop{\arg \min}_\gamma \quad \langle \gamma, \mathbf{M} \rangle_F\]
\[ \begin{align}\begin{aligned}s.t. \ \gamma \mathbf{1} &\leq \mathbf{a}\\ \gamma^T \mathbf{1} &\leq \mathbf{b}\\ \gamma &\geq 0\\ \mathbf{1}^T \gamma^T \mathbf{1} = m &\leq \min\{\|\mathbf{a}\|_1, \|\mathbf{b}\|_1\}\end{aligned}\end{align} \]
where :
\(\mathbf{M}\) is the metric cost matrix
\(\Omega\) is the entropic regularization term, \(\Omega(\gamma) = \sum_{i,j} \gamma_{i,j}\log(\gamma_{i,j})\)
\(\mathbf{a}\) and \(\mathbf{b}\) are the sample weights
m is the amount of mass to be transported
The formulation of the problem has been proposed in [29]
Note
This function will be deprecated in a near future, please use
ot.gromov.partial_gromov_wasserstein instead.
C1 (ndarray, shape (ns, ns)) – Metric cost matrix in the source space
C2 (ndarray, shape (nt, nt)) – Metric costfr matrix in the target space
p (ndarray, shape (ns,)) – Distribution in the source space
q (ndarray, shape (nt,)) – Distribution in the target space
m (float, optional) – Amount of mass to be transported (default: \(\min\{\|\mathbf{p}\|_1, \|\mathbf{q}\|_1\}\))
nb_dummies (int, optional) – Number of dummy points to add (avoid instabilities in the EMD solver)
G0 (ndarray, shape (ns, nt), optional) – Initialization of the transportation matrix
thres (float, optional) – quantile of the gradient matrix to populate the cost matrix when 0 (default: 1)
numItermax (int, optional) – Max number of iterations
tol (float, optional) – tolerance for stopping iterations
log (bool, optional) – return log if True
verbose (bool, optional) – Print information along iterations
**kwargs (dict) – parameters can be directly passed to the emd solver
gamma ((dim_a, dim_b) ndarray) – Optimal transportation matrix for the given parameters
log (dict) – log dictionary returned only if log is True
Examples
>>> import ot >>> import scipy as sp >>> a = np.array([0.25] * 4) >>> b = np.array([0.25] * 4) >>> x = np.array([1,2,100,200]).reshape((-1,1)) >>> y = np.array([3,2,98,199]).reshape((-1,1)) >>> C1 = sp.spatial.distance.cdist(x, x) >>> C2 = sp.spatial.distance.cdist(y, y) >>> np.round(partial_gromov_wasserstein(C1, C2, a, b),2) array([[0. , 0.25, 0. , 0. ], [0.25, 0. , 0. , 0. ], [0. , 0. , 0.25, 0. ], [0. , 0. , 0. , 0.25]]) >>> np.round(partial_gromov_wasserstein(C1, C2, a, b, m=0.25),2) array([[0. , 0. , 0. , 0. ], [0. , 0. , 0. , 0. ], [0. , 0. , 0.25, 0. ], [0. , 0. , 0. , 0. ]])
References
Solves the partial optimal transport problem and returns the partial Gromov-Wasserstein discrepancy
The function considers the following problem:
\[GW = \min_\gamma \quad \langle \gamma, \mathbf{M} \rangle_F\]
\[ \begin{align}\begin{aligned}s.t. \ \gamma \mathbf{1} &\leq \mathbf{a}\\ \gamma^T \mathbf{1} &\leq \mathbf{b}\\ \gamma &\geq 0\\ \mathbf{1}^T \gamma^T \mathbf{1} = m &\leq \min\{\|\mathbf{a}\|_1, \|\mathbf{b}\|_1\}\end{aligned}\end{align} \]
where :
\(\mathbf{M}\) is the metric cost matrix
\(\Omega\) is the entropic regularization term, \(\Omega(\gamma) = \sum_{i,j} \gamma_{i,j}\log(\gamma_{i,j})\)
\(\mathbf{a}\) and \(\mathbf{b}\) are the sample weights
m is the amount of mass to be transported
The formulation of the problem has been proposed in [29]
Note
This function will be deprecated in a near future, please use
ot.gromov.partial_gromov_wasserstein2 instead.
C1 (ndarray, shape (ns, ns)) – Metric cost matrix in the source space
C2 (ndarray, shape (nt, nt)) – Metric cost matrix in the target space
p (ndarray, shape (ns,)) – Distribution in the source space
q (ndarray, shape (nt,)) – Distribution in the target space
m (float, optional) – Amount of mass to be transported (default: \(\min\{\|\mathbf{p}\|_1, \|\mathbf{q}\|_1\}\))
nb_dummies (int, optional) – Number of dummy points to add (avoid instabilities in the EMD solver)
G0 (ndarray, shape (ns, nt), optional) – Initialization of the transportation matrix
thres (float, optional) – quantile of the gradient matrix to populate the cost matrix when 0 (default: 1)
numItermax (int, optional) – Max number of iterations
tol (float, optional) – tolerance for stopping iterations
log (bool, optional) – return log if True
verbose (bool, optional) – Print information along iterations
**kwargs (dict) – parameters can be directly passed to the emd solver
Warning
When dealing with a large number of points, the EMD solver may face some instabilities, especially when the mass associated to the dummy point is large. To avoid them, increase the number of dummy points (allows a smoother repartition of the mass over the points).
partial_gw_dist (float) – partial GW discrepancy
log (dict) – log dictionary returned only if log is True
Examples
>>> import ot >>> import scipy as sp >>> a = np.array([0.25] * 4) >>> b = np.array([0.25] * 4) >>> x = np.array([1,2,100,200]).reshape((-1,1)) >>> y = np.array([3,2,98,199]).reshape((-1,1)) >>> C1 = sp.spatial.distance.cdist(x, x) >>> C2 = sp.spatial.distance.cdist(y, y) >>> np.round(partial_gromov_wasserstein2(C1, C2, a, b),2) 1.69 >>> np.round(partial_gromov_wasserstein2(C1, C2, a, b, m=0.25),2) 0.0
References
Solves the partial optimal transport problem for the quadratic cost and returns the OT plan
The function considers the following problem:
\[\gamma = \mathop{\arg \min}_\gamma \quad \langle \gamma, \mathbf{M} \rangle_F\]
\[ \begin{align}\begin{aligned}s.t. \ \gamma \mathbf{1} &\leq \mathbf{a}\\ \gamma^T \mathbf{1} &\leq \mathbf{b}\\ \gamma &\geq 0\\ \mathbf{1}^T \gamma^T \mathbf{1} = m &\leq \min\{\|\mathbf{a}\|_1, \|\mathbf{b}\|_1\}\end{aligned}\end{align} \]
where :
\(\mathbf{M}\) is the metric cost matrix
\(\mathbf{a}\) and \(\mathbf{b}\) are source and target unbalanced distributions
m is the amount of mass to be transported
a (np.ndarray (dim_a,)) – Unnormalized histogram of dimension dim_a
b (np.ndarray (dim_b,)) – Unnormalized histograms of dimension dim_b
M (np.ndarray (dim_a, dim_b)) – cost matrix for the quadratic cost
m (float, optional) – amount of mass to be transported
nb_dummies (int, optional, default:1) – number of reservoir points to be added (to avoid numerical instabilities, increase its value if an error is raised)
log (bool, optional) – record log if True
**kwargs (dict) – parameters can be directly passed to the emd solver
Warning
When dealing with a large number of points, the EMD solver may face some instabilities, especially when the mass associated to the dummy point is large. To avoid them, increase the number of dummy points (allows a smoother repartition of the mass over the points).
gamma ((dim_a, dim_b) ndarray) – Optimal transportation matrix for the given parameters
log (dict) – log dictionary returned only if log is True
Examples
>>> import ot >>> a = [.1, .2] >>> b = [.1, .1] >>> M = [[0., 1.], [2., 3.]] >>> np.round(partial_wasserstein(a,b,M), 2) array([[0.1, 0. ], [0. , 0.1]]) >>> np.round(partial_wasserstein(a,b,M,m=0.1), 2) array([[0.1, 0. ], [0. , 0. ]])
References
Solves the partial optimal transport problem for the quadratic cost and returns the partial GW discrepancy
The function considers the following problem:
\[\gamma = \min_\gamma \quad \langle \gamma, \mathbf{M} \rangle_F\]
\[ \begin{align}\begin{aligned}s.t. \ \gamma \mathbf{1} &\leq \mathbf{a}\\ \gamma^T \mathbf{1} &\leq \mathbf{b}\\ \gamma &\geq 0\\ \mathbf{1}^T \gamma^T \mathbf{1} = m &\leq \min\{\|\mathbf{a}\|_1, \|\mathbf{b}\|_1\}\end{aligned}\end{align} \]
where :
\(\mathbf{M}\) is the metric cost matrix
\(\mathbf{a}\) and \(\mathbf{b}\) are source and target unbalanced distributions
m is the amount of mass to be transported
a (np.ndarray (dim_a,)) – Unnormalized histogram of dimension dim_a
b (np.ndarray (dim_b,)) – Unnormalized histograms of dimension dim_b
M (np.ndarray (dim_a, dim_b)) – cost matrix for the quadratic cost
m (float, optional) – amount of mass to be transported
nb_dummies (int, optional, default:1) – number of reservoir points to be added (to avoid numerical instabilities, increase its value if an error is raised)
log (bool, optional) – record log if True
**kwargs (dict) – parameters can be directly passed to the emd solver
Warning
When dealing with a large number of points, the EMD solver may face some instabilities, especially when the mass associated to the dummy point is large. To avoid them, increase the number of dummy points (allows a smoother repartition of the mass over the points).
GW (float) – partial GW discrepancy
log (dict) – log dictionary returned only if log is True
Examples
>>> import ot >>> a=[.1, .2] >>> b=[.1, .1] >>> M=[[0., 1.], [2., 3.]] >>> np.round(partial_wasserstein2(a, b, M), 1) 0.3 >>> np.round(partial_wasserstein2(a,b,M,m=0.1), 1) 0.0
References
Solves the partial optimal transport problem for the quadratic cost and returns the OT plan
The function considers the following problem:
\[\gamma = \mathop{\arg \min}_\gamma \quad \langle \gamma, (\mathbf{M} - \lambda) \rangle_F\]
\[ \begin{align}\begin{aligned}s.t. \ \gamma \mathbf{1} &\leq \mathbf{a}\\ \gamma^T \mathbf{1} &\leq \mathbf{b}\\ \gamma &\geq 0\\ \mathbf{1}^T \gamma^T \mathbf{1} = m & \leq \min\{\|\mathbf{a}\|_1, \|\mathbf{b}\|_1\}\end{aligned}\end{align} \]
or equivalently (see Chizat, L., Peyré, G., Schmitzer, B., & Vialard, F. X. (2018). An interpolating distance between optimal transport and Fisher–Rao metrics. Foundations of Computational Mathematics, 18(1), 1-44.)
\[ \begin{align}\begin{aligned}\gamma = \mathop{\arg \min}_\gamma \quad \langle \gamma, \mathbf{M} \rangle_F + \sqrt{\frac{\lambda}{2} (\|\gamma \mathbf{1} - \mathbf{a}\|_1 + \|\gamma^T \mathbf{1} - \mathbf{b}\|_1)}\\s.t. \ \gamma \geq 0\end{aligned}\end{align} \]
where :
\(\mathbf{M}\) is the metric cost matrix
\(\mathbf{a}\) and \(\mathbf{b}\) are source and target unbalanced distributions
\(\lambda\) is the lagrangian cost. Tuning its value allows attaining a given mass to be transported m
The formulation of the problem has been proposed in [28]
a (np.ndarray (dim_a,)) – Unnormalized histogram of dimension dim_a
b (np.ndarray (dim_b,)) – Unnormalized histograms of dimension dim_b
M (np.ndarray (dim_a, dim_b)) – cost matrix for the quadratic cost
reg_m (float, optional) – Lagrangian cost
nb_dummies (int, optional, default:1) – number of reservoir points to be added (to avoid numerical instabilities, increase its value if an error is raised)
log (bool, optional) – record log if True
**kwargs (dict) – parameters can be directly passed to the emd solver
Warning
When dealing with a large number of points, the EMD solver may face some instabilities, especially when the mass associated to the dummy point is large. To avoid them, increase the number of dummy points (allows a smoother repartition of the mass over the points).
gamma ((dim_a, dim_b) ndarray) – Optimal transportation matrix for the given parameters
log (dict) – log dictionary returned only if log is True
Examples
>>> import ot >>> a = [.1, .2] >>> b = [.1, .1] >>> M = [[0., 1.], [2., 3.]] >>> np.round(partial_wasserstein_lagrange(a,b,M), 2) array([[0.1, 0. ], [0. , 0.1]]) >>> np.round(partial_wasserstein_lagrange(a,b,M,reg_m=2), 2) array([[0.1, 0. ], [0. , 0. ]])
References
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4