Regularization path OT solvers
FunctionsThis function computes the inverse of the design matrix in the regularization path using the Schur complement. Two cases may arise:
Case 1: one variable is added to the active set
\[\begin{split}M_{k+1}^{-1} = \begin{bmatrix} M_{k}^{-1} + s^{-1} M_{k}^{-1} \mathbf{b} \mathbf{b}^T M_{k}^{-1} \ & - M_{k}^{-1} \mathbf{b} s^{-1} \\ - s^{-1} \mathbf{b}^T M_{k}^{-1} & s^{-1} \end{bmatrix}\end{split}\]
where :
\(M_k^{-1}\) is the inverse of the design matrix \(H_A^tH_A\) of the previous iteration
\(\mathbf{b}\) is the last column of \(M_{k}\)
\(s\) is the Schur complement, given by \(s = \mathbf{d} - \mathbf{b}^T M_{k}^{-1} \mathbf{b}\)
Case 2: one variable is removed from the active set.
\[M_{k+1}^{-1} = M^{-1}_{k \backslash q} - \frac{r_{-q,q} r^{T}_{-q,q}}{r_{q,q}}\]
where :
\(q\) is the index of column and row to delete
\(M^{-1}_{k \backslash q}\) is the previous inverse matrix deprived of the \(q\) th column and \(q\) th row
\(r_{-q,q}\) is the \(q\) th column of \(M^{-1}_{k}\) without the \(q\) th element
\(r_{q, q}\) is the element of \(q\) th column and \(q\) th row in \(M^{-1}_{k}\)
M_current (ndarray, shape (size(A)-1, size(A)-1)) – Inverse matrix of \(H_A^tH_A\) at the previous iteration, with size(A) the size of the active set
b (ndarray, shape (size(A)-1, )) – None for case 2 (removal), last column of \(M_{k}\) for case 1 (addition)
d (float) – should be equal to 2 when UOT and 1 for the semi-relaxed OT
id_pop (int) – Index of the variable to be removed, equal to -1 if no variable is deleted at the current iteration
M – Inverse matrix of \(H_A^tH_A\) of the current iteration
ndarray, shape (size(A), size(A))
References
This function computes the next gamma value if a variable is removed at the next iteration of the regularization path.
We look for the largest value of the regularization parameter such that an element of the current solution vanishes
\[\max_{j \in A} \frac{\phi_j}{\delta_j}\]
where :
A is the current active set
\(\phi_j\) is the \(j\) th element of the intercept of the current solution
\(\delta_j\) is the \(j\) th element of the slope of the current solution
phi (ndarray, shape (size(A), )) – Intercept of the solution at the current iteration
delta (ndarray, shape (size(A), )) – Slope of the solution at the current iteration
current_gamma (float) – Value of the regularization parameter at the beginning of the current iteration
next_removal_gamma (float) – Gamma value if a variable is removed at the next iteration
next_removal_index (int) – Index of the variable to be removed at the next iteration
References
Given the regularization path, this function computes the transport plan for any value of gamma thanks to the piecewise linearity of the path.
\[t(\gamma) = \phi(\gamma) - \gamma \delta(\gamma)\]
where:
\(\gamma\) is the regularization parameter
\(\phi(\gamma)\) is the corresponding intercept
\(\delta(\gamma)\) is the corresponding slope
\(\mathbf{t}\) is the flattened version of the transport matrix
t – Vectorization of the transport plan corresponding to the given value of gamma
np.ndarray (dim_a*dim_b, )
Examples
>>> import ot >>> import numpy as np >>> n = 3 >>> xs = np.array([1., 2., 3.]).reshape((n, 1)) >>> xt = np.array([5., 6., 7.]).reshape((n, 1)) >>> C = ot.dist(xs, xt) >>> C /= C.max() >>> a = np.array([0.2, 0.5, 0.3]) >>> b = np.array([0.2, 0.5, 0.3]) >>> t, pi_list, g_list = ot.regpath.regularization_path(a, b, C, reg=1e-4) >>> gamma = 1 >>> t2 = ot.regpath.compute_transport_plan(gamma, g_list, pi_list) >>> t2 array([0. , 0. , 0. , 0.19722222, 0.05555556, 0. , 0. , 0.24722222, 0. ])
References
ot.regpath.compute_transport_plan
This function constructs an augmented matrix for the first iteration of the semi-relaxed regularization path
\[\begin{split}\text{Augmented}_H = \begin{bmatrix} 0 & H_{c A} \\ H_{c A}^T & H_{r A}^T H_{r A} \end{bmatrix}\end{split}\]
where :
\(H_{r A}\) is the sub-matrix constructed by the columns of \(H_r\) whose indices belong to the active set A
\(H_{c A}\) is the sub-matrix constructed by the columns of \(H_c\) whose indices belong to the active set A
H_augmented – Augmented matrix for the first iteration of the semi-relaxed regularization path
np.ndarray (dim_b + size(A), dim_b + size(A))
This function gives the regularization path of l2-penalized UOT problem
The problem to optimize is the Lasso reformulation of the l2-penalized UOT:
\[ \begin{align}\begin{aligned}\min_t \gamma \mathbf{c}^T \mathbf{t} + 0.5 * \|{H} \mathbf{t} - \mathbf{y}\|_2^2\\s.t. \mathbf{t} \geq 0\end{aligned}\end{align} \]
where :
\(\mathbf{c}\) is the flattened version of the cost matrix \({C}\)
\(\gamma = 1/\lambda\) is the l2-regularization coefficient
\(\mathbf{y}\) is the concatenation of vectors \(\mathbf{a}\) and \(\mathbf{b}\), defined as \(\mathbf{y}^T = [\mathbf{a}^T \mathbf{b}^T]\)
\({H}\) is a design matrix, see [41] for the design of \({H}\). The matrix product \(H\mathbf{t}\) computes both the source marginal and the target marginals.
\(\mathbf{t}\) is the flattened version of the transport matrix
t (np.ndarray (dim_a*dim_b, )) – Flattened vector of the optimal transport matrix
t_list (list) – List of solutions in the regularization path
gamma_list (list) – List of regularization coefficients in the regularization path
Examples
>>> import ot >>> import numpy as np >>> n = 3 >>> xs = np.array([1., 2., 3.]).reshape((n, 1)) >>> xt = np.array([5., 6., 7.]).reshape((n, 1)) >>> C = ot.dist(xs, xt) >>> C /= C.max() >>> a = np.array([0.2, 0.5, 0.3]) >>> b = np.array([0.2, 0.5, 0.3]) >>> t, _, _ = ot.regpath.fully_relaxed_path(a, b, C, 1e-4) >>> t array([1.99958333e-01, 0.00000000e+00, 0.00000000e+00, 3.88888889e-05, 4.99938889e-01, 0.00000000e+00, 0.00000000e+00, 3.88888889e-05, 2.99958333e-01])
References
This function computes the next value of gamma if a variable is added in the next iteration of the regularization path.
We look for the largest value of gamma such that the gradient of an inactive variable vanishes
\[\max_{i \in \bar{A}} \frac{\mathbf{h}_i^T(H_A \phi - \mathbf{y})} {\mathbf{h}_i^T H_A \delta - \mathbf{c}_i}\]
where :
A is the current active set
\(\mathbf{h}_i\) is the \(i\) th column of the design matrix \({H}\)
\({H}_A\) is the sub-matrix constructed by the columns of \({H}\) whose indices belong to the active set A
\(\mathbf{c}_i\) is the \(i\) th element of the cost vector \(\mathbf{c}\)
\(\mathbf{y}\) is the concatenation of the source and target distributions
\(\phi\) is the intercept of the solutions at the current iteration
\(\delta\) is the slope of the solutions at the current iteration
phi (np.ndarray (size(A), )) – Intercept of the solutions at the current iteration
delta (np.ndarray (size(A), )) – Slope of the solutions at the current iteration
HtH (np.ndarray (dim_a * dim_b, dim_a * dim_b)) – Matrix product of \({H}^T {H}\)
Hty (np.ndarray (dim_a + dim_b, )) – Matrix product of \({H}^T \mathbf{y}\)
c (np.ndarray (dim_a * dim_b, )) – Flattened array of the cost matrix \({C}\)
active_index (list) – Indices of active variables
current_gamma (float) – Value of the regularization parameter at the beginning of the current iteration
next_gamma (float) – Value of gamma if a variable is added to active set in next iteration
next_active_index (int) – Index of variable to be activated
References
This function recasts the l2-penalized UOT problem as a Lasso problem.
Recall the l2-penalized UOT problem defined in [41]
\[ \begin{align}\begin{aligned}\text{UOT}_{\lambda} = \min_T <C, T> + \lambda \|T 1_m - \mathbf{a}\|_2^2 + \lambda \|T^T 1_n - \mathbf{b}\|_2^2\\s.t. T \geq 0\end{aligned}\end{align} \]
where :
\(C\) is the cost matrix
\(\lambda\) is the l2-regularization parameter
\(\mathbf{a}\) and \(\mathbf{b}\) are the source and target distributions
\(T\) is the transport plan to optimize
The problem above can be reformulated as a non-negative penalized linear regression problem, particularly Lasso
\[ \begin{align}\begin{aligned}\text{UOT2}_{\lambda} = \min_{\mathbf{t}} \gamma \mathbf{c}^T \mathbf{t} + 0.5 * \|H \mathbf{t} - \mathbf{y}\|_2^2\\s.t. \mathbf{t} \geq 0\end{aligned}\end{align} \]
where :
\(\mathbf{c}\) is the flattened version of the cost matrix \(C\)
\(\mathbf{y}\) is the concatenation of vectors \(\mathbf{a}\) and \(\mathbf{b}\)
\(H\) is a metric matrix, see [41] for the design of \(H\). The matrix product \(H\mathbf{t}\) computes both the source marginal and the target marginals.
\(\mathbf{t}\) is the flattened version of the transport plan \(T\)
a (np.ndarray (dim_a,)) – Histogram of dimension dim_a
b (np.ndarray (dim_b,)) – Histogram of dimension dim_b
C (np.ndarray, shape (dim_a, dim_b)) – Cost matrix
H (np.ndarray (dim_a+dim_b, dim_a*dim_b)) – Design matrix that contains only 0 and 1
y (np.ndarray (ns + nt, )) – Concatenation of histograms \(\mathbf{a}\) and \(\mathbf{b}\)
c (np.ndarray (ns * nt, )) – Flattened array of the cost matrix
Examples
>>> import ot >>> a = np.array([0.2, 0.3, 0.5]) >>> b = np.array([0.1, 0.9]) >>> C = np.array([[16., 25.], [28., 16.], [40., 36.]]) >>> H, y, c = ot.regpath.recast_ot_as_lasso(a, b, C) >>> H.toarray() array([[1., 1., 0., 0., 0., 0.], [0., 0., 1., 1., 0., 0.], [0., 0., 0., 0., 1., 1.], [1., 0., 1., 0., 1., 0.], [0., 1., 0., 1., 0., 1.]]) >>> y array([0.2, 0.3, 0.5, 0.1, 0.9]) >>> c array([16., 25., 28., 16., 40., 36.])
References
This function recasts the semi-relaxed l2-UOT problem as Lasso problem.
\[ \begin{align}\begin{aligned}\text{semi-relaxed UOT} = \min_T <C, T> + \lambda \|T 1_m - \mathbf{a}\|_2^2\\s.t. T^T 1_n = \mathbf{b}\\ \mathbf{t} \geq 0\end{aligned}\end{align} \]
where :
\(C\) is the metric cost matrix
\(\lambda\) is the l2-regularization parameter
\(\mathbf{a}\) and \(\mathbf{b}\) are the source and target distributions
\(T\) is the transport plan to optimize
The problem above can be reformulated as follows
\[ \begin{align}\begin{aligned}\text{semi-relaxed UOT2} = \min_t \gamma \mathbf{c}^T t + 0.5 * \|H_r \mathbf{t} - \mathbf{a}\|_2^2\\s.t. H_c \mathbf{t} = \mathbf{b}\\ \mathbf{t} \geq 0\end{aligned}\end{align} \]
where :
\(\mathbf{c}\) is flattened version of the cost matrix \(C\)
\(\gamma = 1/\lambda\) is the l2-regularization parameter
\(H_r\) is a metric matrix which computes the sum along the rows of the transport plan \(T\)
\(H_c\) is a metric matrix which computes the sum along the columns of the transport plan \(T\)
\(\mathbf{t}\) is the flattened version of \(T\)
a (np.ndarray (dim_a,)) – Histogram of dimension dim_a
b (np.ndarray (dim_b,)) – Histogram of dimension dim_b
C (np.ndarray, shape (dim_a, dim_b)) – Cost matrix
Hr (np.ndarray (dim_a, dim_a * dim_b)) – Auxiliary matrix constituted by 0 and 1, which computes the sum along the rows of transport plan \(T\)
Hc (np.ndarray (dim_b, dim_a * dim_b)) – Auxiliary matrix constituted by 0 and 1, which computes the sum along the columns of transport plan \(T\)
c (np.ndarray (ns * nt, )) – Flattened array of the cost matrix
Examples
>>> import ot >>> a = np.array([0.2, 0.3, 0.5]) >>> b = np.array([0.1, 0.9]) >>> C = np.array([[16., 25.], [28., 16.], [40., 36.]]) >>> Hr,Hc,c = ot.regpath.recast_semi_relaxed_as_lasso(a, b, C) >>> Hr.toarray() array([[1., 1., 0., 0., 0., 0.], [0., 0., 1., 1., 0., 0.], [0., 0., 0., 0., 1., 1.]]) >>> Hc.toarray() array([[1., 0., 1., 0., 1., 0.], [0., 1., 0., 1., 0., 1.]]) >>> c array([16., 25., 28., 16., 40., 36.])
This function provides all the solutions of the regularization path of the l2-UOT problem [41].
The problem to optimize is the Lasso reformulation of the l2-penalized UOT:
\[ \begin{align}\begin{aligned}\min_t \gamma \mathbf{c}^T \mathbf{t} + 0.5 * \|{H} \mathbf{t} - \mathbf{y}\|_2^2\\s.t. \mathbf{t} \geq 0\end{aligned}\end{align} \]
where :
\(\mathbf{c}\) is the flattened version of the cost matrix \({C}\)
\(\gamma = 1/\lambda\) is the l2-regularization coefficient
\(\mathbf{y}\) is the concatenation of vectors \(\mathbf{a}\) and \(\mathbf{b}\), defined as \(\mathbf{y}^T = [\mathbf{a}^T \mathbf{b}^T]\)
\({H}\) is a design matrix, see [41] for the design of \({H}\). The matrix product \(H\mathbf{t}\) computes both the source marginal and the target marginals.
\(\mathbf{t}\) is the flattened version of the transport matrix
For the semi-relaxed problem, it optimizes the Lasso reformulation of the l2-penalized UOT:
\[ \begin{align}\begin{aligned}\min_t \gamma \mathbf{c}^T \mathbf{t} + 0.5 * \|H_r \mathbf{t} - \mathbf{a}\|_2^2\\s.t. H_c \mathbf{t} = \mathbf{b}\\ \mathbf{t} \geq 0\end{aligned}\end{align} \]
a (np.ndarray (dim_a,)) – Histogram of dimension dim_a
b (np.ndarray (dim_b,)) – Histogram of dimension dim_b
C (np.ndarray, shape (dim_a, dim_b)) – Cost matrix
reg (float (optional)) – l2-regularization coefficient
semi_relaxed (bool (optional)) – Give the semi-relaxed path if True
itmax (int (optional)) – Maximum number of iteration
t (np.ndarray (dim_a*dim_b, )) – Flattened vector of the (unregularized) optimal transport matrix
t_list (list) – List of all the optimal transport vectors of the regularization path
gamma_list (list) – List of the regularization parameters in the path
References
ot.regpath.regularization_path
This function computes the next value of gamma when a variable is active in the regularization path of semi-relaxed UOT.
By taking the Lagrangian form of the problem, we obtain a similar update as the two-sided relaxed UOT
\[\max_{i \in \bar{A}} \frac{\mathbf{h}_{ri}^T(H_{rA} \phi - \mathbf{a}) + \mathbf{h}_{c i}^T\phi_u}{\mathbf{h}_{r i}^T H_{r A} \delta + \ \mathbf{h}_{c i} \delta_u - \mathbf{c}_i}\]
where :
A is the current active set
\(\mathbf{h}_{r i}\) is the ith column of the matrix \(H_r\)
\(\mathbf{h}_{c i}\) is the ith column of the matrix \(H_c\)
\(H_{r A}\) is the sub-matrix constructed by the columns of \(H_r\) whose indices belong to the active set A
\(\mathbf{c}_i\) is the \(i\) th element of cost vector \(\mathbf{c}\)
\(\phi\) is the intercept of the solutions in current iteration
\(\delta\) is the slope of the solutions in current iteration
\(\phi_u\) is the intercept of Lagrange parameter at the current iteration
\(\delta_u\) is the slope of Lagrange parameter at the current iteration
phi (np.ndarray (size(A), )) – Intercept of the solutions at the current iteration
delta (np.ndarray (size(A), )) – Slope of the solutions at the current iteration
phi_u (np.ndarray (dim_b, )) – Intercept of the Lagrange parameter at the current iteration
delta_u (np.ndarray (dim_b, )) – Slope of the Lagrange parameter at the current iteration
HrHr (np.ndarray (dim_a * dim_b, dim_a * dim_b)) – Matrix product of \(H_r^T H_r\)
Hc (np.ndarray (dim_b, dim_a * dim_b)) – Matrix that computes the sum along the columns of the transport plan \(T\)
Hra (np.ndarray (dim_a * dim_b, )) – Matrix product of \(H_r^T \mathbf{a}\)
c (np.ndarray (dim_a * dim_b, )) – Flattened array of cost matrix \(C\)
active_index (list) – Indices of active variables
current_gamma (float) – Value of regularization coefficient at the start of current iteration
next_gamma (float) – Value of gamma if a variable is added to active set in next iteration
next_active_index (int) – Index of variable to be activated
References
This function gives the regularization path of semi-relaxed l2-UOT problem.
The problem to optimize is the Lasso reformulation of the l2-penalized UOT:
\[ \begin{align}\begin{aligned}\min_t \gamma \mathbf{c}^T t + 0.5 * \|H_r \mathbf{t} - \mathbf{a}\|_2^2\\s.t. H_c \mathbf{t} = \mathbf{b}\\ \mathbf{t} \geq 0\end{aligned}\end{align} \]
where :
\(\mathbf{c}\) is the flattened version of the cost matrix \(C\)
\(\gamma = 1/\lambda\) is the l2-regularization parameter
\(H_r\) is a matrix that computes the sum along the rows of the transport plan \(T\)
\(H_c\) is a matrix that computes the sum along the columns of the transport plan \(T\)
\(\mathbf{t}\) is the flattened version of the transport plan \(T\)
t (np.ndarray (dim_a*dim_b, )) – Flattened vector of the (unregularized) optimal transport matrix
t_list (list) – List of all the optimal transport vectors of the regularization path
gamma_list (list) – List of the regularization parameters in the path
Examples
>>> import ot >>> import numpy as np >>> n = 3 >>> xs = np.array([1., 2., 3.]).reshape((n, 1)) >>> xt = np.array([5., 6., 7.]).reshape((n, 1)) >>> C = ot.dist(xs, xt) >>> C /= C.max() >>> a = np.array([0.2, 0.5, 0.3]) >>> b = np.array([0.2, 0.5, 0.3]) >>> t, _, _ = ot.regpath.semi_relaxed_path(a, b, C, 1e-4) >>> t array([1.99980556e-01, 0.00000000e+00, 0.00000000e+00, 1.94444444e-05, 4.99980556e-01, 0.00000000e+00, 0.00000000e+00, 1.94444444e-05, 3.00000000e-01])
References
This function computes the inverse of the design matrix in the regularization path using the Schur complement. Two cases may arise:
Case 1: one variable is added to the active set
\[\begin{split}M_{k+1}^{-1} = \begin{bmatrix} M_{k}^{-1} + s^{-1} M_{k}^{-1} \mathbf{b} \mathbf{b}^T M_{k}^{-1} \ & - M_{k}^{-1} \mathbf{b} s^{-1} \\ - s^{-1} \mathbf{b}^T M_{k}^{-1} & s^{-1} \end{bmatrix}\end{split}\]
where :
\(M_k^{-1}\) is the inverse of the design matrix \(H_A^tH_A\) of the previous iteration
\(\mathbf{b}\) is the last column of \(M_{k}\)
\(s\) is the Schur complement, given by \(s = \mathbf{d} - \mathbf{b}^T M_{k}^{-1} \mathbf{b}\)
Case 2: one variable is removed from the active set.
\[M_{k+1}^{-1} = M^{-1}_{k \backslash q} - \frac{r_{-q,q} r^{T}_{-q,q}}{r_{q,q}}\]
where :
\(q\) is the index of column and row to delete
\(M^{-1}_{k \backslash q}\) is the previous inverse matrix deprived of the \(q\) th column and \(q\) th row
\(r_{-q,q}\) is the \(q\) th column of \(M^{-1}_{k}\) without the \(q\) th element
\(r_{q, q}\) is the element of \(q\) th column and \(q\) th row in \(M^{-1}_{k}\)
M_current (ndarray, shape (size(A)-1, size(A)-1)) – Inverse matrix of \(H_A^tH_A\) at the previous iteration, with size(A) the size of the active set
b (ndarray, shape (size(A)-1, )) – None for case 2 (removal), last column of \(M_{k}\) for case 1 (addition)
d (float) – should be equal to 2 when UOT and 1 for the semi-relaxed OT
id_pop (int) – Index of the variable to be removed, equal to -1 if no variable is deleted at the current iteration
M – Inverse matrix of \(H_A^tH_A\) of the current iteration
ndarray, shape (size(A), size(A))
References
This function computes the next gamma value if a variable is removed at the next iteration of the regularization path.
We look for the largest value of the regularization parameter such that an element of the current solution vanishes
\[\max_{j \in A} \frac{\phi_j}{\delta_j}\]
where :
A is the current active set
\(\phi_j\) is the \(j\) th element of the intercept of the current solution
\(\delta_j\) is the \(j\) th element of the slope of the current solution
phi (ndarray, shape (size(A), )) – Intercept of the solution at the current iteration
delta (ndarray, shape (size(A), )) – Slope of the solution at the current iteration
current_gamma (float) – Value of the regularization parameter at the beginning of the current iteration
next_removal_gamma (float) – Gamma value if a variable is removed at the next iteration
next_removal_index (int) – Index of the variable to be removed at the next iteration
References
Given the regularization path, this function computes the transport plan for any value of gamma thanks to the piecewise linearity of the path.
\[t(\gamma) = \phi(\gamma) - \gamma \delta(\gamma)\]
where:
\(\gamma\) is the regularization parameter
\(\phi(\gamma)\) is the corresponding intercept
\(\delta(\gamma)\) is the corresponding slope
\(\mathbf{t}\) is the flattened version of the transport matrix
t – Vectorization of the transport plan corresponding to the given value of gamma
np.ndarray (dim_a*dim_b, )
Examples
>>> import ot >>> import numpy as np >>> n = 3 >>> xs = np.array([1., 2., 3.]).reshape((n, 1)) >>> xt = np.array([5., 6., 7.]).reshape((n, 1)) >>> C = ot.dist(xs, xt) >>> C /= C.max() >>> a = np.array([0.2, 0.5, 0.3]) >>> b = np.array([0.2, 0.5, 0.3]) >>> t, pi_list, g_list = ot.regpath.regularization_path(a, b, C, reg=1e-4) >>> gamma = 1 >>> t2 = ot.regpath.compute_transport_plan(gamma, g_list, pi_list) >>> t2 array([0. , 0. , 0. , 0.19722222, 0.05555556, 0. , 0. , 0.24722222, 0. ])
References
This function constructs an augmented matrix for the first iteration of the semi-relaxed regularization path
\[\begin{split}\text{Augmented}_H = \begin{bmatrix} 0 & H_{c A} \\ H_{c A}^T & H_{r A}^T H_{r A} \end{bmatrix}\end{split}\]
where :
\(H_{r A}\) is the sub-matrix constructed by the columns of \(H_r\) whose indices belong to the active set A
\(H_{c A}\) is the sub-matrix constructed by the columns of \(H_c\) whose indices belong to the active set A
H_augmented – Augmented matrix for the first iteration of the semi-relaxed regularization path
np.ndarray (dim_b + size(A), dim_b + size(A))
This function gives the regularization path of l2-penalized UOT problem
The problem to optimize is the Lasso reformulation of the l2-penalized UOT:
\[ \begin{align}\begin{aligned}\min_t \gamma \mathbf{c}^T \mathbf{t} + 0.5 * \|{H} \mathbf{t} - \mathbf{y}\|_2^2\\s.t. \mathbf{t} \geq 0\end{aligned}\end{align} \]
where :
\(\mathbf{c}\) is the flattened version of the cost matrix \({C}\)
\(\gamma = 1/\lambda\) is the l2-regularization coefficient
\(\mathbf{y}\) is the concatenation of vectors \(\mathbf{a}\) and \(\mathbf{b}\), defined as \(\mathbf{y}^T = [\mathbf{a}^T \mathbf{b}^T]\)
\({H}\) is a design matrix, see [41] for the design of \({H}\). The matrix product \(H\mathbf{t}\) computes both the source marginal and the target marginals.
\(\mathbf{t}\) is the flattened version of the transport matrix
t (np.ndarray (dim_a*dim_b, )) – Flattened vector of the optimal transport matrix
t_list (list) – List of solutions in the regularization path
gamma_list (list) – List of regularization coefficients in the regularization path
Examples
>>> import ot >>> import numpy as np >>> n = 3 >>> xs = np.array([1., 2., 3.]).reshape((n, 1)) >>> xt = np.array([5., 6., 7.]).reshape((n, 1)) >>> C = ot.dist(xs, xt) >>> C /= C.max() >>> a = np.array([0.2, 0.5, 0.3]) >>> b = np.array([0.2, 0.5, 0.3]) >>> t, _, _ = ot.regpath.fully_relaxed_path(a, b, C, 1e-4) >>> t array([1.99958333e-01, 0.00000000e+00, 0.00000000e+00, 3.88888889e-05, 4.99938889e-01, 0.00000000e+00, 0.00000000e+00, 3.88888889e-05, 2.99958333e-01])
References
This function computes the next value of gamma if a variable is added in the next iteration of the regularization path.
We look for the largest value of gamma such that the gradient of an inactive variable vanishes
\[\max_{i \in \bar{A}} \frac{\mathbf{h}_i^T(H_A \phi - \mathbf{y})} {\mathbf{h}_i^T H_A \delta - \mathbf{c}_i}\]
where :
A is the current active set
\(\mathbf{h}_i\) is the \(i\) th column of the design matrix \({H}\)
\({H}_A\) is the sub-matrix constructed by the columns of \({H}\) whose indices belong to the active set A
\(\mathbf{c}_i\) is the \(i\) th element of the cost vector \(\mathbf{c}\)
\(\mathbf{y}\) is the concatenation of the source and target distributions
\(\phi\) is the intercept of the solutions at the current iteration
\(\delta\) is the slope of the solutions at the current iteration
phi (np.ndarray (size(A), )) – Intercept of the solutions at the current iteration
delta (np.ndarray (size(A), )) – Slope of the solutions at the current iteration
HtH (np.ndarray (dim_a * dim_b, dim_a * dim_b)) – Matrix product of \({H}^T {H}\)
Hty (np.ndarray (dim_a + dim_b, )) – Matrix product of \({H}^T \mathbf{y}\)
c (np.ndarray (dim_a * dim_b, )) – Flattened array of the cost matrix \({C}\)
active_index (list) – Indices of active variables
current_gamma (float) – Value of the regularization parameter at the beginning of the current iteration
next_gamma (float) – Value of gamma if a variable is added to active set in next iteration
next_active_index (int) – Index of variable to be activated
References
This function recasts the l2-penalized UOT problem as a Lasso problem.
Recall the l2-penalized UOT problem defined in [41]
\[ \begin{align}\begin{aligned}\text{UOT}_{\lambda} = \min_T <C, T> + \lambda \|T 1_m - \mathbf{a}\|_2^2 + \lambda \|T^T 1_n - \mathbf{b}\|_2^2\\s.t. T \geq 0\end{aligned}\end{align} \]
where :
\(C\) is the cost matrix
\(\lambda\) is the l2-regularization parameter
\(\mathbf{a}\) and \(\mathbf{b}\) are the source and target distributions
\(T\) is the transport plan to optimize
The problem above can be reformulated as a non-negative penalized linear regression problem, particularly Lasso
\[ \begin{align}\begin{aligned}\text{UOT2}_{\lambda} = \min_{\mathbf{t}} \gamma \mathbf{c}^T \mathbf{t} + 0.5 * \|H \mathbf{t} - \mathbf{y}\|_2^2\\s.t. \mathbf{t} \geq 0\end{aligned}\end{align} \]
where :
\(\mathbf{c}\) is the flattened version of the cost matrix \(C\)
\(\mathbf{y}\) is the concatenation of vectors \(\mathbf{a}\) and \(\mathbf{b}\)
\(H\) is a metric matrix, see [41] for the design of \(H\). The matrix product \(H\mathbf{t}\) computes both the source marginal and the target marginals.
\(\mathbf{t}\) is the flattened version of the transport plan \(T\)
a (np.ndarray (dim_a,)) – Histogram of dimension dim_a
b (np.ndarray (dim_b,)) – Histogram of dimension dim_b
C (np.ndarray, shape (dim_a, dim_b)) – Cost matrix
H (np.ndarray (dim_a+dim_b, dim_a*dim_b)) – Design matrix that contains only 0 and 1
y (np.ndarray (ns + nt, )) – Concatenation of histograms \(\mathbf{a}\) and \(\mathbf{b}\)
c (np.ndarray (ns * nt, )) – Flattened array of the cost matrix
Examples
>>> import ot >>> a = np.array([0.2, 0.3, 0.5]) >>> b = np.array([0.1, 0.9]) >>> C = np.array([[16., 25.], [28., 16.], [40., 36.]]) >>> H, y, c = ot.regpath.recast_ot_as_lasso(a, b, C) >>> H.toarray() array([[1., 1., 0., 0., 0., 0.], [0., 0., 1., 1., 0., 0.], [0., 0., 0., 0., 1., 1.], [1., 0., 1., 0., 1., 0.], [0., 1., 0., 1., 0., 1.]]) >>> y array([0.2, 0.3, 0.5, 0.1, 0.9]) >>> c array([16., 25., 28., 16., 40., 36.])
References
This function recasts the semi-relaxed l2-UOT problem as Lasso problem.
\[ \begin{align}\begin{aligned}\text{semi-relaxed UOT} = \min_T <C, T> + \lambda \|T 1_m - \mathbf{a}\|_2^2\\s.t. T^T 1_n = \mathbf{b}\\ \mathbf{t} \geq 0\end{aligned}\end{align} \]
where :
\(C\) is the metric cost matrix
\(\lambda\) is the l2-regularization parameter
\(\mathbf{a}\) and \(\mathbf{b}\) are the source and target distributions
\(T\) is the transport plan to optimize
The problem above can be reformulated as follows
\[ \begin{align}\begin{aligned}\text{semi-relaxed UOT2} = \min_t \gamma \mathbf{c}^T t + 0.5 * \|H_r \mathbf{t} - \mathbf{a}\|_2^2\\s.t. H_c \mathbf{t} = \mathbf{b}\\ \mathbf{t} \geq 0\end{aligned}\end{align} \]
where :
\(\mathbf{c}\) is flattened version of the cost matrix \(C\)
\(\gamma = 1/\lambda\) is the l2-regularization parameter
\(H_r\) is a metric matrix which computes the sum along the rows of the transport plan \(T\)
\(H_c\) is a metric matrix which computes the sum along the columns of the transport plan \(T\)
\(\mathbf{t}\) is the flattened version of \(T\)
a (np.ndarray (dim_a,)) – Histogram of dimension dim_a
b (np.ndarray (dim_b,)) – Histogram of dimension dim_b
C (np.ndarray, shape (dim_a, dim_b)) – Cost matrix
Hr (np.ndarray (dim_a, dim_a * dim_b)) – Auxiliary matrix constituted by 0 and 1, which computes the sum along the rows of transport plan \(T\)
Hc (np.ndarray (dim_b, dim_a * dim_b)) – Auxiliary matrix constituted by 0 and 1, which computes the sum along the columns of transport plan \(T\)
c (np.ndarray (ns * nt, )) – Flattened array of the cost matrix
Examples
>>> import ot >>> a = np.array([0.2, 0.3, 0.5]) >>> b = np.array([0.1, 0.9]) >>> C = np.array([[16., 25.], [28., 16.], [40., 36.]]) >>> Hr,Hc,c = ot.regpath.recast_semi_relaxed_as_lasso(a, b, C) >>> Hr.toarray() array([[1., 1., 0., 0., 0., 0.], [0., 0., 1., 1., 0., 0.], [0., 0., 0., 0., 1., 1.]]) >>> Hc.toarray() array([[1., 0., 1., 0., 1., 0.], [0., 1., 0., 1., 0., 1.]]) >>> c array([16., 25., 28., 16., 40., 36.])
This function provides all the solutions of the regularization path of the l2-UOT problem [41].
The problem to optimize is the Lasso reformulation of the l2-penalized UOT:
\[ \begin{align}\begin{aligned}\min_t \gamma \mathbf{c}^T \mathbf{t} + 0.5 * \|{H} \mathbf{t} - \mathbf{y}\|_2^2\\s.t. \mathbf{t} \geq 0\end{aligned}\end{align} \]
where :
\(\mathbf{c}\) is the flattened version of the cost matrix \({C}\)
\(\gamma = 1/\lambda\) is the l2-regularization coefficient
\(\mathbf{y}\) is the concatenation of vectors \(\mathbf{a}\) and \(\mathbf{b}\), defined as \(\mathbf{y}^T = [\mathbf{a}^T \mathbf{b}^T]\)
\({H}\) is a design matrix, see [41] for the design of \({H}\). The matrix product \(H\mathbf{t}\) computes both the source marginal and the target marginals.
\(\mathbf{t}\) is the flattened version of the transport matrix
For the semi-relaxed problem, it optimizes the Lasso reformulation of the l2-penalized UOT:
\[ \begin{align}\begin{aligned}\min_t \gamma \mathbf{c}^T \mathbf{t} + 0.5 * \|H_r \mathbf{t} - \mathbf{a}\|_2^2\\s.t. H_c \mathbf{t} = \mathbf{b}\\ \mathbf{t} \geq 0\end{aligned}\end{align} \]
a (np.ndarray (dim_a,)) – Histogram of dimension dim_a
b (np.ndarray (dim_b,)) – Histogram of dimension dim_b
C (np.ndarray, shape (dim_a, dim_b)) – Cost matrix
reg (float (optional)) – l2-regularization coefficient
semi_relaxed (bool (optional)) – Give the semi-relaxed path if True
itmax (int (optional)) – Maximum number of iteration
t (np.ndarray (dim_a*dim_b, )) – Flattened vector of the (unregularized) optimal transport matrix
t_list (list) – List of all the optimal transport vectors of the regularization path
gamma_list (list) – List of the regularization parameters in the path
References
This function computes the next value of gamma when a variable is active in the regularization path of semi-relaxed UOT.
By taking the Lagrangian form of the problem, we obtain a similar update as the two-sided relaxed UOT
\[\max_{i \in \bar{A}} \frac{\mathbf{h}_{ri}^T(H_{rA} \phi - \mathbf{a}) + \mathbf{h}_{c i}^T\phi_u}{\mathbf{h}_{r i}^T H_{r A} \delta + \ \mathbf{h}_{c i} \delta_u - \mathbf{c}_i}\]
where :
A is the current active set
\(\mathbf{h}_{r i}\) is the ith column of the matrix \(H_r\)
\(\mathbf{h}_{c i}\) is the ith column of the matrix \(H_c\)
\(H_{r A}\) is the sub-matrix constructed by the columns of \(H_r\) whose indices belong to the active set A
\(\mathbf{c}_i\) is the \(i\) th element of cost vector \(\mathbf{c}\)
\(\phi\) is the intercept of the solutions in current iteration
\(\delta\) is the slope of the solutions in current iteration
\(\phi_u\) is the intercept of Lagrange parameter at the current iteration
\(\delta_u\) is the slope of Lagrange parameter at the current iteration
phi (np.ndarray (size(A), )) – Intercept of the solutions at the current iteration
delta (np.ndarray (size(A), )) – Slope of the solutions at the current iteration
phi_u (np.ndarray (dim_b, )) – Intercept of the Lagrange parameter at the current iteration
delta_u (np.ndarray (dim_b, )) – Slope of the Lagrange parameter at the current iteration
HrHr (np.ndarray (dim_a * dim_b, dim_a * dim_b)) – Matrix product of \(H_r^T H_r\)
Hc (np.ndarray (dim_b, dim_a * dim_b)) – Matrix that computes the sum along the columns of the transport plan \(T\)
Hra (np.ndarray (dim_a * dim_b, )) – Matrix product of \(H_r^T \mathbf{a}\)
c (np.ndarray (dim_a * dim_b, )) – Flattened array of cost matrix \(C\)
active_index (list) – Indices of active variables
current_gamma (float) – Value of regularization coefficient at the start of current iteration
next_gamma (float) – Value of gamma if a variable is added to active set in next iteration
next_active_index (int) – Index of variable to be activated
References
This function gives the regularization path of semi-relaxed l2-UOT problem.
The problem to optimize is the Lasso reformulation of the l2-penalized UOT:
\[ \begin{align}\begin{aligned}\min_t \gamma \mathbf{c}^T t + 0.5 * \|H_r \mathbf{t} - \mathbf{a}\|_2^2\\s.t. H_c \mathbf{t} = \mathbf{b}\\ \mathbf{t} \geq 0\end{aligned}\end{align} \]
where :
\(\mathbf{c}\) is the flattened version of the cost matrix \(C\)
\(\gamma = 1/\lambda\) is the l2-regularization parameter
\(H_r\) is a matrix that computes the sum along the rows of the transport plan \(T\)
\(H_c\) is a matrix that computes the sum along the columns of the transport plan \(T\)
\(\mathbf{t}\) is the flattened version of the transport plan \(T\)
t (np.ndarray (dim_a*dim_b, )) – Flattened vector of the (unregularized) optimal transport matrix
t_list (list) – List of all the optimal transport vectors of the regularization path
gamma_list (list) – List of the regularization parameters in the path
Examples
>>> import ot >>> import numpy as np >>> n = 3 >>> xs = np.array([1., 2., 3.]).reshape((n, 1)) >>> xt = np.array([5., 6., 7.]).reshape((n, 1)) >>> C = ot.dist(xs, xt) >>> C /= C.max() >>> a = np.array([0.2, 0.5, 0.3]) >>> b = np.array([0.2, 0.5, 0.3]) >>> t, _, _ = ot.regpath.semi_relaxed_path(a, b, C, 1e-4) >>> t array([1.99980556e-01, 0.00000000e+00, 0.00000000e+00, 1.94444444e-05, 4.99980556e-01, 0.00000000e+00, 0.00000000e+00, 1.94444444e-05, 3.00000000e-01])
References
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4