scico.optimize.admm¶
ADMM solver and auxiliary classes.
Classes
|
Solver for linear operators diagonalized in the DFT domain. |
|
Solver for linear operators block-diagonalized in the DFT domain. |
|
Solver for linear operators block-diagonalized in the DFT domain. |
|
Solver for generic problem without special structure. |
|
Solver for quadratic functionals. |
|
Solver for quadratic functionals involving matrix operators. |
Base class for solvers for the non-separable ADMM step. |
- class scico.optimize.admm.SubproblemSolver¶
Bases:
objectBase class for solvers for the non-separable ADMM step.
The ADMM solver implemented by
ADMMaddresses a general problem form for which one of the corresponding ADMM algorithm subproblems is separable into distinct subproblems for each of the \(g_i\), and another that is non-separable, involving function \(f\) and a sum over \(\ell_2\) norm terms involving all operators \(C_i\). This class is a base class for solvers of the latter subproblem\[\argmin_{\mb{x}} \; f(\mb{x}) + \sum_i \frac{\rho_i}{2} \norm{\mb{z}^{(k)}_i - \mb{u}^{(k)}_i - C_i \mb{x}}_2^2 \;.\]- Attributes:
admm (
ADMM) – ADMM solver object to which the solver is attached.
- internal_init(admm)[source]¶
Second stage initializer to be called by
ADMM.__init__.- Parameters:
admm (
ADMM) – Reference toADMMobject to which theSubproblemSolverobject is to be attached.
- class scico.optimize.admm.GenericSubproblemSolver(minimize_kwargs={'options': {'maxiter': 100}})¶
Bases:
SubproblemSolverSolver for generic problem without special structure.
Note that this solver is only suitable for small-scale problems.
- Attributes:
admm (
ADMM) – ADMM solver object to which the solver is attached.minimize_kwargs (dict) – Dictionary of arguments for
scico.solver.minimize.
Initialize a
GenericSubproblemSolverobject.- Parameters:
minimize_kwargs (
dict) – Dictionary of arguments forscico.solver.minimize.
- solve(x0)[source]¶
Solve the ADMM step.
- Parameters:
x0 (
Union[Array,BlockArray]) – Initial value.- Return type:
Union[Array,BlockArray]- Returns:
Computed solution.
- class scico.optimize.admm.LinearSubproblemSolver(cg_kwargs=None, cg_function='scico')¶
Bases:
SubproblemSolverSolver for quadratic functionals.
Solver for the case in which
fis a quadratic function of \(\mb{x}\). It is a specialization ofSubproblemSolverfor the case wherefis an \(\ell_2\) or weighted \(\ell_2\) norm, andf.Ais a linear operator, so that the subproblem involves solving a linear equation. This requires thatf.functionalbe an instance ofSquaredL2Lossand for the forward operatorf.Ato be an instance ofLinearOperator.The \(\mb{x}\)-update step is
\[\mb{x}^{(k+1)} = \argmin_{\mb{x}} \; \frac{1}{2} \norm{\mb{y} - A \mb{x}}_W^2 + \sum_i \frac{\rho_i}{2} \norm{\mb{z}^{(k)}_i - \mb{u}^{(k)}_i - C_i \mb{x}}_2^2 \;,\]where \(W\) a weighting
Diagonaloperator or anIdentityoperator (i.e., no weighting). This update step reduces to the solution of the linear system\[\left(A^H W A + \sum_{i=1}^N \rho_i C_i^H C_i \right) \mb{x}^{(k+1)} = \; A^H W \mb{y} + \sum_{i=1}^N \rho_i C_i^H ( \mb{z}^{(k)}_i - \mb{u}^{(k)}_i) \;.\]- Attributes:
admm (
ADMM) – ADMM solver object to which the solver is attached.cg_kwargs (dict) – Dictionary of arguments for CG solver.
cg (func) – CG solver function (
scico.solver.cgorjax.scipy.sparse.linalg.cg) lhs (type): Function implementing the linear operator needed for the \(\mb{x}\) update step.
Initialize a
LinearSubproblemSolverobject.- Parameters:
cg_kwargs (
Optional[dict[str,Any]]) – Dictionary of arguments for CG solver. See documentation forscico.solver.cgorjax.scipy.sparse.linalg.cg, including how to specify a preconditioner. Default values are the same as those ofscico.solver.cg, except for “tol”: 1e-4 and “maxiter”: 100.cg_function (
str) – String indicating which CG implementation to use. One of “jax” or “scico”; default “scico”. If “scico”, usesscico.solver.cg. If “jax”, usesjax.scipy.sparse.linalg.cg. The “jax” option is slower on small-scale problems or problems involving external functions, but can be differentiated through. The “scico” option is faster on small-scale problems, but slower on large-scale problems where the forward operator is written entirely in jax.
- compute_rhs()[source]¶
Compute the right hand side of the linear equation to be solved.
Compute
\[A^H W \mb{y} + \sum_{i=1}^N \rho_i C_i^H ( \mb{z}^{(k)}_i - \mb{u}^{(k)}_i) \;.\]- Return type:
Union[Array,BlockArray]- Returns:
Computed solution.
- internal_init(admm)[source]¶
Second stage initializer to be called by
ADMM.__init__.- Parameters:
admm (
ADMM) – Reference toADMMobject to which theSubproblemSolverobject is to be attached.
- solve(x0)[source]¶
Solve the ADMM step.
- Parameters:
x0 (
Union[Array,BlockArray]) – Initial value.- Return type:
Union[Array,BlockArray]- Returns:
Computed solution.
- class scico.optimize.admm.MatrixSubproblemSolver(check_solve=False, solve_kwargs=None)¶
Bases:
LinearSubproblemSolverSolver for quadratic functionals involving matrix operators.
Solver for the case in which \(f\) is a quadratic function of \(\mb{x}\), and \(A\) and all the \(C_i\) are diagonal or matrix operators. It is a specialization of
LinearSubproblemSolver.As for
LinearSubproblemSolver, the \(\mb{x}\)-update step is\[\mb{x}^{(k+1)} = \argmin_{\mb{x}} \; \frac{1}{2} \norm{\mb{y} - A \mb{x}}_W^2 + \sum_i \frac{\rho_i}{2} \norm{\mb{z}^{(k)}_i - \mb{u}^{(k)}_i - C_i \mb{x}}_2^2 \;,\]where \(W\) is a weighting
Diagonaloperator or anIdentityoperator (i.e., no weighting). This update step reduces to the solution of the linear system\[\left(A^H W A + \sum_{i=1}^N \rho_i C_i^H C_i \right) \mb{x}^{(k+1)} = \; A^H W \mb{y} + \sum_{i=1}^N \rho_i C_i^H ( \mb{z}^{(k)}_i - \mb{u}^{(k)}_i) \;,\]which is solved by factorization of the left hand side of the equation, using
MatrixATADSolver.- Attributes:
admm (
ADMM) – ADMM solver object to which the solver is attached.solve_kwargs (dict) – Dictionary of arguments for solver
MatrixATADSolverinitialization.
Initialize a
MatrixSubproblemSolverobject.- Parameters:
- internal_init(admm)[source]¶
Second stage initializer to be called by
ADMM.__init__.- Parameters:
admm (
ADMM) – Reference toADMMobject to which theSubproblemSolverobject is to be attached.
- class scico.optimize.admm.CircularConvolveSolver(ndims=None)¶
Bases:
LinearSubproblemSolverSolver for linear operators diagonalized in the DFT domain.
Specialization of
LinearSubproblemSolverfor the case wherefisNone, or an instance ofSquaredL2Losswith a forward operatorf.Athat is either an instance ofIdentityorCircularConvolve, and theC_iare all shift invariant linear operators, examples of which include instances ofIdentityas well as some instances (depending on initializer parameters) ofCircularConvolveandFiniteDifference. None of the instances ofCircularConvolvemay sum over any of their axes.- Attributes:
admm (
ADMM) – ADMM solver object to which the solver is attached.lhs_f (array) – Left hand side, in the DFT domain, of the linear equation to be solved.
Initialize a
CircularConvolveSolverobject.- Parameters:
ndims (
Optional[int]) – Number of trailing dimensions of the input and kernel involved in theCircularConvolveconvolutions. In most cases this value is automatically determined from the optimization problem specification, but this is not possible whenfisNoneand none of theC_iare of typeCircularConvolve. When notNone, this parameter overrides the automatic mechanism.
- internal_init(admm)[source]¶
Second stage initializer to be called by
ADMM.__init__.- Parameters:
admm (
ADMM) – Reference toADMMobject to which theSubproblemSolverobject is to be attached.
- solve(x0)[source]¶
Solve the ADMM step.
- Parameters:
x0 (
Union[Array,BlockArray]) – Initial value (unused, has no effect).- Return type:
Union[Array,BlockArray]- Returns:
Computed solution.
- class scico.optimize.admm.FBlockCircularConvolveSolver(ndims=None, check_solve=False)¶
Bases:
LinearSubproblemSolverSolver for linear operators block-diagonalized in the DFT domain.
Specialization of
LinearSubproblemSolverfor the case wherefis an instance ofSquaredL2Loss, the forward operatorf.Ais a composition of aSumoperator and aCircularConvolveoperator. The former must sum over the first axis of its input, and the latter must be initialized so that it convolves a set of filters, indexed by the first axis, with an input array that has the same number of axes as the filter array, and has an initial axis of the same length as that of the filter array. The \(C_i\) must all be shift invariant linear operators, examples of which include instances ofIdentityas well as some instances (depending on initializer parameters) ofCircularConvolveandFiniteDifference. None of the instances ofCircularConvolvemay be summed over any of their axes.The solver is based on the frequency-domain approach proposed in [59]. We have \(f = \omega \norm{A \mb{x} - \mb{y}}_2^2\), where typically \(\omega = 1/2\), and \(A\) is a block-row operator with circulant blocks, i.e. it can be written as
\[A = \left( \begin{array}{cccc} A_1 & A_2 & \ldots & A_{K} \end{array} \right) \;,\]where all of the \(A_k\) are circular convolution operators. The complete functional to be minimized is
\[\omega \norm{A \mb{x} - \mb{y}}_2^2 + \sum_{i=1}^N g_i(C_i \mb{x}) \;,\]where the \(C_i\) are either identity or circular convolutions, and the ADMM x-step is
\[\mb{x}^{(j+1)} = \argmin_{\mb{x}} \; \omega \norm{A \mb{x} - \mb{y}}_2^2 + \sum_i \frac{\rho_i}{2} \norm{C_i \mb{x} - (\mb{z}^{(j)}_i - \mb{u}^{(j)}_i)}_2^2 \;.\]This subproblem is most easily solved in the DFT transform domain, where the circular convolutions become diagonal operators. Denoting the frequency-domain versions of variables with a circumflex (e.g. \(\hat{\mb{x}}\) is the frequency-domain version of \(\mb{x}\)), the solution of the subproblem can be written as
\[\left( \hat{A}^H \hat{A} + \frac{1}{2 \omega} \sum_i \rho_i \hat{C}_i^H \hat{C}_i \right) \hat{\mathbf{x}} = \hat{A}^H \hat{\mb{y}} + \frac{1}{2 \omega} \sum_i \rho_i \hat{C}_i^H (\hat{\mb{z}}_i - \hat{\mb{u}}_i) \;.\]This linear equation is computational expensive to solve because the left hand side includes the term \(\hat{A}^H \hat{A}\), which corresponds to the outer product of \(\hat{A}^H\) and \(\hat{A}\). A computationally efficient solution is possible, however, by exploiting the Woodbury matrix identity
\[(D + U G V)^{-1} = D^{-1} - D^{-1} U (G^{-1} + V D^{-1} U)^{-1} V D^{-1} \;.\]Setting
\[\begin{split}D &= \frac{1}{2 \omega} \sum_i \rho_i \hat{C}_i^H \hat{C}_i \\ U &= \hat{A}^H \\ G &= I \\ V &= \hat{A}\end{split}\]we have
\[(D + \hat{A}^H \hat{A})^{-1} = D^{-1} - D^{-1} \hat{A}^H (I + \hat{A} D^{-1} \hat{A}^H)^{-1} \hat{A} D^{-1}\]which can be simplified to
\[(D + \hat{A}^H \hat{A})^{-1} = D^{-1} (I - \hat{A}^H E^{-1} \hat{A} D^{-1})\]by defining \(E = I + \hat{A} D^{-1} \hat{A}^H\). The right hand side is much cheaper to compute because the only matrix inversions involve \(D\), which is diagonal, and \(E\), which is a weighted inner product of \(\hat{A}^H\) and \(\hat{A}\).
Initialize a
FBlockCircularConvolveSolverobject.- Parameters:
check_solve (
bool) – IfTrue, compute solver accuracy after each solve.
- internal_init(admm)[source]¶
Second stage initializer to be called by
ADMM.__init__.- Parameters:
admm (
ADMM) – Reference toADMMobject to which theSubproblemSolverobject is to be attached.
- solve(x0)[source]¶
Solve the ADMM step.
- Parameters:
x0 (
Union[Array,BlockArray]) – Initial value (unused, has no effect).- Return type:
Union[Array,BlockArray]- Returns:
Computed solution.
- class scico.optimize.admm.G0BlockCircularConvolveSolver(ndims=None, check_solve=False)¶
Bases:
SubproblemSolverSolver for linear operators block-diagonalized in the DFT domain.
Specialization of
LinearSubproblemSolverfor the case where \(f = 0\) (i.e,fis aZeroFunctional), \(g_1\) is an instance ofSquaredL2Loss, \(C_1\) is a composition of aSumoperator an aCircularConvolveoperator. The former must sum over the first axis of its input, and the latter must be initialized so that it convolves a set of filters, indexed by the first axis, with an input array that has the same number of axes as the filter array, and has an initial axis of the same length as that of the filter array. The other \(C_i\) must all be shift invariant linear operators, examples of which include instances ofIdentityas well as some instances (depending on initializer parameters) ofCircularConvolveandFiniteDifference. None of these instances ofCircularConvolvemay be summed over any of their axes.The solver is based on the frequency-domain approach proposed in [59]. We have \(g_1 = \omega \norm{B A \mb{x} - \mb{y}}_2^2\), where typically \(\omega = 1/2\), \(B\) is the identity or a diagonal operator, and \(A\) is a block-row operator with circulant blocks, i.e. it can be written as
\[A = \left( \begin{array}{cccc} A_1 & A_2 & \ldots & A_{K} \end{array} \right) \;,\]where all of the \(A_k\) are circular convolution operators. The complete functional to be minimized is
\[\sum_{i=1}^N g_i(C_i \mb{x}) \;,\]where
\[\begin{split}g_1(\mb{z}) &= \omega \norm{B \mb{z} - \mb{y}}_2^2\\ C_1 &= A \;,\end{split}\]and the other \(C_i\) are either identity or circular convolutions. The ADMM x-step is
\[\mb{x}^{(j+1)} = \argmin_{\mb{x}} \; \rho_1 \omega \norm{ A \mb{x} - (\mb{z}^{(j)}_1 - \mb{u}^{(j)}_1)}_2^2 + \sum_{i=2}^N \frac{\rho_i}{2} \norm{C_i \mb{x} - (\mb{z}^{(j)}_i - \mb{u}^{(j)}_i)}_2^2 \;.\]This subproblem is most easily solved in the DFT transform domain, where the circular convolutions become diagonal operators. Denoting the frequency-domain versions of variables with a circumflex (e.g. \(\hat{\mb{x}}\) is the frequency-domain version of \(\mb{x}\)), the solution of the subproblem can be written as
\[\left( \hat{A}^H \hat{A} + \frac{1}{2 \omega \rho_1} \sum_{i=2}^N \rho_i \hat{C}_i^H \hat{C}_i \right) \hat{\mathbf{x}} = \hat{A}^H (\hat{\mb{z}}_1 - \hat{\mb{u}}_1) + \frac{1}{2 \omega \rho_1} \sum_{i=2}^N \rho_i \hat{C}_i^H (\hat{\mb{z}}_i - \hat{\mb{u}}_i) \;.\]This linear equation is computational expensive to solve because the left hand side includes the term \(\hat{A}^H \hat{A}\), which corresponds to the outer product of \(\hat{A}^H\) and \(\hat{A}\). A computationally efficient solution is possible, however, by exploiting the Woodbury matrix identity
\[(D + U G V)^{-1} = D^{-1} - D^{-1} U (G^{-1} + V D^{-1} U)^{-1} V D^{-1} \;.\]Setting
\[\begin{split}D &= \frac{1}{2 \omega \rho_1} \sum_{i=2}^N \rho_i \hat{C}_i^H \hat{C}_i \\ U &= \hat{A}^H \\ G &= I \\ V &= \hat{A}\end{split}\]we have
\[(D + \hat{A}^H \hat{A})^{-1} = D^{-1} - D^{-1} \hat{A}^H (I + \hat{A} D^{-1} \hat{A}^H)^{-1} \hat{A} D^{-1}\]which can be simplified to
\[(D + \hat{A}^H \hat{A})^{-1} = D^{-1} (I - \hat{A}^H E^{-1} \hat{A} D^{-1})\]by defining \(E = I + \hat{A} D^{-1} \hat{A}^H\). The right hand side is much cheaper to compute because the only matrix inversions involve \(D\), which is diagonal, and \(E\), which is a weighted inner product of \(\hat{A}^H\) and \(\hat{A}\).
Initialize a
G0BlockCircularConvolveSolverobject.- Parameters:
check_solve (
bool) – IfTrue, compute solver accuracy after each solve.
- compute_rhs()[source]¶
Compute the right hand side of the linear equation to be solved.
Compute
\[C_1^H ( \mb{z}^{(k)}_1 - \mb{u}^{(k)}_1) + \frac{1}{2 \omega \rho_1}\sum_{i=2}^N \rho_i C_i^H ( \mb{z}^{(k)}_i - \mb{u}^{(k)}_i) \;.\]- Return type:
Union[Array,BlockArray]- Returns:
Right hand side of the linear equation.
- internal_init(admm)[source]¶
Second stage initializer to be called by
ADMM.__init__.- Parameters:
admm (
ADMM) – Reference toADMMobject to which theSubproblemSolverobject is to be attached.
- solve(x0)[source]¶
Solve the ADMM step.
- Parameters:
x0 (
Union[Array,BlockArray]) – Initial value (unused, has no effect).- Return type:
Union[Array,BlockArray]- Returns:
Computed solution.