scico.optimize.admm#
ADMM solver and auxiliary classes.
Classes
Solver for linear operators diagonalized in the DFT domain. |
|
|
Solver for linear operators block-diagonalized in the DFT domain. |
|
Solver for linear operators block-diagonalized in the DFT domain. |
|
Solver for generic problem without special structure. |
|
Solver for quadratic functionals. |
|
Solver for quadratic functionals involving matrix operators. |
Base class for solvers for the non-separable ADMM step. |
- class scico.optimize.admm.SubproblemSolver#
Bases:
object
Base class for solvers for the non-separable ADMM step.
The ADMM solver implemented by
ADMM
addresses a general problem form for which one of the corresponding ADMM algorithm subproblems is separable into distinct subproblems for each of the \(g_i\), and another that is non-separable, involving function \(f\) and a sum over \(\ell_2\) norm terms involving all operators \(C_i\). This class is a base class for solvers of the latter subproblem\[\argmin_{\mb{x}} \; f(\mb{x}) + \sum_i \frac{\rho_i}{2} \norm{\mb{z}^{(k)}_i - \mb{u}^{(k)}_i - C_i \mb{x}}_2^2 \;.\]- Attributes:
admm (
ADMM
) – ADMM solver object to which the solver is attached.
- internal_init(admm)[source]#
Second stage initializer to be called by
ADMM.__init__
.- Parameters:
admm (
ADMM
) – Reference toADMM
object to which theSubproblemSolver
object is to be attached.
- class scico.optimize.admm.GenericSubproblemSolver(minimize_kwargs={'options': {'maxiter': 100}})#
Bases:
SubproblemSolver
Solver for generic problem without special structure.
Note that this solver is only suitable for small-scale problems.
- Attributes:
admm (
ADMM
) – ADMM solver object to which the solver is attached.minimize_kwargs (dict) – Dictionary of arguments for
scico.solver.minimize
.
Initialize a
GenericSubproblemSolver
object.- Parameters:
minimize_kwargs (
dict
) – Dictionary of arguments forscico.solver.minimize
.
- solve(x0)[source]#
Solve the ADMM step.
- Parameters:
x0 (
Union
[Array
,BlockArray
]) – Initial value.- Return type:
- Returns:
Computed solution.
- class scico.optimize.admm.LinearSubproblemSolver(cg_kwargs=None, cg_function='scico')#
Bases:
SubproblemSolver
Solver for quadratic functionals.
Solver for the case in which
f
is a quadratic function of \(\mb{x}\). It is a specialization ofSubproblemSolver
for the case wheref
is an \(\ell_2\) or weighted \(\ell_2\) norm, andf.A
is a linear operator, so that the subproblem involves solving a linear equation. This requires thatf.functional
be an instance ofSquaredL2Loss
and for the forward operatorf.A
to be an instance ofLinearOperator
.The \(\mb{x}\)-update step is
\[\mb{x}^{(k+1)} = \argmin_{\mb{x}} \; \frac{1}{2} \norm{\mb{y} - A \mb{x}}_W^2 + \sum_i \frac{\rho_i}{2} \norm{\mb{z}^{(k)}_i - \mb{u}^{(k)}_i - C_i \mb{x}}_2^2 \;,\]where \(W\) a weighting
Diagonal
operator or anIdentity
operator (i.e., no weighting). This update step reduces to the solution of the linear system\[\left(A^H W A + \sum_{i=1}^N \rho_i C_i^H C_i \right) \mb{x}^{(k+1)} = \; A^H W \mb{y} + \sum_{i=1}^N \rho_i C_i^H ( \mb{z}^{(k)}_i - \mb{u}^{(k)}_i) \;.\]- Attributes:
admm (
ADMM
) – ADMM solver object to which the solver is attached.cg_kwargs (dict) – Dictionary of arguments for CG solver.
cg (func) – CG solver function (
scico.solver.cg
orjax.scipy.sparse.linalg.cg
) lhs (type): Function implementing the linear operator needed for the \(\mb{x}\) update step.
Initialize a
LinearSubproblemSolver
object.- Parameters:
cg_kwargs – Dictionary of arguments for CG solver. See documentation for
scico.solver.cg
orjax.scipy.sparse.linalg.cg
, including how to specify a preconditioner. Default values are the same as those ofscico.solver.cg
, except for “tol”: 1e-4 and “maxiter”: 100.cg_function – String indicating which CG implementation to use. One of “jax” or “scico”; default “scico”. If “scico”, uses
scico.solver.cg
. If “jax”, usesjax.scipy.sparse.linalg.cg
. The “jax” option is slower on small-scale problems or problems involving external functions, but can be differentiated through. The “scico” option is faster on small-scale problems, but slower on large-scale problems where the forward operator is written entirely in jax.
- compute_rhs()[source]#
Compute the right hand side of the linear equation to be solved.
Compute
\[A^H W \mb{y} + \sum_{i=1}^N \rho_i C_i^H ( \mb{z}^{(k)}_i - \mb{u}^{(k)}_i) \;.\]- Return type:
- Returns:
Computed solution.
- internal_init(admm)[source]#
Second stage initializer to be called by
ADMM.__init__
.- Parameters:
admm (
ADMM
) – Reference toADMM
object to which theSubproblemSolver
object is to be attached.
- solve(x0)[source]#
Solve the ADMM step.
- Parameters:
x0 (
Union
[Array
,BlockArray
]) – Initial value.- Return type:
- Returns:
Computed solution.
- class scico.optimize.admm.MatrixSubproblemSolver(check_solve=False, solve_kwargs=None)#
Bases:
LinearSubproblemSolver
Solver for quadratic functionals involving matrix operators.
Solver for the case in which \(f\) is a quadratic function of \(\mb{x}\), and \(A\) and all the \(C_i\) are diagonal or matrix operators. It is a specialization of
LinearSubproblemSolver
.As for
LinearSubproblemSolver
, the \(\mb{x}\)-update step is\[\mb{x}^{(k+1)} = \argmin_{\mb{x}} \; \frac{1}{2} \norm{\mb{y} - A \mb{x}}_W^2 + \sum_i \frac{\rho_i}{2} \norm{\mb{z}^{(k)}_i - \mb{u}^{(k)}_i - C_i \mb{x}}_2^2 \;,\]where \(W\) is a weighting
Diagonal
operator or anIdentity
operator (i.e., no weighting). This update step reduces to the solution of the linear system\[\left(A^H W A + \sum_{i=1}^N \rho_i C_i^H C_i \right) \mb{x}^{(k+1)} = \; A^H W \mb{y} + \sum_{i=1}^N \rho_i C_i^H ( \mb{z}^{(k)}_i - \mb{u}^{(k)}_i) \;,\]which is solved by factorization of the left hand side of the equation, using
MatrixATADSolver
.- Attributes:
admm (
ADMM
) – ADMM solver object to which the solver is attached.solve_kwargs (dict) – Dictionary of arguments for solver
MatrixATADSolver
initialization.
Initialize a
MatrixSubproblemSolver
object.- Parameters:
check_solve – If
True
, compute solver accuracy after each solve.solve_kwargs – Dictionary of arguments for solver
MatrixATADSolver
initialization.
- internal_init(admm)[source]#
Second stage initializer to be called by
ADMM.__init__
.- Parameters:
admm (
ADMM
) – Reference toADMM
object to which theSubproblemSolver
object is to be attached.
- class scico.optimize.admm.CircularConvolveSolver#
Bases:
LinearSubproblemSolver
Solver for linear operators diagonalized in the DFT domain.
Specialization of
LinearSubproblemSolver
for the case wheref
is an instance ofSquaredL2Loss
, the forward operatorf.A
is either an instance ofIdentity
orCircularConvolve
, and theC_i
are all shift invariant linear operators, examples of which include instances ofIdentity
as well as some instances (depending on initializer parameters) ofCircularConvolve
andFiniteDifference
. None of the instances ofCircularConvolve
may sum over any of their axes.- Attributes:
admm (
ADMM
) – ADMM solver object to which the solver is attached.lhs_f (array) – Left hand side, in the DFT domain, of the linear equation to be solved.
Initialize a
CircularConvolveSolver
object.- internal_init(admm)[source]#
Second stage initializer to be called by
ADMM.__init__
.- Parameters:
admm (
ADMM
) – Reference toADMM
object to which theSubproblemSolver
object is to be attached.
- solve(x0)[source]#
Solve the ADMM step.
- Parameters:
x0 (
Union
[Array
,BlockArray
]) – Initial value (unused, has no effect).- Return type:
- Returns:
Computed solution.
- class scico.optimize.admm.FBlockCircularConvolveSolver(ndims=None, check_solve=False)#
Bases:
LinearSubproblemSolver
Solver for linear operators block-diagonalized in the DFT domain.
Specialization of
LinearSubproblemSolver
for the case wheref
is an instance ofSquaredL2Loss
, the forward operatorf.A
is a composition of aSum
operator and aCircularConvolve
operator. The former must sum over the first axis of its input, and the latter must be initialized so that it convolves a set of filters, indexed by the first axis, with an input array that has the same number of axes as the filter array, and has an initial axis of the same length as that of the filter array. The \(C_i\) must all be shift invariant linear operators, examples of which include instances ofIdentity
as well as some instances (depending on initializer parameters) ofCircularConvolve
andFiniteDifference
. None of the instances ofCircularConvolve
may be summed over any of their axes.The solver is based on the frequency-domain approach proposed in [53]. We have \(f = \omega \norm{A \mb{x} - \mb{y}}_2^2\), where typically \(\omega = 1/2\), and \(A\) is a block-row operator with circulant blocks, i.e. it can be written as
\[A = \left( \begin{array}{cccc} A_1 & A_2 & \ldots & A_{K} \end{array} \right) \;,\]where all of the \(A_k\) are circular convolution operators. The complete functional to be minimized is
\[\omega \norm{A \mb{x} - \mb{y}}_2^2 + \sum_{i=1}^N g_i(C_i \mb{x}) \;,\]where the \(C_i\) are either identity or circular convolutions, and the ADMM x-step is
\[\mb{x}^{(j+1)} = \argmin_{\mb{x}} \; \omega \norm{A \mb{x} - \mb{y}}_2^2 + \sum_i \frac{\rho_i}{2} \norm{C_i \mb{x} - (\mb{z}^{(j)}_i - \mb{u}^{(j)}_i)}_2^2 \;.\]This subproblem is most easily solved in the DFT transform domain, where the circular convolutions become diagonal operators. Denoting the frequency-domain versions of variables with a circumflex (e.g. \(\hat{\mb{x}}\) is the frequency-domain version of \(\mb{x}\)), the solution of the subproblem can be written as
\[\left( \hat{A}^H \hat{A} + \frac{1}{2 \omega} \sum_i \rho_i \hat{C}_i^H \hat{C}_i \right) \hat{\mathbf{x}} = \hat{A}^H \hat{\mb{y}} + \frac{1}{2 \omega} \sum_i \rho_i \hat{C}_i^H (\hat{\mb{z}}_i - \hat{\mb{u}}_i) \;.\]This linear equation is computational expensive to solve because the left hand side includes the term \(\hat{A}^H \hat{A}\), which corresponds to the outer product of \(\hat{A}^H\) and \(\hat{A}\). A computationally efficient solution is possible, however, by exploiting the Woodbury matrix identity
\[(D + U G V)^{-1} = D^{-1} - D^{-1} U (G^{-1} + V D^{-1} U)^{-1} V D^{-1} \;.\]Setting
\[\begin{split}D &= \frac{1}{2 \omega} \sum_i \rho_i \hat{C}_i^H \hat{C}_i \\ U &= \hat{A}^H \\ G &= I \\ V &= \hat{A}\end{split}\]we have
\[(D + \hat{A}^H \hat{A})^{-1} = D^{-1} - D^{-1} \hat{A}^H (I + \hat{A} D^{-1} \hat{A}^H)^{-1} \hat{A} D^{-1}\]which can be simplified to
\[(D + \hat{A}^H \hat{A})^{-1} = D^{-1} (I - \hat{A}^H E^{-1} \hat{A} D^{-1})\]by defining \(E = I + \hat{A} D^{-1} \hat{A}^H\). The right hand side is much cheaper to compute because the only matrix inversions involve \(D\), which is diagonal, and \(E\), which is a weighted inner product of \(\hat{A}^H\) and \(\hat{A}\).
Initialize a
FBlockCircularConvolveSolver
object.- Parameters:
check_solve (
bool
) – IfTrue
, compute solver accuracy after each solve.
- internal_init(admm)[source]#
Second stage initializer to be called by
ADMM.__init__
.- Parameters:
admm (
ADMM
) – Reference toADMM
object to which theSubproblemSolver
object is to be attached.
- solve(x0)[source]#
Solve the ADMM step.
- Parameters:
x0 (
Union
[Array
,BlockArray
]) – Initial value (unused, has no effect).- Return type:
- Returns:
Computed solution.
- class scico.optimize.admm.G0BlockCircularConvolveSolver(ndims=None, check_solve=False)#
Bases:
SubproblemSolver
Solver for linear operators block-diagonalized in the DFT domain.
Specialization of
LinearSubproblemSolver
for the case where \(f = 0\) (i.e,f
is aZeroFunctional
), \(g_1\) is an instance ofSquaredL2Loss
, \(C_1\) is a composition of aSum
operator an aCircularConvolve
operator. The former must sum over the first axis of its input, and the latter must be initialized so that it convolves a set of filters, indexed by the first axis, with an input array that has the same number of axes as the filter array, and has an initial axis of the same length as that of the filter array. The other \(C_i\) must all be shift invariant linear operators, examples of which include instances ofIdentity
as well as some instances (depending on initializer parameters) ofCircularConvolve
andFiniteDifference
. None of these instances ofCircularConvolve
may be summed over any of their axes.The solver is based on the frequency-domain approach proposed in [53]. We have \(g_1 = \omega \norm{B A \mb{x} - \mb{y}}_2^2\), where typically \(\omega = 1/2\), \(B\) is the identity or a diagonal operator, and \(A\) is a block-row operator with circulant blocks, i.e. it can be written as
\[A = \left( \begin{array}{cccc} A_1 & A_2 & \ldots & A_{K} \end{array} \right) \;,\]where all of the \(A_k\) are circular convolution operators. The complete functional to be minimized is
\[\sum_{i=1}^N g_i(C_i \mb{x}) \;,\]where
\[\begin{split}g_1(\mb{z}) &= \omega \norm{B \mb{z} - \mb{y}}_2^2\\ C_1 &= A \;,\end{split}\]and the other \(C_i\) are either identity or circular convolutions. The ADMM x-step is
\[\mb{x}^{(j+1)} = \argmin_{\mb{x}} \; \rho_1 \omega \norm{ A \mb{x} - (\mb{z}^{(j)}_1 - \mb{u}^{(j)}_1)}_2^2 + \sum_{i=2}^N \frac{\rho_i}{2} \norm{C_i \mb{x} - (\mb{z}^{(j)}_i - \mb{u}^{(j)}_i)}_2^2 \;.\]This subproblem is most easily solved in the DFT transform domain, where the circular convolutions become diagonal operators. Denoting the frequency-domain versions of variables with a circumflex (e.g. \(\hat{\mb{x}}\) is the frequency-domain version of \(\mb{x}\)), the solution of the subproblem can be written as
\[\left( \hat{A}^H \hat{A} + \frac{1}{2 \omega \rho_1} \sum_{i=2}^N \rho_i \hat{C}_i^H \hat{C}_i \right) \hat{\mathbf{x}} = \hat{A}^H (\hat{\mb{z}}_1 - \hat{\mb{u}}_1) + \frac{1}{2 \omega \rho_1} \sum_{i=2}^N \rho_i \hat{C}_i^H (\hat{\mb{z}}_i - \hat{\mb{u}}_i) \;.\]This linear equation is computational expensive to solve because the left hand side includes the term \(\hat{A}^H \hat{A}\), which corresponds to the outer product of \(\hat{A}^H\) and \(\hat{A}\). A computationally efficient solution is possible, however, by exploiting the Woodbury matrix identity
\[(D + U G V)^{-1} = D^{-1} - D^{-1} U (G^{-1} + V D^{-1} U)^{-1} V D^{-1} \;.\]Setting
\[\begin{split}D &= \frac{1}{2 \omega \rho_1} \sum_{i=2}^N \rho_i \hat{C}_i^H \hat{C}_i \\ U &= \hat{A}^H \\ G &= I \\ V &= \hat{A}\end{split}\]we have
\[(D + \hat{A}^H \hat{A})^{-1} = D^{-1} - D^{-1} \hat{A}^H (I + \hat{A} D^{-1} \hat{A}^H)^{-1} \hat{A} D^{-1}\]which can be simplified to
\[(D + \hat{A}^H \hat{A})^{-1} = D^{-1} (I - \hat{A}^H E^{-1} \hat{A} D^{-1})\]by defining \(E = I + \hat{A} D^{-1} \hat{A}^H\). The right hand side is much cheaper to compute because the only matrix inversions involve \(D\), which is diagonal, and \(E\), which is a weighted inner product of \(\hat{A}^H\) and \(\hat{A}\).
Initialize a
G0BlockCircularConvolveSolver
object.- Parameters:
check_solve (
bool
) – IfTrue
, compute solver accuracy after each solve.
- compute_rhs()[source]#
Compute the right hand side of the linear equation to be solved.
Compute
\[C_1^H ( \mb{z}^{(k)}_1 - \mb{u}^{(k)}_1) + \frac{1}{2 \omega \rho_1}\sum_{i=2}^N \rho_i C_i^H ( \mb{z}^{(k)}_i - \mb{u}^{(k)}_i) \;.\]- Return type:
- Returns:
Right hand side of the linear equation.
- internal_init(admm)[source]#
Second stage initializer to be called by
ADMM.__init__
.- Parameters:
admm (
ADMM
) – Reference toADMM
object to which theSubproblemSolver
object is to be attached.
- solve(x0)[source]#
Solve the ADMM step.
- Parameters:
x0 (
Union
[Array
,BlockArray
]) – Initial value (unused, has no effect).- Return type:
- Returns:
Computed solution.