scico.optimize#
Optimization algorithms.
Classes
|
Basic Alternating Direction Method of Multipliers (ADMM) algorithm. |
|
Accelerated Proximal Gradient Method (AcceleratedPGM) base class. |
|
Linearized alternating direction method of multipliers algorithm. |
|
Non-linear proximal alternating direction method of multipliers. |
|
Base class for optimizer classes. |
|
Primal–dual hybrid gradient (PDHG) algorithm. |
|
Proximal Gradient Method (PGM) base class. |
|
Proximal alternating direction method of multipliers. |
|
Base class for proximal ADMM solvers. |
- class scico.optimize.ADMM(f, g_list, C_list, rho_list, alpha=1.0, x0=None, subproblem_solver=None, **kwargs)#
Bases:
Optimizer
Basic Alternating Direction Method of Multipliers (ADMM) algorithm.
Solve an optimization problem of the form
\[\argmin_{\mb{x}} \; f(\mb{x}) + \sum_{i=1}^N g_i(C_i \mb{x}) \;,\]where \(f\) and the \(g_i\) are instances of
Functional
, and the \(C_i\) areLinearOperator
.The optimization problem is solved by introducing the splitting \(\mb{z}_i = C_i \mb{x}\) and solving
\[\argmin_{\mb{x}, \mb{z}_i} \; f(\mb{x}) + \sum_{i=1}^N g_i(\mb{z}_i) \; \text{such that}\; C_i \mb{x} = \mb{z}_i \;,\]via an ADMM algorithm [25] [23] [12] consisting of the iterations (see
step
)\[\begin{split}\begin{aligned} \mb{x}^{(k+1)} &= \argmin_{\mb{x}} \; f(\mb{x}) + \sum_i \frac{\rho_i}{2} \norm{\mb{z}^{(k)}_i - \mb{u}^{(k)}_i - C_i \mb{x}}_2^2 \\ \mb{z}_i^{(k+1)} &= \argmin_{\mb{z}_i} \; g_i(\mb{z}_i) + \frac{\rho_i}{2} \norm{\mb{z}_i - \mb{u}^{(k)}_i - C_i \mb{x}^{(k+1)}}_2^2 \\ \mb{u}_i^{(k+1)} &= \mb{u}_i^{(k)} + C_i \mb{x}^{(k+1)} - \mb{z}^{(k+1)}_i \; . \end{aligned}\end{split}\]- Attributes:
f (
Functional
) – Functional \(f\) (usually aLoss
)g_list (list of
Functional
) – List of \(g_i\) functionals. Must be same length asC_list
andrho_list
.C_list (list of
LinearOperator
) – List of \(C_i\) operators.rho_list (list of scalars) – List of \(\rho_i\) penalty parameters. Must be same length as
C_list
andg_list
.alpha (float) – Relaxation parameter.
u_list (list of array-like) – List of scaled Lagrange multipliers \(\mb{u}_i\) at current iteration.
x (array-like) – Solution.
subproblem_solver (
SubproblemSolver
) – Solver for \(\mb{x}\)-update step.z_list (list of array-like) – List of auxiliary variables \(\mb{z}_i\) at current iteration.
z_list_old (list of array-like) – List of auxiliary variables \(\mb{z}_i\) at previous iteration.
Initialize an
ADMM
object.- Parameters:
f (
Functional
) – Functional \(f\) (usually a loss function).g_list (
List
[Functional
]) – List of \(g_i\) functionals. Must be same length asC_list
andrho_list
.C_list (
List
[LinearOperator
]) – List of \(C_i\) operators.rho_list (
List
[float
]) – List of \(\rho_i\) penalty parameters. Must be same length asC_list
andg_list
.alpha (
float
) – Relaxation parameter. No relaxation for default 1.0.x0 (
Union
[Array
,BlockArray
,None
]) – Initial value for \(\mb{x}\). IfNone
, defaults to an array of zeros.subproblem_solver (
Optional
[SubproblemSolver
]) – Solver for \(\mb{x}\)-update step. Defaults toNone
, which implies use of an instance ofGenericSubproblemSolver
.**kwargs – Additional optional parameters handled by initializer of base class
Optimizer
.
- norm_dual_residual()[source]#
Compute the \(\ell_2\) norm of the dual residual.
Compute the \(\ell_2\) norm of the dual residual
\[\left\| \sum_{i=1}^N \rho_i C_i^T \left( \mb{z}^{(k)}_i - \mb{z}^{(k-1)}_i \right) \right\|_2 \;.\]- Return type:
- Returns:
Norm of dual residual.
- norm_primal_residual(x=None)[source]#
Compute the \(\ell_2\) norm of the primal residual.
Compute the \(\ell_2\) norm of the primal residual
\[\left( \sum_{i=1}^N \rho_i \left\| C_i \mb{x} - \mb{z}_i^{(k)} \right\|_2^2\right)^{1/2} \;.\]- Parameters:
x (
Union
[Array
,BlockArray
,None
]) – Point at which to evaluate primal residual. IfNone
, the primal residual is evaluated at the current iterateself.x
.- Return type:
- Returns:
Norm of primal residual.
- objective(x=None, z_list=None)[source]#
Evaluate the objective function.
Evaluate the objective function
\[f(\mb{x}) + \sum_{i=1}^N g_i(\mb{z}_i) \;.\]Note that this form is cheaper to compute, but may have very poor accuracy compared with the “true” objective function
\[f(\mb{x}) + \sum_{i=1}^N g_i(C_i \mb{x}) \;.\]when the primal residual is large.
- Parameters:
x (
Union
[Array
,BlockArray
,None
]) – Point at which to evaluate objective function. IfNone
, the objective is evaluated at the current iterateself.x
.z_list (
Optional
[List
[Union
[Array
,BlockArray
]]]) – Point at which to evaluate objective function. IfNone
, the objective is evaluated at the current iterateself.z_list
.
- Return type:
- Returns:
Value of the objective function.
- step()[source]#
Perform a single ADMM iteration.
The primary variable \(\mb{x}\) is updated by solving the the optimization problem
\[\mb{x}^{(k+1)} = \argmin_{\mb{x}} \; f(\mb{x}) + \sum_i \frac{\rho_i}{2} \norm{\mb{z}^{(k)}_i - \mb{u}^{(k)}_i - C_i \mb{x}}_2^2 \;.\]Update auxiliary variables \(\mb{z}_i\) and scaled Lagrange multipliers \(\mb{u}_i\). The auxiliary variables are updated according to
\[\begin{split}\begin{aligned} \mb{z}_i^{(k+1)} &= \argmin_{\mb{z}_i} \; g_i(\mb{z}_i) + \frac{\rho_i}{2} \norm{\mb{z}_i - \mb{u}^{(k)}_i - C_i \mb{x}^{(k+1)}}_2^2 \\ &= \mathrm{prox}_{g_i}(C_i \mb{x} + \mb{u}_i, 1 / \rho_i) \;, \end{aligned}\end{split}\]and the scaled Lagrange multipliers are updated according to
\[\mb{u}_i^{(k+1)} = \mb{u}_i^{(k)} + C_i \mb{x}^{(k+1)} - \mb{z}^{(k+1)}_i \;.\]
- u_init(x0)[source]#
Initialize scaled Lagrange multipliers \(\mb{u}_i\).
Initialized to
\[\mb{u}_i = \mb{0} \;.\]Note that the parameter x0 is unused, but is provided for potential use in an overridden method.
- Parameters:
x0 (
Union
[Array
,BlockArray
]) – Initial value of \(\mb{x}\).- Return type:
List
[Union
[Array
,BlockArray
]]
- z_init(x0)[source]#
Initialize auxiliary variables \(\mb{z}_i\).
Initialized to
\[\mb{z}_i = C_i \mb{x}^{(0)} \;.\]z_list
andz_list_old
are initialized to the same value.- Parameters:
x0 (
Union
[Array
,BlockArray
]) – Initial value of \(\mb{x}\).- Return type:
Tuple
[List
[Union
[Array
,BlockArray
]],List
[Union
[Array
,BlockArray
]]]
- class scico.optimize.Optimizer(**kwargs)#
Bases:
object
Base class for optimizer classes.
- Attributes:
itnum (int) – Optimizer iteration counter.
maxiter (int) – Maximum number of optimizer outer-loop iterations.
timer (
Timer
) – Iteration timer.
Initialize common attributes of
Optimizer
objects.- Parameters:
**kwargs (
Any
) –Optional parameter dict. Valid keys are:
- iter0:
Initial value of iteration counter. Default value is 0.
- maxiter:
Maximum iterations on call to
solve
. Default value is 100.- nanstop:
If
True
, stop iterations if aNaN
orInf
value is encountered in a solver working variable. Default value isFalse
.- itstat_options:
A dict of named parameters to be passed to the
diagnostics.IterationStats
initializer. The dict may also include an additional key “itstat_func” with the corresponding value being a function with two parameters, an integer and anOptimizer
object, responsible for constructing a tuple ready for insertion into thediagnostics.IterationStats
object. IfNone
, default values are used for the dict entries, otherwise the default dict is updated with the dict specified by this parameter.
- class scico.optimize.LinearizedADMM(f, g, C, mu, nu, x0=None, **kwargs)#
Bases:
Optimizer
Linearized alternating direction method of multipliers algorithm.
Solve an optimization problem of the form
\[\argmin_{\mb{x}} \; f(\mb{x}) + g(C \mb{x}) \;,\]where \(f\) and \(g\) are instances of
Functional
, (in most cases \(f\) will, more specifically be an an instance ofLoss
), and \(C\) is an instance ofLinearOperator
.The optimization problem is solved by introducing the splitting \(\mb{z} = C \mb{x}\) and solving
\[\argmin_{\mb{x}, \mb{z}} \; f(\mb{x}) + g(\mb{z}) \; \text{such that}\; C \mb{x} = \mb{z} \;,\]via a linearized ADMM algorithm [55] [41] (Sec. 4.4.2) consisting of the iterations (see
step
)\[\begin{split}\begin{aligned} \mb{x}^{(k+1)} &= \mathrm{prox}_{\mu f} \left( \mb{x}^{(k)} - (\mu / \nu) C^T \left(C \mb{x}^{(k)} - \mb{z}^{(k)} + \mb{u}^{(k)} \right) \right) \\ \mb{z}^{(k+1)} &= \mathrm{prox}_{\nu g} \left(C \mb{x}^{(k+1)} + \mb{u}^{(k)} \right) \\ \mb{u}^{(k+1)} &= \mb{u}^{(k)} + C \mb{x}^{(k+1)} - \mb{z}^{(k+1)} \;. \end{aligned}\end{split}\]Parameters \(\mu\) and \(\nu\) are required to satisfy
\[0 < \mu < \nu \| C \|_2^{-2} \;.\]- Attributes:
f (
Functional
) – Functional \(f\) (usually aLoss
).g (
Functional
) – Functional \(g\).C (
LinearOperator
) – \(C\) operator.mu (scalar) – First algorithm parameter.
nu (scalar) – Second algorithm parameter.
u (array-like) – Scaled Lagrange multipliers \(\mb{u}\) at current iteration.
x (array-like) – Solution variable.
z (array-like) – Auxiliary variables \(\mb{z}\) at current iteration.
z_old (array-like) – Auxiliary variables \(\mb{z}\) at previous iteration.
Initialize a
LinearizedADMM
object.- Parameters:
f (
Functional
) – Functional \(f\) (usually a loss function).g (
Functional
) – Functional \(g\).C (
LinearOperator
) – Operator \(C\).mu (
float
) – First algorithm parameter.nu (
float
) – Second algorithm parameter.x0 (
Union
[Array
,BlockArray
,None
]) – Starting point for \(\mb{x}\). IfNone
, defaults to an array of zeros.**kwargs – Additional optional parameters handled by initializer of base class
Optimizer
.
- norm_dual_residual()[source]#
Compute the \(\ell_2\) norm of the dual residual.
Compute the \(\ell_2\) norm of the dual residual
\[\norm{\mb{z}^{(k)} - \mb{z}^{(k-1)}}_2 \;.\]- Return type:
- Returns:
Current norm of dual residual.
- norm_primal_residual(x=None)[source]#
Compute the \(\ell_2\) norm of the primal residual.
Compute the \(\ell_2\) norm of the primal residual
\[\norm{C \mb{x} - \mb{z}}_2 \;.\]- Parameters:
x (
Union
[Array
,BlockArray
,None
]) – Point at which to evaluate primal residual. IfNone
, the primal residual is evaluated at the current iterateself.x
.- Return type:
- Returns:
Norm of primal residual.
- objective(x=None, z=None)[source]#
Evaluate the objective function.
Evaluate the objective function
\[f(\mb{x}) + g(\mb{z}) \;.\]- Parameters:
x (
Union
[Array
,BlockArray
,None
]) – Point at which to evaluate objective function. IfNone
, the objective is evaluated at the current iterateself.x
.z (
Optional
[List
[Union
[Array
,BlockArray
]]]) – Point at which to evaluate objective function. IfNone
, the objective is evaluated at the current iterateself.z
.
- Return type:
- Returns:
scalar – Value of the objective function.
- step()[source]#
Perform a single linearized ADMM iteration.
The primary variable \(\mb{x}\) is updated by computing
\[\mb{x}^{(k+1)} = \mathrm{prox}_{\mu f} \left( \mb{x}^{(k)} - (\mu / \nu) A^T \left(A \mb{x}^{(k)} - \mb{z}^{(k)} + \mb{u}^{(k)} \right) \right) \;.\]The auxiliary variable is updated according to
\[\mb{z}^{(k+1)} = \mathrm{prox}_{\nu g} \left(A \mb{x}^{(k+1)} + \mb{u}^{(k)} \right) \;,\]and the scaled Lagrange multiplier is updated according to
\[\mb{u}^{(k+1)} = \mb{u}^{(k)} + C \mb{x}^{(k+1)} - \mb{z}^{(k+1)} \;.\]
- u_init(x0)[source]#
Initialize scaled Lagrange multiplier \(\mb{u}\).
Initialized to
\[\mb{u} = \mb{0} \;.\]Note that the parameter x0 is unused, but is provided for potential use in an overridden method.
- Parameters:
x0 (
Union
[Array
,BlockArray
]) – Starting point for \(\mb{x}\).- Return type:
- z_init(x0)[source]#
Initialize auxiliary variable \(\mb{z}\).
Initialized to
\[\mb{z} = C \mb{x}^{(0)} \;.\]z
andz_old
are initialized to the same value.- Parameters:
x0 (
Union
[Array
,BlockArray
]) – Starting point for \(\mb{x}\).- Return type:
Tuple
[Union
[Array
,BlockArray
],Union
[Array
,BlockArray
]]
- class scico.optimize.PGM(f, g, L0, x0, step_size=None, **kwargs)#
Bases:
Optimizer
Proximal Gradient Method (PGM) base class.
Minimize a function of the form \(f(\mb{x}) + g(\mb{x})\), where \(f\) and the \(g\) are instances of
Functional
.Uses helper
StepSize
to provide an estimate of the Lipschitz constant \(L\) of \(f\). The step size \(\alpha\) is the reciprocal of \(L\), i.e.: \(\alpha = 1 / L\).- Parameters:
f (
Union
[Loss
,Functional
]) – Loss or Functional object with grad defined.g (
Functional
) – Instance of Functional with defined prox method.L0 (
float
) – Initial estimate of Lipschitz constant of f.x0 (
Union
[Array
,BlockArray
]) – Starting point for \(\mb{x}\).step_size (
Optional
[PGMStepSize
]) – helperStepSize
to estimate the Lipschitz constant of f.**kwargs – Additional optional parameters handled by initializer of base class
Optimizer
.
- f_quad_approx(x, y, L)[source]#
Evaluate the quadratic approximation to function \(f\).
Evaluate the quadratic approximation to function \(f\), corresponding to \(\hat{f}_{L}(\mb{x}, \mb{y}) = f(\mb{y}) + \nabla f(\mb{y})^H (\mb{x} - \mb{y}) + \frac{L}{2} \left\|\mb{x} - \mb{y}\right\|_2^2\).
- Return type:
- norm_residual()[source]#
Return the fixed point residual.
Return the fixed point residual (see Sec. 4.3 of [33]).
- Return type:
- class scico.optimize.AcceleratedPGM(f, g, L0, x0, step_size=None, **kwargs)#
Bases:
PGM
Accelerated Proximal Gradient Method (AcceleratedPGM) base class.
Minimize a function of the form \(f(\mb{x}) + g(\mb{x})\).
Minimize a function of the form \(f(\mb{x}) + g(\mb{x})\), where \(f\) and the \(g\) are instances of
Functional
. The accelerated form of PGM is also known as FISTA [8].For documentation on inherited attributes, see
PGM
.- Parameters:
f (
Union
[Loss
,Functional
]) – Loss or Functional object with grad defined.g (
Functional
) – Instance of Functional with defined prox method.L0 (
float
) – Initial estimate of Lipschitz constant of f.x0 (
Union
[Array
,BlockArray
]) – Starting point for \(\mb{x}\).step_size (
Optional
[PGMStepSize
]) – helperStepSize
to estimate the Lipschitz constant of f.**kwargs – Additional optional parameters handled by initializer of base class
Optimizer
.
- class scico.optimize.PDHG(f, g, C, tau, sigma, alpha=1.0, x0=None, z0=None, **kwargs)#
Bases:
Optimizer
Primal–dual hybrid gradient (PDHG) algorithm.
Primal–dual hybrid gradient (PDHG) is a family of algorithms [21] that includes the Chambolle-Pock primal-dual algorithm [14]. The form implemented here is a minor variant [42] of the original Chambolle-Pock algorithm.
Solve an optimization problem of the form
\[\argmin_{\mb{x}} \; f(\mb{x}) + g(C \mb{x}) \;,\]where \(f\) and \(g\) are instances of
Functional
, (in most cases \(f\) will, more specifically be an an instance ofLoss
), and \(C\) is an instance ofOperator
orLinearOperator
.When C is a
LinearOperator
, the algorithm iterations are\[\begin{split}\begin{aligned} \mb{x}^{(k+1)} &= \mathrm{prox}_{\tau f} \left( \mb{x}^{(k)} - \tau C^T \mb{z}^{(k)} \right) \\ \mb{z}^{(k+1)} &= \mathrm{prox}_{\sigma g^*} \left( \mb{z}^{(k)} + \sigma C((1 + \alpha) \mb{x}^{(k+1)} - \alpha \mb{x}^{(k)} \right) \;, \end{aligned}\end{split}\]where \(g^*\) denotes the convex conjugate of \(g\). Parameters \(\tau > 0\) and \(\sigma > 0\) are also required to satisfy
\[\tau \sigma < \| C \|_2^{-2} \;,\]and it is required that \(\alpha \in [0, 1]\).
When C is a non-linear
Operator
, a non-linear PDHG variant [49] is used, with the same iterations except for \(\mb{x}\) update\[\mb{x}^{(k+1)} = \mathrm{prox}_{\tau f} \left( \mb{x}^{(k)} - \tau [J_x C(\mb{x}^{(k)})]^T \mb{z}^{(k)} \right) \;.\]- Attributes:
f (
Functional
) – Functional \(f\) (usually aLoss
).g (
Functional
) – Functional \(g\).C (
Operator
) – \(C\) operator.tau (scalar) – First algorithm parameter.
sigma (scalar) – Second algorithm parameter.
alpha (scalar) – Relaxation parameter.
x (array-like) – Primal variable \(\mb{x}\) at current iteration.
x_old (array-like) – Primal variable \(\mb{x}\) at previous iteration.
z (array-like) – Dual variable \(\mb{z}\) at current iteration.
z_old (array-like) – Dual variable \(\mb{z}\) at previous iteration.
Initialize a
PDHG
object.- Parameters:
f (
Functional
) – Functional \(f\) (usually a loss function).g (
Functional
) – Functional \(g\).C (
Operator
) – Operator \(C\).tau (
float
) – First algorithm parameter.sigma (
float
) – Second algorithm parameter.alpha (
float
) – Relaxation parameter.x0 (
Union
[Array
,BlockArray
,None
]) – Starting point for \(\mb{x}\). IfNone
, defaults to an array of zeros.z0 (
Union
[Array
,BlockArray
,None
]) – Starting point for \(\mb{z}\). IfNone
, defaults to an array of zeros.**kwargs – Additional optional parameters handled by initializer of base class
Optimizer
.
- static estimate_parameters(C, x=None, ratio=1.0, factor=1.01, maxiter=100, key=None)[source]#
Estimate tau and sigma parameters of
PDHG
.Find values of the tau and sigma parameters of
PDHG
that respect the constraint\[\tau \sigma < \| C \|_2^{-2} \quad \text{or} \quad \tau \sigma < \| J_x C(\mb{x}) \|_2^{-2} \;,\]depending on whether \(C\) is a
LinearOperator
or not.- Parameters:
C (
Operator
) – Operator \(C\).x (
Union
[Array
,BlockArray
,None
]) – Value of \(\mb{x}\) at which to evaluate the Jacobian of \(C\) (when it is not aLinearOperator
). IfNone
, defaults to an array of zeros.ratio (
float
) – Desired ratio between return \(\tau\) and \(\sigma\) values (\(\sigma = \mathrm{ratio} \tau\)).factor (
Optional
[float
]) – Safety factor with which to multiply \(\| C \|_2^{-2}\) to ensure strict inequality compliance. IfNone
, the value is set to 1.0.maxiter (
int
) – Maximum number of power iterations to use in operator norm estimation (seeoperator_norm
). Default: 100.key (
Optional
[Array
]) – Jax PRNG key to use in operator norm estimation (seeoperator_norm
). Defaults toNone
, in which case a new key is created.
- Returns:
A tuple (tau, sigma) representing the estimated parameter values.
- norm_dual_residual()[source]#
Compute the \(\ell_2\) norm of the dual residual.
Compute the \(\ell_2\) norm of the dual residual
\[\sigma^{-1} \norm{\mb{z}^{(k)} - \mb{z}^{(k-1)}}_2 \;.\]- Return type:
- Returns:
Current norm of dual residual.
- norm_primal_residual()[source]#
Compute the \(\ell_2\) norm of the primal residual.
Compute the \(\ell_2\) norm of the primal residual
\[\tau^{-1} \norm{\mb{x}^{(k)} - \mb{x}^{(k-1)}}_2 \;.\]- Return type:
- Returns:
Current norm of primal residual.
- objective(x=None)[source]#
Evaluate the objective function.
Evaluate the objective function
\[f(\mb{x}) + g(C \mb{x}) \;.\]- Parameters:
x (
Union
[Array
,BlockArray
,None
]) – Point at which to evaluate objective function. IfNone
, the objective is evaluated at the current iterateself.x
- Return type:
- Returns:
scalar – Value of the objective function.
- class scico.optimize.ProximalADMM(f, g, A, rho, mu, nu, B=None, c=None, x0=None, z0=None, u0=None, fast_dual_residual=True, **kwargs)#
Bases:
ProximalADMMBase
Proximal alternating direction method of multipliers.
Solve an optimization problem of the form
\[\argmin_{\mb{x}} \; f(\mb{x}) + g(\mb{z}) \; \text{such that}\; A \mb{x} + B \mb{z} = \mb{c} \;,\]where \(f\) and \(g\) are instances of
Functional
, (in most cases \(f\) will, more specifically be an instance ofLoss
), and \(A\) and \(B\) are instances ofLinearOperator
.The optimization problem is solved via a variant of the proximal ADMM algorithm [18], consisting of the iterations (see
step
)\[\begin{split}\begin{aligned} \mb{x}^{(k+1)} &= \mathrm{prox}_{\rho^{-1} \mu^{-1} f} \left( \mb{x}^{(k)} - \mu^{-1} A^T \left(2 \mb{u}^{(k)} - \mb{u}^{(k-1)} \right) \right) \\ \mb{z}^{(k+1)} &= \mathrm{prox}_{\rho^{-1} \nu^{-1} g} \left( \mb{z}^{(k)} - \nu^{-1} B^T \left( B \mb{x}^{(k+1)} + A \mb{z}^{(k)} - \mb{c} + \mb{u}^{(k)} \right) \right) \\ \mb{u}^{(k+1)} &= \mb{u}^{(k)} + A \mb{x}^{(k+1)} + B \mb{z}^{(k+1)} - \mb{c} \;. \end{aligned}\end{split}\]Parameters \(\mu\) and \(\nu\) are required to satisfy
\[\mu > \norm{ A }_2^2 \quad \text{and} \quad \nu > \norm{ B }_2^2 \;.\]- Attributes:
A (
LinearOperator
) – \(A\) linear operator.B (
LinearOperator
) – \(B\) linear operator.c (array-like) – constant \(\mb{c}\).
Initialize a
ProximalADMM
object.- Parameters:
f (
Functional
) – Functional \(f\) (usually a loss function).g (
Functional
) – Functional \(g\).A (
LinearOperator
) – Linear operator \(A\).rho (
float
) – Penalty parameter.mu (
float
) – First algorithm parameter.nu (
float
) – Second algorithm parameter.B (
Optional
[LinearOperator
]) – Linear operator \(B\) (ifNone
, \(B = -I\) where \(I\) is the identity operator).c (
Union
[float
,Array
,BlockArray
,None
]) – Constant \(\mb{c}\). IfNone
, defaults to zero.x0 (
Union
[Array
,BlockArray
,None
]) – Starting value for \(\mb{x}\). IfNone
, defaults to an array of zeros.z0 (
Union
[Array
,BlockArray
,None
]) – Starting value for \(\mb{z}\). IfNone
, defaults to an array of zeros.u0 (
Union
[Array
,BlockArray
,None
]) – Starting value for \(\mb{u}\). IfNone
, defaults to an array of zeros.fast_dual_residual (
bool
) – Flag indicating whether to use fast approximation to the dual residual, or a slower but more accurate calculation.**kwargs – Additional optional parameters handled by initializer of base class
Optimizer
.
- static estimate_parameters(A, B=None, factor=1.01, maxiter=100, key=None)[source]#
Estimate mu and nu parameters of
ProximalADMM
.Find values of the mu and nu parameters of
ProximalADMM
that respect the constraints\[\mu > \norm{ A }_2^2 \quad \text{and} \quad \nu > \norm{ B }_2^2 \;.\]- Parameters:
A (
LinearOperator
) – Linear operator \(A\).B (
Optional
[LinearOperator
]) – Linear operator \(B\) (ifNone
, \(B = -I\) where \(I\) is the identity operator).factor (
Optional
[float
]) – Safety factor with which to multiply estimated operator norms to ensure strict inequality compliance. IfNone
, return the estimated squared operator norms.maxiter (
int
) – Maximum number of power iterations to use in operator norm estimation (seeoperator_norm
). Default: 100.key (
Optional
[Array
]) – Jax PRNG key to use in operator norm estimation (seeoperator_norm
). Defaults toNone
, in which case a new key is created.
- Return type:
- Returns:
A tuple (mu, nu) representing the estimated parameter values or corresponding squared operator norm values, depending on the value of the factor parameter.
- norm_dual_residual()[source]#
Compute the \(\ell_2\) norm of the dual residual.
Compute the \(\ell_2\) norm of the dual residual. If the flag requesting a fast approximate calculation is set, it is computed as
\[\norm{\mb{z}^{(k+1)} - \mb{z}^{(k)}}_2 \;,\]otherwise it is computed as
\[\norm{A^T B ( \mb{z}^{(k+1)} - \mb{z}^{(k)} ) }_2 \;.\]- Return type:
- Returns:
Current norm of dual residual.
- norm_primal_residual(x=None, z=None)[source]#
Compute the \(\ell_2\) norm of the primal residual.
Compute the \(\ell_2\) norm of the primal residual
\[\norm{A \mb{x} + B \mb{z} - \mb{c}}_2 \;.\]- Parameters:
x (
Union
[Array
,BlockArray
,None
]) – Point at which to evaluate primal residual. IfNone
, the primal residual is evaluated at the current iterateself.x
.z (
Optional
[List
[Union
[Array
,BlockArray
]]]) – Point at which to evaluate primal residual. IfNone
, the primal residual is evaluated at the current iterateself.z
.
- Return type:
- Returns:
Norm of primal residual.
- class scico.optimize.NonLinearPADMM(f, g, H, rho, mu, nu, x0=None, z0=None, u0=None, fast_dual_residual=True, **kwargs)#
Bases:
ProximalADMMBase
Non-linear proximal alternating direction method of multipliers.
Solve an optimization problem of the form
\[\argmin_{\mb{x}} \; f(\mb{x}) + g(\mb{z}) \; \text{such that}\; H(\mb{x}, \mb{z}) = 0 \;,\]where \(f\) and \(g\) are instances of
Functional
, (in most cases \(f\) will, more specifically be an instance ofLoss
), and \(H\) is a function.The optimization problem is solved via a variant of the proximal ADMM algorithm for problems with a non-linear operator constraint [11], consisting of the iterations (see
step
)\[\begin{split}\begin{aligned} A^{(k)} &= J_{\mb{x}} H(\mb{x}^{(k)}, \mb{z}^{(k)}) \\ \mb{x}^{(k+1)} &= \mathrm{prox}_{\rho^{-1} \mu^{-1} f} \left( \mb{x}^{(k)} - \mu^{-1} (A^{(k)})^T \left(2 \mb{u}^{(k)} - \mb{u}^{(k-1)} \right) \right) \\ B^{(k)} &= J_{\mb{z}} H(\mb{x}^{(k+1)}, \mb{z}^{(k)}) \\ \mb{z}^{(k+1)} &= \mathrm{prox}_{\rho^{-1} \nu^{-1} g} \left( \mb{z}^{(k)} - \nu^{-1} (B^{(k)})^T \left( H(\mb{x}^{(k+1)}, \mb{z}^{(k)}) + \mb{u}^{(k)} \right) \right) \\ \mb{u}^{(k+1)} &= \mb{u}^{(k)} + H(\mb{x}^{(k+1)}, \mb{z}^{(k+1)}) \;. \end{aligned}\end{split}\]Parameters \(\mu\) and \(\nu\) are required to satisfy
\[\mu > \norm{ A^{(k)} }_2^2 \quad \text{and} \quad \nu > \norm{ B^{(k)} }_2^2\]for all \(A^{(k)}\) and \(B^{(k)}\).
- Attributes:
H (
Function
) – \(H\) function.
Initialize a
NonLinearPADMM
object.- Parameters:
f (
Functional
) – Functional \(f\) (usually a loss function).g (
Functional
) – Functional \(g\).H (
Function
) – Function \(H\).rho (
float
) – Penalty parameter.mu (
float
) – First algorithm parameter.nu (
float
) – Second algorithm parameter.x0 (
Union
[Array
,BlockArray
,None
]) – Starting value for \(\mb{x}\). IfNone
, defaults to an array of zeros.z0 (
Union
[Array
,BlockArray
,None
]) – Starting value for \(\mb{z}\). IfNone
, defaults to an array of zeros.u0 (
Union
[Array
,BlockArray
,None
]) – Starting value for \(\mb{u}\). IfNone
, defaults to an array of zeros.fast_dual_residual (
bool
) – Flag indicating whether to use fast approximation to the dual residual, or a slower but more accurate calculation.**kwargs – Additional optional parameters handled by initializer of base class
Optimizer
.
- static estimate_parameters(H, x=None, z=None, factor=1.01, maxiter=100, key=None)[source]#
Estimate mu and nu parameters of
NonLinearPADMM
.Find values of the mu and nu parameters of
NonLinearPADMM
that respect the constraints\[\mu > \norm{ J_x H(\mb{x}, \mb{z}) }_2^2 \quad \text{and} \quad \nu > \norm{ J_z H(\mb{x}, \mb{z}) }_2^2 \;.\]- Parameters:
H (
Function
) – Constraint function \(H\).x (
Union
[Array
,BlockArray
,None
]) – Value of \(\mb{x}\) at which to evaluate the Jacobian. IfNone
, defaults to an array of zeros.z (
Union
[Array
,BlockArray
,None
]) – Value of \(\mb{z}\) at which to evaluate the Jacobian. IfNone
, defaults to an array of zeros.factor (
Optional
[float
]) – Safety factor with which to multiply estimated operator norms to ensure strict inequality compliance. IfNone
, return the estimated squared operator norms.maxiter (
int
) – Maximum number of power iterations to use in operator norm estimation (seeoperator_norm
). Default: 100.key (
Optional
[Array
]) – Jax PRNG key to use in operator norm estimation (seeoperator_norm
). Defaults toNone
, in which case a new key is created.
- Return type:
- Returns:
A tuple (mu, nu) representing the estimated parameter values or corresponding squared operator norm values, depending on the value of the factor parameter.
- norm_dual_residual()[source]#
Compute the \(\ell_2\) norm of the dual residual.
Compute the \(\ell_2\) norm of the dual residual. If the flag requesting a fast approximate calculation is set, it is computed as
\[\norm{\mb{z}^{(k+1)} - \mb{z}^{(k)}}_2 \;,\]otherwise it is computed as
\[\norm{A^T B ( \mb{z}^{(k+1)} - \mb{z}^{(k)} ) }_2 \;,\]where
\[\begin{split}A &= J_{\mb{x}} H(\mb{x}^{(k+1)}, \mb{z}^{(k+1)}) \\ B &= J_{\mb{z}} H(\mb{x}^{(k+1)}, \mb{z}^{(k+1)}) \;.\end{split}\]- Return type:
- Returns:
Current norm of dual residual.
- norm_primal_residual(x=None, z=None)[source]#
Compute the \(\ell_2\) norm of the primal residual.
Compute the \(\ell_2\) norm of the primal residual
\[\norm{H(\mb{x}, \mb{z})}_2 \;.\]- Parameters:
x (
Union
[Array
,BlockArray
,None
]) – Point at which to evaluate primal residual. IfNone
, the primal residual is evaluated at the current iterateself.x
.z (
Optional
[List
[Union
[Array
,BlockArray
]]]) – Point at which to evaluate primal residual. IfNone
, the primal residual is evaluated at the current iterateself.z
.
- Return type:
- Returns:
Norm of primal residual.
- class scico.optimize.ProximalADMMBase(f, g, rho, mu, nu, xshape, zshape, ushape, xdtype, zdtype, udtype, x0=None, z0=None, u0=None, fast_dual_residual=True, **kwargs)#
Bases:
Optimizer
Base class for proximal ADMM solvers.
- Attributes:
f (
Functional
) – Functional \(f\) (usually aLoss
).g (
Functional
) – Functional \(g\).rho (scalar) – Penalty parameter.
mu (scalar) – First algorithm parameter.
nu (scalar) – Second algorithm parameter.
x (array-like) – Solution variable.
z (array-like) – Auxiliary variables \(\mb{z}\) at current iteration.
z_old (array-like) – Auxiliary variables \(\mb{z}\) at previous iteration.
u (array-like) – Scaled Lagrange multipliers \(\mb{u}\) at current iteration.
u_old (array-like) – Scaled Lagrange multipliers \(\mb{u}\) at previous iteration.
Initialize a
ProximalADMMBase
object.- Parameters:
f (
Functional
) – Functional \(f\) (usually a loss function).g (
Functional
) – Functional \(g\).rho (
float
) – Penalty parameter.mu (
float
) – First algorithm parameter.nu (
float
) – Second algorithm parameter.xshape (
Union
[Tuple
[int
,...
],Tuple
[Tuple
[int
,...
],...
]]) – Shape of variable \(\mb{x}\).zshape (
Union
[Tuple
[int
,...
],Tuple
[Tuple
[int
,...
],...
]]) – Shape of variable \(\mb{z}\).ushape (
Union
[Tuple
[int
,...
],Tuple
[Tuple
[int
,...
],...
]]) – Shape of variable \(\mb{u}\).xdtype (
DType
) – Dtype of variable \(\mb{x}\).zdtype (
DType
) – Dtype of variable \(\mb{z}\).udtype (
DType
) – Dtype of variable \(\mb{u}\).x0 (
Union
[Array
,BlockArray
,None
]) – Initial value for \(\mb{x}\). IfNone
, defaults to an array of zeros.z0 (
Union
[Array
,BlockArray
,None
]) – Initial value for \(\mb{z}\). IfNone
, defaults to an array of zeros.u0 (
Union
[Array
,BlockArray
,None
]) – Initial value for \(\mb{u}\). IfNone
, defaults to an array of zeros.fast_dual_residual (
bool
) – Flag indicating whether to use fast approximation to the dual residual, or a slower but more accurate calculation.**kwargs – Additional optional parameters handled by initializer of base class
Optimizer
.
- objective(x=None, z=None)[source]#
Evaluate the objective function.
Evaluate the objective function
\[f(\mb{x}) + g(\mb{z}) \;.\]- Parameters:
x (
Union
[Array
,BlockArray
,None
]) – Point at which to evaluate objective function. IfNone
, the objective is evaluated at the current iterateself.x
.z (
Optional
[List
[Union
[Array
,BlockArray
]]]) – Point at which to evaluate objective function. IfNone
, the objective is evaluated at the current iterateself.z
.
- Return type:
- Returns:
scalar – Current value of the objective function.
Modules
ADMM solver and auxiliary classes. |
|
PGM solvers and auxiliary classes. |