scico.optimize¶

Optimization algorithms.

Modules

`scico.optimize.admm`	ADMM solver and auxiliary classes.
`scico.optimize.pgm`	PGM solvers and auxiliary classes.

Classes

`ADMM`(f, g_list, C_list, rho_list[, alpha, ...])	Basic Alternating Direction Method of Multipliers (ADMM) algorithm.
`AcceleratedPGM`(f, g, L0, x0[, step_size])	Accelerated proximal gradient method (APGM) algorithm.
`LinearizedADMM`(f, g, C, mu, nu[, x0])	Linearized alternating direction method of multipliers algorithm.
`NonLinearPADMM`(f, g, H, rho, mu, nu[, x0, ...])	Non-linear proximal alternating direction method of multipliers.
`Optimizer`(**kwargs)	Base class for optimizer classes.
`PDHG`(f, g, C, tau, sigma[, alpha, x0, z0])	Primal–dual hybrid gradient (PDHG) algorithm.
`PGM`(f, g, L0, x0[, step_size])	Proximal gradient method (PGM) algorithm.
`ProximalADMM`(f, g, A, rho, mu, nu[, B, c, ...])	Proximal alternating direction method of multipliers.
`ProximalADMMBase`(f, g, rho, mu, nu, xshape, ...)	Base class for proximal ADMM solvers.

class scico.optimize.ADMM(f, g_list, C_list, rho_list, alpha=1.0, x0=None, subproblem_solver=None, **kwargs)¶

Bases: Optimizer

Basic Alternating Direction Method of Multipliers (ADMM) algorithm.

Solve an optimization problem of the form

\[\argmin_{\mb{x}} \; f(\mb{x}) + \sum_{i=1}^N g_i(C_i \mb{x}) \;,\]

where \(f\) and the \(g_i\) are instances of Functional, and the \(C_i\) are LinearOperator.

The optimization problem is solved by introducing the splitting \(\mb{z}_i = C_i \mb{x}\) and solving

\[\argmin_{\mb{x}, \mb{z}_i} \; f(\mb{x}) + \sum_{i=1}^N g_i(\mb{z}_i) \; \text{such that}\; C_i \mb{x} = \mb{z}_i \;,\]

via an ADMM algorithm [26] [24] [12] consisting of the iterations (see step)

\[\begin{split}\begin{aligned} \mb{x}^{(k+1)} &= \argmin_{\mb{x}} \; f(\mb{x}) + \sum_i \frac{\rho_i}{2} \norm{\mb{z}^{(k)}_i - \mb{u}^{(k)}_i - C_i \mb{x}}_2^2 \\ \mb{z}_i^{(k+1)} &= \argmin_{\mb{z}_i} \; g_i(\mb{z}_i) + \frac{\rho_i}{2} \norm{\mb{z}_i - \mb{u}^{(k)}_i - C_i \mb{x}^{(k+1)}}_2^2 \\ \mb{u}_i^{(k+1)} &= \mb{u}_i^{(k)} + C_i \mb{x}^{(k+1)} - \mb{z}^{(k+1)}_i \; . \end{aligned}\end{split}\]

Attributes:

f (Functional) – Functional \(f\) (usually a Loss)
g_list (list of Functional) – List of \(g_i\) functionals. Must be same length as C_list and rho_list.
C_list (list of LinearOperator) – List of \(C_i\) operators.
rho_list (list of scalars) – List of \(\rho_i\) penalty parameters. Must be same length as C_list and g_list.
alpha (float) – Relaxation parameter.
u_list (list of array-like) – List of scaled Lagrange multipliers \(\mb{u}_i\) at current iteration.
x (array-like) – Solution.
subproblem_solver (SubproblemSolver) – Solver for \(\mb{x}\)-update step.
z_list (list of array-like) – List of auxiliary variables \(\mb{z}_i\) at current iteration.
z_list_old (list of array-like) – List of auxiliary variables \(\mb{z}_i\) at previous iteration.

Initialize an ADMM object.

Parameters:

f (Functional) – Functional \(f\) (usually a loss function).
g_list (List[Functional]) – List of \(g_i\) functionals. Must be same length as C_list and rho_list.
C_list (List[LinearOperator]) – List of \(C_i\) operators.
rho_list (List[float]) – List of \(\rho_i\) penalty parameters. Must be same length as C_list and g_list.
alpha (float) – Relaxation parameter. No relaxation for default 1.0.
x0 (Union[Array, BlockArray, None]) – Initial value for \(\mb{x}\). If None, defaults to an array of zeros.
subproblem_solver (Optional[SubproblemSolver]) – Solver for \(\mb{x}\)-update step. Defaults to None, which implies use of an instance of GenericSubproblemSolver.
**kwargs – Additional optional parameters handled by initializer of base class Optimizer.

minimizer()[source]¶

Return the current estimate of the functional mimimizer.

Return type:: Union[Array, BlockArray]
Returns:: Current estimate of the functional mimimizer.

norm_dual_residual()[source]¶

Compute the \(\ell_2\) norm of the dual residual.

Compute the \(\ell_2\) norm of the dual residual

\[\left\| \sum_{i=1}^N \rho_i C_i^T \left( \mb{z}^{(k)}_i - \mb{z}^{(k-1)}_i \right) \right\|_2 \;.\]

Return type:: float
Returns:: Norm of dual residual.

norm_primal_residual(x=None)[source]¶

Compute the \(\ell_2\) norm of the primal residual.

Compute the \(\ell_2\) norm of the primal residual

\[\left( \sum_{i=1}^N \rho_i \left\| C_i \mb{x} - \mb{z}_i^{(k)} \right\|_2^2\right)^{1/2} \;.\]

Parameters:: x (Union[Array, BlockArray, None]) – Point at which to evaluate primal residual. If None, the primal residual is evaluated at the current iterate self.x.
Return type:: float
Returns:: Norm of primal residual.

objective(x=None, z_list=None)[source]¶

Evaluate the objective function.

Evaluate the objective function

\[f(\mb{x}) + \sum_{i=1}^N g_i(\mb{z}_i) \;.\]

Note that this form is cheaper to compute, but may have very poor accuracy compared with the “true” objective function

\[f(\mb{x}) + \sum_{i=1}^N g_i(C_i \mb{x}) \;.\]

when the primal residual is large.

Parameters:

x (Union[Array, BlockArray, None]) – Point at which to evaluate objective function. If None, the objective is evaluated at the current iterate self.x.
z_list (Optional[List[Union[Array, BlockArray]]]) – Point at which to evaluate objective function. If None, the objective is evaluated at the current iterate self.z_list.

Return type:

float

Returns:

Value of the objective function.

step()[source]¶

Perform a single ADMM iteration.

The primary variable \(\mb{x}\) is updated by solving the the optimization problem

\[\mb{x}^{(k+1)} = \argmin_{\mb{x}} \; f(\mb{x}) + \sum_i \frac{\rho_i}{2} \norm{\mb{z}^{(k)}_i - \mb{u}^{(k)}_i - C_i \mb{x}}_2^2 \;.\]

Update auxiliary variables \(\mb{z}_i\) and scaled Lagrange multipliers \(\mb{u}_i\). The auxiliary variables are updated according to

\[\begin{split}\begin{aligned} \mb{z}_i^{(k+1)} &= \argmin_{\mb{z}_i} \; g_i(\mb{z}_i) + \frac{\rho_i}{2} \norm{\mb{z}_i - \mb{u}^{(k)}_i - C_i \mb{x}^{(k+1)}}_2^2 \\ &= \mathrm{prox}_{g_i}(C_i \mb{x} + \mb{u}_i, 1 / \rho_i) \;, \end{aligned}\end{split}\]

and the scaled Lagrange multipliers are updated according to

\[\mb{u}_i^{(k+1)} = \mb{u}_i^{(k)} + C_i \mb{x}^{(k+1)} - \mb{z}^{(k+1)}_i \;.\]

u_init(x0)[source]¶

Initialize scaled Lagrange multipliers \(\mb{u}_i\).

Initialized to

\[\mb{u}_i = \mb{0} \;.\]

Note that the parameter x0 is unused, but is provided for potential use in an overridden method.

Parameters:: x0 (Union[Array, BlockArray]) – Initial value of \(\mb{x}\).
Return type:: List[Union[Array, BlockArray]]

z_init(x0)[source]¶

Initialize auxiliary variables \(\mb{z}_i\).

Initialized to

\[\mb{z}_i = C_i \mb{x}^{(0)} \;.\]

z_list and z_list_old are initialized to the same value.

Parameters:: x0 (Union[Array, BlockArray]) – Initial value of \(\mb{x}\).
Return type:: Tuple[List[Union[Array, BlockArray]], List[Union[Array, BlockArray]]]

class scico.optimize.Optimizer(**kwargs)¶

Bases: object

Base class for optimizer classes.

Attributes:

itnum (int) – Optimizer iteration counter.
maxiter (int) – Maximum number of optimizer outer-loop iterations.
timer (Timer) – Iteration timer.

Initialize common attributes of Optimizer objects.

Parameters:

**kwargs (Any) –

Optional parameter dict. Valid keys are:

iter0:: Initial value of iteration counter. Default value is 0.
maxiter:: Maximum iterations on call to solve. Default value is 100.
nanstop:: If True, stop iterations if a NaN or Inf value is encountered in a solver working variable. Default value is False.
itstat_options:: A dict of named parameters to be passed to the diagnostics.IterationStats initializer. The dict may also include an additional key “itstat_func” with the corresponding value being a function with two parameters, an integer and an Optimizer object, responsible for constructing a tuple ready for insertion into the diagnostics.IterationStats object. If None, default values are used for the dict entries, otherwise the default dict is updated with the dict specified by this parameter.

history(transpose=False)[source]¶

Retrieve record of algorithm iterations.

Parameters:: transpose (bool) – Flag indicating whether results should be returned in “transposed” form, i.e. as a namedtuple of lists rather than a list of namedtuples.
Return type:: Union[List[NamedTuple], Tuple[List]]
Returns:: Record of all iterations.

load_state(path)[source]¶

Load optimizer state from a file.

Restore optimizer state from a file.

Parameters:: path (str) – Filename of state file saved using save_state.

minimizer()[source]¶

Return the current estimate of the functional mimimizer.

Return type:: Union[Array, BlockArray]
Returns:: Current estimate of the functional mimimizer.

save_state(path)[source]¶

Save optimizer state to a file.

Parameters:: path (str) – Filename of file to which state should be saved.

solve(callback=None)[source]¶

Initialize and run the optimization algorithm.

Initialize and run the opimization algorithm for a total of self.maxiter iterations.

Parameters:: callback (Optional[Callable[[Optimizer], None]]) – An optional callback function, taking an a single argument of type Optimizer, that is called at the end of every iteration.
Return type:: Union[Array, BlockArray]
Returns:: Computed solution.

step()[source]¶: Perform a single optimizer step.

class scico.optimize.LinearizedADMM(f, g, C, mu, nu, x0=None, **kwargs)¶

Bases: Optimizer

Linearized alternating direction method of multipliers algorithm.

Solve an optimization problem of the form

\[\argmin_{\mb{x}} \; f(\mb{x}) + g(C \mb{x}) \;,\]

where \(f\) and \(g\) are instances of Functional, (in most cases \(f\) will, more specifically be an an instance of Loss), and \(C\) is an instance of LinearOperator.

The optimization problem is solved by introducing the splitting \(\mb{z} = C \mb{x}\) and solving

\[\argmin_{\mb{x}, \mb{z}} \; f(\mb{x}) + g(\mb{z}) \; \text{such that}\; C \mb{x} = \mb{z} \;,\]

via a linearized ADMM algorithm [61] [47] (Sec. 4.4.2) consisting of the iterations (see step)

\[\begin{split}\begin{aligned} \mb{x}^{(k+1)} &= \mathrm{prox}_{\mu f} \left( \mb{x}^{(k)} - (\mu / \nu) C^T \left(C \mb{x}^{(k)} - \mb{z}^{(k)} + \mb{u}^{(k)} \right) \right) \\ \mb{z}^{(k+1)} &= \mathrm{prox}_{\nu g} \left(C \mb{x}^{(k+1)} + \mb{u}^{(k)} \right) \\ \mb{u}^{(k+1)} &= \mb{u}^{(k)} + C \mb{x}^{(k+1)} - \mb{z}^{(k+1)} \;. \end{aligned}\end{split}\]

Parameters \(\mu\) and \(\nu\) are required to satisfy

\[0 < \mu < \nu \| C \|_2^{-2} \;.\]

Attributes:

f (Functional) – Functional \(f\) (usually a Loss).
g (Functional) – Functional \(g\).
C (LinearOperator) – \(C\) operator.
mu (scalar) – First algorithm parameter.
nu (scalar) – Second algorithm parameter.
u (array-like) – Scaled Lagrange multipliers \(\mb{u}\) at current iteration.
x (array-like) – Solution variable.
z (array-like) – Auxiliary variables \(\mb{z}\) at current iteration.
z_old (array-like) – Auxiliary variables \(\mb{z}\) at previous iteration.

Initialize a LinearizedADMM object.

Parameters:

f (Functional) – Functional \(f\) (usually a loss function).
g (Functional) – Functional \(g\).
C (LinearOperator) – Operator \(C\).
mu (float) – First algorithm parameter.
nu (float) – Second algorithm parameter.
x0 (Union[Array, BlockArray, None]) – Starting point for \(\mb{x}\). If None, defaults to an array of zeros.
**kwargs – Additional optional parameters handled by initializer of base class Optimizer.

minimizer()[source]¶

Return the current estimate of the functional mimimizer.

Return type:: Union[Array, BlockArray]
Returns:: Current estimate of the functional mimimizer.

norm_dual_residual()[source]¶

Compute the \(\ell_2\) norm of the dual residual.

Compute the \(\ell_2\) norm of the dual residual

\[\norm{\mb{z}^{(k)} - \mb{z}^{(k-1)}}_2 \;.\]

Return type:: float
Returns:: Current norm of dual residual.

norm_primal_residual(x=None)[source]¶

Compute the \(\ell_2\) norm of the primal residual.

Compute the \(\ell_2\) norm of the primal residual

\[\norm{C \mb{x} - \mb{z}}_2 \;.\]

Parameters:: x (Union[Array, BlockArray, None]) – Point at which to evaluate primal residual. If None, the primal residual is evaluated at the current iterate self.x.
Return type:: float
Returns:: Norm of primal residual.

objective(x=None, z=None)[source]¶

Evaluate the objective function.

Evaluate the objective function

\[f(\mb{x}) + g(\mb{z}) \;.\]

Parameters:

x (Union[Array, BlockArray, None]) – Point at which to evaluate objective function. If None, the objective is evaluated at the current iterate self.x.
z (Union[Array, BlockArray, None]) – Point at which to evaluate objective function. If None, the objective is evaluated at the current iterate self.z.

Return type:

float

Returns:

scalar – Value of the objective function.

step()[source]¶

Perform a single linearized ADMM iteration.

The primary variable \(\mb{x}\) is updated by computing

\[\mb{x}^{(k+1)} = \mathrm{prox}_{\mu f} \left( \mb{x}^{(k)} - (\mu / \nu) A^T \left(A \mb{x}^{(k)} - \mb{z}^{(k)} + \mb{u}^{(k)} \right) \right) \;.\]

The auxiliary variable is updated according to

\[\mb{z}^{(k+1)} = \mathrm{prox}_{\nu g} \left(A \mb{x}^{(k+1)} + \mb{u}^{(k)} \right) \;,\]

and the scaled Lagrange multiplier is updated according to

\[\mb{u}^{(k+1)} = \mb{u}^{(k)} + C \mb{x}^{(k+1)} - \mb{z}^{(k+1)} \;.\]

u_init(x0)[source]¶

Initialize scaled Lagrange multiplier \(\mb{u}\).

Initialized to

\[\mb{u} = \mb{0} \;.\]

Note that the parameter x0 is unused, but is provided for potential use in an overridden method.

Parameters:: x0 (Union[Array, BlockArray]) – Starting point for \(\mb{x}\).
Return type:: Union[Array, BlockArray]

z_init(x0)[source]¶

Initialize auxiliary variable \(\mb{z}\).

Initialized to

\[\mb{z} = C \mb{x}^{(0)} \;.\]

z and z_old are initialized to the same value.

Parameters:: x0 (Union[Array, BlockArray]) – Starting point for \(\mb{x}\).
Return type:: Tuple[Union[Array, BlockArray], Union[Array, BlockArray]]

class scico.optimize.PGM(f, g, L0, x0, step_size=None, **kwargs)¶

Bases: Optimizer

Proximal gradient method (PGM) algorithm.

Minimize a functional of the form \(f(\mb{x}) + g(\mb{x})\), where \(f\) and the \(g\) are instances of Functional. Functional \(f\) should be differentiable and have a Lipschitz continuous derivative, and functional \(g\) should have a proximal operator defined.

The step size \(\alpha\) of the algorithm is defined in terms of its reciprocal \(L\), i.e. \(\alpha = 1 / L\). The initial value for this parameter, L0, is required to satisfy

\[L_0 \geq K(\nabla f) \;,\]

where \(K(\nabla f)\) denotes the Lipschitz constant of the gradient of \(f\). When f is an instance of SquaredL2Loss with a LinearOperator A,

\[K(\nabla f) = \lambda_{ \mathrm{max} }( A^H A ) = \| A \|_2^2 \;,\]

where \(\lambda_{\mathrm{max}}(B)\) denotes the largest eigenvalue of \(B\).

The evolution of the step size is controlled by auxiliary class PGMStepSize and derived classes. The default PGMStepSize simply sets \(L = L_0\), while the derived classes implement a variety of adaptive strategies.

Parameters:

f (Union[Loss, Functional]) – Instance of Loss or Functional with defined grad method.
g (Functional) – Instance of Functional with defined prox method.
L0 (float) – Initial estimate of Lipschitz constant of gradient of f.
x0 (Union[Array, BlockArray]) – Starting point for \(\mb{x}\).
step_size (Optional[PGMStepSize]) – Instance of an auxiliary class of type PGMStepSize determining the evolution of the algorithm step size.
**kwargs – Additional optional parameters handled by initializer of base class Optimizer.

f_quad_approx(x, y, L)[source]¶

Evaluate the quadratic approximation to function \(f\).

Evaluate the quadratic approximation to function \(f\), corresponding to \(\hat{f}_{L}(\mb{x}, \mb{y}) = f(\mb{y}) + \nabla f(\mb{y})^H (\mb{x} - \mb{y}) + \frac{L}{2} \left\|\mb{x} - \mb{y}\right\|_2^2\).

Return type:: float

minimizer()[source]¶

Return the current estimate of the functional mimimizer.

Return type:: Union[Array, BlockArray]
Returns:: Current estimate of the functional mimimizer.

norm_residual()[source]¶

Return the fixed point residual.

Return the fixed point residual (see Sec. 4.3 of [38]).

Return type:: float

objective(x=None)[source]¶

Evaluate the objective function \(f(\mb{x}) + g(\mb{x})\).

Return type:: float

step()[source]¶: Take a single PGM step.

x_step(v, L)[source]¶

Compute update for variable x.

Return type:: Union[Array, BlockArray]

class scico.optimize.AcceleratedPGM(f, g, L0, x0, step_size=None, **kwargs)¶

Bases: PGM

Accelerated proximal gradient method (APGM) algorithm.

Minimize a function of the form \(f(\mb{x}) + g(\mb{x})\), where \(f\) and the \(g\) are instances of Functional. The accelerated form of PGM is also known as FISTA [8].

See PGM for more detailed documentation.

Parameters:

f (Union[Loss, Functional]) – Instance of Loss or Functional with defined grad method.
g (Functional) – Instance of Functional with defined prox method.
L0 (float) – Initial estimate of Lipschitz constant of gradient of f.
x0 (Union[Array, BlockArray]) – Starting point for \(\mb{x}\).
step_size (Optional[PGMStepSize]) – Instance of an auxiliary class of type PGMStepSize determining the evolution of the algorithm step size.
**kwargs – Additional optional parameters handled by initializer of base class Optimizer.

step()[source]¶: Take a single AcceleratedPGM step.

class scico.optimize.PDHG(f, g, C, tau, sigma, alpha=1.0, x0=None, z0=None, **kwargs)¶

Bases: Optimizer

Primal–dual hybrid gradient (PDHG) algorithm.

Primal–dual hybrid gradient (PDHG) is a family of algorithms [22] that includes the Chambolle-Pock primal-dual algorithm [14]. The form implemented here is a minor variant [48] of the original Chambolle-Pock algorithm.

Solve an optimization problem of the form

\[\argmin_{\mb{x}} \; f(\mb{x}) + g(C \mb{x}) \;,\]

where \(f\) and \(g\) are instances of Functional, (in most cases \(f\) will, more specifically be an an instance of Loss), and \(C\) is an instance of Operator or LinearOperator.

When C is a LinearOperator, the algorithm iterations are

\[\begin{split}\begin{aligned} \mb{x}^{(k+1)} &= \mathrm{prox}_{\tau f} \left( \mb{x}^{(k)} - \tau C^T \mb{z}^{(k)} \right) \\ \mb{z}^{(k+1)} &= \mathrm{prox}_{\sigma g^*} \left( \mb{z}^{(k)} + \sigma C((1 + \alpha) \mb{x}^{(k+1)} - \alpha \mb{x}^{(k)} \right) \;, \end{aligned}\end{split}\]

where \(g^*\) denotes the convex conjugate of \(g\). Parameters \(\tau > 0\) and \(\sigma > 0\) are also required to satisfy

\[\tau \sigma < \| C \|_2^{-2} \;,\]

and it is required that \(\alpha \in [0, 1]\).

When C is a non-linear Operator, a non-linear PDHG variant [55] is used, with the same iterations except for \(\mb{x}\) update

\[\mb{x}^{(k+1)} = \mathrm{prox}_{\tau f} \left( \mb{x}^{(k)} - \tau [J_x C(\mb{x}^{(k)})]^T \mb{z}^{(k)} \right) \;.\]

Attributes:

f (Functional) – Functional \(f\) (usually a Loss).
g (Functional) – Functional \(g\).
C (Operator) – \(C\) operator.
tau (scalar) – First algorithm parameter.
sigma (scalar) – Second algorithm parameter.
alpha (scalar) – Relaxation parameter.
x (array-like) – Primal variable \(\mb{x}\) at current iteration.
x_old (array-like) – Primal variable \(\mb{x}\) at previous iteration.
z (array-like) – Dual variable \(\mb{z}\) at current iteration.
z_old (array-like) – Dual variable \(\mb{z}\) at previous iteration.

Initialize a PDHG object.

Parameters:

f (Functional) – Functional \(f\) (usually a loss function).
g (Functional) – Functional \(g\).
C (Operator) – Operator \(C\).
tau (float) – First algorithm parameter.
sigma (float) – Second algorithm parameter.
alpha (float) – Relaxation parameter.
x0 (Union[Array, BlockArray, None]) – Starting point for \(\mb{x}\). If None, defaults to an array of zeros.
z0 (Union[Array, BlockArray, None]) – Starting point for \(\mb{z}\). If None, defaults to an array of zeros.
**kwargs – Additional optional parameters handled by initializer of base class Optimizer.

static estimate_parameters(C, x=None, ratio=1.0, factor=1.01, maxiter=100, key=None)[source]¶

Estimate tau and sigma parameters of PDHG.

Find values of the tau and sigma parameters of PDHG that respect the constraint

\[\tau \sigma < \| C \|_2^{-2} \quad \text{or} \quad \tau \sigma < \| J_x C(\mb{x}) \|_2^{-2} \;,\]

depending on whether \(C\) is a LinearOperator or not.

Parameters:

C (Operator) – Operator \(C\).
x (Union[Array, BlockArray, None]) – Value of \(\mb{x}\) at which to evaluate the Jacobian of \(C\) (when it is not a LinearOperator). If None, defaults to an array of zeros.
ratio (float) – Desired ratio between return \(\tau\) and \(\sigma\) values (\(\sigma = \mathrm{ratio} \tau\)).
factor (Optional[float]) – Safety factor with which to multiply \(\| C \|_2^{-2}\) to ensure strict inequality compliance. If None, the value is set to 1.0.
maxiter (int) – Maximum number of power iterations to use in operator norm estimation (see operator_norm). Default: 100.
key (Optional[Array]) – Jax PRNG key to use in operator norm estimation (see operator_norm). Defaults to None, in which case a new key is created.

Returns:

A tuple (tau, sigma) representing the estimated parameter values.

minimizer()[source]¶

Return the current estimate of the functional mimimizer.

Return type:: Union[Array, BlockArray]
Returns:: Current estimate of the functional mimimizer.

norm_dual_residual()[source]¶

Compute the \(\ell_2\) norm of the dual residual.

Compute the \(\ell_2\) norm of the dual residual

\[\sigma^{-1} \norm{\mb{z}^{(k)} - \mb{z}^{(k-1)}}_2 \;.\]

Return type:: float
Returns:: Current norm of dual residual.

norm_primal_residual()[source]¶

Compute the \(\ell_2\) norm of the primal residual.

Compute the \(\ell_2\) norm of the primal residual

\[\tau^{-1} \norm{\mb{x}^{(k)} - \mb{x}^{(k-1)}}_2 \;.\]

Return type:: float
Returns:: Current norm of primal residual.

objective(x=None)[source]¶

Evaluate the objective function.

Evaluate the objective function

\[f(\mb{x}) + g(C \mb{x}) \;.\]

Parameters:: x (Union[Array, BlockArray, None]) – Point at which to evaluate objective function. If None, the objective is evaluated at the current iterate self.x
Return type:: float
Returns:: scalar – Value of the objective function.

step()[source]¶: Perform a single iteration.

class scico.optimize.ProximalADMM(f, g, A, rho, mu, nu, B=None, c=None, x0=None, z0=None, u0=None, fast_dual_residual=True, **kwargs)¶

Bases: ProximalADMMBase

Proximal alternating direction method of multipliers.

Solve an optimization problem of the form

\[\argmin_{\mb{x}} \; f(\mb{x}) + g(\mb{z}) \; \text{such that}\; A \mb{x} + B \mb{z} = \mb{c} \;,\]

where \(f\) and \(g\) are instances of Functional, (in most cases \(f\) will, more specifically be an instance of Loss), and \(A\) and \(B\) are instances of LinearOperator.

The optimization problem is solved via a variant of the proximal ADMM algorithm [19], consisting of the iterations (see step)

\[\begin{split}\begin{aligned} \mb{x}^{(k+1)} &= \mathrm{prox}_{\rho^{-1} \mu^{-1} f} \left( \mb{x}^{(k)} - \mu^{-1} A^T \left(2 \mb{u}^{(k)} - \mb{u}^{(k-1)} \right) \right) \\ \mb{z}^{(k+1)} &= \mathrm{prox}_{\rho^{-1} \nu^{-1} g} \left( \mb{z}^{(k)} - \nu^{-1} B^T \left( A \mb{x}^{(k+1)} + B \mb{z}^{(k)} - \mb{c} + \mb{u}^{(k)} \right) \right) \\ \mb{u}^{(k+1)} &= \mb{u}^{(k)} + A \mb{x}^{(k+1)} + B \mb{z}^{(k+1)} - \mb{c} \;. \end{aligned}\end{split}\]

Parameters \(\mu\) and \(\nu\) are required to satisfy

\[\mu > \norm{ A }_2^2 \quad \text{and} \quad \nu > \norm{ B }_2^2 \;.\]

Attributes:

A (LinearOperator) – \(A\) linear operator.
B (LinearOperator) – \(B\) linear operator.
c (array-like) – constant \(\mb{c}\).

Initialize a ProximalADMM object.

Parameters:

f (Functional) – Functional \(f\) (usually a loss function).
g (Functional) – Functional \(g\).
A (LinearOperator) – Linear operator \(A\).
rho (float) – Penalty parameter.
mu (float) – First algorithm parameter.
nu (float) – Second algorithm parameter.
B (Optional[LinearOperator]) – Linear operator \(B\) (if None, \(B = -I\) where \(I\) is the identity operator).
c (Union[float, Array, BlockArray, None]) – Constant \(\mb{c}\). If None, defaults to zero.
x0 (Union[Array, BlockArray, None]) – Starting value for \(\mb{x}\). If None, defaults to an array of zeros.
z0 (Union[Array, BlockArray, None]) – Starting value for \(\mb{z}\). If None, defaults to an array of zeros.
u0 (Union[Array, BlockArray, None]) – Starting value for \(\mb{u}\). If None, defaults to an array of zeros.
fast_dual_residual (bool) – Flag indicating whether to use fast approximation to the dual residual, or a slower but more accurate calculation.
**kwargs – Additional optional parameters handled by initializer of base class Optimizer.

static estimate_parameters(A, B=None, factor=1.01, maxiter=100, key=None)[source]¶

Estimate mu and nu parameters of ProximalADMM.

Find values of the mu and nu parameters of ProximalADMM that respect the constraints

\[\mu > \norm{ A }_2^2 \quad \text{and} \quad \nu > \norm{ B }_2^2 \;.\]

Parameters:

A (LinearOperator) – Linear operator \(A\).
B (Optional[LinearOperator]) – Linear operator \(B\) (if None, \(B = -I\) where \(I\) is the identity operator).
factor (Optional[float]) – Safety factor with which to multiply estimated operator norms to ensure strict inequality compliance. If None, return the estimated squared operator norms.
maxiter (int) – Maximum number of power iterations to use in operator norm estimation (see operator_norm). Default: 100.
key (Optional[Array]) – Jax PRNG key to use in operator norm estimation (see operator_norm). Defaults to None, in which case a new key is created.

Return type:

Tuple[float, float]

Returns:

A tuple (mu, nu) representing the estimated parameter values or corresponding squared operator norm values, depending on the value of the factor parameter.

norm_dual_residual()[source]¶

Compute the \(\ell_2\) norm of the dual residual.

Compute the \(\ell_2\) norm of the dual residual. If the flag requesting a fast approximate calculation is set, it is computed as

\[\norm{\mb{z}^{(k+1)} - \mb{z}^{(k)}}_2 \;,\]

otherwise it is computed as

\[\norm{A^T B ( \mb{z}^{(k+1)} - \mb{z}^{(k)} ) }_2 \;.\]

Return type:: float
Returns:: Current norm of dual residual.

norm_primal_residual(x=None, z=None)[source]¶

Compute the \(\ell_2\) norm of the primal residual.

Compute the \(\ell_2\) norm of the primal residual

\[\norm{A \mb{x} + B \mb{z} - \mb{c}}_2 \;.\]

Parameters:

x (Union[Array, BlockArray, None]) – Point at which to evaluate primal residual. If None, the primal residual is evaluated at the current iterate self.x.
z (Union[Array, BlockArray, None]) – Point at which to evaluate primal residual. If None, the primal residual is evaluated at the current iterate self.z.

Return type:

float

Returns:

Norm of primal residual.

step()[source]¶

Perform a single algorithm iteration.

class scico.optimize.NonLinearPADMM(f, g, H, rho, mu, nu, x0=None, z0=None, u0=None, fast_dual_residual=True, **kwargs)¶

Bases: ProximalADMMBase

Non-linear proximal alternating direction method of multipliers.

Solve an optimization problem of the form

\[\argmin_{\mb{x}} \; f(\mb{x}) + g(\mb{z}) \; \text{such that}\; H(\mb{x}, \mb{z}) = 0 \;,\]

where \(f\) and \(g\) are instances of Functional, (in most cases \(f\) will, more specifically be an instance of Loss), and \(H\) is a function.

The optimization problem is solved via a variant of the proximal ADMM algorithm for problems with a non-linear operator constraint [11], consisting of the iterations (see step)

\[\begin{split}\begin{aligned} A^{(k)} &= J_{\mb{x}} H(\mb{x}^{(k)}, \mb{z}^{(k)}) \\ \mb{x}^{(k+1)} &= \mathrm{prox}_{\rho^{-1} \mu^{-1} f} \left( \mb{x}^{(k)} - \mu^{-1} (A^{(k)})^T \left(2 \mb{u}^{(k)} - \mb{u}^{(k-1)} \right) \right) \\ B^{(k)} &= J_{\mb{z}} H(\mb{x}^{(k+1)}, \mb{z}^{(k)}) \\ \mb{z}^{(k+1)} &= \mathrm{prox}_{\rho^{-1} \nu^{-1} g} \left( \mb{z}^{(k)} - \nu^{-1} (B^{(k)})^T \left( H(\mb{x}^{(k+1)}, \mb{z}^{(k)}) + \mb{u}^{(k)} \right) \right) \\ \mb{u}^{(k+1)} &= \mb{u}^{(k)} + H(\mb{x}^{(k+1)}, \mb{z}^{(k+1)}) \;. \end{aligned}\end{split}\]

Parameters \(\mu\) and \(\nu\) are required to satisfy

\[\mu > \norm{ A^{(k)} }_2^2 \quad \text{and} \quad \nu > \norm{ B^{(k)} }_2^2\]

for all \(A^{(k)}\) and \(B^{(k)}\).

Attributes:: H (Function) – \(H\) function.

Initialize a NonLinearPADMM object.

Parameters:

f (Functional) – Functional \(f\) (usually a loss function).
g (Functional) – Functional \(g\).
H (Function) – Function \(H\).
rho (float) – Penalty parameter.
mu (float) – First algorithm parameter.
nu (float) – Second algorithm parameter.
x0 (Union[Array, BlockArray, None]) – Starting value for \(\mb{x}\). If None, defaults to an array of zeros.
z0 (Union[Array, BlockArray, None]) – Starting value for \(\mb{z}\). If None, defaults to an array of zeros.
u0 (Union[Array, BlockArray, None]) – Starting value for \(\mb{u}\). If None, defaults to an array of zeros.
fast_dual_residual (bool) – Flag indicating whether to use fast approximation to the dual residual, or a slower but more accurate calculation.
**kwargs – Additional optional parameters handled by initializer of base class Optimizer.

static estimate_parameters(H, x=None, z=None, factor=1.01, maxiter=100, key=None)[source]¶

Estimate mu and nu parameters of NonLinearPADMM.

Find values of the mu and nu parameters of NonLinearPADMM that respect the constraints

\[\mu > \norm{ J_x H(\mb{x}, \mb{z}) }_2^2 \quad \text{and} \quad \nu > \norm{ J_z H(\mb{x}, \mb{z}) }_2^2 \;.\]

Parameters:

H (Function) – Constraint function \(H\).
x (Union[Array, BlockArray, None]) – Value of \(\mb{x}\) at which to evaluate the Jacobian. If None, defaults to an array of zeros.
z (Union[Array, BlockArray, None]) – Value of \(\mb{z}\) at which to evaluate the Jacobian. If None, defaults to an array of zeros.
factor (Optional[float]) – Safety factor with which to multiply estimated operator norms to ensure strict inequality compliance. If None, return the estimated squared operator norms.
maxiter (int) – Maximum number of power iterations to use in operator norm estimation (see operator_norm). Default: 100.
key (Optional[Array]) – Jax PRNG key to use in operator norm estimation (see operator_norm). Defaults to None, in which case a new key is created.

Return type:

Tuple[float, float]

Returns:

A tuple (mu, nu) representing the estimated parameter values or corresponding squared operator norm values, depending on the value of the factor parameter.

norm_dual_residual()[source]¶

Compute the \(\ell_2\) norm of the dual residual.

Compute the \(\ell_2\) norm of the dual residual. If the flag requesting a fast approximate calculation is set, it is computed as

\[\norm{\mb{z}^{(k+1)} - \mb{z}^{(k)}}_2 \;,\]

otherwise it is computed as

\[\norm{A^T B ( \mb{z}^{(k+1)} - \mb{z}^{(k)} ) }_2 \;,\]

where

\[\begin{split}A &= J_{\mb{x}} H(\mb{x}^{(k+1)}, \mb{z}^{(k+1)}) \\ B &= J_{\mb{z}} H(\mb{x}^{(k+1)}, \mb{z}^{(k+1)}) \;.\end{split}\]

Return type:: float
Returns:: Current norm of dual residual.

norm_primal_residual(x=None, z=None)[source]¶

Compute the \(\ell_2\) norm of the primal residual.

Compute the \(\ell_2\) norm of the primal residual

\[\norm{H(\mb{x}, \mb{z})}_2 \;.\]

Parameters:

x (Union[Array, BlockArray, None]) – Point at which to evaluate primal residual. If None, the primal residual is evaluated at the current iterate self.x.
z (Union[Array, BlockArray, None]) – Point at which to evaluate primal residual. If None, the primal residual is evaluated at the current iterate self.z.

Return type:

float

Returns:

Norm of primal residual.

step()[source]¶

Perform a single algorithm iteration.

class scico.optimize.ProximalADMMBase(f, g, rho, mu, nu, xshape, zshape, ushape, xdtype, zdtype, udtype, x0=None, z0=None, u0=None, fast_dual_residual=True, **kwargs)¶

Bases: Optimizer

Base class for proximal ADMM solvers.

Attributes:

f (Functional) – Functional \(f\) (usually a Loss).
g (Functional) – Functional \(g\).
rho (scalar) – Penalty parameter.
mu (scalar) – First algorithm parameter.
nu (scalar) – Second algorithm parameter.
x (array-like) – Solution variable.
z (array-like) – Auxiliary variables \(\mb{z}\) at current iteration.
z_old (array-like) – Auxiliary variables \(\mb{z}\) at previous iteration.
u (array-like) – Scaled Lagrange multipliers \(\mb{u}\) at current iteration.
u_old (array-like) – Scaled Lagrange multipliers \(\mb{u}\) at previous iteration.

Initialize a ProximalADMMBase object.

Parameters:

f (Functional) – Functional \(f\) (usually a loss function).
g (Functional) – Functional \(g\).
rho (float) – Penalty parameter.
mu (float) – First algorithm parameter.
nu (float) – Second algorithm parameter.
xshape (Union[Tuple[int, ...], Tuple[Tuple[int, ...], ...]]) – Shape of variable \(\mb{x}\).
zshape (Union[Tuple[int, ...], Tuple[Tuple[int, ...], ...]]) – Shape of variable \(\mb{z}\).
ushape (Union[Tuple[int, ...], Tuple[Tuple[int, ...], ...]]) – Shape of variable \(\mb{u}\).
xdtype (DType) – Dtype of variable \(\mb{x}\).
zdtype (DType) – Dtype of variable \(\mb{z}\).
udtype (DType) – Dtype of variable \(\mb{u}\).
x0 (Union[Array, BlockArray, None]) – Initial value for \(\mb{x}\). If None, defaults to an array of zeros.
z0 (Union[Array, BlockArray, None]) – Initial value for \(\mb{z}\). If None, defaults to an array of zeros.
u0 (Union[Array, BlockArray, None]) – Initial value for \(\mb{u}\). If None, defaults to an array of zeros.
fast_dual_residual (bool) – Flag indicating whether to use fast approximation to the dual residual, or a slower but more accurate calculation.
**kwargs – Additional optional parameters handled by initializer of base class Optimizer.

minimizer()[source]¶

Return the current estimate of the functional mimimizer.

Return type:: Union[Array, BlockArray]
Returns:: Current estimate of the functional mimimizer.

objective(x=None, z=None)[source]¶

Evaluate the objective function.

Evaluate the objective function

\[f(\mb{x}) + g(\mb{z}) \;.\]

Parameters:

x (Union[Array, BlockArray, None]) – Point at which to evaluate objective function. If None, the objective is evaluated at the current iterate self.x.
z (Union[Array, BlockArray, None]) – Point at which to evaluate objective function. If None, the objective is evaluated at the current iterate self.z.

Return type:

float

Returns:

scalar – Current value of the objective function.