Optimisation framework#

This package allows rapid prototyping of optimisation-based reconstruction problems, i.e. defining and solving different optimization problems to enforce different properties on the reconstructed image.

Firstly, it provides an object-oriented framework for defining mathematical operators and functions as well a collection of useful example operators and functions. Both smooth and non-smooth functions can be used.

Further, it provides a number of high-level generic implementations of optimisation algorithms to solve genericlly formulated optimisation problems constructed from operator and function objects.

The fundamental components are:

  • Operator: A class specifying a (currently linear) operator.

  • Function: A class specifying mathematical functions such as a least squares data fidelity.

  • Algorithm: Implementation of an iterative optimisation algorithm to solve a particular generic optimisation problem. Algorithms are iterable Python object which can be run in a for loop. Can be stopped and warm restarted.

Algorithms#

A number of generic algorithm implementations are provided including Gradient Descent (GD), Conjugate Gradient Least Squares (CGLS), Simultaneous Iterative Reconstruction Technique (SIRT), Primal Dual Hybrid Gradient (PDHG), Iterative Shrinkage Thresholding Algorithm (ISTA), and Fast Iterative Shrinkage Thresholding Algorithm (FISTA).

An algorithm is designed for a particular generic optimisation problem accepts and number of instances of Function derived classes and/or Operator derived classes as input to define a specific instance of the generic optimisation problem to be solved. They are iterable objects which can be run in a for loop. The user can provide a stopping criterion different than the default max_iteration.

New algorithms can be easily created by extending the Algorithm class. The user is required to implement only 4 methods: set_up, __init__, update and update_objective.

  • set_up and __init__ are used to configure the algorithm

  • update is the actual iteration updating the solution

  • update_objective defines how the objective is calculated.

For example, the implementation of the update of the Gradient Descent algorithm to minimise a Function will only be:

def update(self):
    self.x += -self.rate * self.objective_function.gradient(self.x)
def update_objective(self):
    self.loss.append(self.objective_function(self.x))

The Algorithm provides the infrastructure to continue iteration, to access the values of the objective function in subsequent iterations, the time for each iteration, and to provide a nice print to screen of the status of the optimisation.

Base class#

class cil.optimisation.algorithms.Algorithm(**kwargs)[source]#

Base class for iterative algorithms

provides the minimal infrastructure.

Algorithms are iterables so can be easily run in a for loop. They will stop as soon as the stop criterion is met. The user is required to implement the set_up, __init__, update and and update_objective methods

A courtesy method run is available to run n iterations. The method accepts a callback function that receives the current iteration number and the actual objective value and can be used to trigger print to screens and other user interactions. The run method will stop when the stopping criterion is met.

__init__(**kwargs)[source]#

Constructor

Set the minimal number of parameters:

Parameters
  • max_iteration (int, optional, default 0) – maximum number of iterations

  • update_objective_interval (int, optional, default 1) – the interval every which we would save the current objective. 1 means every iteration, 2 every 2 iteration and so forth. This is by default 1 and should be increased when evaluating the objective is computationally expensive.

  • log_file (str, optional, default None) – log verbose output to file

set_up(*args, **kwargs)[source]#

Set up the algorithm

update()[source]#

A single iteration of the algorithm

should_stop()[source]#

default stopping criterion: number of iterations

The user can change this in concrete implementation of iterative algorithms.

__set_up_logger(fname)#

Set up the logger if desired

max_iteration_stop_criterion()[source]#

default stop criterion for iterative algorithm: max_iteration reached

__iter__()[source]#

Algorithm is an iterable

next()[source]#

Algorithm is an iterable

python2 backwards compatibility

__next__()[source]#

Algorithm is an iterable

calling this method triggers update and update_objective

_update_previous_solution()[source]#

Update the previous solution with the current one

The concrete algorithm calls update_previous_solution. Normally this would entail the swapping of pointers:

tmp = self.x_old
self.x_old = self.x
self.x = tmp
get_output()[source]#

Returns the current solution.

is_provably_convergent()[source]#

Check if the algorithm is convergent based on the provable convergence criterion.

get_last_loss(**kwargs)[source]#

Returns the last stored value of the loss function

if update_objective_interval is 1 it is the value of the objective at the current iteration. If update_objective_interval > 1 it is the last stored value.

get_last_objective(**kwargs)[source]#

alias to get_last_loss

update_objective()[source]#

calculates the objective with the current solution

property iterations#

returns the iterations at which the objective has been evaluated

property loss#

returns the list of the values of the objective during the iteration

The length of this list may be shorter than the number of iterations run when the update_objective_interval > 1

property objective#

alias of loss

property max_iteration#

gets the maximum number of iterations

run(iterations=None, verbose=1, callback=None, **kwargs)[source]#

run n iterations and update the user with the callback if specified

Parameters
  • iterations – number of iterations to run. If not set the algorithm will run until max_iteration or until stop criterion is reached

  • verbose – sets the verbosity output to screen, 0 no verbose, 1 medium, 2 highly verbose

  • callback – is a function that receives: current iteration number, last objective function value and the current solution and gets executed at each update_objective_interval

  • print_interval – integer, controls every how many iteration there’s a print to screen. Notice that printing will not evaluate the objective function and so the print might be out of sync wrt the calculation of the objective. In such cases nan will be printed.

__weakref__#

list of weak references to the object (if defined)

verbose_output(verbose=False)[source]#

Creates a nice tabulated output

GD#

class cil.optimisation.algorithms.GD(initial=None, objective_function=None, step_size=None, **kwargs)[source]#

Gradient Descent algorithm

set_up(initial, objective_function, step_size)[source]#

initialisation of the algorithm

Parameters
  • initial – initial guess

  • objective_function – objective function to be minimised

  • step_size – step size

update()[source]#

Single iteration

update_objective()[source]#

calculates the objective with the current solution

armijo_rule()[source]#

Applies the Armijo rule to calculate the step size (step_size)

https://projecteuclid.org/download/pdf_1/euclid.pjm/1102995080

The Armijo rule runs a while loop to find the appropriate step_size by starting from a very large number (alpha). The step_size is found by dividing alpha by 2 in an iterative way until a certain criterion is met. To avoid infinite loops, we add a maximum number of times the while loop is run.

This rule would allow to reach a minimum step_size of 10^-alpha.

if alpha = numpy.power(10,gamma) delta = 3 step_size = numpy.power(10, -delta) with armijo rule we can get to step_size from initial alpha by repeating the while loop k times where alpha / 2^k = step_size 10^gamma / 2^k = 10^-delta 2^k = 10^(gamma+delta) k = gamma+delta / log10(2) approx 3.3 * (gamma+delta)

if we would take by default delta = gamma kmax = numpy.ceil ( 2 * gamma / numpy.log10(2) ) kmax = numpy.ceil (2 * numpy.log10(alpha) / numpy.log10(2))

get_last_loss(**kwargs)#

Returns the last stored value of the loss function

if update_objective_interval is 1 it is the value of the objective at the current iteration. If update_objective_interval > 1 it is the last stored value.

get_last_objective(**kwargs)#

alias to get_last_loss

get_output()#

Returns the current solution.

is_provably_convergent()#

Check if the algorithm is convergent based on the provable convergence criterion.

property iterations#

returns the iterations at which the objective has been evaluated

property loss#

returns the list of the values of the objective during the iteration

The length of this list may be shorter than the number of iterations run when the update_objective_interval > 1

property max_iteration#

gets the maximum number of iterations

max_iteration_stop_criterion()#

default stop criterion for iterative algorithm: max_iteration reached

next()#

Algorithm is an iterable

python2 backwards compatibility

property objective#

alias of loss

run(iterations=None, verbose=1, callback=None, **kwargs)#

run n iterations and update the user with the callback if specified

Parameters
  • iterations – number of iterations to run. If not set the algorithm will run until max_iteration or until stop criterion is reached

  • verbose – sets the verbosity output to screen, 0 no verbose, 1 medium, 2 highly verbose

  • callback – is a function that receives: current iteration number, last objective function value and the current solution and gets executed at each update_objective_interval

  • print_interval – integer, controls every how many iteration there’s a print to screen. Notice that printing will not evaluate the objective function and so the print might be out of sync wrt the calculation of the objective. In such cases nan will be printed.

verbose_output(verbose=False)#

Creates a nice tabulated output

should_stop()[source]#

default stopping criterion: number of iterations

The user can change this in concrete implementation of iterative algorithms.

CGLS#

class cil.optimisation.algorithms.CGLS(initial=None, operator=None, data=None, tolerance=1e-06, **kwargs)[source]#

Conjugate Gradient Least Squares algorithm

Problem:

\[\min || A x - b ||^2_2\]

Parameters :

:parameter operator : Linear operator for the inverse problem :parameter initial : Initial guess ( Default initial = 0) :parameter data : Acquired data to reconstruct :parameter tolerance: Tolerance/ Stopping Criterion to end CGLS algorithm

Reference:

https://web.stanford.edu/group/SOL/software/cgls/

set_up(initial, operator, data, tolerance=1e-06)[source]#

initialisation of the algorithm

Parameters
  • operator – Linear operator for the inverse problem

  • initial – Initial guess ( Default initial = 0)

  • data – Acquired data to reconstruct

  • tolerance – Tolerance/ Stopping Criterion to end CGLS algorithm

update()[source]#

single iteration

update_objective()[source]#

calculates the objective with the current solution

should_stop()[source]#

stopping criterion

flag()[source]#

returns whether the tolerance has been reached

get_last_loss(**kwargs)#

Returns the last stored value of the loss function

if update_objective_interval is 1 it is the value of the objective at the current iteration. If update_objective_interval > 1 it is the last stored value.

get_last_objective(**kwargs)#

alias to get_last_loss

get_output()#

Returns the current solution.

is_provably_convergent()#

Check if the algorithm is convergent based on the provable convergence criterion.

property iterations#

returns the iterations at which the objective has been evaluated

property loss#

returns the list of the values of the objective during the iteration

The length of this list may be shorter than the number of iterations run when the update_objective_interval > 1

property max_iteration#

gets the maximum number of iterations

max_iteration_stop_criterion()#

default stop criterion for iterative algorithm: max_iteration reached

next()#

Algorithm is an iterable

python2 backwards compatibility

property objective#

alias of loss

run(iterations=None, verbose=1, callback=None, **kwargs)#

run n iterations and update the user with the callback if specified

Parameters
  • iterations – number of iterations to run. If not set the algorithm will run until max_iteration or until stop criterion is reached

  • verbose – sets the verbosity output to screen, 0 no verbose, 1 medium, 2 highly verbose

  • callback – is a function that receives: current iteration number, last objective function value and the current solution and gets executed at each update_objective_interval

  • print_interval – integer, controls every how many iteration there’s a print to screen. Notice that printing will not evaluate the objective function and so the print might be out of sync wrt the calculation of the objective. In such cases nan will be printed.

verbose_output(verbose=False)#

Creates a nice tabulated output

SIRT#

class cil.optimisation.algorithms.SIRT(initial, operator, data, lower=None, upper=None, constraint=None, **kwargs)[source]#

Simultaneous Iterative Reconstruction Technique, see [6].

Simultaneous Iterative Reconstruction Technique (SIRT) solves the following problem

\[A x = b\]

The SIRT algorithm is

\[x^{k+1} = \mathrm{proj}_{C}( x^{k} + \omega * D ( A^{T} ( M * (b - Ax) ) ) ),\]

where, \(M = \frac{1}{A*\mathbb{1}}\), \(D = \frac{1}{A^{T}\mathbb{1}}\), \(\mathbb{1}\) is a DataContainer of ones, \(\mathrm{prox}_{C}\) is the projection over a set \(C\), and \(\omega\) is the relaxation parameter.

Parameters

Note

If constraint is not passed, lower and upper are used to create an IndicatorBox and apply its proximal.

If constraint is passed, proximal method is required to be implemented.

Note

The preconditioning arrays (weights) M and D used in SIRT are defined as

\[M = \frac{1}{A*\mathbb{1}} = \frac{1}{\sum_{j}a_{i,j}}\]
\[D = \frac{1}{A*\mathbb{1}} = \frac{1}{\sum_{i}a_{i,j}}\]

Examples

\[\underset{x}{\mathrm{argmin}} \| x - d\|^{2}\]
>>> sirt = SIRT(initial = ig.allocate(0), operator = A, data = d, max_iteration = 5)
set_up(initial, operator, data, lower=None, upper=None, constraint=None)[source]#

Initialisation of the algorithm

set_relaxation_parameter(value=1.0)[source]#

Set the relaxation parameter \(\omega\)

Parameters

value (float) – The relaxation parameter to be applied to the update. Must be between 0 and 2 to guarantee asymptotic convergence.

update()[source]#

Performs a single iteration of the SIRT algorithm

\[x^{k+1} = \mathrm{proj}_{C}( x^{k} + \omega * D ( A^{T} ( M * (b - Ax) ) ) )\]
update_objective()[source]#

Returns the objective

\[\|A x - b\|^{2}\]
get_last_loss(**kwargs)#

Returns the last stored value of the loss function

if update_objective_interval is 1 it is the value of the objective at the current iteration. If update_objective_interval > 1 it is the last stored value.

get_last_objective(**kwargs)#

alias to get_last_loss

get_output()#

Returns the current solution.

is_provably_convergent()#

Check if the algorithm is convergent based on the provable convergence criterion.

property iterations#

returns the iterations at which the objective has been evaluated

property loss#

returns the list of the values of the objective during the iteration

The length of this list may be shorter than the number of iterations run when the update_objective_interval > 1

property max_iteration#

gets the maximum number of iterations

max_iteration_stop_criterion()#

default stop criterion for iterative algorithm: max_iteration reached

next()#

Algorithm is an iterable

python2 backwards compatibility

property objective#

alias of loss

run(iterations=None, verbose=1, callback=None, **kwargs)#

run n iterations and update the user with the callback if specified

Parameters
  • iterations – number of iterations to run. If not set the algorithm will run until max_iteration or until stop criterion is reached

  • verbose – sets the verbosity output to screen, 0 no verbose, 1 medium, 2 highly verbose

  • callback – is a function that receives: current iteration number, last objective function value and the current solution and gets executed at each update_objective_interval

  • print_interval – integer, controls every how many iteration there’s a print to screen. Notice that printing will not evaluate the objective function and so the print might be out of sync wrt the calculation of the objective. In such cases nan will be printed.

should_stop()#

default stopping criterion: number of iterations

The user can change this in concrete implementation of iterative algorithms.

verbose_output(verbose=False)#

Creates a nice tabulated output

ISTA#

class cil.optimisation.algorithms.ISTA(initial, f, g, step_size=None, **kwargs)[source]#

Iterative Shrinkage-Thresholding Algorithm, see [1], [2].

Iterative Shrinkage-Thresholding Algorithm (ISTA)

\[x^{k+1} = \mathrm{prox}_{\alpha^{k} g}(x^{k} - \alpha^{k}\nabla f(x^{k}))\]

is used to solve

\[\min_{x} f(x) + g(x)\]

where \(f\) is differentiable, \(g\) has a simple proximal operator and \(\alpha^{k}\) is the step_size per iteration.

Note

For a constant step size, i.e., \(a^{k}=a\) for \(k\geq1\), convergence of ISTA is guaranteed if

\[\alpha\in(0, \frac{2}{L}),\]

where \(L\) is the Lipschitz constant of \(f\), see [].

Parameters
  • initial (DataContainer) – Initial guess of ISTA.

  • f (Function) – Differentiable function

  • g (Function) – Convex function with simple proximal operator

  • step_size (positive float, default = None) – Step size for the gradient step of ISTA. The default step_size is \(\frac{0.99 * 2}{L}.\)

  • kwargs (Keyword arguments) – Arguments from the base class Algorithm.

Examples

\[\underset{x}{\mathrm{argmin}}\|A x - b\|^{2}_{2}\]
>>> f = LeastSquares(A, b=b, c=0.5)
>>> g = ZeroFunction()
>>> ig = Aop.domain
>>> ista = ISTA(initial = ig.allocate(), f = f, g = g, max_iteration=10)
>>> ista.run()

See also

FISTA, GD

set_step_size(step_size)[source]#

Set default step size.

__init__(initial, f, g, step_size=None, **kwargs)[source]#

Constructor

Set the minimal number of parameters:

Parameters
  • max_iteration (int, optional, default 0) – maximum number of iterations

  • update_objective_interval (int, optional, default 1) – the interval every which we would save the current objective. 1 means every iteration, 2 every 2 iteration and so forth. This is by default 1 and should be increased when evaluating the objective is computationally expensive.

  • log_file (str, optional, default None) – log verbose output to file

set_up(initial, f, g, step_size, **kwargs)[source]#

Set up of the algorithm

update()[source]#

Performs a single iteration of ISTA

\[x_{k+1} = \mathrm{prox}_{\alpha g}(x_{k} - \alpha\nabla f(x_{k}))\]
get_output()[source]#

Returns the current solution.

update_objective()[source]#

Updates the objective

\[f(x) + g(x)\]
__delattr__(name, /)#

Implement delattr(self, name).

__dir__()#

Default dir() implementation.

__eq__(value, /)#

Return self==value.

__format__(format_spec, /)#

Default object formatter.

__ge__(value, /)#

Return self>=value.

__getattribute__(name, /)#

Return getattr(self, name).

__gt__(value, /)#

Return self>value.

__hash__()#

Return hash(self).

__init_subclass__()#

This method is called when a class is subclassed.

The default implementation does nothing. It may be overridden to extend subclasses.

__iter__()#

Algorithm is an iterable

__le__(value, /)#

Return self<=value.

__lt__(value, /)#

Return self<value.

__ne__(value, /)#

Return self!=value.

__new__(**kwargs)#

Create and return a new object. See help(type) for accurate signature.

__next__()#

Algorithm is an iterable

calling this method triggers update and update_objective

__reduce__()#

Helper for pickle.

__reduce_ex__(protocol, /)#

Helper for pickle.

__repr__()#

Return repr(self).

__setattr__(name, value, /)#

Implement setattr(self, name, value).

__sizeof__()#

Size of object in memory, in bytes.

__str__()#

Return str(self).

__subclasshook__()#

Abstract classes can override this to customize issubclass().

This is invoked early on by abc.ABCMeta.__subclasscheck__(). It should return True, False or NotImplemented. If it returns NotImplemented, the normal algorithm is used. Otherwise, it overrides the normal algorithm (and the outcome is cached).

__weakref__#

list of weak references to the object (if defined)

get_last_loss(**kwargs)#

Returns the last stored value of the loss function

if update_objective_interval is 1 it is the value of the objective at the current iteration. If update_objective_interval > 1 it is the last stored value.

get_last_objective(**kwargs)#

alias to get_last_loss

is_provably_convergent()#

Check if the algorithm is convergent based on the provable convergence criterion.

property iterations#

returns the iterations at which the objective has been evaluated

property loss#

returns the list of the values of the objective during the iteration

The length of this list may be shorter than the number of iterations run when the update_objective_interval > 1

property max_iteration#

gets the maximum number of iterations

max_iteration_stop_criterion()#

default stop criterion for iterative algorithm: max_iteration reached

next()#

Algorithm is an iterable

python2 backwards compatibility

property objective#

alias of loss

run(iterations=None, verbose=1, callback=None, **kwargs)#

run n iterations and update the user with the callback if specified

Parameters
  • iterations – number of iterations to run. If not set the algorithm will run until max_iteration or until stop criterion is reached

  • verbose – sets the verbosity output to screen, 0 no verbose, 1 medium, 2 highly verbose

  • callback – is a function that receives: current iteration number, last objective function value and the current solution and gets executed at each update_objective_interval

  • print_interval – integer, controls every how many iteration there’s a print to screen. Notice that printing will not evaluate the objective function and so the print might be out of sync wrt the calculation of the objective. In such cases nan will be printed.

should_stop()#

default stopping criterion: number of iterations

The user can change this in concrete implementation of iterative algorithms.

verbose_output(verbose=False)#

Creates a nice tabulated output

FISTA#

class cil.optimisation.algorithms.FISTA(initial, f, g, step_size=None, **kwargs)[source]#

Fast Iterative Shrinkage-Thresholding Algorithm, see [1], [2].

Fast Iterative Shrinkage-Thresholding Algorithm (FISTA)

\[\begin{split}\begin{cases} y_{k} = x_{k} - \alpha\nabla f(x_{k}) \\ x_{k+1} = \mathrm{prox}_{\alpha g}(y_{k})\\ t_{k+1} = \frac{1+\sqrt{1+ 4t_{k}^{2}}}{2}\\ y_{k+1} = x_{k} + \frac{t_{k}-1}{t_{k-1}}(x_{k} - x_{k-1}) \end{cases}\end{split}\]

is used to solve

\[\min_{x} f(x) + g(x)\]

where \(f\) is differentiable, \(g\) has a simple proximal operator and \(\alpha^{k}\) is the step_size per iteration.

Parameters
  • initial (DataContainer) – Starting point of the algorithm

  • f (Function) – Differentiable function

  • g (Function) – Convex function with simple proximal operator

  • step_size (positive float, default = None) – Step size for the gradient step of FISTA. The default step_size is \(\frac{1}{L}\).

  • kwargs (Keyword arguments) – Arguments from the base class Algorithm.

Examples

\[\underset{x}{\mathrm{argmin}}\|A x - b\|^{2}_{2}\]
>>> f = LeastSquares(A, b=b, c=0.5)
>>> g = ZeroFunction()
>>> ig = Aop.domain
>>> fista = FISTA(initial = ig.allocate(), f = f, g = g, max_iteration=10)
>>> fista.run()

See also

FISTA, GD

set_step_size(step_size)[source]#

Set the default step size

__init__(initial, f, g, step_size=None, **kwargs)[source]#

Constructor

Set the minimal number of parameters:

Parameters
  • max_iteration (int, optional, default 0) – maximum number of iterations

  • update_objective_interval (int, optional, default 1) – the interval every which we would save the current objective. 1 means every iteration, 2 every 2 iteration and so forth. This is by default 1 and should be increased when evaluating the objective is computationally expensive.

  • log_file (str, optional, default None) – log verbose output to file

update()[source]#

Performs a single iteration of FISTA

\[\begin{split}\begin{cases} x_{k+1} = \mathrm{prox}_{\alpha g}(y_{k} - \alpha\nabla f(y_{k}))\\ t_{k+1} = \frac{1+\sqrt{1+ 4t_{k}^{2}}}{2}\\ y_{k+1} = x_{k} + \frac{t_{k}-1}{t_{k-1}}(x_{k} - x_{k-1}) \end{cases}\end{split}\]
__delattr__(name, /)#

Implement delattr(self, name).

__dir__()#

Default dir() implementation.

__eq__(value, /)#

Return self==value.

__format__(format_spec, /)#

Default object formatter.

__ge__(value, /)#

Return self>=value.

__getattribute__(name, /)#

Return getattr(self, name).

__gt__(value, /)#

Return self>value.

__hash__()#

Return hash(self).

__init_subclass__()#

This method is called when a class is subclassed.

The default implementation does nothing. It may be overridden to extend subclasses.

__iter__()#

Algorithm is an iterable

__le__(value, /)#

Return self<=value.

__lt__(value, /)#

Return self<value.

__ne__(value, /)#

Return self!=value.

__new__(**kwargs)#

Create and return a new object. See help(type) for accurate signature.

__next__()#

Algorithm is an iterable

calling this method triggers update and update_objective

__reduce__()#

Helper for pickle.

__reduce_ex__(protocol, /)#

Helper for pickle.

__repr__()#

Return repr(self).

__setattr__(name, value, /)#

Implement setattr(self, name, value).

__sizeof__()#

Size of object in memory, in bytes.

__str__()#

Return str(self).

__subclasshook__()#

Abstract classes can override this to customize issubclass().

This is invoked early on by abc.ABCMeta.__subclasscheck__(). It should return True, False or NotImplemented. If it returns NotImplemented, the normal algorithm is used. Otherwise, it overrides the normal algorithm (and the outcome is cached).

__weakref__#

list of weak references to the object (if defined)

get_last_loss(**kwargs)#

Returns the last stored value of the loss function

if update_objective_interval is 1 it is the value of the objective at the current iteration. If update_objective_interval > 1 it is the last stored value.

get_last_objective(**kwargs)#

alias to get_last_loss

get_output()#

Returns the current solution.

is_provably_convergent()#

Check if the algorithm is convergent based on the provable convergence criterion.

property iterations#

returns the iterations at which the objective has been evaluated

property loss#

returns the list of the values of the objective during the iteration

The length of this list may be shorter than the number of iterations run when the update_objective_interval > 1

property max_iteration#

gets the maximum number of iterations

max_iteration_stop_criterion()#

default stop criterion for iterative algorithm: max_iteration reached

next()#

Algorithm is an iterable

python2 backwards compatibility

property objective#

alias of loss

run(iterations=None, verbose=1, callback=None, **kwargs)#

run n iterations and update the user with the callback if specified

Parameters
  • iterations – number of iterations to run. If not set the algorithm will run until max_iteration or until stop criterion is reached

  • verbose – sets the verbosity output to screen, 0 no verbose, 1 medium, 2 highly verbose

  • callback – is a function that receives: current iteration number, last objective function value and the current solution and gets executed at each update_objective_interval

  • print_interval – integer, controls every how many iteration there’s a print to screen. Notice that printing will not evaluate the objective function and so the print might be out of sync wrt the calculation of the objective. In such cases nan will be printed.

set_up(initial, f, g, step_size, **kwargs)#

Set up of the algorithm

should_stop()#

default stopping criterion: number of iterations

The user can change this in concrete implementation of iterative algorithms.

update_objective()#

Updates the objective

\[f(x) + g(x)\]
verbose_output(verbose=False)#

Creates a nice tabulated output

PDHG#

class cil.optimisation.algorithms.PDHG(f, g, operator, tau=None, sigma=None, initial=None, **kwargs)[source]#

Primal Dual Hybrid Gradient (PDHG) algorithm, see [3], [4].

Parameters
  • f (Function) – A convex function with a “simple” proximal method of its conjugate.

  • g (Function) – A convex function with a “simple” proximal.

  • operator (LinearOperator) – A Linear Operator.

  • sigma (positive float, or np.ndarray, DataContainer, BlockDataContainer, optional, default=None) – Step size for the dual problem.

  • tau (positive float, or np.ndarray, DataContainer, BlockDataContainer, optional, default=None) – Step size for the primal problem.

  • initial (DataContainer, optional, default=None) – Initial point for the PDHG algorithm.

  • gamma_g (positive float, optional, default=None) – Strongly convex constant if the function g is strongly convex. Allows primal acceleration of the PDHG algorithm.

  • gamma_fconj (positive float, optional, default=None) – Strongly convex constant if the convex conjugate of f is strongly convex. Allows dual acceleration of the PDHG algorithm.

  • **kwargs

    Keyward arguments used from the base class Algorithm.

    max_iterationint, optional, default=0

    Maximum number of iterations of the PDHG.

    update_objective_intervalint, optional, default=1

    Evaluates objectives, e.g., primal/dual/primal-dual gap every update_objective_interval.

    check_convergenceboolean, default=True

    Checks scalar sigma and tau values satisfy convergence criterion

Example

In our CIL-Demos repositoryyou can find examples using the PDHG algorithm for different imaging problems, such as Total Variation denoising, Total Generalised Variation inpaintingand Total Variation Tomography reconstruction. More examples can also be found in [5], [7].

Note

Currently, the strongly convex constants are passed as parameters of PDHG. In the future, these parameters will be properties of the corresponding functions.

Notes

A first-order primal-dual algorithm for convex optimization problems with known saddle-point structure with applications in imaging.

The general problem considered in the PDHG algorithm is the generic saddle-point problem

\[\min_{x\in X}\max_{y\in Y} \langle Kx, y \rangle + g(x) - f^{*}(x)\]

where \(f\) and \(g\) are convex functions with “simple” proximal operators.

\(X\) and \(Y\) are two two finite-dimensional vector spaces with an inner product and representing the domain of \(g\) and \(f^{*}\), the convex conjugate of \(f\), respectively.

The operator \(K\) is a continuous linear operator with operator norm defined as

\[\|K\| = \max\{ \|Kx\| : x\in X, \|x\|\leq1\}\]

The saddle point problem is decomposed into the primal problem:

\[\min_{x\in X} f(Kx) + g(x),\]

and its corresponding dual problem

\[\max_{y\in Y} - g^{*}(-K^{*}y) - f^{*}(y).\]

The PDHG algorithm consists of three steps:

  • gradient ascent step for the dual problem,

  • gradient descent step for the primal problem and

  • an over-relaxation of the primal variable.

\[y^{n+1} = \mathrm{prox}_{\sigma f^{*}}(y^{n} + \sigma K \bar{x}^{n})\]
\[x^{n+1} = \mathrm{prox}_{\tau g}(x^{n} - \tau K^{*}y^{n+1})\]
\[\bar{x}^{n+1} = x^{n+1} + \theta (x^{n+1} - x^{n})\]

Notes

  • Convergence is guaranteed if \(\theta\) = 1.0, the operator norm \(\|K\|\), the dual step size \(\sigma\) and the primal step size \(\tau\), satisfy the following inequality:

\[\tau \sigma \|K\|^2 < 1\]
  • By default, the step sizes \(\sigma\) and \(\tau\) are positive scalars and defined as below:

    • If sigma is None and tau is None:

    \[\sigma = \frac{1}{\|K\|}, \tau = \frac{1}{\|K\|}\]
    • If tau is None:

    \[\tau = \frac{1}{\sigma\|K\|^{2}}\]
    • If sigma is None:

    \[\sigma = \frac{1}{\tau\|K\|^{2}}\]
  • To monitor the convergence of the algorithm, we compute the primal/dual objectives and the primal-dual gap in update_objective().

    The primal objective is

    \[f(Kx) + g(x)\]

    and the dual objective is

    \[- g^{*}(-K^{*}y) - f^{*}(y)\]

    The primal-dual gap (or duality gap) is

    \[f(Kx) + g(x) + g^{*}(-K^{*}y) + f^{*}(y)\]

    and measures how close is the primal-dual pair (x,y) to the primal-dual solution. It is always non-negative and is used to monitor convergence of the PDHG algorithm. For more information, see Duality Gap.

Note

  • The primal objective is printed if verbose=1, pdhg.run(verbose=1).

  • All the objectives are printed if verbose=2, pdhg.run(verbose=2).

Computing these objectives can be costly, so it is better to compute every some iterations. To do this, use update_objective_interval = #number.

  • PDHG algorithm can be accelerated if the functions \(f^{*}\) and/or \(g\) are strongly convex. In these cases, the step-sizes \(\sigma\) and \(\tau\) are updated using the update_step_sizes() method. A function \(f\) is strongly convex with constant \(\gamma>0\) if

    \[f(x) - \frac{\gamma}{2}\|x\|^{2} \quad\mbox{ is convex. }\]
    • For instance the function \(\frac{1}{2}\|x\|^{2}_{2}\) is \(\gamma\) strongly convex for \(\gamma\in(-\infty,1]\). We say it is 1-strongly convex because it is the largest constant for which \(f - \frac{1}{2}\|\cdot\|^{2}\) is convex.

    • The \(\|\cdot\|_{1}\) norm is not strongly convex. For more information, see Strongly Convex.

    • If \(g\) is strongly convex with constant \(\gamma\) then the step-sizes \(\sigma\), \(\tau\) and \(\theta\) are updated as:

    \begin{aligned} \theta_{n} & = \frac{1}{\sqrt{1 + 2\gamma\tau_{n}}}\\ \tau_{n+1} & = \theta_{n}\tau_{n}\\ \sigma_{n+1} & = \frac{\sigma_{n}}{\theta_{n}} \end{aligned}
    • If \(f^{*}\) is strongly convex, we swap \(\sigma\) with \(\tau\).

Note

The case where both functions are strongly convex is not available at the moment.

Todo

Implement acceleration of PDHG when both functions are strongly convex.

set_gamma_g(value)[source]#

Set the value of the strongly convex constant for function g

Parameters

value (a positive number or None) –

set_gamma_fconj(value)[source]#

Set the value of the strongly convex constant for the convex conjugate of function f

Parameters

value (a positive number or None) –

set_up(f, g, operator, tau=None, sigma=None, initial=None, **kwargs)[source]#

Initialisation of the algorithm

Parameters
  • f (Function) – A convex function with a “simple” proximal method of its conjugate.

  • g (Function) – A convex function with a “simple” proximal.

  • operator (LinearOperator) – A Linear Operator.

  • sigma (positive float, or np.ndarray, DataContainer, BlockDataContainer, optional, default=None) – Step size for the dual problem.

  • tau (positive float, or np.ndarray, DataContainer, BlockDataContainer, optional, default=None) – Step size for the primal problem.

  • initial (DataContainer, optional, default=None) – Initial point for the PDHG algorithm.

  • theta (Relaxation parameter, Number, default 1.0) –

get_output()[source]#

Returns the current solution.

update()[source]#

Performs a single iteration of the PDHG algorithm

check_convergence()[source]#

Check whether convergence criterion for PDHG is satisfied with scalar values of tau and sigma

Returns

True if convergence criterion is satisfied. False if not satisfied or convergence is unknown.

Return type

Boolean

set_step_sizes(sigma=None, tau=None)[source]#

Sets sigma and tau step-sizes for the PDHG algorithm. The step sizes can be either scalar or array-objects.

Parameters
  • sigma (positive float, or np.ndarray, DataContainer, BlockDataContainer, optional, default=None) – Step size for the dual problem.

  • tau (positive float, or np.ndarray, DataContainer, BlockDataContainer, optional, default=None) – Step size for the primal problem.

The user can set either, both or none. Values passed by the user will be accepted as long as they are positive numbers, or correct shape array like objects.

update_step_sizes()[source]#

Updates step sizes in the cases of primal or dual acceleration using the strongly convexity property. The case where both functions are strongly convex is not available at the moment.

update_objective()[source]#

Evaluates the primal objective, the dual objective and the primal-dual gap.

property objective#

alias of loss

get_last_loss(**kwargs)#

Returns the last stored value of the loss function

if update_objective_interval is 1 it is the value of the objective at the current iteration. If update_objective_interval > 1 it is the last stored value.

get_last_objective(**kwargs)#

alias to get_last_loss

is_provably_convergent()#

Check if the algorithm is convergent based on the provable convergence criterion.

property iterations#

returns the iterations at which the objective has been evaluated

property loss#

returns the list of the values of the objective during the iteration

The length of this list may be shorter than the number of iterations run when the update_objective_interval > 1

property max_iteration#

gets the maximum number of iterations

max_iteration_stop_criterion()#

default stop criterion for iterative algorithm: max_iteration reached

next()#

Algorithm is an iterable

python2 backwards compatibility

run(iterations=None, verbose=1, callback=None, **kwargs)#

run n iterations and update the user with the callback if specified

Parameters
  • iterations – number of iterations to run. If not set the algorithm will run until max_iteration or until stop criterion is reached

  • verbose – sets the verbosity output to screen, 0 no verbose, 1 medium, 2 highly verbose

  • callback – is a function that receives: current iteration number, last objective function value and the current solution and gets executed at each update_objective_interval

  • print_interval – integer, controls every how many iteration there’s a print to screen. Notice that printing will not evaluate the objective function and so the print might be out of sync wrt the calculation of the objective. In such cases nan will be printed.

should_stop()#

default stopping criterion: number of iterations

The user can change this in concrete implementation of iterative algorithms.

verbose_output(verbose=False)#

Creates a nice tabulated output

LADMM#

class cil.optimisation.algorithms.LADMM(f=None, g=None, operator=None, tau=None, sigma=1.0, initial=None, **kwargs)[source]#

LADMM is the Linearized Alternating Direction Method of Multipliers (LADMM)

General form of ADMM : min_{x} f(x) + g(y), subject to Ax + By = b

Case: A = Id, B = -K, b = 0 ==> min_x f(Kx) + g(x)

The quadratic term in the augmented Lagrangian is linearized for the x-update.

Main algorithmic difference is that in ADMM we compute two proximal subproblems, where in the PDHG a proximal and proximal conjugate.

Reference (Section 8) : https://link.springer.com/content/pdf/10.1007/s10107-018-1321-1.pdf

x^{k} = prox_{ au f } (x^{k-1} - tau/sigma A^{T}(Ax^{k-1} - z^{k-1} + u^{k-1} )

z^{k} = prox_{sigma g} (Ax^{k} + u^{k-1})

u^{k} = u^{k-1} + Ax^{k} - z^{k}

set_up(f, g, operator, tau=None, sigma=1.0, initial=None)[source]#

Set up the algorithm

update()[source]#

A single iteration of the algorithm

update_objective()[source]#

calculates the objective with the current solution

get_last_loss(**kwargs)#

Returns the last stored value of the loss function

if update_objective_interval is 1 it is the value of the objective at the current iteration. If update_objective_interval > 1 it is the last stored value.

get_last_objective(**kwargs)#

alias to get_last_loss

get_output()#

Returns the current solution.

is_provably_convergent()#

Check if the algorithm is convergent based on the provable convergence criterion.

property iterations#

returns the iterations at which the objective has been evaluated

property loss#

returns the list of the values of the objective during the iteration

The length of this list may be shorter than the number of iterations run when the update_objective_interval > 1

property max_iteration#

gets the maximum number of iterations

max_iteration_stop_criterion()#

default stop criterion for iterative algorithm: max_iteration reached

next()#

Algorithm is an iterable

python2 backwards compatibility

property objective#

alias of loss

run(iterations=None, verbose=1, callback=None, **kwargs)#

run n iterations and update the user with the callback if specified

Parameters
  • iterations – number of iterations to run. If not set the algorithm will run until max_iteration or until stop criterion is reached

  • verbose – sets the verbosity output to screen, 0 no verbose, 1 medium, 2 highly verbose

  • callback – is a function that receives: current iteration number, last objective function value and the current solution and gets executed at each update_objective_interval

  • print_interval – integer, controls every how many iteration there’s a print to screen. Notice that printing will not evaluate the objective function and so the print might be out of sync wrt the calculation of the objective. In such cases nan will be printed.

should_stop()#

default stopping criterion: number of iterations

The user can change this in concrete implementation of iterative algorithms.

verbose_output(verbose=False)#

Creates a nice tabulated output

SPDHG#

class cil.optimisation.algorithms.SPDHG(f=None, g=None, operator=None, tau=None, sigma=None, initial=None, prob=None, gamma=1.0, **kwargs)[source]#

Stochastic Primal Dual Hybrid Gradient

Problem:

\[\min_{x} f(Kx) + g(x) = \min_{x} \sum f_i(K_i x) + g(x)\]
Parameters
  • f (BlockFunction) – Each must be a convex function with a “simple” proximal method of its conjugate

  • g (Function) – A convex function with a “simple” proximal

  • operator (BlockOperator) – BlockOperator must contain Linear Operators

  • tau (positive float, optional, default=None) – Step size parameter for Primal problem

  • sigma (list of positive float, optional, default=None) – List of Step size parameters for Dual problem

  • initial (DataContainer, optional, default=None) – Initial point for the SPDHG algorithm

  • prob (list of floats, optional, default=None) – List of probabilities. If None each subset will have probability = 1/number of subsets

  • gamma (float) – parameter controlling the trade-off between the primal and dual step sizes

  • **kwargs

  • norms (list of floats) – precalculated list of norms of the operators

Example

Example of usage: See https://github.com/vais-ral/CIL-Demos/blob/master/Tomography/Simulated/Single%20Channel/PDHG_vs_SPDHG.py

Note

Convergence is guaranteed provided that [2, eq. (12)]:

\[\]

|sigma[i]^{1/2} * K[i] * tau^{1/2} |^2 < p_i for all i

Note

Notation for primal and dual step-sizes are reversed with comparison

to PDHG.py

Note

this code implements serial sampling only, as presented in [2]

(to be extended to more general case of [1] as future work)

References

[1]”Stochastic primal-dual hybrid gradient algorithm with arbitrary sampling and imaging applications”, Chambolle, Antonin, Matthias J. Ehrhardt, Peter Richtárik, and Carola-Bibiane Schonlieb, SIAM Journal on Optimization 28, no. 4 (2018): 2783-2808.

[2]”Faster PET reconstruction with non-smooth priors by randomization and preconditioning”, Matthias J Ehrhardt, Pawel Markiewicz and Carola-Bibiane Schönlieb, Physics in Medicine & Biology, Volume 64, Number 22, 2019.

set_up(f, g, operator, tau=None, sigma=None, initial=None, prob=None, gamma=1.0, norms=None)[source]#

set-up of the algorithm :param f: Each must be a convex function with a “simple” proximal method of its conjugate :type f: BlockFunction :param g: A convex function with a “simple” proximal :type g: Function :param operator: BlockOperator must contain Linear Operators :type operator: BlockOperator :param tau: Step size parameter for Primal problem :type tau: positive float, optional, default=None :param sigma: List of Step size parameters for Dual problem :type sigma: list of positive float, optional, default=None :param initial: Initial point for the SPDHG algorithm :type initial: DataContainer, optional, default=None :param prob: List of probabilities. If None each subset will have probability = 1/number of subsets :type prob: list of floats, optional, default=None :param gamma: parameter controlling the trade-off between the primal and dual step sizes :type gamma: float :param **kwargs: :param norms: precalculated list of norms of the operators :type norms: list of floats

update()[source]#

A single iteration of the algorithm

update_objective()[source]#

calculates the objective with the current solution

property objective#

alias of loss

get_last_loss(**kwargs)#

Returns the last stored value of the loss function

if update_objective_interval is 1 it is the value of the objective at the current iteration. If update_objective_interval > 1 it is the last stored value.

get_last_objective(**kwargs)#

alias to get_last_loss

get_output()#

Returns the current solution.

is_provably_convergent()#

Check if the algorithm is convergent based on the provable convergence criterion.

property iterations#

returns the iterations at which the objective has been evaluated

property loss#

returns the list of the values of the objective during the iteration

The length of this list may be shorter than the number of iterations run when the update_objective_interval > 1

property max_iteration#

gets the maximum number of iterations

max_iteration_stop_criterion()#

default stop criterion for iterative algorithm: max_iteration reached

next()#

Algorithm is an iterable

python2 backwards compatibility

run(iterations=None, verbose=1, callback=None, **kwargs)#

run n iterations and update the user with the callback if specified

Parameters
  • iterations – number of iterations to run. If not set the algorithm will run until max_iteration or until stop criterion is reached

  • verbose – sets the verbosity output to screen, 0 no verbose, 1 medium, 2 highly verbose

  • callback – is a function that receives: current iteration number, last objective function value and the current solution and gets executed at each update_objective_interval

  • print_interval – integer, controls every how many iteration there’s a print to screen. Notice that printing will not evaluate the objective function and so the print might be out of sync wrt the calculation of the objective. In such cases nan will be printed.

should_stop()#

default stopping criterion: number of iterations

The user can change this in concrete implementation of iterative algorithms.

verbose_output(verbose=False)#

Creates a nice tabulated output

Operators#

The two most important methods are direct and adjoint methods that describe the result of applying the operator, and its adjoint respectively, onto a compatible DataContainer input. The output is another DataContainer object or subclass hereof. An important special case is to represent the tomographic forward and backprojection operations.

Operator base classes#

All operators extend the Operator class. A special class is the LinearOperator which represents an operator for which the adjoint operation is defined. A ScaledOperator represents the multiplication of any operator with a scalar.

class cil.optimisation.operators.Operator(domain_geometry, **kwargs)[source]#

Operator that maps from a space X -> Y

Parameters
__init__(domain_geometry, **kwargs)[source]#

Initialize self. See help(type(self)) for accurate signature.

is_linear()[source]#

Returns if the operator is linear

direct(x, out=None)[source]#

Returns the application of the Operator on x

norm(**kwargs)[source]#

Returns the norm of the Operator. On first call the norm will be calculated using the operator’s calculate_norm method. Subsequent calls will return the cached norm.

Returns

norm

Return type

positive:float

set_norm(norm=None)[source]#

Sets the norm of the operator to a custom value.

calculate_norm()[source]#

Calculates the norm of the Operator

range_geometry()[source]#

Returns the range of the Operator: Y space

domain_geometry()[source]#

Returns the domain of the Operator: X space

__rmul__(scalar)[source]#

Defines the multiplication by a scalar on the left

returns a ScaledOperator

__neg__()[source]#

Return -self

__sub__(other)[source]#

Returns the subtraction of the operators.

__weakref__#

list of weak references to the object (if defined)

class cil.optimisation.operators.LinearOperator(domain_geometry, **kwargs)[source]#

Linear operator that maps from a space X <-> Y

Parameters
__init__(domain_geometry, **kwargs)[source]#

Initialize self. See help(type(self)) for accurate signature.

is_linear()[source]#

Returns if the operator is linear

adjoint(x, out=None)[source]#

returns the adjoint/inverse operation

only available to linear operators

static PowerMethod(operator, max_iteration=10, initial=None, tolerance=1e-05, return_all=False, method='auto')[source]#

Power method or Power iteration algorithm

The Power method computes the largest (dominant) eigenvalue of a matrix in magnitude, e.g., absolute value in the real case and modulus in the complex case.

Parameters
  • operator (LinearOperator) –

  • max_iteration (positive:int, default=10) – Number of iterations for the Power method algorithm.

  • initial (DataContainer, default = None) – Starting point for the Power method.

  • tolerance (positive:float, default = 1e-5) – Stopping criterion for the Power method. Check if two consecutive eigenvalue evaluations are below the tolerance.

  • return_all (boolean, default = False) – Toggles the verbosity of the return

  • method (string one of “auto”, “composed_with_adjoint” and “direct_only”, default = “auto”) – The default auto lets the code choose the method, this can be specified with “direct_only” or “composed_with_adjoint”

Returns

  • dominant eigenvalue (positive:float)

  • number of iterations (positive:int) – Number of iterations run. Only returned if return_all is True.

  • eigenvector (DataContainer) – Corresponding eigenvector of the dominant eigenvalue. Only returned if return_all is True.

  • list of eigenvalues (list) – List of eigenvalues. Only returned if return_all is True.

  • convergence (boolean) – Check on wether the difference between the last two iterations is less than tolerance. Only returned if return_all is True.

Note

The power method contains two different algorithms chosen by the method flag.

In the case method=”direct_only”, for operator, \(A\), the power method computes the iterations \(x_{k+1} = A (x_k/\|x_{k}\|)\) initialised with a random vector \(x_0\) and returning the largest (dominant) eigenvalue in magnitude given by \(\|x_k\|\).

In the case method=”composed_with_adjoint”, the algorithm computes the largest (dominant) eigenvalue of \(A^{T}A\) returning the square root of this value, i.e. the iterations: \(x_{k+1} = A^TA (x_k/\|x_{k}\|)\) and returning \(\sqrt{\|x_k\|}\).

The default flag is method=”auto”, the algorithm checks to see if the operator.domain_geometry() == operator.range_geometry() and if so uses the method “direct_only” and if not the method “composed_with_adjoint”.

Examples

>>> M = np.array([[1.,0],[1.,2.]])
>>> Mop = MatrixOperator(M)
>>> Mop_norm = Mop.PowerMethod(Mop)
>>> Mop_norm
2.0000654846240296

PowerMethod is called when we compute the norm of a matrix or a LinearOperator.

>>> Mop_norm = Mop.norm()
2.0005647295658866
calculate_norm()[source]#

Returns the norm of the LinearOperator calculated by the PowerMethod with default values.

static dot_test(operator, domain_init=None, range_init=None, tolerance=1e-06, **kwargs)[source]#

Does a dot linearity test on the operator Evaluates if the following equivalence holds .. math:

Ax\times y = y \times A^Tx
Parameters
  • operator – operator to test the dot_test

  • range_init – optional initialisation container in the operator range

  • domain_init – optional initialisation container in the operator domain

  • seed – Seed random generator

:type : int, default = 1 :param tolerance: Check if the following expression is below the tolerance .. math:

|Ax\times y - y \times A^Tx|/(\|A\|\|x\|\|y\| + 1e-12) < tolerance

:type : float, default 1e-6 :returns: boolean, True if the test is passed.

class cil.optimisation.operators.ScaledOperator(operator, scalar, **kwargs)[source]#

A class to represent the scalar multiplication of an Operator with a scalar. It holds an operator and a scalar. Basically it returns the multiplication of the result of direct and adjoint of the operator with the scalar. For the rest it behaves like the operator it holds.

Parameters
  • operator – a Operator or LinearOperator

  • scalar – a scalar multiplier

Example

The scaled operator behaves like the following:

sop = ScaledOperator(operator, scalar)
sop.direct(x) = scalar * operator.direct(x)
sop.adjoint(x) = scalar * operator.adjoint(x)
sop.norm() = operator.norm()
sop.range_geometry() = operator.range_geometry()
sop.domain_geometry() = operator.domain_geometry()
__init__(operator, scalar, **kwargs)[source]#

creator

Parameters
  • operator – a Operator or LinearOperator

  • scalar (Number) – a scalar multiplier

direct(x, out=None)[source]#

direct method

adjoint(x, out=None)[source]#

adjoint method

norm(**kwargs)[source]#

norm of the operator

is_linear()[source]#

returns whether the operator is linear

Returns

boolean

class cil.optimisation.operators.CompositionOperator(*operators, **kwargs)[source]#
__init__(*operators, **kwargs)[source]#

Initialize self. See help(type(self)) for accurate signature.

direct(x, out=None)[source]#

Returns the application of the Operator on x

is_linear()[source]#

Returns if the operator is linear

calculate_norm()[source]#

Returns the norm of the CompositionOperator, that is the product of the norms of its operators.

class cil.optimisation.operators.DiagonalOperator(diagonal, domain_geometry=None)[source]#

Performs an element-wise multiplication, i.e., Hadamard Product of a DataContainer x and DataContainer diagonal, d .

\[(D\circ x) = \sum_{i,j}^{M,N} D_{i,j} x_{i, j}\]

In matrix-vector interpretation, if D is a \(M\times N\) dense matrix and is flattened, we have a \(M*N \times M*N\) vector. A sparse diagonal matrix, i.e., DigaonalOperator can be created if we add the vector above to the main diagonal. If the DataContainer x is also flattened, we have a \(M*N\) vector. Now, matrix-vector multiplcation is allowed and results to a \((M*N,1)\) vector. After reshaping we recover a \(M\times N\) DataContainer.

Parameters
  • diagonal (DataContainer) – DataContainer with the same dimensions as the data to be operated on.

  • domain_geometry (ImageGeometry) – Specifies the geometry of the operator domain. If ‘None’ will use the diagonal geometry directly. default=None .

__init__(diagonal, domain_geometry=None)[source]#

Initialize self. See help(type(self)) for accurate signature.

direct(x, out=None)[source]#

Returns \(D\circ x\)

adjoint(x, out=None)[source]#

Returns \(D\circ x\)

calculate_norm(**kwargs)[source]#

Returns the operator norm of DiagonalOperator which is the \(\infty\) norm of diagonal

\[\|D\|_{\infty} = \max_{i}\{|D_{i}|\}\]
class cil.optimisation.operators.ChannelwiseOperator(op, channels, dimension='prepend')[source]#

ChannelwiseOperator: takes in a single-channel operator op and the number of channels to be used, and creates a new multi-channel ChannelwiseOperator, which will apply the operator op independently on each channel for the number of channels specified.

ChannelwiseOperator supports simple operators as input but not BlockOperators. Typically if such behaviour is desired, it can be achieved by creating instead a BlockOperator of ChannelwiseOperators.

param op

Single-channel operator

param channels

Number of channels

param dimension

‘prepend’ (default) or ‘append’ channel dimension onto existing dimensions

__init__(op, channels, dimension='prepend')[source]#

Initialize self. See help(type(self)) for accurate signature.

direct(x, out=None)[source]#

Returns D(x)

adjoint(x, out=None)[source]#

Returns D^{*}(y)

calculate_norm(**kwargs)[source]#

Evaluates operator norm of DiagonalOperator

Trivial operators#

Trivial operators are the following.

class cil.optimisation.operators.IdentityOperator(domain_geometry, range_geometry=None)[source]#

IdentityOperator: Id: X -> Y, Id(x) = xin Y

X : gm_domain Y : gm_range ( Default: Y = X )

__init__(domain_geometry, range_geometry=None)[source]#

Initialize self. See help(type(self)) for accurate signature.

direct(x, out=None)[source]#

Returns Id(x)

adjoint(x, out=None)[source]#

Returns Id(x)

calculate_norm(**kwargs)[source]#

Evaluates operator norm of IdentityOperator

class cil.optimisation.operators.ZeroOperator(domain_geometry, range_geometry=None)[source]#

ZeroOperator: O: X -> Y, maps any element of \(x\in X\) into the zero element \(\in Y, O(x) = O_{Y}\)

Parameters
  • gm_domain – domain of the operator

  • gm_range – range of the operator, default: same as domain

Note:

\[ \begin{align}\begin{aligned}O^{*}: Y^{*} -> X^{*} \text{(Adjoint)}\\< O(x), y > = < x, O^{*}(y) >\end{aligned}\end{align} \]
__init__(domain_geometry, range_geometry=None)[source]#

Initialize self. See help(type(self)) for accurate signature.

direct(x, out=None)[source]#

Returns O(x)

adjoint(x, out=None)[source]#

Returns O^{*}(y)

calculate_norm(**kwargs)[source]#

Evaluates operator norm of ZeroOperator

class cil.optimisation.operators.MatrixOperator(A)[source]#

Matrix wrapped into a LinearOperator

Param

a numpy matrix

__init__(A)[source]#

creator

Parameters

A – numpy ndarray representing a matrix

direct(x, out=None)[source]#

Returns the application of the Operator on x

adjoint(x, out=None)[source]#

returns the adjoint/inverse operation

only available to linear operators

class cil.optimisation.operators.MaskOperator(mask, domain_geometry=None)[source]#
Parameters
  • mask (DataContainer) – Boolean array with the same dimensions as the data to be operated on.

  • domain_geometry (ImageGeometry) – Specifies the geometry of the operator domain. If ‘None’ will use the mask geometry size and spacing as float32. default = None .

__init__(mask, domain_geometry=None)[source]#

Initialize self. See help(type(self)) for accurate signature.

GradientOperator#

class cil.optimisation.operators.GradientOperator(domain_geometry, method='forward', bnd_cond='Neumann', **kwargs)[source]#

Gradient Operator: Computes first-order forward/backward differences on 2D, 3D, 4D ImageData under Neumann/Periodic boundary conditions

Parameters
  • domain_geometry (ImageGeometry) – Set up the domain of the function

  • method (str, default 'forward') – Accepts: ‘forward’, ‘backward’, ‘centered’, note C++ optimised routine only works with ‘forward’

  • bnd_cond (str, default, 'Neumann') – Set the boundary conditions to use ‘Neumann’ or ‘Periodic’

  • **kwargs

    correlation: str, default ‘Space’

    ’Space’ will compute the gradient on only the spatial dimensions, ‘SpaceChannels’ will include the channel dimension direction

    backend: str, default ‘c’

    ’c’ or ‘numpy’, defaults to ‘c’ if correlation is ‘SpaceChannels’ or channels = 1

    num_threads: int

    If backend is ‘c’ specify the number of threads to use. Default is number of cpus/2

    split: boolean

    If ‘True’, and backend ‘c’ will return a BlockDataContainer with grouped spatial domains. i.e. [Channel, [Z, Y, X]], otherwise [Channel, Z, Y, X]

Returns

a BlockDataContainer containing images of the derivatives order given by dimension_labels i.e. [‘horizontal_y’,’horizontal_x’] will return [d(‘horizontal_y’), d(‘horizontal_x’)]

Return type

BlockDataContainer

Example

2D example

\begin{eqnarray} \nabla : X \rightarrow Y\\ u \in X, \nabla(u) &=& [\partial_{y} u, \partial_{x} u]\\ u^{*} \in Y, \nabla^{*}(u^{*}) &=& \partial_{y} v1 + \partial_{x} v2 \end{eqnarray}
direct(x, out=None)[source]#

Computes the first-order forward differences

Parameters
Returns

result data if out not specified

Return type

BlockDataContainer

adjoint(x, out=None)[source]#

Computes the first-order backward differences

Parameters
  • x (BlockDataContainer) – Gradient images for each dimension in ImageGeometry domain

  • out (ImageData, optional) – pre-allocated output memory to store result

Returns

result data if out not specified

Return type

ImageData

calculate_norm()[source]#

Returns the analytical norm of the GradientOperator.

\[\begin{split}(\partial_{z}, \partial_{y}, \partial_{x}) &= \sqrt{\|\partial_{z}\|^{2} + \|\partial_{y}\|^{2} + \|\partial_{x}\|^{2} } \\ &= \sqrt{ \frac{4}{h_{z}^{2}} + \frac{4}{h_{y}^{2}} + \frac{4}{h_{x}^{2}}}\end{split}\]

Where the voxel sizes in each dimension are equal to 1 this simplifies to:

  • 2D geometries \(norm = \sqrt{8}\)

  • 3D geometries \(norm = \sqrt{12}\)

class cil.optimisation.operators.FiniteDifferenceOperator(domain_geometry, range_geometry=None, direction=None, method='forward', bnd_cond='Neumann')[source]#

Computes forward/backward/centered finite differences of a DataContainer under Neumann/Periodic boundary conditions

Parameters
  • domain_geometry – Domain geometry for the FiniteDifferenceOperator

  • direction (string label from domain geometry or integer number) – Direction to evaluate finite differences

  • method ('forward', 'backward', 'centered') – Method for finite differences

  • bnd_cond – ‘Neumann’, ‘Periodic’

direct(x, out=None)[source]#

Returns the application of the Operator on x

adjoint(x, out=None)[source]#

returns the adjoint/inverse operation

only available to linear operators

class cil.optimisation.operators.SparseFiniteDifferenceOperator(domain_geometry, range_geometry=None, direction=0, bnd_cond='Neumann')[source]#

Create Sparse Matrices for the Finite Difference Operator

direct(x)[source]#

Returns the application of the Operator on x

class cil.optimisation.operators.SymmetrisedGradientOperator(domain_geometry, bnd_cond='Neumann', **kwargs)[source]#

Symmetrized Gradient Operator: E: V -> W

V : range of the Gradient Operator W : range of the Symmetrized Gradient

Example (2D):

\[ \begin{align}\begin{aligned}\begin{split}v = (v1, v2) \\\end{split}\\\begin{split}Ev = 0.5 * ( \nabla\cdot v + (\nabla\cdot c)^{T} ) \\\end{split}\\\begin{split}\begin{matrix} \partial_{y} v1 & 0.5 * (\partial_{x} v1 + \partial_{y} v2) \\ 0.5 * (\partial_{x} v1 + \partial_{y} v2) & \partial_{x} v2 \end{matrix}\end{split}\end{aligned}\end{align} \]
direct(x, out=None)[source]#

Returns E(v)

adjoint(x, out=None)[source]#

returns the adjoint/inverse operation

only available to linear operators

Functions#

A Function represents a mathematical function of one or more inputs and is intended to accept DataContainers as input as well as any additional parameters.

Fixed parameters can be passed in during the creation of the function object. The methods of the function reflect the properties of it, for example, if the function represented is differentiable the function should contain a method gradient which should return the gradient of the function evaluated at an input point. If the function is not differentiable but allows a simple proximal operator, the method proximal should return the proximal operator evaluated at an input point. The function value is evaluated by calling the function itself, e.g. f(x) for a Function f and input point x.

Base classes#

class cil.optimisation.functions.Function(L=None)[source]#

Abstract class representing a function

Parameters
  • L (number, positive, default None) – Lipschitz constant of the gradient of the function F(x), when it is differentiable.

  • domain – The domain of the function.

Lipschitz of the gradient of the function; it is a positive real number, such that |f'(x) - f'(y)| <= L ||x-y||, assuming f: IG –> R

__init__(L=None)[source]#

Initialize self. See help(type(self)) for accurate signature.

__call__(x)[source]#

Returns the value of the function F at x: \(F(x)\)

gradient(x, out=None)[source]#

Returns the value of the gradient of function F at x, if it is differentiable

\[F'(x)\]
proximal(x, tau, out=None)[source]#

Returns the proximal operator of function \(\tau F\) at x .. math:: mathrm{prox}_{tau F}(x) = underset{z}{mathrm{argmin}} frac{1}{2}|z - x|^{2} + tau F(z)

convex_conjugate(x)[source]#

Returns the convex conjugate of function \(F\) at \(x^{*}\),

\[F^{*}(x^{*}) = \underset{x^{*}}{\sup} <x^{*}, x> - F(x)\]
proximal_conjugate(x, tau, out=None)[source]#

Returns the proximal operator of the convex conjugate of function \(\tau F\) at \(x^{*}\)

\[\mathrm{prox}_{\tau F^{*}}(x^{*}) = \underset{z^{*}}{\mathrm{argmin}} \frac{1}{2}\|z^{*} - x^{*}\|^{2} + \tau F^{*}(z^{*})\]

Due to Moreau’s identity, we have an analytic formula to compute the proximal operator of the convex conjugate \(F^{*}\)

\[\mathrm{prox}_{\tau F^{*}}(x) = x - \tau\mathrm{prox}_{\tau^{-1} F}(\tau^{-1}x)\]
__add__(other)[source]#

Returns the sum of the functions.

Cases: a) the sum of two functions \((F_{1}+F_{2})(x) = F_{1}(x) + F_{2}(x)\)
  1. the sum of a function with a scalar \((F_{1}+scalar)(x) = F_{1}(x) + scalar\)

__radd__(other)[source]#

Making addition commutative.

__sub__(other)[source]#

Returns the subtraction of the functions.

__rmul__(scalar)[source]#

Returns a function multiplied by a scalar.

centered_at(center)[source]#

Returns a translated function, namely if we have a function \(F(x)\) the center is at the origin. TranslateFunction is \(F(x - b)\) and the center is at point b.

property L#

Lipschitz of the gradient of function f.

L is positive real number, such that |f'(x) - f'(y)| <= L ||x-y||, assuming f: IG –> R

__weakref__#

list of weak references to the object (if defined)

class cil.optimisation.functions.SumFunction(*functions)[source]#

SumFunction represents the sum of \(n\geq2\) functions

\[(F_{1} + F_{2} + ... + F_{n})(\cdot) = F_{1}(\cdot) + F_{2}(\cdot) + ... + F_{n}(\cdot)\]
Parameters

*functions (Functions) – Functions to set up a SumFunction

Raises

ValueError – If the number of function is strictly less than 2.

Examples

\[F(x) = \|x\|^{2} + \frac{1}{2}\|x - 1\|^{2}\]
>>> from cil.optimisation.functions import L2NormSquared
>>> from cil.framework import ImageGeometry
>>> f1 = L2NormSquared()
>>> f2 = 0.5 * L2NormSquared(b = ig.allocate(1))
>>> F = SumFunction(f1, f2)
\[F(x) = \sum_{i=1}^{50} \|x - i\|^{2}\]
>>> F = SumFunction(*[L2NormSquared(b=i) for i in range(50)])
__init__(*functions)[source]#

Initialize self. See help(type(self)) for accurate signature.

property L#

Returns the Lipschitz constant for the SumFunction

\[L = \sum_{i} L_{i}\]

where \(L_{i}\) is the Lipschitz constant of the smooth function \(F_{i}\).

property Lmax#

Returns the maximum Lipschitz constant for the SumFunction

\[L = \max_{i}\{L_{i}\}\]

where \(L_{i}\) is the Lipschitz constant of the smooth function \(F_{i}\).

__call__(x)[source]#

Returns the value of the sum of functions at \(x\).

\[(F_{1} + F_{2} + ... + F_{n})(x) = F_{1}(x) + F_{2}(x) + ... + F_{n}(x)\]
gradient(x, out=None)[source]#

Returns the value of the sum of the gradient of functions at \(x\), if all of them are differentiable.

\[(F'_{1} + F'_{2} + ... + F'_{n})(x) = F'_{1}(x) + F'_{2}(x) + ... + F'_{n}(x)\]
__add__(other)[source]#

Addition for the SumFunction.

  • SumFunction + SumFunction is a SumFunction.

  • SumFunction + Function is a SumFunction.

class cil.optimisation.functions.ScaledFunction(function, scalar)[source]#

ScaledFunction represents the scalar multiplication with a Function.

Let a function F then and a scalar \(\alpha\).

If \(G(x) = \alpha F(x)\) then:

  1. \(G(x) = \alpha F(x)\) ( __call__ method )

  2. \(G'(x) = \alpha F'(x)\) ( gradient method )

  3. \(G^{*}(x^{*}) = \alpha F^{*}(\frac{x^{*}}{\alpha})\) ( convex_conjugate method )

  4. \(\mathrm{prox}_{\tau G}(x) = \mathrm{prox}_{(\tau\alpha) F}(x)\) ( proximal method )

__init__(function, scalar)[source]#

Initialize self. See help(type(self)) for accurate signature.

property L#

Lipschitz of the gradient of function f.

L is positive real number, such that |f'(x) - f'(y)| <= L ||x-y||, assuming f: IG –> R

__call__(x, out=None)[source]#

Returns the value of the scaled function.

\[G(x) = \alpha F(x)\]
convex_conjugate(x)[source]#

Returns the convex conjugate of the scaled function.

\[G^{*}(x^{*}) = \alpha F^{*}(\frac{x^{*}}{\alpha})\]
gradient(x, out=None)[source]#

Returns the gradient of the scaled function.

\[G'(x) = \alpha F'(x)\]
proximal(x, tau, out=None)[source]#

Returns the proximal operator of the scaled function.

\[\mathrm{prox}_{\tau G}(x) = \mathrm{prox}_{(\tau\alpha) F}(x)\]
proximal_conjugate(x, tau, out=None)[source]#

This returns the proximal operator for the function at x, tau

class cil.optimisation.functions.SumScalarFunction(function, constant)[source]#

SumScalarFunction represents the sum a function with a scalar.

\[(F + scalar)(x) = F(x) + scalar\]

Although SumFunction has no general expressions for

  1. convex_conjugate

  2. proximal

  3. proximal_conjugate

if the second argument is a ConstantFunction then we can derive the above analytically.

__init__(function, constant)[source]#

Initialize self. See help(type(self)) for accurate signature.

convex_conjugate(x)[source]#

Returns the convex conjugate of a \((F+scalar)\)

\[(F+scalar)^{*}(x^{*}) = F^{*}(x^{*}) - scalar\]
proximal(x, tau, out=None)[source]#

Returns the proximal operator of \(F+scalar\)

\[\mathrm{prox}_{ au (F+scalar)}(x) = \mathrm{prox}_{ au F}\]
property L#

Returns the Lipschitz constant for the SumFunction

\[L = \sum_{i} L_{i}\]

where \(L_{i}\) is the Lipschitz constant of the smooth function \(F_{i}\).

class cil.optimisation.functions.TranslateFunction(function, center)[source]#

TranslateFunction represents the translation of function F with respect to the center b.

Let a function F and consider \(G(x) = F(x - center)\).

Function F is centered at 0, whereas G is centered at point b.

If \(G(x) = F(x - b)\) then:

  1. \(G(x) = F(x - b)\) ( __call__ method )

  2. \(G'(x) = F'(x - b)\) ( gradient method )

  3. \(G^{*}(x^{*}) = F^{*}(x^{*}) + <x^{*}, b >\) ( convex_conjugate method )

  4. \(\mathrm{prox}_{\tau G}(x) = \mathrm{prox}_{\tau F}(x - b) + b\) ( proximal method )

__init__(function, center)[source]#

Initialize self. See help(type(self)) for accurate signature.

__call__(x)[source]#

Returns the value of the translated function.

\[G(x) = F(x - b)\]
gradient(x, out=None)[source]#

Returns the gradient of the translated function.

\[G'(x) = F'(x - b)\]
proximal(x, tau, out=None)[source]#

Returns the proximal operator of the translated function.

\[\mathrm{prox}_{\tau G}(x) = \mathrm{prox}_{\tau F}(x-b) + b\]
convex_conjugate(x)[source]#

Returns the convex conjugate of the translated function.

\[G^{*}(x^{*}) = F^{*}(x^{*}) + <x^{*}, b >\]

Simple functions#

class cil.optimisation.functions.ConstantFunction(constant=0)[source]#

ConstantFunction: \(F(x) = constant, constant\in\mathbb{R}\)

__init__(constant=0)[source]#

Initialize self. See help(type(self)) for accurate signature.

__call__(x)[source]#

Returns the value of the function, \(F(x) = constant\)

gradient(x, out=None)[source]#

Returns the value of the gradient of the function, \(F'(x)=0\)

convex_conjugate(x)[source]#

The convex conjugate of constant function \(F(x) = c\in\mathbb{R}\) is

\[\begin{split}F(x^{*}) = \begin{cases} -c, & if x^{*} = 0\\ \infty, & \mbox{otherwise} \end{cases}\end{split}\]

However, \(x^{*} = 0\) only in the limit of iterations, so in fact this can be infinity. We do not want to have inf values in the convex conjugate, so we have to penalise this value accordingly. The following penalisation is useful in the PDHG algorithm, when we compute primal & dual objectives for convergence purposes.

\[F^{*}(x^{*}) = \sum \max\{x^{*}, 0\}\]
proximal(x, tau, out=None)[source]#

Returns the proximal operator of the constant function, which is the same element, i.e.,

\[\mathrm{prox}_{ au F}(x) = x\]
property L#

Lipschitz of the gradient of function f.

L is positive real number, such that |f'(x) - f'(y)| <= L ||x-y||, assuming f: IG –> R

__rmul__(other)[source]#

defines the right multiplication with a number

class cil.optimisation.functions.ZeroFunction[source]#

ZeroFunction represents the zero function, \(F(x) = 0\)

__init__()[source]#

Initialize self. See help(type(self)) for accurate signature.

class cil.optimisation.functions.Rosenbrock(alpha, beta)[source]#

Rosenbrock function

\[\]

F(x,y) = (alpha - x)^2 + beta(y-x^2)^2

The function has a global minimum at .. math:: (x,y)=(alpha, alpha^2)

__init__(alpha, beta)[source]#

Initialize self. See help(type(self)) for accurate signature.

__call__(x)[source]#

Returns the value of the function F at x: \(F(x)\)

gradient(x, out=None)[source]#

Gradient of the Rosenbrock function

\[\]

nabla f(x,y) = left[ 2*((x-alpha) - 2beta x(y-x^2)) ; 2beta (y - x^2) right]

Composition of operator and a function#

This class allows the user to write a function which does the following:

\[F ( x ) = G ( Ax )\]

where \(A\) is an operator. For instance the least squares function l2norm_ Norm2Sq can be expressed as

\[F(x) = || Ax - b ||^2_2\]
class cil.optimisation.functions.OperatorCompositionFunction(function, operator)[source]#

Composition of a function with an operator as : \((F \otimes A)(x) = F(Ax)\)

parameter function

Function F

parameter operator

Operator A

For general operator, we have no explicit formulas for convex_conjugate, proximal and proximal_conjugate

__init__(function, operator)[source]#

creator

Parameters
  • A (Operator) – operator

  • f (Function) – function

property L#

Lipschitz of the gradient of function f.

L is positive real number, such that |f'(x) - f'(y)| <= L ||x-y||, assuming f: IG –> R

__call__(x)[source]#

Returns \(F(Ax)\)

gradient(x, out=None)[source]#

Return the gradient of F(Ax),

..math :: (F(Ax))’ = A^{T}F’(Ax)

Indicator box#

class cil.optimisation.functions.IndicatorBox(lower=None, upper=None, accelerated=True)[source]#

Indicator function for box constraint

\[\begin{split}f(x) = \mathbb{I}_{[a, b]} = \begin{cases} 0, \text{ if } x \in [a, b] \\ \infty, \text{otherwise} \end{cases}\end{split}\]
Parameters
  • lower (float, DataContainer or numpy array, default None) – Lower bound. If set to None, it is equivalent to -np.inf.

  • upper (float, DataContainer or numpy array, default None) – Upper bound. If set to None, it is equivalent to np.inf.

  • accelerated (bool, default True) – Specifies whether to use the accelerated version or not, using numba or numpy backends respectively.

If lower or upper are passed a DataContainer (or derived class such as ImageData or AcquisitionData) or a numpy array, the bounds can be set to different values for each element.

In order to save computing time it is possible to suppress the evaluation of the function. This is achieved by setting suppress_evaluation to True. IndicatorBox evaluated on any input will then return 0.

If accelerated is set to True (default), the Numba backend is used. Otherwise, the Numpy backend is used. An optional parameter to set the number of threads used by Numba can be set with set_num_threads. Setting the number of threads when accelerate is set to False will not have any effect. The default number of threads is defined in the cil.utilities.multiprocessing module, and it is equivalent to half of the CPU cores available.

In order to save computing time it is possible to suppress the evaluation of the function.

ib = IndicatorBox(lower=0, upper=1)
ib.set_suppress_evaluation(True)
ib.evaluate(x) # returns 0

Set the number of threads used in accelerated mode.

num_threads = 4
ib = IndicatorBox(lower=0, upper=1)
ib.set_num_threads(num_threads)
static __new__(cls, lower=None, upper=None, accelerated=True)[source]#

Create and return a new object. See help(type) for accurate signature.

__init__(lower=None, upper=None, accelerated=True)[source]#
set_suppress_evaluation(value)[source]#

Suppresses the evaluation of the function

Parameters

value (bool) – If True, the function evaluation on any input will return 0, without calculation.

__call__(x)[source]#

Evaluates IndicatorBox at x

Parameters

x (DataContainer) –

Evaluates the IndicatorBox at x. If suppress_evaluation is True, returns 0.

proximal(x, tau, out=None)[source]#

Proximal operator of IndicatorBox at x

\[prox_{\tau * f}(x)\]
Parameters
  • x (DataContainer) – Input to the proximal operator

  • tau (float) – Step size. Notice it is ignored in IndicatorBox

  • out (DataContainer, optional) – Output of the proximal operator. If not provided, a new DataContainer is created.

Note

tau is ignored but it is in the signature of the generic Function class

gradient(x)[source]#

IndicatorBox is not differentiable, so calling gradient will raise a ValueError

property num_threads#

Get the optional number of threads parameter to use for the accelerated version.

Defaults to the value set in the CIL multiprocessing module.

set_num_threads(value)[source]#

Set the optional number of threads parameter to use for the accelerated version.

This is discarded if accelerated=False.

KullbackLeibler#

class cil.optimisation.functions.KullbackLeibler(b, eta=None, mask=None, backend='numba')[source]#

Kullback Leibler

\[\begin{split}F(u, v) = \begin{cases} u \log(\frac{u}{v}) - u + v & \mbox{ if } u > 0, v > 0\\ v & \mbox{ if } u = 0, v \ge 0 \\ \infty, & \mbox{otherwise} \end{cases}\end{split}\]

where the \(0\log0 := 0\) convention is used.

Parameters
  • b (DataContainer, non-negative) – Translates the function at point b.

  • eta (DataContainer, default = 0) – Background noise

  • mask (DataContainer, default = None) – Mask for the data b

  • backend ({'numba','numpy'}, optional) – Backend for the KullbackLeibler methods. Numba is the default backend.

Note

The Kullback-Leibler function is used in practice as a fidelity term in minimisation problems where the acquired data follow Poisson distribution. If we denote the acquired data with b then, we write

\[\underset{i}{\sum} F(b_{i}, (v + \eta)_{i})\]

where, \(\eta\) is an additional noise.

In the case of Positron Emission Tomography reconstruction \(\eta\) represents scatter and random events contribution during the PET acquisition. Hence, in that case the KullbackLeibler fidelity measures the distance between \(\mathcal{A}v + \eta\) and acquisition data \(b\), where \(\mathcal{A}\) is the projection operator. This is related to PoissonLogLikelihoodWithLinearModelForMean , definition that is used in PET reconstruction in the SIRF software.

Note

The default implementation uses the build-in function kl_div from scipy. The methods of the KullbackLeibler are accelerated provided that numba library is installed.

Examples

>>> from cil.optimisation.functions import KullbackLeibler
>>> from cil.framework import ImageGeometry
>>> ig = ImageGeometry(3,4)
>>> data = ig.allocate('random')
>>> F = KullbackLeibler(b = data)
static __new__(cls, b, eta=None, mask=None, backend='numba')[source]#

Create and return a new object. See help(type) for accurate signature.

__init__(b, eta=None, mask=None, backend='numba')[source]#

Initialize self. See help(type(self)) for accurate signature.

L1 Norm#

class cil.optimisation.functions.L1Norm(**kwargs)[source]#

L1Norm function

Consider the following cases:
  1. \[F(x) = ||x||_{1}\]
  2. \[F(x) = ||x - b||_{1}\]
__init__(**kwargs)[source]#

creator

Cases considered (with/without data): a) \(f(x) = ||x||_{1}\) b) \(f(x) = ||x - b||_{1}\)

Parameters

b (DataContainer, optional) – translation of the function

__call__(x)[source]#

Returns the value of the L1Norm function at x.

Consider the following cases:
  1. \[F(x) = ||x||_{1}\]
  2. \[F(x) = ||x - b||_{1}\]
convex_conjugate(x)[source]#

Returns the value of the convex conjugate of the L1Norm function at x. Here, we need to use the convex conjugate of L1Norm, which is the Indicator of the unit \(L^{\infty}\) norm

Consider the following cases:

  1. \[F^{*}(x^{*}) = \mathbb{I}_{\{\|\cdot\|_{\infty}\leq1\}}(x^{*})\]
  2. \[F^{*}(x^{*}) = \mathbb{I}_{\{\|\cdot\|_{\infty}\leq1\}}(x^{*}) + <x^{*},b>\]
\[\begin{split}\mathbb{I}_{\{\|\cdot\|_{\infty}\leq1\}}(x^{*}) = \begin{cases} 0, \mbox{if } \|x^{*}\|_{\infty}\leq1\\ \infty, \mbox{otherwise} \end{cases}\end{split}\]
proximal(x, tau, out=None)[source]#

Returns the value of the proximal operator of the L1Norm function at x.

Consider the following cases:

  1. \[\mathrm{prox}_{\tau F}(x) = \mathrm{ShinkOperator}(x)\]
  2. \[\mathrm{prox}_{\tau F}(x) = \mathrm{ShinkOperator}(x) + b\]

where,

\[\mathrm{prox}_{\tau F}(x) = \mathrm{ShinkOperator}(x) = sgn(x) * \max\{ |x| - \tau, 0 \}\]

Squared L2 norm squared#

class cil.optimisation.functions.L2NormSquared(**kwargs)[source]#

L2NormSquared function: \(F(x) = \| x\|^{2}_{2} = \underset{i}{\sum}x_{i}^{2}\)

Following cases are considered:

  1. \(F(x) = \|x\|^{2}_{2}\)

  2. \(F(x) = \|x - b\|^{2}_{2}\)

Note

For case b) case we can use F = L2NormSquared().centered_at(b), see TranslateFunction.

Example
>>> F = L2NormSquared()
>>> F = L2NormSquared(b=b)
>>> F = L2NormSquared().centered_at(b)
__init__(**kwargs)[source]#

creator

Cases considered (with/without data):
  1. \[f(x) = \|x\|^{2}_{2}\]
  2. \[f(x) = \|\|x - b\|\|^{2}_{2}\]
Parameters

b (DataContainer, optional) – translation of the function

__call__(x)[source]#

Returns the value of the L2NormSquared function at x.

Following cases are considered:

  1. \(F(x) = \|x\|^{2}_{2}\)

  2. \(F(x) = \|x - b\|^{2}_{2}\)

Param

\(x\)

Returns

\(\underset{i}{\sum}x_{i}^{2}\)

gradient(x, out=None)[source]#

Returns the value of the gradient of the L2NormSquared function at x.

Following cases are considered:

  1. \(F'(x) = 2x\)

  2. \(F'(x) = 2(x-b)\)

convex_conjugate(x)[source]#

Returns the value of the convex conjugate of the L2NormSquared function at x.

Consider the following cases:

  1. \[F^{*}(x^{*}) = \frac{1}{4}\|x^{*}\|^{2}_{2}\]
  2. \[F^{*}(x^{*}) = \frac{1}{4}\|x^{*}\|^{2}_{2} + <x^{*}, b>\]
proximal(x, tau, out=None)[source]#

Returns the value of the proximal operator of the L2NormSquared function at x.

Consider the following cases:

  1. \[\mathrm{prox}_{\tau F}(x) = \frac{x}{1+2\tau}\]
  2. \[\mathrm{prox}_{\tau F}(x) = \frac{x-b}{1+2\tau} + b\]
class cil.optimisation.functions.WeightedL2NormSquared(**kwargs)[source]#

WeightedL2NormSquared function: \(F(x) = \| x\|_{w}^{2}_{2} = \underset{i}{\sum}w_{i}*x_{i}^{2} = <x, w*x> = x^{T}*w*x\)

__init__(**kwargs)[source]#

Initialize self. See help(type(self)) for accurate signature.

__call__(x)[source]#

Returns the value of the function F at x: \(F(x)\)

gradient(x, out=None)[source]#

Returns the value of the gradient of function F at x, if it is differentiable

\[F'(x)\]
convex_conjugate(x)[source]#

Returns the convex conjugate of function \(F\) at \(x^{*}\),

\[F^{*}(x^{*}) = \underset{x^{*}}{\sup} <x^{*}, x> - F(x)\]
proximal(x, tau, out=None)[source]#

Returns the proximal operator of function \(\tau F\) at x .. math:: mathrm{prox}_{tau F}(x) = underset{z}{mathrm{argmin}} frac{1}{2}|z - x|^{2} + tau F(z)

Least Squares#

class cil.optimisation.functions.LeastSquares(A, b, c=1.0, weight=None)[source]#

(Weighted) Least Squares function

\[F(x) = c\|Ax-b\|_2^2\]

or if weighted

\[F(x) = c\|Ax-b\|_{2,W}^{2}\]

A : LinearOperator

b : Data, DataContainer

c : Scaling Constant, float, default 1.0

weight: DataContainer with all positive elements of size of the range of operator A, default None

L : Lipshitz Constant of the gradient of \(F\) which is \(2 c ||A||_2^2 = 2 c s1(A)^2\), or

L : Lipshitz Constant of the gradient of \(F\) which is \(2 c ||weight|| ||A||_2^2 = 2s1(A)^2\),

where s1(A) is the largest singular value of A.

__init__(A, b, c=1.0, weight=None)[source]#

Initialize self. See help(type(self)) for accurate signature.

__call__(x)[source]#

Returns the value of \(F(x) = c\|Ax-b\|_2^2\) or c|Ax-b|_{2,weight}^2

gradient(x, out=None)[source]#

Returns the value of the gradient of \(F(x) = c*\|A*x-b\|_2^2\)

\[F'(x) = 2cA^T(Ax-b)\]
\[F'(x) = 2cA^T(weight(Ax-b))\]
property L#

Lipschitz of the gradient of function f.

L is positive real number, such that |f'(x) - f'(y)| <= L ||x-y||, assuming f: IG –> R

__rmul__(other)[source]#

defines the right multiplication with a number

Mixed L21 norm#

class cil.optimisation.functions.MixedL21Norm(**kwargs)[source]#

MixedL21Norm function: \(F(x) = ||x||_{2,1} = \sum |x|_{2} = \sum \sqrt{ (x^{1})^{2} + (x^{2})^{2} + \dots}\)

where x is a BlockDataContainer, i.e., \(x=(x^{1}, x^{2}, \dots)\)

__init__(**kwargs)[source]#

Initialize self. See help(type(self)) for accurate signature.

__call__(x)[source]#

Returns the value of the MixedL21Norm function at x.

Parameters

xBlockDataContainer

convex_conjugate(x)[source]#

Returns the value of the convex conjugate of the MixedL21Norm function at x.

This is the Indicator function of \(\mathbb{I}_{\{\|\cdot\|_{2,\infty}\leq1\}}(x^{*})\),

i.e.,

\[\begin{split}\mathbb{I}_{\{\|\cdot\|_{2, \infty}\leq1\}}(x^{*}) = \begin{cases} 0, \mbox{if } \|x\|_{2, \infty}\leq1\\ \infty, \mbox{otherwise} \end{cases}\end{split}\]

where,

\[\|x\|_{2,\infty} = \max\{ \|x\|_{2} \} = \max\{ \sqrt{ (x^{1})^{2} + (x^{2})^{2} + \dots}\}\]
proximal(x, tau, out=None)[source]#

Returns the value of the proximal operator of the MixedL21Norm function at x.

\[\mathrm{prox}_{\tau F}(x) = \frac{x}{\|x\|_{2}}\max\{ \|x\|_{2} - \tau, 0 \}\]

where the convention 0 · (0/0) = 0 is used.

Smooth Mixed L21 norm#

class cil.optimisation.functions.SmoothMixedL21Norm(epsilon)[source]#

SmoothMixedL21Norm function: \(F(x) = ||x||_{2,1} = \sum |x|_{2} = \sum \sqrt{ (x^{1})^{2} + (x^{2})^{2} + \epsilon^2 + \dots}\)

where x is a BlockDataContainer, i.e., \(x=(x^{1}, x^{2}, \dots)\)

Conjugate, proximal and proximal conjugate methods no closed-form solution

__init__(epsilon)[source]#
Parameters

epsilon – smoothing parameter making MixedL21Norm differentiable

__call__(x)[source]#

Returns the value of the SmoothMixedL21Norm function at x.

gradient(x, out=None)[source]#

Returns the value of the gradient of the SmoothMixedL21Norm function at x.

frac{x}{|x|}

Mixed L11 norm#

class cil.optimisation.functions.MixedL11Norm(**kwargs)[source]#

MixedL11Norm function

\[F(x) = ||x||_{1,1} = \sum |x_{1}| + |x_{2}| + \cdots + |x_{n}|\]

Note

MixedL11Norm is a separable function, therefore it can also be defined using the BlockFunction.

See also

L1Norm, MixedL21Norm

__init__(**kwargs)[source]#

Initialize self. See help(type(self)) for accurate signature.

__call__(x)[source]#

Returns the value of the MixedL11Norm function at x.

Parameters

xBlockDataContainer

proximal(x, tau, out=None)[source]#

Returns the value of the proximal operator of the MixedL11Norm function at x.

\[\mathrm{prox}_{\tau F}(x) = \mathrm{ShinkOperator}(x)\]

where,

\[\mathrm{prox}_{\tau F}(x) = \mathrm{ShinkOperator}(x) := sgn(x) * \max\{ |x| - \tau, 0 \}\]

Total variation#

class cil.optimisation.functions.TotalVariation(max_iteration=10, tolerance=None, correlation='Space', backend='c', lower=None, upper=None, isotropic=True, split=False, info=False, strong_convexity_constant=0, warm_start=True)[source]#

Total variation Function

\[\mathrm{TV}(u) := \|\nabla u\|_{2,1} = \sum \|\nabla u\|_{2},\, (\mbox{isotropic})\]
\[\mathrm{TV}(u) := \|\nabla u\|_{1,1} = \sum \|\nabla u\|_{1}\, (\mbox{anisotropic})\]

Notes

The TotalVariation (TV) Function acts as a composite function, i.e., the composition of the MixedL21Norm function and the GradientOperator operator,

\[f(u) = \|u\|_{2,1}, \Rightarrow (f\circ\nabla)(u) = f(\nabla x) = \mathrm{TV}(u)\]

In that case, the proximal operator of TV does not have an exact solution and we use an iterative algorithm to solve:

\[\mathrm{prox}_{\tau \mathrm{TV}}(b) := \underset{u}{\mathrm{argmin}} \frac{1}{2\tau}\|u - b\|^{2} + \mathrm{TV}(u)\]

The algorithm used for the proximal operator of TV is the Fast Gradient Projection algorithm (or FISTA) applied to the _dual problem_ of the above problem, see [1], [2], [9].

See also “Multicontrast MRI Reconstruction with Structure-Guided Total Variation”, Ehrhardt, Betcke, 2016.

Parameters
  • max_iteration (int, default = 5) – Maximum number of iterations for the FGP algorithm to solve to solve the dual problem of the Total Variation Denoising problem (ROF). If warm_start=False, this should be around 100, or larger, with a set tolerance.

  • tolerance (float, default = None) –

    Stopping criterion for the FGP algorithm used to to solve the dual problem of the Total Variation Denoising problem (ROF). If the difference between iterates in the FGP algorithm is less than the tolerance the iterations end before the max_iteration number.

    \[\|x^{k+1} - x^{k}\|_{2} < \mathrm{tolerance}\]

  • correlation (str, default = Space) – Correlation between Space and/or SpaceChannels for the GradientOperator.

  • backend (str, default = c) – Backend to compute the GradientOperator

  • lower ('float, default = None) – A constraint is enforced using the IndicatorBox function, e.g., IndicatorBox(lower, upper).

  • upper ('float, default = None) – A constraint is enforced using the IndicatorBox function, e.g., IndicatorBox(lower, upper).

  • isotropic (boolean, default = True) –

    Use either isotropic or anisotropic definition of TV.

    \[|x|_{2} = \sqrt{x_{1}^{2} + x_{2}^{2}},\, (\mbox{isotropic})\]
    \[|x|_{1} = |x_{1}| + |x_{2}|\, (\mbox{anisotropic})\]

  • split (boolean, default = False) – Splits the Gradient into spatial gradient and spectral or temporal gradient for multichannel data.

  • info (boolean, default = False) – Information is printed for the stopping criterion of the FGP algorithm used to solve the dual problem of the Total Variation Denoising problem (ROF).

  • strong_convexity_constant (float, default = 0) –

    A strongly convex term weighted by the strong_convexity_constant (\(\gamma\)) parameter is added to the Total variation. Now the TotalVariation function is \(\gamma\) - strongly convex and the proximal operator is

    \[\underset{u}{\mathrm{argmin}} \frac{1}{2\tau}\|u - b\|^{2} + \mathrm{TV}(u) + \frac{\gamma}{2}\|u\|^{2} \Leftrightarrow\]
    \[\underset{u}{\mathrm{argmin}} \frac{1}{2\frac{\tau}{1+\gamma\tau}}\|u - \frac{b}{1+\gamma\tau}\|^{2} + \mathrm{TV}(u)\]

  • warm_start (boolean, default = True) – If set to true, the FGP algorithm used to solve the dual problem of the Total Variation Denoising problem (ROF) is initiated by the final value from the previous iteration and not at zero. This allows the max_iteration value to be reduced to 5-10 iterations.

Note

With warm_start set to the default, True, the TV function will keep in memory the range of the gradient of the image to be denoised, i.e. N times the dimensionality of the image. This increases the memory requirements. However, during the evaluation of proximal the memory requirements will be unchanged as the same amount of memory will need to be allocated and deallocated.

Note

In the case where the Total variation becomes a \(\gamma\) - strongly convex function, i.e.,

\[\mathrm{TV}(u) + \frac{\gamma}{2}\|u\|^{2}\]

\(\gamma\) should be relatively small, so as the second term above will not act as an additional regulariser. For more information, see [8], [3].

Examples

\[\underset{u}{\mathrm{argmin}} \frac{1}{2}\|u - b\|^{2} + \alpha\|\nabla u\|_{2,1}\]
>>> alpha = 2.0
>>> TV = TotalVariation()
>>> sol = TV.proximal(b, tau = alpha)

Examples

\[\underset{u}{\mathrm{argmin}} \frac{1}{2}\|u - b\|^{2} + \alpha\|\nabla u\|_{1,1} + \mathbb{I}_{C}(u)\]

where \(C = \{1.0\leq u\leq 2.0\}\).

>>> alpha = 2.0
>>> TV = TotalVariation(isotropic=False, lower=1.0, upper=2.0)
>>> sol = TV.proximal(b, tau = alpha)

Examples

\[\underset{u}{\mathrm{argmin}} \frac{1}{2}\|u - b\|^{2} + (\alpha\|\nabla u\|_{2,1} + \frac{\gamma}{2}\|u\|^{2})\]
>>> alpha = 2.0
>>> gamma = 1e-3
>>> TV = alpha * TotalVariation(isotropic=False, strong_convexity_constant=gamma)
>>> sol = TV.proximal(b, tau = 1.0)
__init__(max_iteration=10, tolerance=None, correlation='Space', backend='c', lower=None, upper=None, isotropic=True, split=False, info=False, strong_convexity_constant=0, warm_start=True)[source]#

Initialize self. See help(type(self)) for accurate signature.

__call__(x)[source]#

Returns the value of the TotalVariation function at x .

proximal(x, tau, out=None)[source]#

Returns the proximal operator of the TotalVariation function at x .

convex_conjugate(x)[source]#

Returns the value of convex conjugate of the TotalVariation function at x .

calculate_Lipschitz()[source]#

Default value for the Lipschitz constant.

property gradient#

GradientOperator is created if it is not instantiated yet. The domain of the _gradient, is created in the __call__ and proximal methods.

__rmul__(scalar)[source]#

Returns a function multiplied by a scalar.

Block Framework#

To be able to express more advanced optimisation problems we developed the Block Framework, which provides a generic strategy to treat variational problems in the following form:

\[\min \text{Regulariser} + \text{Fidelity}\]

The block framework consists of:

The block framework allows writing more advanced `optimisation problems`_. Consider the typical Tikhonov regularisation:

\[\underset{u}{\mathrm{argmin}}\begin{Vmatrix}A u - b \end{Vmatrix}^2_2 + \alpha^2\|Lu\|^2_2\]

where,

  • \(A\) is the projection operator

  • \(b\) is the acquired data

  • \(u\) is the unknown image to be solved for

  • \(\alpha\) is the regularisation parameter

  • \(L\) is a regularisation operator

The first term measures the fidelity of the solution to the data. The second term measures the fidelity to the prior knowledge we have imposed on the system, operator \(L\).

This can be re-written equivalently in the block matrix form:

\[\underset{u}{\mathrm{argmin}}\begin{Vmatrix}\binom{A}{\alpha L} u - \binom{b}{0}\end{Vmatrix}^2_2\]

With the definitions:

  • \(\tilde{A} = \binom{A}{\alpha L}\)

  • \(\tilde{b} = \binom{b}{0}\)

this can now be recognised as a least squares problem which can be solved by any algorithm in the cil.optimisation which can solve least squares problem, e.g. CGLS.

\[\underset{u}{\mathrm{argmin}}\begin{Vmatrix}\tilde{A} u - \tilde{b}\end{Vmatrix}^2_2\]

To be able to express our optimisation problems in the matrix form above, we developed the so-called, Block Framework comprising 4 main actors: BlockGeometry, BlockDataContainer, BlockFunction and BlockOperator.

BlockDataContainer#

BlockDataContainer holds `DataContainer`_ as column vector. It is possible to do basic algebra between BlockDataContainer s and with numbers, list or numpy arrays.

\[ \begin{align}\begin{aligned}x = [x_{1}, x_{2} ]\in (X_{1}\times X_{2})\\y = [y_{1}, y_{2}, y_{3} ]\in(Y_{1}\times Y_{2} \times Y_{3})\end{aligned}\end{align} \]
class cil.framework.BlockDataContainer(*args, **kwargs)[source]#

Class to hold DataContainers as column vector

Provides basic algebra between BlockDataContainer’s, DataContainer’s and subclasses and Numbers

  1. algebra between `BlockDataContainer`s will be element-wise, only if the shape of the 2 `BlockDataContainer`s is the same, otherwise it will fail

  2. algebra between BlockDataContainer`s and `list or numpy array will work as long as the number of rows and element of the arrays match, independently on the fact that the BlockDataContainer could be nested

  3. algebra between BlockDataContainer and one DataContainer is possible. It will require all the DataContainers in the block to be compatible with the DataContainer we want to operate with.

  4. algebra between BlockDataContainer and a Number is possible and it will be done with each element of the BlockDataContainer even if nested

A = [ [B,C] , D] A * 3 = [ 3 * [B,C] , 3* D] = [ [ 3*B, 3*C] , 3*D ]

__iter__()[source]#

BlockDataContainer is Iterable

next()[source]#

python2 backwards compatibility

is_compatible(other)[source]#

basic check if the size of the 2 objects fit

add(other, *args, **kwargs)[source]#

Algebra: add method of BlockDataContainer with number/DataContainer or BlockDataContainer

Param

other (number, DataContainer or subclasses or BlockDataContainer

Param

out (optional): provides a placehold for the resul.

subtract(other, *args, **kwargs)[source]#

Algebra: subtract method of BlockDataContainer with number/DataContainer or BlockDataContainer

Param

other (number, DataContainer or subclasses or BlockDataContainer

Param

out (optional): provides a placeholder for the result.

multiply(other, *args, **kwargs)[source]#

Algebra: multiply method of BlockDataContainer with number/DataContainer or BlockDataContainer

Param

other (number, DataContainer or subclasses or BlockDataContainer)

Param

out (optional): provides a placeholder for the result.

divide(other, *args, **kwargs)[source]#

Algebra: divide method of BlockDataContainer with number/DataContainer or BlockDataContainer

Param

other (number, DataContainer or subclasses or BlockDataContainer)

Param

out (optional): provides a placeholder for the result.

power(other, *args, **kwargs)[source]#

Algebra: power method of BlockDataContainer with number/DataContainer or BlockDataContainer

Param

other (number, DataContainer or subclasses or BlockDataContainer

Param

out (optional): provides a placeholder for the result.

maximum(other, *args, **kwargs)[source]#

Algebra: power method of BlockDataContainer with number/DataContainer or BlockDataContainer

Param

other (number, DataContainer or subclasses or BlockDataContainer)

Param

out (optional): provides a placeholder for the result.

minimum(other, *args, **kwargs)[source]#

Algebra: power method of BlockDataContainer with number/DataContainer or BlockDataContainer

Param

other (number, DataContainer or subclasses or BlockDataContainer)

Param

out (optional): provides a placeholder for the result.

sapyb(a, y, b, out, num_threads=1)[source]#

performs axpby element-wise on the BlockDataContainer containers

Does the operation .. math:: a*x+b*y and stores the result in out, where x is self

Parameters
  • a – scalar

  • b – scalar

  • y – compatible (Block)DataContainer

  • out – (Block)DataContainer to store the result

>>> a = 2
>>> b = 3
>>> ig = ImageGeometry(10,11)
>>> x = ig.allocate(1)
>>> y = ig.allocate(2)
>>> bdc1 = BlockDataContainer(2*x, y)
>>> bdc2 = BlockDataContainer(x, 2*y)
>>> out = bdc1.sapyb(a,bdc2,b)
axpby(a, b, y, out, dtype=<class 'numpy.float32'>, num_threads=1)[source]#

Deprecated method. Alias of sapyb

binary_operations(operation, other, *args, **kwargs)[source]#

Algebra: generic method of algebric operation with BlockDataContainer with number/DataContainer or BlockDataContainer

Provides commutativity with DataContainer and subclasses, i.e. this class’s reverse algebraic methods take precedence w.r.t. direct algebraic methods of DataContainer and subclasses.

This method is not to be used directly

unary_operations(operation, *args, **kwargs)[source]#

Unary operation on BlockDataContainer:

generic method of unary operation with BlockDataContainer: abs, sign, sqrt and conjugate

This method is not to be used directly

copy()[source]#

alias of clone

__radd__(other)[source]#

Reverse addition

to make sure that this method is called rather than the __mul__ of a numpy array the class constant __array_priority__ must be set > 0 https://docs.scipy.org/doc/numpy-1.15.1/reference/arrays.classes.html#numpy.class.__array_priority__

__rsub__(other)[source]#

Reverse subtraction

to make sure that this method is called rather than the __mul__ of a numpy array the class constant __array_priority__ must be set > 0 https://docs.scipy.org/doc/numpy-1.15.1/reference/arrays.classes.html#numpy.class.__array_priority__

__rmul__(other)[source]#

Reverse multiplication

to make sure that this method is called rather than the __mul__ of a numpy array the class constant __array_priority__ must be set > 0 https://docs.scipy.org/doc/numpy-1.15.1/reference/arrays.classes.html#numpy.class.__array_priority__

__rdiv__(other)[source]#

Reverse division

to make sure that this method is called rather than the __mul__ of a numpy array the class constant __array_priority__ must be set > 0 https://docs.scipy.org/doc/numpy-1.15.1/reference/arrays.classes.html#numpy.class.__array_priority__

__rtruediv__(other)[source]#

Reverse truedivision

to make sure that this method is called rather than the __mul__ of a numpy array the class constant __array_priority__ must be set > 0 https://docs.scipy.org/doc/numpy-1.15.1/reference/arrays.classes.html#numpy.class.__array_priority__

__rpow__(other)[source]#

Reverse power

to make sure that this method is called rather than the __mul__ of a numpy array the class constant __array_priority__ must be set > 0 https://docs.scipy.org/doc/numpy-1.15.1/reference/arrays.classes.html#numpy.class.__array_priority__

__iadd__(other)[source]#

Inline addition

__isub__(other)[source]#

Inline subtraction

__imul__(other)[source]#

Inline multiplication

__idiv__(other)[source]#

Inline division

__itruediv__(other)[source]#

Inline truedivision

__neg__()[source]#

Return - self

__weakref__#

list of weak references to the object (if defined)

Block Function#

BlockFunction acts on BlockDataContainer as a separable sum function:

\[ \begin{align}\begin{aligned}f = [f_1,...,f_n] \newline\\f([x_1,...,x_n]) = f_1(x_1) + .... + f_n(x_n)\end{aligned}\end{align} \]
\[ \begin{align}\begin{aligned}\begin{split}Y = \begin{bmatrix} y_{1}\\ y_{2}\\ y_{3}\\ \end{bmatrix}, \quad F = [ f_{1}, f_{2}, f_{3} ]\end{split}\\F(Y) : = f_{1}(y_{1}) + f_{2}(y_{2}) + f_{3}(y_{3})\end{aligned}\end{align} \]
class cil.optimisation.functions.BlockFunction(*functions)[source]#

BlockFunction represents a separable sum function \(F\) defined as

\[F:X_{1}\times X_{2}\cdots\times X_{m} \rightarrow (-\infty, \infty]\]

where \(F\) is the separable sum of functions \((f_{i})_{i=1}^{m}\),

\[F(x_{1}, x_{2}, \cdots, x_{m}) = \overset{m}{\underset{i=1}{\sum}}f_{i}(x_{i}), \mbox{ with } f_{i}: X_{i} \rightarrow (-\infty, \infty].\]

A nice property (due to it’s separability structure) is that the proximal operator can be decomposed along the proximal operators of each function \(f_{i}\).

\[\mathrm{prox}_{\tau F}(x) = ( \mathrm{prox}_{\tau f_{i}}(x_{i}) )_{i=1}^{m}\]

In addition, if \(\tau := (\tau_{1},\dots,\tau_{m})\), then

\[\mathrm{prox}_{\tau F}(x) = ( \mathrm{prox}_{\tau_{i} f_{i}}(x_{i}) )_{i=1}^{m}\]
__init__(*functions)[source]#

Initialize self. See help(type(self)) for accurate signature.

property L#

Lipschitz of the gradient of function f.

L is positive real number, such that |f'(x) - f'(y)| <= L ||x-y||, assuming f: IG –> R

__call__(x)[source]#

Returns the value of the BlockFunction \(F\)

\[F(x) = \overset{m}{\underset{i=1}{\sum}}f_{i}(x_{i}), \mbox{ where } x = (x_{1}, x_{2}, \cdots, x_{m}), \quad i = 1,2,\dots,m\]

Parameter:

x : BlockDataContainer and must have as many rows as self.length

returns ..math:: sum(f_i(x_i))

convex_conjugate(x)[source]#

Returns the value of the convex conjugate of the BlockFunction at \(x^{*}\).

\[F^{*}(x^{*}) = \overset{m}{\underset{i=1}{\sum}}f_{i}^{*}(x^{*}_{i})\]

Parameter:

x : BlockDataContainer and must have as many rows as self.length

proximal(x, tau, out=None)[source]#

Proximal operator of the BlockFunction at x:

\[\mathrm{prox}_{\tau F}(x) = (\mathrm{prox}_{\tau f_{i}}(x_{i}))_{i=1}^{m}\]

Parameter:

x : BlockDataContainer and must have as many rows as self.length

gradient(x, out=None)[source]#

Returns the value of the gradient of the BlockFunction function at x.

\[F'(x) = [f_{1}'(x_{1}), ... , f_{m}'(x_{m})]\]

Parameter:

x : BlockDataContainer and must have as many rows as self.length

proximal_conjugate(x, tau, out=None)[source]#

Proximal operator of the convex conjugate of BlockFunction at x:

\[\mathrm{prox}_{\tau F^{*}}(x) = (\mathrm{prox}_{\tau f^{*}_{i}}(x^{*}_{i}))_{i=1}^{m}\]

Parameter:

x : BlockDataContainer and must have as many rows as self.length

__rmul__(other)[source]#

Define multiplication with a scalar

Parameters

other – number

Returns a new BlockFunction containing the product of the scalar with all the functions in the block

Block Operator#

BlockOperator represent a block matrix with operators

\[\begin{split}K = \begin{bmatrix} A_{1} & A_{2} \\ A_{3} & A_{4} \\ A_{5} & A_{6} \end{bmatrix}_{(3,2)} * \quad \underbrace{\begin{bmatrix} x_{1} \\ x_{2} \end{bmatrix}_{(2,1)}}_{\textbf{x}} = \begin{bmatrix} A_{1}x_{1} + A_{2}x_{2}\\ A_{3}x_{1} + A_{4}x_{2}\\ A_{5}x_{1} + A_{6}x_{2}\\ \end{bmatrix}_{(3,1)} = \begin{bmatrix} y_{1}\\ y_{2}\\ y_{3} \end{bmatrix}_{(3,1)} = \textbf{y}\end{split}\]

Column: Share the same domains \(X_{1}, X_{2}\)

Rows: Share the same ranges \(Y_{1}, Y_{2}, Y_{3}\)

\[K : (X_{1}\times X_{2}) \rightarrow (Y_{1}\times Y_{2} \times Y_{3})\]

\(A_{1}, A_{3}, A_{5}\): share the same domain \(X_{1}\) and \(A_{2}, A_{4}, A_{6}\): share the same domain \(X_{2}\)

\[\begin{split}A_{1}: X_{1} \rightarrow Y_{1} \\ A_{3}: X_{1} \rightarrow Y_{2} \\ A_{5}: X_{1} \rightarrow Y_{3} \\ A_{2}: X_{2} \rightarrow Y_{1} \\ A_{4}: X_{2} \rightarrow Y_{2} \\ A_{6}: X_{2} \rightarrow Y_{3}\end{split}\]

For instance with these ingredients one may write the following objective function,

\[\alpha ||\nabla u||_{2,1} + ||u - g||_2^2\]

where \(g\) represent the measured values, \(u\) the solution \(\nabla\) is the gradient operator, \(|| ~~ ||_{2,1}\) is a norm for the output of the gradient operator and \(|| x-g ||^2_2\) is least squares fidelity function as

\[ \begin{align}\begin{aligned}\begin{split}K = \begin{bmatrix} \nabla \\ \mathbb{1} \end{bmatrix}\end{split}\\F(x) = \Big[ \alpha \lVert ~x~ \rVert_{2,1} ~~ , ~~ || x - g||_2^2 \Big]\\w = [ u ]\end{aligned}\end{align} \]

Then we have rewritten the problem as

\[F(Kw) = \alpha \left\lVert \nabla u \right\rVert_{2,1} + ||u-g||^2_2\]

Which in Python would be like

op1 = GradientOperator(ig, correlation=GradientOperator.CORRELATION_SPACE)
op2 = IdentityOperator(ig, ag)

# Create BlockOperator
K = BlockOperator(op1, op2, shape=(2,1) )

# Create functions
F = BlockFunction(alpha * MixedL21Norm(), 0.5 * L2NormSquared(b=noisy_data))
class cil.optimisation.operators.BlockOperator(*args, **kwargs)[source]#

A Block matrix containing Operators

The Block Framework is a generic strategy to treat variational problems in the following form:

\[\min Regulariser + Fidelity\]

BlockOperators have a generic shape M x N, and when applied on an Nx1 BlockDataContainer, will yield and Mx1 BlockDataContainer. Notice: BlockDatacontainer are only allowed to have the shape of N x 1, with N rows and 1 column.

User may specify the shape of the block, by default is a row vector

Operators in a Block are required to have the same domain column-wise and the same range row-wise.

__init__(*args, **kwargs)[source]#

Class creator

Note

Do not include the self parameter in the Args section.

:param : param: vararg (Operator): Operators in the block. :param : param: shape (tuple, optional): If shape is passed the Operators in

vararg are considered input in a row-by-row fashion. Shape and number of Operators must match.

Example

BlockOperator(op0,op1) results in a row block BlockOperator(op0,op1,shape=(1,2)) results in a column block

column_wise_compatible()[source]#

Operators in a Block should have the same domain per column

row_wise_compatible()[source]#

Operators in a Block should have the same range per row

get_item(row, col)[source]#

returns the Operator at specified row and col

norm(**kwargs)[source]#

Returns the norm of the BlockOperator

if the operator in the block do not have method norm defined, i.e. they are SIRF AcquisitionModel’s we use PowerMethod if applicable, otherwise we raise an Error

direct(x, out=None)[source]#

Direct operation for the BlockOperator

BlockOperator work on BlockDataContainer, but they will work on DataContainers and inherited classes by simple wrapping the input in a BlockDataContainer of shape (1,1)

adjoint(x, out=None)[source]#

Adjoint operation for the BlockOperator

BlockOperator may contain both LinearOperator and Operator This method exists in BlockOperator as it is not known what type of Operator it will contain.

BlockOperator work on BlockDataContainer, but they will work on DataContainers and inherited classes by simple wrapping the input in a BlockDataContainer of shape (1,1)

Raises: ValueError if the contained Operators are not linear

is_linear()[source]#

returns whether all the elements of the BlockOperator are linear

get_output_shape(xshape, adjoint=False)[source]#

returns the shape of the output BlockDataContainer

A(N,M) direct u(M,1) -> N,1 A(N,M)^T adjoint u(N,1) -> M,1

__rmul__(scalar)[source]#

Defines the left multiplication with a scalar

Paramer scalar

(number or iterable containing numbers):

Returns: a block operator with Scaled Operators inside

property T#

Return the transposed of self

input in a row-by-row

domain_geometry()[source]#

returns the domain of the BlockOperator

If the shape of the BlockOperator is (N,1) the domain is a ImageGeometry or AcquisitionGeometry. Otherwise it is a BlockGeometry.

range_geometry()[source]#

returns the range of the BlockOperator

__getitem__(index)[source]#

returns the index-th operator in the block irrespectively of it’s shape

get_as_list()[source]#

returns the list of operators

Return Home

References#

1(1,2,3)

Amir Beck and Marc Teboulle. A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM Journal on Imaging Sciences, 2(1):183–202, 2009. URL: https://doi.org/10.1137/080716542, arXiv:https://doi.org/10.1137/080716542, doi:10.1137/080716542.

2(1,2,3)

Amir Beck and Marc Teboulle. Fast gradient-based algorithms for constrained total variation image denoising and deblurring problems. IEEE Transactions on Image Processing, 18(11):2419–2434, 2009. doi:10.1109/TIP.2009.2028250.

3(1,2)

Antonin Chambolle and Thomas Pock. A first-order primal-dual algorithm for convex problems with applications to imaging. Journal of Mathematical Imaging and Vision, 40(1):120–145, May 2011. URL: https://doi.org/10.1007/s10851-010-0251-1, doi:10.1007/s10851-010-0251-1.

4

Ernie Esser, Xiaoqun Zhang, and Tony F. Chan. A general framework for a class of first order primal-dual algorithms for convex optimization in imaging science. SIAM Journal on Imaging Sciences, 3(4):1015–1046, 2010. URL: https://doi.org/10.1137/09076934X, arXiv:https://doi.org/10.1137/09076934X, doi:10.1137/09076934X.

5

J. S. Jörgensen, E. Ametova, G. Burca, G. Fardell, E. Papoutsellis, E. Pasca, K. Thielemans, M. Turner, R. Warr, W. R. B. Lionheart, and P. J. Withers. Core imaging library - part i: a versatile python framework for tomographic imaging. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 379(2204):20200192, 2021. URL: https://royalsocietypublishing.org/doi/abs/10.1098/rsta.2020.0192, arXiv:https://royalsocietypublishing.org/doi/pdf/10.1098/rsta.2020.0192, doi:10.1098/rsta.2020.0192.

6

Avinash C. Kak and Malcolm Slaney. Principles of Computerized Tomographic Imaging. Society for Industrial and Applied Mathematics, January 2001. URL: https://doi.org/10.1137/1.9780898719277, doi:10.1137/1.9780898719277.

7

Evangelos Papoutsellis, Evelina Ametova, Claire Delplancke, Gemma Fardell, Jakob S. Jörgensen, Edoardo Pasca, Martin Turner, Ryan Warr, William R. B. Lionheart, and Philip J. Withers. Core imaging library - part ii: multichannel reconstruction for dynamic and spectral tomography. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 379(2204):20200193, 2021. URL: https://royalsocietypublishing.org/doi/abs/10.1098/rsta.2020.0193, arXiv:https://royalsocietypublishing.org/doi/pdf/10.1098/rsta.2020.0193, doi:10.1098/rsta.2020.0193.

8

Julian Rasch and Antonin Chambolle. Inexact first-order primal–dual algorithms. Computational Optimization and Applications, 76(2):381–430, Jun 2020. URL: https://doi.org/10.1007/s10589-020-00186-y, doi:10.1007/s10589-020-00186-y.

9

Mingqiang Zhu, Stephen J. Wright, and Tony F. Chan. Duality-based algorithms for total-variation-regularized image restoration. Computational Optimization and Applications, 47(3):377–400, Nov 2010. URL: https://doi.org/10.1007/s10589-008-9225-2, doi:10.1007/s10589-008-9225-2.