SGHMC𝞡

`posteriors.sgmcmc.sghmc.build(log_posterior, lr, alpha=0.01, beta=0.0, sigma=1.0, temperature=1.0, momenta=None)` 𝞡

Builds SGHMC transform.

Algorithm from Chen et al, 2014:

\[\begin{align} θ_{t+1} &= θ_t + ε σ^{-2} m_t \\ m_{t+1} &= m_t + ε \nabla \log p(θ_t, \text{batch}) - ε σ^{-2} α m_t + N(0, ε T (2 α - ε β T) \mathbb{I})\ \end{align}\]

for learning rate \(\epsilon\) and temperature \(T\)

Targets \(p_T(θ, m) \propto \exp( (\log p(θ) - \frac{1}{2σ^2} m^Tm) / T)\) with temperature \(T\).

The log posterior and temperature are recommended to be constructed in tandem to ensure robust scaling for a large amount of data and variable batch size.

Parameters:

Name	Type	Description	Default
`log_posterior`	`LogProbFn`	Function that takes parameters and input batch and returns the log posterior value (which can be unnormalised) as well as auxiliary information, e.g. from the model call.	required
`lr`	`float \| Schedule`	Learning rate, scalar or schedule (callable taking step index, returning scalar).	required
`alpha`	`float`	Friction coefficient.	`0.01`
`beta`	`float`	Gradient noise coefficient (estimated variance).	`0.0`
`sigma`	`float`	Standard deviation of momenta target distribution.	`1.0`
`temperature`	`float \| Schedule`	Temperature of the joint parameter + momenta distribution. Scalar or schedule (callable taking step index, returning scalar).	`1.0`
`momenta`	`TensorTree \| float \| None`	Initial momenta. Can be tree like params or scalar. Defaults to random iid samples from N(0, 1).	`None`

Returns:

Type	Description
`Transform`	SGHMC transform instance.

Source code in posteriors/sgmcmc/sghmc.py

def build(
    log_posterior: LogProbFn,
    lr: float | Schedule,
    alpha: float = 0.01,
    beta: float = 0.0,
    sigma: float = 1.0,
    temperature: float | Schedule = 1.0,
    momenta: TensorTree | float | None = None,
) -> Transform:
    """Builds SGHMC transform.

    Algorithm from [Chen et al, 2014](https://arxiv.org/abs/1402.4102):

    \\begin{align}
    θ_{t+1} &= θ_t + ε σ^{-2} m_t \\\\
    m_{t+1} &= m_t + ε \\nabla \\log p(θ_t, \\text{batch}) - ε σ^{-2} α m_t
    + N(0, ε T (2 α - ε β T) \\mathbb{I})\\
    \\end{align}

    for learning rate $\\epsilon$ and temperature $T$

    Targets $p_T(θ, m) \\propto \\exp( (\\log p(θ) - \\frac{1}{2σ^2} m^Tm) / T)$
    with temperature $T$.

    The log posterior and temperature are recommended to be [constructed in tandem](../../log_posteriors.md)
    to ensure robust scaling for a large amount of data and variable batch size.

    Args:
        log_posterior: Function that takes parameters and input batch and
            returns the log posterior value (which can be unnormalised)
            as well as auxiliary information, e.g. from the model call.
        lr: Learning rate,
            scalar or schedule (callable taking step index, returning scalar).
        alpha: Friction coefficient.
        beta: Gradient noise coefficient (estimated variance).
        sigma: Standard deviation of momenta target distribution.
        temperature: Temperature of the joint parameter + momenta distribution.
            Scalar or schedule (callable taking step index, returning scalar).
        momenta: Initial momenta. Can be tree like params or scalar.
            Defaults to random iid samples from N(0, 1).

    Returns:
        SGHMC transform instance.
    """
    init_fn = partial(init, momenta=momenta)
    update_fn = partial(
        update,
        log_posterior=log_posterior,
        lr=lr,
        alpha=alpha,
        beta=beta,
        sigma=sigma,
        temperature=temperature,
    )
    return Transform(init_fn, update_fn)

`posteriors.sgmcmc.sghmc.SGHMCState` 𝞡

Bases: TensorClass['frozen']

State encoding params and momenta for SGHMC.

Attributes:

Name	Type	Description
`params`	`TensorTree`	Parameters.
`momenta`	`TensorTree`	Momenta for each parameter.
`log_posterior`	`Tensor`	Log posterior evaluation.
`step`	`Tensor`	Current step count.

Source code in posteriors/sgmcmc/sghmc.py

class SGHMCState(TensorClass["frozen"]):
    """State encoding params and momenta for SGHMC.

    Attributes:
        params: Parameters.
        momenta: Momenta for each parameter.
        log_posterior: Log posterior evaluation.
        step: Current step count.
    """

    params: TensorTree
    momenta: TensorTree
    log_posterior: torch.Tensor = torch.tensor(torch.nan)
    step: torch.Tensor = torch.tensor(0)

`posteriors.sgmcmc.sghmc.init(params, momenta=None)` 𝞡

Initialise momenta for SGHMC.

Parameters:

Name	Type	Description	Default
`params`	`TensorTree`	Parameters for which to initialise.	required
`momenta`	`TensorTree \| float \| None`	Initial momenta. Can be tree like params or scalar. Defaults to random iid samples from N(0, 1).	`None`

Returns:

Type	Description
`SGHMCState`	Initial SGHMCState containing momenta.

Source code in posteriors/sgmcmc/sghmc.py

def init(params: TensorTree, momenta: TensorTree | float | None = None) -> SGHMCState:
    """Initialise momenta for SGHMC.

    Args:
        params: Parameters for which to initialise.
        momenta: Initial momenta. Can be tree like params or scalar.
            Defaults to random iid samples from N(0, 1).

    Returns:
        Initial SGHMCState containing momenta.
    """
    if momenta is None:
        momenta = tree_map(
            lambda x: torch.randn_like(x, requires_grad=x.requires_grad),
            params,
        )
    elif is_scalar(momenta):
        momenta = tree_map(
            lambda x: torch.full_like(x, momenta, requires_grad=x.requires_grad),
            params,
        )

    return SGHMCState(params, momenta)

`posteriors.sgmcmc.sghmc.update(state, batch, log_posterior, lr, alpha=0.01, beta=0.0, sigma=1.0, temperature=1.0, inplace=False)` 𝞡

Updates parameters and momenta for SGHMC.

Update rule from Chen et al, 2014, see build for details.

Parameters:

Name	Type	Description	Default
`state`	`SGHMCState`	SGHMCState containing params and momenta.	required
`batch`	`Any`	Data batch to be send to log_posterior.	required
`log_posterior`	`LogProbFn`	Function that takes parameters and input batch and returns the log posterior value (which can be unnormalised) as well as auxiliary information, e.g. from the model call.	required
`lr`	`float \| Schedule`	Learning rate, scalar or schedule (callable taking step index, returning scalar).	required
`alpha`	`float`	Friction coefficient.	`0.01`
`beta`	`float`	Gradient noise coefficient (estimated variance).	`0.0`
`sigma`	`float`	Standard deviation of momenta target distribution.	`1.0`
`temperature`	`float \| Schedule`	Temperature of the joint parameter + momenta distribution. Scalar or schedule (callable taking step index, returning scalar).	`1.0`
`inplace`	`bool`	Whether to modify state in place.	`False`

Returns:

Type	Description
`tuple[SGHMCState, TensorTree]`	Updated state (which are pointers to the inputted state tensors if inplace=True) and auxiliary information.

Source code in posteriors/sgmcmc/sghmc.py

def update(
    state: SGHMCState,
    batch: Any,
    log_posterior: LogProbFn,
    lr: float | Schedule,
    alpha: float = 0.01,
    beta: float = 0.0,
    sigma: float = 1.0,
    temperature: float | Schedule = 1.0,
    inplace: bool = False,
) -> tuple[SGHMCState, TensorTree]:
    """Updates parameters and momenta for SGHMC.

    Update rule from [Chen et al, 2014](https://arxiv.org/abs/1402.4102),
    see [build](sghmc.md#posteriors.sgmcmc.sghmc.build) for details.

    Args:
        state: SGHMCState containing params and momenta.
        batch: Data batch to be send to log_posterior.
        log_posterior: Function that takes parameters and input batch and
            returns the log posterior value (which can be unnormalised)
            as well as auxiliary information, e.g. from the model call.
        lr: Learning rate,
            scalar or schedule (callable taking step index, returning scalar).
        alpha: Friction coefficient.
        beta: Gradient noise coefficient (estimated variance).
        sigma: Standard deviation of momenta target distribution.
        temperature: Temperature of the joint parameter + momenta distribution.
            Scalar or schedule (callable taking step index, returning scalar).
        inplace: Whether to modify state in place.

    Returns:
        Updated state (which are pointers to the inputted state tensors if inplace=True)
            and auxiliary information.
    """
    with torch.no_grad(), CatchAuxError():
        grads, (log_post, aux) = grad_and_value(log_posterior, has_aux=True)(
            state.params, batch
        )

    lr = lr(state.step) if callable(lr) else lr
    temperature = temperature(state.step) if callable(temperature) else temperature
    prec = sigma**-2

    def transform_params(p, m):
        return p + lr * prec * m

    def transform_momenta(m, g):
        return (
            m
            + lr * g
            - lr * prec * alpha * m
            + (temperature * lr * (2 * alpha - temperature * lr * beta)) ** 0.5
            * torch.randn_like(m)
        )

    params = flexi_tree_map(
        transform_params, state.params, state.momenta, inplace=inplace
    )
    momenta = flexi_tree_map(transform_momenta, state.momenta, grads, inplace=inplace)

    if inplace:
        tree_insert_(state.log_posterior, log_post.detach())
        tree_insert_(state.step, state.step + 1)
        return state, aux
    return SGHMCState(params, momenta, log_post.detach(), state.step + 1), aux

SGHMC𝞡

posteriors.sgmcmc.sghmc.build(log_posterior, lr, alpha=0.01, beta=0.0, sigma=1.0, temperature=1.0, momenta=None) 𝞡

posteriors.sgmcmc.sghmc.SGHMCState 𝞡

posteriors.sgmcmc.sghmc.init(params, momenta=None) 𝞡

posteriors.sgmcmc.sghmc.update(state, batch, log_posterior, lr, alpha=0.01, beta=0.0, sigma=1.0, temperature=1.0, inplace=False) 𝞡

`posteriors.sgmcmc.sghmc.build(log_posterior, lr, alpha=0.01, beta=0.0, sigma=1.0, temperature=1.0, momenta=None)` 𝞡

`posteriors.sgmcmc.sghmc.SGHMCState` 𝞡

`posteriors.sgmcmc.sghmc.init(params, momenta=None)` 𝞡

`posteriors.sgmcmc.sghmc.update(state, batch, log_posterior, lr, alpha=0.01, beta=0.0, sigma=1.0, temperature=1.0, inplace=False)` 𝞡