Posterior Sampling Example
This example demonstrates GP posterior sampling using the full covariance matrix, fitting a sparse sinusoidal dataset.
Overview
We will:
- Fit an isotropic GP to sparse sinusoidal data
- Use
pred_gp(lite=false)to get the full posterior covariance - Draw posterior samples from the GP using the Student-t distribution
- Visualize the posterior uncertainty
Setup
using laGP
using Distributions
using PDMats
using LinearAlgebra
using Random
Random.seed!(42)Generate Training Data
Create sparse observations from sin(x):
# Sparse training points over [0, 2π]
X_train = reshape(collect(range(0, 2π, length=6)), :, 1)
Y_train = sin.(X_train[:, 1])
println("Training data:")
println(" X: ", vec(X_train))
println(" Y: ", Y_train)Fit Isotropic GP
# Initial hyperparameters
d_init = 2.0 # lengthscale
g_init = 1e-6 # small nugget for near-interpolation
# Create GP
gp = new_gp(X_train, Y_train, d_init, g_init)
# MLE for lengthscale only (keep small nugget fixed)
mle_gp!(gp, :d; tmax=20.0)
println("Optimized hyperparameters:")
println(" d = ", gp.d)
println(" g = ", gp.g)
println(" log-likelihood = ", llik_gp(gp))Full Posterior Prediction
Get the full covariance matrix for sampling:
# Dense test grid
xx = collect(range(-1, 2π + 1, length=499))
XX = reshape(xx, :, 1)
# Get full posterior (lite=false returns GPPredictionFull)
pred_full = pred_gp(gp, XX; lite=false)
println("Prediction:")
println(" Test points: ", length(xx))
println(" Sigma size: ", size(pred_full.Sigma))Draw Posterior Samples
The laGP package uses a concentrated (profile) likelihood to estimate the variance parameter τ². This introduces additional uncertainty that should be captured using a Student-t distribution rather than a Normal distribution:
# Draw posterior samples using MvTDist (Student-t)
# This mirrors R's rmvt and correctly accounts for uncertainty in τ² estimation
mvt = MvTDist(pred_full.df, pred_full.mean, PDMat(Symmetric(pred_full.Sigma)))
n_samples = 100
samples = rand(mvt, n_samples)
println("Posterior samples: ", size(samples))Why Student-t?
The concentrated likelihood marginalizes out τ² analytically, which means the posterior predictive distribution is Student-t with df = n (number of training observations). Using MvNormal would underestimate uncertainty, especially with small training sets.
Visualization (with CairoMakie)
using CairoMakie
fig = Figure(size=(700, 500))
ax = Axis(fig[1, 1],
xlabel="x",
ylabel="Y(x) | θ̂",
title="Simple Sinusoidal Example: GP Posterior Samples"
)
# Draw posterior samples (gray, semi-transparent)
for i in 1:n_samples
lines!(ax, xx, samples[:, i], color=(:gray, 0.3), linewidth=0.5)
end
# Draw posterior mean (blue)
lines!(ax, xx, pred_full.mean, color=:blue, linewidth=2, label="Posterior mean")
# Draw true function (dashed green)
lines!(ax, xx, sin.(xx), color=:green, linewidth=1.5, linestyle=:dash, label="sin(x)")
# Draw training points (black circles)
scatter!(ax, vec(X_train), Y_train, color=:black, markersize=12, label="Training data")
axislegend(ax, position=:lb)
xlims!(ax, -1, 2π + 1)
ylims!(ax, -2.0, 2.0)
figThe resulting visualization shows the posterior samples with the GP mean and true function:

Key Concepts
Posterior Mean and Variance
The posterior mean interpolates the training data (due to small nugget), while the variance:
- Is near zero at training points
- Increases between and beyond training points
- Reflects uncertainty about the true function
Posterior Samples
Each sample represents a plausible function consistent with:
- The training data
- The GP prior (smoothness encoded by kernel)
- The estimated hyperparameters
Practical Notes
- Nugget size: A very small nugget (1e-6) gives near-interpolation. Increase for noisy data.
- Memory: Full covariance requires O(n²) storage. For large test sets, use
lite=true. - Student-t vs Normal: Always use
MvTDistfor posterior samples when using concentrated likelihood. The degrees of freedomdf = n(training set size).
Credible Intervals
Instead of sampling, you can compute pointwise intervals:
using Distributions: TDist, quantile
# 95% credible intervals using Student-t distribution
t_dist = TDist(pred_full.df)
t_crit = quantile(t_dist, 0.975)
# Standard deviation at each point
std_pred = sqrt.(diag(pred_full.Sigma))
# Intervals
lower = pred_full.mean .- t_crit .* std_pred
upper = pred_full.mean .+ t_crit .* std_predThis is more efficient than sampling when you only need intervals.